# Python Course - Tutorial 6

### Exercise 1: Robust Weather Analyzer
Extend the `analyze_weather_data` function from the previous tutorial with exception handling. The function should handle the following exceptions:

1. If the `data` parameter is not a list, raise a `TypeError` with the message: "The data parameter must be a list!".
2. If the `analysis_type` parameter is not a string, raise a `TypeError` with the message: "The analysis_type parameter must be a string!".
3. If the `analysis_type` parameter is not one of "average", "max", "min", or "trend", raise a `ValueError` with the message: "The analysis_type parameter must be one of 'average', 'max', 'min', or 'trend'!".
4. If the `data` parameter is an empty list, raise a `ValueError` with the message: "The data parameter must not be empty!".
5. *Advanced*: If the `data` parameter is a list of dictionaries, but one of the dictionaries does not have the keys "date", "temperature", "humidity", or "wind_speed", raise a `ValueError` with the message: "The data parameter must be a list of dictionaries with the keys 'date', 'temperature', 'humidity', and 'wind_speed'!".

*Hint*: Use the built-in `isinstance()` function to check if a variable is of a certain type.

**Sample Outputs**:
```python
>>> analyze_weather_data("foo", "average")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in analyze_weather_data
TypeError: The data parameter must be a list!

>>> analyze_weather_data(weather_data, 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in analyze_weather_data
TypeError: The analysis_type parameter must be a string!

>>> analyze_weather_data(weather_data, "foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in analyze_weather_data
ValueError: The analysis_type parameter must be one of 'average', 'max', 'min', or 'trend'!

>>> analyze_weather_data([], "average")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in analyze_weather_data
ValueError: The data parameter must not be empty!

>>> analyze_weather_data([{"foo": 1}], "average")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in analyze_weather_data
ValueError: The data parameter must be a list of dictionaries with the keys 'date', 'temperature', 'humidity', and 'wind_speed'!

In [None]:
def analyze_weather_data(data, analysis_type):
    """
    Analyzes weather data and returns the result as a dictionary.
    :param data: A list of dictionaries containing weather data. Each dictionary has 'date', 'temperature', 'humidity', and 'wind_speed'.
    :param analysis_type: The type of analysis to perform. Must be one of 'average', 'max', 'min', or 'trend'.
    :return: The result of the analysis as a dictionary or a string (for trend analysis).
    """

    # Check if the parameters are of the correct type
    if not isinstance(data, list):
        raise TypeError("The data parameter must be a list!")
    if not isinstance(analysis_type, str):
        raise TypeError("The analysis_type parameter must be a string!")
    if analysis_type not in ["average", "max", "min", "trend"]:
        raise ValueError("The analysis_type parameter must be one of 'average', 'max', 'min', or 'trend'!")
    if not data:
        raise ValueError("The data parameter must not be empty!")
    
    # Check if all dictionaries have the required keys
    required_keys = {"date", "temperature", "humidity", "wind_speed"}
    for item in data:
        if not isinstance(item, dict):
            raise TypeError("The data parameter must be a list of dictionaries!")
        if not all(key in item for key in required_keys):
            raise ValueError("The data parameter must be a list of dictionaries with " +
                             "the keys 'date', 'temperature', 'humidity', and 'wind_speed'!")

    # Check if the analysis type is "average"
    if analysis_type == "average":
        # Calculate the total temperature and total humidity by summing over the data
        total_temp = sum([item['temperature'] for item in data])
        total_humidity = sum(item['humidity'] for item in data)
        
        # Calculate the average temperature and humidity by dividing the total by the number of data points
        avg_temp = total_temp / len(data)
        avg_humidity = total_humidity / len(data)
        
        # Return the average temperature and humidity as a dictionary
        return {"average_temperature": avg_temp, "average_humidity": avg_humidity}

    # Check if the analysis type is "max"
    elif analysis_type == "max":
        # Find the day with the maximum temperature using the max() function and a lambda function to compare temperatures
        max_temp_day = max(data, key=lambda x: x['temperature'])
        
        # Return the date of the day with the maximum temperature
        return {"max_temperature_date": max_temp_day['date']}

    # Check if the analysis type is "min"
    elif analysis_type == "min":
        # Find the day with the minimum temperature using the min() function and a lambda function to compare temperatures
        min_temp_day = min(data, key=lambda x: x['temperature'])
        
        # Return the date of the day with the minimum temperature
        return {"min_temperature_date": min_temp_day['date']}

    # Check if the analysis type is "trend"
    elif analysis_type == "trend":
        # Extract a list of temperatures from the data
        temperatures = [item['temperature'] for item in data]
        
        # Check if the temperatures are in an increasing trend
        if all(temperatures[i] <= temperatures[i + 1] for i in range(len(temperatures) - 1)):
            return "Increasing trend"
        
        # Check if the temperatures are in a decreasing trend
        elif all(temperatures[i] >= temperatures[i + 1] for i in range(len(temperatures) - 1)):
            return "Decreasing trend"
        
        # If neither increasing nor decreasing, it's a stable or mixed trend
        else:
            return "Stable or mixed trend"

    # If the analysis type is invalid, return an error message
    else:
        return "Invalid analysis type"


if __name__ == "__main__":
    # Example usage
    weather_data = [
        {"date": "2023-11-01", "temperature": 20, "humidity": 50, "wind_speed": 5},
        {"date": "2023-11-02", "temperature": 22, "humidity": 45, "wind_speed": 7},
        {"date": "2023-11-03", "temperature": 21, "humidity": 55, "wind_speed": 4},
        # ... add more data as needed
    ]

    # Call the analyze_weather_data function with the 'trend' analysis type and store the result
    result = analyze_weather_data(weather_data, "prognosis")
    
    # Print the result of the analysis
    print(result)

### Exercise 2: Exception Handling in Data Validation

In this exercise, you will write Python functions that perform simple data validation. You will use the `raise` statement to trigger exceptions when invalid data is encountered, and you will use `try-except-else` blocks to handle these exceptions. This exercise will help you understand how to use exceptions to manage error conditions in Python without using object-oriented programming concepts.

1. Write a function `validate_age(age)` that takes an integer `age` as input. If `age` is less than 0 or greater than 120, raise a `ValueError` with the message `"Invalid age: {age}"`. Otherwise, return `age`.

2. Write a function `calculate_retirement_age(current_age)` that:
   - Uses `validate_age` to ensure `current_age` is valid.
   - Calculates and returns the number of years left until retirement age (assume retirement age is 65).
   - If `current_age` is already greater than or equal to 65, return 0.

3. In your main program:
   - Prompt the user to enter their age. Use the `input()` function to get the age as a string and then convert it to an integer.
   - Use a `try-except-else` block to handle any exceptions that may be raised during the validation and calculation process.
   - If an exception occurs, print an error message. If no exception occurs, print the number of years left until retirement.

- **Validate Age Function**: The `validate_age` function checks if the provided `age` is within a realistic human age range (0 to 120). If not, it raises a `ValueError`.
- **Calculate Retirement Age Function**: The `calculate_retirement_age` function first validates the `current_age`. It then calculates the number of years left until retirement age (65). If the person is already 65 or older, it returns 0.
- **Exception Handling**: In the `main` function, a `try-except-else` block is used to handle any `ValueError` exceptions that may be raised during input conversion or age validation. If an exception occurs, an error message is printed. If no exception occurs, the program prints the years left until retirement.


In [None]:
# (i) Validate age function
def validate_age(age):
    if age < 0 or age > 120:
        raise ValueError(f"Invalid age: {age}")
    return age

# (ii) Calculate retirement age function
def calculate_retirement_age(current_age):
    age = validate_age(current_age)
    retirement_age = 65
    years_left = retirement_age - age
    if years_left < 0:
        return 0
    else:
        return years_left

# (iii) Main program with exception handling
def main():
    try:
        age_input = input("Enter your age: ")
        age = int(age_input)
        years_until_retirement = calculate_retirement_age(age)
    except ValueError as e:
        print(e)
    else:
        print(f"You have {years_until_retirement} years left until retirement.")

if __name__ == "__main__":
    main()

### Exercise 3: Debugging a Program

The following Python program is intended to read sales data from a text file called `sales_data.txt` (in the GitHub repository), process the data to compute the total sales, average sales per day, and identify the day with the highest total sales. However, the program contains several bugs that prevent it from working correctly, particularly in handling datetime values when grouping sales by date. Your task is to identify and fix these bugs using a debugger.

In [None]:
def process_sales_data(path):
    total_sales = 0.0
    sales_per_day = {}
    highest_sales = 0.0
    highest_day = ''
    sales_count = 0

    with open(path, 'r') as file:
        for line in file:
            line = line.strip()
            if not line:
                continue
            date, sales = line.split(',')
            sales = float(sales)
            total_sales += sales
            sales_count += 1
            if date not in sales_per_day:
                sales_per_day[date] = sales
            else:
                sales_per_day[date] += sales
            if sales > highest_sales:
                highest_sales = sales
                highest_day = date

    average_sales_per_day = total_sales / len(sales_per_day)
    return total_sales, average_sales_per_day, highest_day

# Sample usage
total, average_per_day, best_day = process_sales_data('tutorial6_data.txt')
print(f"Total Sales: ${total}")
print(f"Average Daily Sales: ${average_per_day}")
print(f"Highest Sales Day: {best_day}")

**Identifying and Fixing Bugs**

When addressing logical errors, set breakpoints at critical points where key values change or computations occur. This allows you to observe how the data evolves and pinpoint potential issues. Remember, the debugger won't activate unless an exception is thrown or you manually define breakpoints. By stepping through the code, you can systematically trace the program's flow and identify discrepancies.

(i) Bugs in the Function:
- **Bug 1**: The variable `date` is assigned as `date = date_time_str`, which includes the time component. This prevents correct grouping by date.
- **Bug 2**: When checking for the highest sales day, the code compares individual `sales` amounts instead of the total sales per day. This means it identifies the transaction with the highest amount, not the day with the highest total sales.
- **Bug 3**: The average sales per day is calculated as `total_sales / len(sales_per_day)`, but if the date extraction is incorrect, the number of days may be wrong or inflated due to time components.
- **Bug 4**: The program does not handle exceptions that may arise from incorrect date formats or parsing errors.

In [None]:
#(ii) Fixed Function:

import datetime

def process_sales_data(filename):
    # Initialize variables to track total sales, daily sales, and highest sales
    total_sales = 0.0
    sales_per_day = {}  # Dictionary to store sales grouped by date
    highest_sales = 0.0  # Variable to track the highest sales value for a single day
    highest_day = ''  # Variable to track the date with the highest sales
    sales_count = 0  # Counter for the total number of valid sales records processed

    # Open the file containing sales data
    with open(filename, 'r') as file:
        for line in file:
            line = line.strip()  # Remove any leading/trailing whitespace from the line
            if not line:
                continue  # Skip empty lines
            try:
                # Split the line into date-time and sales values
                date_time_str, sales = line.split(',')
                sales = float(sales)  # Convert the sales value to a float
                # Parse the date-time string into a datetime object
                date_time = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S')
                date = date_time.date()  # Extract the date (ignoring time)
            except ValueError:
                # Print a warning for lines that can't be processed and skip them
                print(f"Skipping invalid line: {line}")
                continue

            # Accumulate total sales and count the record
            total_sales += sales
            sales_count += 1
            # Add the sales value to the corresponding date in the dictionary
            if date not in sales_per_day:
                sales_per_day[date] = sales
            else:
                sales_per_day[date] += sales

    # Find the day with the highest total sales
    for date, daily_total in sales_per_day.items():
        if daily_total > highest_sales:
            highest_sales = daily_total
            highest_day = date

    # Calculate average sales per day, guarding against division by zero
    if sales_per_day:
        average_sales_per_day = total_sales / len(sales_per_day)
    else:
        average_sales_per_day = 0.0

    # Return a tuple with total sales, average sales per day, and the day with the highest sales
    return total_sales, average_sales_per_day, highest_day


# Sample usage
total, average_per_day, best_day = process_sales_data('sales_data.txt')
print(f"Total Sales: ${total}")
print(f"Average Daily Sales: ${average_per_day}")
print(f"Highest Sales Day: {best_day}")

(iii) Explanation:
- **Bug 1 Fix**: Changed `date = date_time_str` to `date = date_time.date()` to correctly extract the date part from the datetime object. This ensures that sales are grouped correctly by date without time components interfering.
- **Bug 2 Fix**: Moved the logic for finding the day with the highest total sales after processing all the data. Instead of comparing individual sales amounts, we now compare the total sales per day stored in sales_per_day.
- **Bug 3 Fix**: With the correct extraction of dates, `len(sales_per_day)` accurately reflects the number of unique days, making the average calculation correct.
- **Bug 4 Fix**: Added a try-except block around the parsing code to handle any potential ValueError exceptions, such as incorrect date formats or non-numeric sales values. This prevents the program from crashing on bad data.

### Exercise 4: GitHub Copilot Installation

1. Sign up for the free student plan of GitHub Copilot, e.g., on [GitHub Education](https://github.com/education/students).
2. Install the GitHub Copilot extension in Visual Studio Code and the GitHub Copilot plugin in PyCharm.
3. Test the extension/plugin by writing a few lines of code in both editors.

In [None]:
# Moved to tutorial 7