# **Assignment: Exploring Weather Data Using NumPy**

---

## Objective

This assignment focuses on analyzing weather data using **NumPy**. You will:
1. Create and manipulate datasets.
2. Perform statistical computations on NumPy arrays.
3. Solve real-world problems involving temperature data.

---

## Dataset Description

The dataset represents daily temperature records for one year (365 days). It consists of the following columns:
1. **Day of the Year**: An integer from 1 to 365.
2. **Minimum Temperature**: A random integer between -10 and 15 (in degrees Celsius).
3. **Maximum Temperature**: A random integer between 5 and 35 (in degrees Celsius).

To ensure uniformity across all submissions, we’ve fixed the random seed for data generation.

---

## Instructions

1. Use **only NumPy** for all computations.
2. Do **NOT** use loops unless explicitly stated.
3. Pass all visible and hidden test cases to complete the assignment.

---

## Tasks

### Task 1: Generate the Dataset

Write a function `generate_weather_data()` that:
1. Generates a NumPy array with the following columns:
   - Column 1: Days of the year (1 to 365).
   - Column 2: Minimum temperatures (random integers between -10 and 15).
   - Column 3: Maximum temperatures (random integers between 5 and 35).
2. Use a fixed random seed (`np.random.seed(42)`) to ensure reproducibility.
3. Return the generated array.

---

### Task 2: Basic Statistics

Write a function `basic_statistics(data)` that:
1. Accepts the weather data array from Task 1.
2. Computes and returns:
   - The average minimum temperature.
   - The average maximum temperature.
   - The highest temperature and the day it occurred.
   - The lowest temperature and the day it occurred.

---

### Task 3: Daily Temperature Range

Write a function `daily_temperature_range(data)` that:
1. Computes the daily temperature range (maximum - minimum) for all days.
2. Returns:
   - An array of daily ranges.
   - The day with the largest temperature range and its value.

---

### Task 4: Heatwave Identification

Write a function `identify_heatwaves(data)` that:
1. Defines a **heatwave** as three or more consecutive days where the maximum temperature is above 30°C.
2. Identifies and returns:
   - The total number of heatwaves.
   - A list of tuples, where each tuple represents the start and end day of a heatwave.

**Example**:
- If maximum temperatures on days 74, 75, and 76 are all above 30°C, it is a single heatwave: `(74, 76)`.
- A single day above 30°C is **not** a heatwave.

```Zaio.IO```

In [None]:
#### STUDENT CODE CELL

import numpy as np

def generate_weather_data():
    # Set a fixed random seed for reproducibility
    np.random.seed(42)
    
    # Generate days array (1 to 365)
    days = np.arange(1, 366)
    
    # Generate random temperatures
    min_temps = np.random.randint(-10, 16, size=365)  # -10 to 15
    max_temps = np.random.randint(5, 36, size=365)    # 5 to 35
    
    # Stack the arrays to create the weather data
    weather_data = np.column_stack((days, min_temps, max_temps))
    
    return weather_data

In [None]:
def basic_statistics(data):
    # Extract min and max temperature columns
    min_temps = data[:, 1]
    max_temps = data[:, 2]
    
    # Calculate averages
    avg_min = np.mean(min_temps)
    avg_max = np.mean(max_temps)
    
    # Find highest temperature and its day
    max_temp_idx = np.argmax(max_temps)
    max_temp = max_temps[max_temp_idx]
    max_temp_day = data[max_temp_idx, 0]
    
    # Find lowest temperature and its day
    min_temp_idx = np.argmin(min_temps)
    min_temp = min_temps[min_temp_idx]
    min_temp_day = data[min_temp_idx, 0]
    
    return avg_min, avg_max, (max_temp_day, max_temp), (min_temp_day, min_temp)

In [None]:
def daily_temperature_range(data):
    # Calculate daily temperature range
    daily_ranges = data[:, 2] - data[:, 1]
    
    # Find the day with the largest range
    max_range_idx = np.argmax(daily_ranges)
    max_range = daily_ranges[max_range_idx]
    max_range_day = data[max_range_idx, 0]
    
    return daily_ranges, (max_range_day, max_range)

In [None]:
def identify_heatwaves(data):
    # Find days where max temp is above 30°C
    hot_days = data[:, 2] > 30
    
    # Initialize variables for tracking heatwaves
    heatwaves = []
    in_heatwave = False
    heatwave_start = 0
    
    # Loop through each day
    for i in range(len(hot_days)):
        day = data[i, 0]
        
        if hot_days[i]:  # If it's a hot day
            if not in_heatwave:  # Start of potential heatwave
                in_heatwave = True
                heatwave_start = day
        else:  # Not a hot day
            if in_heatwave:  # End of a potential heatwave
                heatwave_length = day - heatwave_start
                if heatwave_length >= 3:  # It's a heatwave (3+ consecutive days)
                    heatwaves.append((heatwave_start, day - 1))
                in_heatwave = False
    
    # Check if the last sequence was a heatwave
    if in_heatwave:
        heatwave_length = data[-1, 0] - heatwave_start + 1
        if heatwave_length >= 3:
            heatwaves.append((heatwave_start, data[-1, 0]))
    
    return len(heatwaves), heatwaves