# Scenario: Climate Data Analysis for a Research Center

*As a data scientist at a climate research center, you have been provided with daily temperature and humidity data collected from 500 locations over one year. Your objective is to analyze this data for trends, seasonal patterns, and other useful metrics tha t can aid in understanding climate changes across various regions.*

## Assignment Tasks

### 1. Initialize Temperature and Humidity Data
Set up two arrays to represent daily data:
* `temperature_data`: Randomly generated temperature values in Celsius, ranging between `-10` and `40` degrees, for each of the `500` locations across `365` days.
* `humidity_data`: Randomly generated humidity percentages, ranging from `0` to `100`, for each location and day.

In [19]:
import numpy as np

temperature_data = np.random.uniform(-10, 40, (500, 365))
humidity_data = np.random.uniform(0, 100, (500, 365))


### 2. Check for Missing Data 
Simulate missing data by randomly setting `5%` of the values in `temperature_data` and `humidity_data` to null values. Determine how many null values exist in each array and report the total number of missing entries.

In [13]:
mask_temp = np.random.rand(*temperature_data.shape) < 0.05
temperature_data[mask_temp] = np.nan

mask_humid = np.random.rand(*humidity_data.shape) < 0.05
humidity_data[mask_humid] = np.nan

print("Total missing temperature values:", np.isnan(temperature_data).sum())
print("Total missing humidity values:", np.isnan(humidity_data).sum())

Total missing temperature values: 8977
Total missing humidity values: 8920


### 3. Convert Temperature and Calculate Discomfort Index 
Convert `temperature_data` from Celsius to Fahrenheit to facilitate data sharing with international teams. Then, compute a "`feels like`" discomfort index by combining temperature and humidity data.
* Ensure that any values in the "`feels like`" index that exceed `80` are capped at `80`, meaning they should be set to `80` if they are originally greater than `80`.

In [23]:
temperature_data_fahrenheit = temperature_data * 9/5 + 32
discomfort_index = 0.8 * temperature_data_fahrenheit + 0.6 * humidity_data
discomfort_index = np.minimum(discomfort_index, 80)


### 4. Analyze January Temperatures 
Extract the daily temperatures for January (first 31 days). Calculate and display the average January temperature across all 500 locations.

In [20]:
january_temps = temperature_data[:, :31]
avg_january_temp = np.mean(january_temps, axis=1)
print("Average January temperature across locations:", np.mean(avg_january_temp))

Average January temperature across locations: 14.920665584343665


### 5. Identify Extreme Temperatures 
Mark any temperature in `temperature_data` that exceeds `35°C` as a potential error by replacing it with a null value. Count the number of null values per location.

In [16]:
temperature_data[temperature_data > 35] = np.nan
extreme_temp_counts = np.isnan(temperature_data).sum(axis=1)

### 6. Calculate Quarterly Temperature Averages 
Reshape `temperature_data` into four quarters (one per season) and calculate the average temperature for each location across these quarters.

In [25]:
quarters = np.split(temperature_data, [90, 181, 273], axis=1)

quarterly_avg_temps = [np.nanmean(q, axis=1) for q in quarters]
quarterly_avg_temps = np.array(quarterly_avg_temps).T 

### 7. Classify Humidity Levels 
Classify each day’s humidity level as "`Dry`" if below `30%` and "`Humid`" if above `70%`, and count the total number of "`Dry`" and "`Humid`" days for each location.

In [26]:
humidity_classification = np.where(humidity_data < 30, "Dry", "Humid")
dry_days = np.sum(humidity_classification == "Dry", axis=1)
humid_days = np.sum(humidity_classification == "Humid", axis=1)

### 8. Apply Daily Pressure Trend to Temperature Data 
Account for daily atmospheric pressure variations by generating a trend across the `365` days and applying it to adjust daily temperatures at each location.

In [27]:
pressure_trend = np.linspace(-2, 2, 365)
temperature_data_adjusted = temperature_data + pressure_trend