Scenario: You are a data analyst working with a dataset of global weather. Your task is to analyze temperature trends and visualize the results.

Tasks:

1. Data Preparation:

Hint 1: Use np.random.uniform(low, high, size) to generate the temperature data.
Hint 2: Create a DataFrame using pd.DataFrame(data, index, columns) with appropriate index and columns.

Use NumPy to generate a synthetic dataset representing average monthly temperatures (in degrees Celsius) for 12 months across 10 different cities. The temperatures should range from -5 to 35 degrees.

Convert this NumPy array into a Pandas DataFrame, adding city names as index and months as columns.
2. Data Analysis:

Hint 1: Calculate the annual average temperature using DataFrame.mean(axis).
Hint 2: Find the city with the highest and lowest average temperature using idxmax() and idxmin() methods.

Calculate the annual average temperature for each city.

Identify the city with the highest and lowest average temperature for the year.
3. Data Visualization:


Deliverables:
A Jupyter Notebook containing all the code for data generation, analysis, and visualization.
A brief report within the notebook summarizing your findings, including the city with the highest and lowest average temperatures and any interesting trends observed in the data.


In [28]:
import numpy as np
import pandas as pd

np.random.seed(42)
temperatures = np.random.uniform(-5, 35, size=(10, 12))
temperatures = np.round(temperatures, 1)
city_names = [f"City_{i+1}" for i in range(10)]
month_names = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
               "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
temperature_df = pd.DataFrame(temperatures, index=city_names, columns=month_names)

annual_avg_temp = temperature_df.mean(axis=1)

city_highest_temp = annual_avg_temp.idxmax()
city_lowest_temp = annual_avg_temp.idxmin()

print("Temperature DataFrame:")
print(temperature_df)
print("\nAnnual Average Temperatures:")
print(annual_avg_temp)
print(f"\nCity with Highest Average Temperature: {city_highest_temp}")
print(f"City with Lowest Average Temperature: {city_lowest_temp}")


Temperature DataFrame:
          Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov  \
City_1   10.0  33.0  24.3  18.9   1.2   1.2  -2.7  29.6  19.0  23.3  -4.2   
City_2   28.3   3.5   2.3   2.3   7.2  16.0  12.3   6.6  19.5   0.6   6.7   
City_3   13.2  26.4   3.0  15.6  18.7  -3.1  19.3   1.8  -2.4  33.0  33.6   
City_4    7.2  -1.1  22.4  12.6  -0.1  14.8  -3.6  31.4   5.4  21.5   7.5   
City_5   16.9   2.4  33.8  26.0  32.6  30.8  18.9  31.9  -1.5   2.8  -3.2   
City_6   10.5   5.9  28.1   9.3   6.2  16.7   0.6  27.1  -2.0  34.5  25.9   
City_7   -4.8  27.6  23.3  24.2  25.9  -2.0   9.3  -0.4  29.5  19.9   8.2   
City_8    7.4   8.0  24.2  20.5  30.5  13.9  -0.2  23.5  25.4  17.5  25.8   
City_9   15.9  12.1  -4.0  -0.7  -3.7  20.5   7.6  15.3  31.3   5.0  11.4   
City_10   4.2  -1.9   6.6   1.4  32.2  27.3  20.3  29.9  27.1   2.5  30.7   

          Dec  
City_1   33.8  
City_2    9.7  
City_3   27.3  
City_4   15.8  
City_5    8.0  
City_6    2.9  
City_7   -2.5  
C

In [29]:
import plotly.express as px

temperature_long = temperature_df.reset_index().melt(
    id_vars="index", var_name="Month", value_name="Temperature"
)
temperature_long.rename(columns={"index": "City"}, inplace=True)

fig = px.line(
    temperature_long,
    x="Month",
    y="Temperature",
    color="City",
    title="Monthly Temperatures Across Cities",
    labels={"Temperature": "Temperature (°C)", "Month": "Month", "City": "City"},
)

fig.update_layout(
    template="plotly_white",
    xaxis=dict(title="Month", tickmode="array", tickvals=month_names),
    yaxis_title="Temperature (°C)",
    legend_title="City"
)
fig.show()


City with Highest Average Temperature: City_8
City with Lowest Average Temperature: City_2
City with Highest reported Temperature: City_6
City with Lowest reported Temperature: City_7
Anomalies: Cities 1 and 4 have negative °C in July, City_1 has more than 30 °C in December