<a href="https://www.kaggle.com/code/headply/weather-data-visualization-interactive?scriptVersionId=264184134" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# PROJECT SUMMARY

**Weather & Air Quality Analysis Project**


This project explores a global dataset of weather conditions and air quality indicators (≈96k records), combining spatial, temporal, and environmental insights into a unified analysis.

**OBJECTIVES**

    Clean and preprocess weather & air quality data.
    
    Engineer features like day length and seasonal breakdowns.
    
    Explore global geospatial patterns in weather and pollution.
    
    Assess weather–pollution interactions (temperature, humidity, wind).
    
    Benchmark pollutant levels against WHO standards to identify hotspots.
    
    Build interactive visualizations for exploration.

    

**KEY ANALYSIS**

    Data Cleaning & Preparation
    
    Standardized time-based features (sunrise, sunset, moonrise, moonset).
    
    Removed redundancies (e.g., °C vs °F, mph vs kph).
    
    Identified and managed outliers (Chile’s extreme PM2.5 events).
    
    Engineered features like day length (hours) and seasonal groupings.
    
    Exploratory Data Analysis (EDA)
    
    Distributions: Histograms, KDE plots, and boxplots for pollutants & weather.
    
    Geospatial patterns: Choropleth maps (temperature, humidity, AQI) and hotspot detection.
    
    Seasonality: Time-series of temperature vs PM2.5 across months.
    
    Weather–Pollution Interaction:
    
    Correlation heatmaps (temp, humidity, AQI).
    
    Scatter plots of wind speed vs pollutants.
    
    Pollutant distribution by wind direction.
    
    Astronomical impacts: Relationship between day length, moon phase, and visibility/temperature.
    


**ADVANCED INSIGHTS**

    Identified countries exceeding WHO thresholds for PM2.5 and other pollutants.
    
    Found strong seasonal patterns (e.g., pollution spikes in dry months).
    
    Evidence of wind dispersal effects—higher wind speeds reduce PM2.5 concentrations.



**TOOLS AND LIBRARIES**

    Python (Pandas, NumPy) – data cleaning & feature engineering.
    
    Matplotlib / Seaborn / Plotly – static & interactive visualizations.
    
    GeoPandas / Folium – geospatial mapping.
    
    ipywidgets – dropdowns & interactive filtering in Jupyter Notebook.




**OUTCOMES**

    A reproducible notebook that combines statistical, temporal, and geospatial analysis.

    Clear identification of pollution hotspots and weather patterns.
    
    Interactive visualizations for exploring seasonal and spatial dynamics.
    
    A framework adaptable for climate research, urban planning, or environmental monitoring.

# **1. SETUP AND IMPORTS**

Note: There are some visualization in this notebook that are interactive, you can download the notebook to interact with them

In [None]:
#Importing necessary Libraries (not all used library are imported at this point, some are imported are the point of usage so as to show what they are used for

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns #for visualization
import geopandas as gpd #map visuals
import ipywidgets as widgets
from ipywidgets import interact



In [None]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)  #To ignore every "futurewarning"

In [None]:
#importing the data

df = pd.read_csv("/kaggle/input/global-weather-repository/GlobalWeatherRepository.csv") #reads the csv data into a panda dataframe
df.head() #shows the first five rows of the dataframe by default

# DATA UNDERSTANDING

Next we need to check the shape of our data to know how many rows and columns we are dealing with here.


In [None]:
#Next we need to check the shape of our data to know how many rows and columns we are dealing with here and also see if there are duplicates or missing values


def data_details(df):
    shapes = df.shape #stores a tuple that corresponds with the number of rows and cols in the dataset
    numerical_columns = df.select_dtypes(include="number").shape[1] #stores the number of cols in the data that are numerical
    non_numerical_columns =  df.select_dtypes(exclude="number").shape[1] #stores the number of cols in the data that are non numerical
    total_distinct_countries = df["country"].nunique() #stores the number of distinct country names in the dataset
    missing = df.isnull().sum() #stores total number of rows with null values
    missing_filtered = missing[missing > 0] #returns the the specific rows with null values
    if not missing_filtered.empty:
        missing_report = (missing_filtered)
    else:
        missing_report = ("No missing values found.")
    
    duplicate_report = df.duplicated().sum()

    print(f"Total missing values = {missing_report}\nTotal duplicates = {duplicate_report}\nNo of rows = {shapes[0]}\nNo of columns = {shapes[1]} \nNumerical columns = {numerical_columns}\nNon-numerical columns = {non_numerical_columns}\nTotal Number of Countries = {total_distinct_countries}")

data_details(df)

In [None]:
#let's see the the names of the columns we have their types respectively

df.info()

# DATA CLEANING

In [None]:
"""
While exploring the data, some issues were highlighted such as:
Misspelt country names and location names
The time column is in object type but it should be in datetime so time series information can be efficiently deduced
"""


def time_normalization(df):
    """
    converts the time column to a datetime type
    """
    df['last_updated'] = pd.to_datetime(df['last_updated'], errors='coerce')
    df = df.dropna(subset=['last_updated'])
    df = df.drop_duplicates()
    return df

def country_correction(df): #to correct mispelt country names
    country_fix = {
        'كولومبيا': 'Colombia',
        '火鸡': 'Turkey',
        'Польша': 'Poland',
        'Турция': 'Turkey',
        'Jemen': 'Yemen',
        'Turkménistan': 'Turkmenistan',
        'Bélgica': 'Belgium',
        'Südkorea': 'South Korea',
        'Marrocos': 'Morocco',
        'Inde': 'India',
        'Polônia': 'Poland',
        'Mexique': 'Mexico',
        'Saint-Vincent-et-les-Grenadines': 'Saint Vincent and the Grenadines',
        'Saudi Arabien': 'Saudi Arabia',
        'Letonia': 'Latvia',
        'Estonie': 'Estonia',
        'Komoren': 'Comoros',
        'Malásia': 'Malaysia',
        'USA United States of America': 'United States of America',
        'Congo': 'Democratic Republic of Congo',
        'Гватемала': 'Guatemala'
    }
    df['country'] = df['country'].replace(country_fix)
    return df
    
def location_correction(df): #to correct mispelt location names
    location_fix ={
    "'S Gravenjansdyk": "'S Gravenjansdijk",
    "'S Gravenstaffel": "'S Gravenjansdijk",
    "Phnum Penh": "Phnom Penh",
    "Beijing Shi": "Beijing",
    "Addis Abeba": "Addis Ababa",
    "New Guatemala": "Guatemala City",
    "Kuwait": "Kuwait City",
    "Mexico (Grupo Mexico)": "Mexico City",
    "City Of San Marino": "San Marino",
    "Ar Riyadh": "Riyadh",
    "Nuku'alofa": "Nuku`Aloia",
    "-Kingdom": "Ankara"
    }
    df['location_name'] = df['location_name'].replace(location_fix)
    return df
    
def clean_weather_data(df): #returns the cleaned dataset
    df = time_normalization(df)
    df = country_correction(df)
    df = location_correction(df)
    return df



df = clean_weather_data(df)

In [None]:
#let's see if the changes we made took effect

df.head(10)

In [None]:
"""
Furthermore, to effectively make use of the sunrise and sunset column information, we need to normalize it to datetime and also ensure it connects with the actual time column.
Also some columns would be derived from existing columns e,g the day, month and day_time_length.
"""


def sun_moon(df):
    # Helper function to safely combine date + time
    def safe_to_datetime(date_series, time_series):
        return pd.to_datetime(
            date_series.astype(str) + " " + time_series.astype(str),
            format="%Y-%m-%d %I:%M %p",  # enforce consistent AM/PM parsing
            errors="coerce"              # turn "No moonrise"/bad values into NaT
        )

    # Combine last_updated date with times
    df["sunrise_dt"] = safe_to_datetime(df["last_updated"].dt.date, df["sunrise"])
    df["sunset_dt"]  = safe_to_datetime(df["last_updated"].dt.date, df["sunset"])

    # Compute durations only where valid
    df["day_length_hours"] = (df["sunset_dt"] - df["sunrise_dt"]).dt.total_seconds() / 3600
    
    return df

    # add engineered durations
    df["day_length_hours"] = (df["sunset_dt"] - df["sunrise_dt"]).dt.total_seconds() / 3600
    df["moonlight_hours"] = (df["moonset_dt"] - df["moonrise_dt"]).dt.total_seconds() / 3600
    
    return df

def year_month_day_extraction(df):
    df["year"] = df["last_updated"].dt.year
    df["month_name"] = df["last_updated"].dt.strftime("%B")
    df["day_name"] = df["last_updated"].dt.strftime("%A")
    return df

def drop_redundant(df):
    df.drop(["sunrise",
             "sunset",
             "moonrise",
             "moonset", 
             "last_updated_epoch", 
             "temperature_fahrenheit", 
             "wind_kph", 
             "pressure_in", 
             "precip_in", 
             "visibility_miles", 
             "feels_like_fahrenheit", 
             "gust_kph"],
            axis=1, 
            inplace=True
           )
    return df

def columns_creation(df):
    df = sun_moon(df)
    df = year_month_day_extraction(df)
    df = drop_redundant(df)
    return df

# Apply
df = columns_creation(df)

#let's see if the necessary columns have been dropped
data_details(df)



In [None]:
#let's see if we are good to go

df.head()

# DETECTING AND TREATING OUTLIERS


In [None]:
#Lets generate a boxplot accross all numerical columns so as to have a quick glimpse if our data realy has outliers

import matplotlib.pyplot as plt

num_cols = df.select_dtypes(include="number")

plt.figure(figsize=(12, 6))
num_cols.boxplot(rot=90)
plt.title("Outlier Detection Across Numeric Columns")
plt.show()

In [None]:
"""
Spotted, our data has so many outliers, lets see them.
First you either choose "all" or a partiular country name to see how outliers are distributed accross numerical columns,
then afterwards a dropdown populated with the top 10 columns with the highest outliers is provided so as to choose and see the normalization of the column.
"""

from ipywidgets import interact
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# --- First part: skewness per country ---
@interact(selected_country = ["All"] + sorted(df["country"].unique()))
def kde_by_country(selected_country):
    if selected_country == "All":
        filtered_df = df.copy()
    else:
        filtered_df = df[df["country"] == selected_country]
    
    numerical_columns = filtered_df.select_dtypes('number').drop(["year","latitude","longitude"], axis=1)
    
    # KDE plots
    fig, axes = plt.subplots(7, 3, figsize=(15, 15))
    axes = axes.flatten()
    for i, column in enumerate(numerical_columns.columns):
        sns.kdeplot(data=numerical_columns, x=column, fill=True, color='skyblue', ax=axes[i])
        axes[i].set_title(column)
    for j in range(len(numerical_columns.columns), len(axes)):
        axes[j].axis('off')
    plt.tight_layout()
    plt.show()

    # Skewness table
    skewness = numerical_columns.skew().sort_values(ascending=False)
    skewness_df = skewness.reset_index()
    skewness_df.columns = ["Feature", "Skewness"]

    display(skewness_df)

    # Take top 10 most skewed features
    top_skewed = skewness_df.head(10)["Feature"].tolist()

    # --- Second part: deeper analysis tied to top skewed ---
    @interact(selected_feature=top_skewed)
    def skewness_analysis(selected_feature):
        fig, axes = plt.subplots(1, 3, figsize=(20, 6))

        # 1. Histogram (raw)
        sns.histplot(filtered_df[selected_feature], bins=50, kde=True, ax=axes[0])
        axes[0].set_title(f"{selected_feature} Distribution (Raw)")

        # 2. Histogram (log-transformed)
        sns.histplot(np.log1p(filtered_df[selected_feature]), bins=50, kde=True, ax=axes[1], color="orange")
        axes[1].set_title(f"{selected_feature} Distribution (Log1p)")

        # 3. Boxplot
        sns.boxplot(x=filtered_df[selected_feature], ax=axes[2], color="red")
        axes[2].set_title(f"{selected_feature} Boxplot")

        plt.tight_layout()
        plt.show()

        # Outlier countries
        threshold = filtered_df[selected_feature].quantile(0.95)
        outlier_countries = (
            filtered_df[filtered_df[selected_feature] > threshold]
            .groupby("country")[selected_feature]
            .count()
            .sort_values(ascending=False)
        )
        print(f"Countries contributing most to {selected_feature} skewness:")
        display(outlier_countries.head(10))


# EDA

In [None]:
#Top 10 COUNTRIES RANKING (basic_features)

import ipywidgets as widgets
from IPython.display import display

def top_hot_cold(df, mode="Current Month"):
    # Determine mode
    if mode == "Current Month":
        current_month = pd.Timestamp.now().month
        current_year = pd.Timestamp.now().year
        data = df[(df['last_updated'].dt.month == current_month) &
                  (df['last_updated'].dt.year == current_year)]
        
    else:
        # All months = use full dataset
        data = df.copy()

    # Group and compute averages
    top_hot = data.groupby("country")["temperature_celsius"].mean().sort_values(ascending=False).head(10)
    top_cold = data.groupby("country")["temperature_celsius"].mean().sort_values(ascending=True).head(10)
    longest_days = data.groupby("country")["day_length_hours"].mean().sort_values(ascending=False).head(10)
    pm_2_5 = data.groupby("country")["air_quality_PM2.5"].mean().sort_values(ascending=False).head(10)
    start_date = df["last_updated"].max()

    # Combine into one DataFrame
    combined_df = pd.DataFrame({
        'Cold(c)': top_cold.index, 
        'Mean_Temp_Cold': top_cold.values,
        'Hot(c)': top_hot.index, 
        'Mean_Temp_Hot': top_hot.values,
        "Day_length" : longest_days.index,
        "Mean_Length" : longest_days.values,
        "PM2.5" : pm_2_5.index,
        "Mean2.5" : pm_2_5.values
        
    }, index=pd.Index(range(1, 11), name='Rank'))

    display(combined_df)
    

# Dropdown widget
mode_dropdown = widgets.Dropdown(
    options=["Current Month", "All Months Average"],
    value="Current Month",
    description="View:"
)

widgets.interact(lambda mode: top_hot_cold(df, mode), mode=mode_dropdown)

In [None]:

# Update with actual column names from print(world.columns)
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))

# Compute average per country
d_count= ["latitude", "longitude", "year"]
wanted = df.select_dtypes(include="number").drop(d_count, axis=1)
wanted["country"] = df["country"]
avg_aqi = wanted.groupby("country", as_index=False).mean()

# Merge with shapefile
world = world.merge(avg_aqi, left_on="name", right_on="country", how="left")


excluded = ["name", "country", "iso_a3", "geometry", "pop_est", "gdp_md_est", "continent"]

features = [col for col in world.columns if col not in excluded]

def plot_feature(feature):
    if feature not in world.columns:
        print(f"❌ Column '{feature}' not found in world DataFrame. Available: {list(world.columns)}")
        return
    
    fig, ax = plt.subplots(1, 1, figsize=(14, 8))

    world.boundary.plot(ax=ax, linewidth=0.8, color="black")
    world.plot(
        column=feature,
        cmap="OrRd",
        legend=True,
        ax=ax,
        missing_kwds={
            "color": "lightgrey",
            "edgecolor": "white",
            "hatch": "///",
            "label": "No data",
        }
    )

    ax.set_title(f"Average {feature.replace('_',' ').title()} by Country",
                 fontsize=16, fontweight="bold")
    ax.axis("off")
    plt.show()

interact(
    plot_feature,
    feature=widgets.Dropdown(options=features, value=features[0], description="Feature:")
)



In [None]:
#Lets see if there are any correlation among Air pollutants

# Select pollutants
pollutants = ["air_quality_PM2.5", "air_quality_PM10", "air_quality_Nitrogen_dioxide", 
              "air_quality_Sulphur_dioxide", "air_quality_Ozone", "air_quality_Carbon_Monoxide"]

# Correlation matrix
corr = df[pollutants].corr()

# Heatmap
plt.figure(figsize=(8,6))
sns.heatmap(corr, annot=True, cmap="coolwarm", fmt=".2f")
plt.title("Correlation Among Air Pollutants")
plt.show()



In [None]:
#Let's see how the air quality parameters perform in contrast with W.H.O limits

who_limits = {
    "air_quality_PM2.5": 15,
    "air_quality_PM10": 45,
    "air_quality_Nitrogen_dioxide": 25,
    "air_quality_Sulphur_dioxide": 40,
    "air_quality_Ozone": 100,
    "air_quality_Carbon_Monoxide": 4000
}

for pollutant, limit in who_limits.items():
    exceedance = (df[pollutant] > limit).mean() * 100
    print(f"{pollutant}: {exceedance:.1f}% of records exceed WHO guideline")


In [None]:
#Lets see the countries that violates the most

from plotly.subplots import make_subplots
import plotly.graph_objects as go


# --- WHO limits ---
WHO_PM25_LIMIT = 5     
WHO_PM10_LIMIT = 15    
WHO_NO2_LIMIT = 10     
WHO_SO2_LIMIT = 40     
WHO_O3_LIMIT = 100     

# --- Aggregate by country ---
country_avg = df.groupby("country").agg({
    "air_quality_PM2.5": "mean",
    "air_quality_PM10": "mean",
    "air_quality_Nitrogen_dioxide": "mean",
    "air_quality_Sulphur_dioxide": "mean",
    "air_quality_Ozone": "mean"
}).reset_index()

# --- Merge with world map ---
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
world = world.merge(country_avg, left_on="name", right_on="country", how="left")

# --- Pollutants dict ---
pollutants = {
    "PM2.5 (µg/m³)": "air_quality_PM2.5",
    "PM10 (µg/m³)": "air_quality_PM10",
    "NO₂ (µg/m³)": "air_quality_Nitrogen_dioxide",
    "SO₂ (µg/m³)": "air_quality_Sulphur_dioxide",
    "O₃ (µg/m³)": "air_quality_Ozone"
}

# --- Base figure with 2 subplots: map + bar chart ---
fig = make_subplots(
    rows=1, cols=2,
    column_widths=[0.7, 0.3],
    subplot_titles=("Global Pollution Hotspots", "Top 5 Violators"),
    specs=[[{"type": "choropleth"}, {"type": "bar"}]]
)

# Default pollutant (PM2.5)
default_col = list(pollutants.values())[0]

# Choropleth map
choropleth = go.Choropleth(
    geojson=world.__geo_interface__,
    locations=world.index,
    z=world[default_col],
    colorscale="Reds",
    colorbar_title="µg/m³",
    marker_line_color="black",
    marker_line_width=0.2
)

# Top 5 bar chart
top5 = country_avg.nlargest(5, default_col)
bar = go.Bar(
    x=top5[default_col],
    y=top5["country"],
    orientation="h",
    marker=dict(color="crimson")
)

# Add traces
fig.add_trace(choropleth, row=1, col=1)
fig.add_trace(bar, row=1, col=2)

# --- Dropdown menu ---
buttons = []
for label, col in pollutants.items():
    # Top 5 for this pollutant
    top5 = country_avg.nlargest(5, col)
    buttons.append(dict(
        method="update",
        label=label,
        args=[
            {
                "z": [world[col], top5[col]], 
                "x": [None, top5[col]], 
                "y": [None, top5["country"]]
            },
            {"title": f"Global Air Quality Hotspots – {label}"}
        ]
    ))

fig.update_layout(
    geo=dict(projection_type="natural earth"),
    updatemenus=[dict(
        active=0,
        buttons=buttons,
        x=0.1, y=1.1,
        xanchor="left", yanchor="top"
    )],
    height=600, width=1100,
    showlegend=False
)

fig.show()


In [None]:
#Lets see if there are correlations between Weather and AQI values

# Focus subset

col_to_corr = df.select_dtypes("number").drop(["year"], axis = 1)
corr_cols = ["temperature_celsius", "humidity", "air_quality_PM2.5", "air_quality_us-epa-index", "day_length_hours"]
corr_matrix = col_to_corr.corr()

# Heatmap
plt.figure(figsize=(15,9))
sns.heatmap(corr_matrix, annot=True, cmap="coolwarm", fmt=".2f")
plt.title("Correlation: Weather vs AirQuality vs Location")
plt.show()



**Time Series AnalysisTemperature vs Air Quality Parameters**


In [None]:
import pandas as pd
import plotly.graph_objects as go

# --- Ensure datetime ---
df["last_updated"] = pd.to_datetime(df["last_updated"])

# --- Function to resample ---
def aggregate_data(freq, country="All", param="air_quality_PM2.5"):
    if country != "All":
        temp_df = df[df["country"] == country]
    else:
        temp_df = df.copy()
    
    resampled = temp_df.resample(freq, on="last_updated").agg({
        "temperature_celsius": "mean",
        param: "mean"
    }).reset_index()
    
    return resampled

# --- Initial defaults ---
param = "air_quality_PM2.5"
country = "All"
freq = "D"  # Daily by default
data = aggregate_data(freq, country, param)

# --- Base Figure ---
fig = go.Figure()

# Temperature trace
fig.add_trace(go.Scatter(
    x=data["last_updated"], y=data["temperature_celsius"],
    mode="lines+markers", name="Temperature (°C)", line=dict(color="red")
))

# AQI trace
fig.add_trace(go.Scatter(
    x=data["last_updated"], y=data[param],
    mode="lines+markers", name=param, line=dict(color="blue"), yaxis="y2"
))

# Layout with dual y-axes
fig.update_layout(
    title="Weather–Pollution Interaction (Interactive)",
    xaxis=dict(title="Date"),
    yaxis=dict(title="Temperature (°C)", color="red"),
    yaxis2=dict(title=f"{param} (µg/m³)", overlaying="y", side="right", color="blue"),
    height=600
)

# --- Dropdown options ---
countries = ["All"] + sorted(df["country"].dropna().unique().tolist())
aq_params = ["air_quality_PM2.5", "air_quality_PM10",
             "air_quality_Nitrogen_dioxide",
             "air_quality_Sulphur_dioxide", "air_quality_Ozone"]

# 1) Frequency buttons
freq_buttons = [
    dict(label="Daily", method="update",
         args=[{"x": [aggregate_data("D", country, param)["last_updated"],
                      aggregate_data("D", country, param)["last_updated"]],
                "y": [aggregate_data("D", country, param)["temperature_celsius"],
                      aggregate_data("D", country, param)[param]]},
               {"xaxis": {"title": "Date"}}]),
    dict(label="Weekly", method="update",
         args=[{"x": [aggregate_data("W", country, param)["last_updated"],
                      aggregate_data("W", country, param)["last_updated"]],
                "y": [aggregate_data("W", country, param)["temperature_celsius"],
                      aggregate_data("W", country, param)[param]]},
               {"xaxis": {"title": "Week"}}]),
    dict(label="Monthly", method="update",
         args=[{"x": [aggregate_data("M", country, param)["last_updated"],
                      aggregate_data("M", country, param)["last_updated"]],
                "y": [aggregate_data("M", country, param)["temperature_celsius"],
                      aggregate_data("M", country, param)[param]]},
               {"xaxis": {"title": "Month"}}]),
]

# 2) Parameter buttons
param_buttons = [
    dict(label=p, method="update",
         args=[{"y": [aggregate_data(freq, country, p)["temperature_celsius"],
                      aggregate_data(freq, country, p)[p]],
                "x": [aggregate_data(freq, country, p)["last_updated"],
                      aggregate_data(freq, country, p)["last_updated"]],
                "name": ["Temperature (°C)", p]},
               {"yaxis2": {"title": f"{p} (µg/m³)"}}])
    for p in aq_params
]

# 3) Country buttons
country_buttons = [
    dict(label=c, method="update",
         args=[{"y": [aggregate_data(freq, c, param)["temperature_celsius"],
                      aggregate_data(freq, c, param)[param]],
                "x": [aggregate_data(freq, c, param)["last_updated"],
                      aggregate_data(freq, c, param)["last_updated"]],
                "name": ["Temperature (°C)", param]}])
    for c in countries
]

# --- Attach dropdowns ---
fig.update_layout(
    updatemenus=[
        dict(buttons=freq_buttons, direction="down", showactive=True, x=0.0, y=1.15, xanchor="left", yanchor="top"),
        dict(buttons=param_buttons, direction="down", showactive=True, x=0.25, y=1.15, xanchor="left", yanchor="top"),
        dict(buttons=country_buttons, direction="down", showactive=True, x=0.55, y=1.15, xanchor="left", yanchor="top"),
    ]
)

fig.show()
