Weather data:

Similarly as done with New Zealand, here are the following places chosen in the UK:


| Station             | Coordinates (lat, lon) | Location / Region           | Climate Representation   |
| ------------------- | ---------------------- | --------------------------- | ------------------------ |
| Heathrow Airport    | 51.4700, -0.4543       | London / South England      | Mild Temperate (Lowland) |
| Edinburgh Gogarbank | 55.9220, -3.3680       | Southeast Scotland          | Cool Temperate           |
| Cardiff Bute Park   | 51.4810, -3.1810       | South Wales                 | Maritime / Wet           |
| Belfast Aldergrove  | 54.6550, -6.2150       | Northern Ireland            | Maritime / Cloudy        |
| Shap Fell           | 54.5410, -2.5800       | Northwest England (Uplands) | Cool, Wet, Mountainous   |


Data was obtained from openmeteo.org

In [1]:
# Imports

import pandas as pd 
import numpy as np
from dotenv import load_dotenv
import os

Data is in a common .csv file with all the five stations, temperature and rainfall per hour.

In [2]:
# Clean data for UK weather from file

df = pd.read_csv('./datasets/uk/uk_weather_raw.csv') 
df["Date"] = pd.to_datetime(df["time"]).dt.date # Change datatype to date

# Aggregate per station and date
df_diario = df.groupby(['location_id', 'Date']).agg(
    avg_temp=('temperature_2m (°C)', 'mean'),
    total_rain=('precipitation (mm)', 'sum')
).reset_index()

df_diario

Unnamed: 0,location_id,Date,avg_temp,total_rain
0,0,2022-01-01,12.600000,0.0
1,0,2022-01-02,11.354167,3.2
2,0,2022-01-03,9.254167,0.0
3,0,2022-01-04,5.245833,3.8
4,0,2022-01-05,3.070833,0.0
...,...,...,...,...
7300,4,2025-12-27,2.466667,0.0
7301,4,2025-12-28,3.100000,0.0
7302,4,2025-12-29,2.045833,0.0
7303,4,2025-12-30,1.241667,0.0


We do not need to filter by date here as the data are extracted from the source directly within the range that we desired

In [3]:
# Aggregating different stations by date, precipitation here is a mean

uk_agg = df_diario.groupby(['Date']).agg(
    avg_temp_uk=('avg_temp', 'mean'),
    avg_rain_uk=('total_rain', 'mean')
).round(2).reset_index()

uk_agg

Unnamed: 0,Date,avg_temp_uk,avg_rain_uk
0,2022-01-01,11.80,1.72
1,2022-01-02,9.25,5.54
2,2022-01-03,7.22,5.26
3,2022-01-04,3.00,1.10
4,2022-01-05,1.98,0.00
...,...,...,...
1456,2025-12-27,3.77,0.00
1457,2025-12-28,4.68,0.00
1458,2025-12-29,3.60,0.00
1459,2025-12-30,2.72,0.00


In [7]:
#Save data table in a csv file
uk_agg.to_csv('./datasets/jl_uk_weather.csv', index=False)