Extract & Clean ERA5 Wind, Temp, and Cloud Data
This script extracts ERA5 variables:

u10, v10 → calculates wind speed
t2m → air temperature at 2m (converted to °C)
tcc → cloud cover (0 to 1)

It averages these over Ireland’s grid points and exports a model-ready .csv file.

In [1]:
import xarray as xr
import pandas as pd
import numpy as np
import os

Cleans ERA5 weather data for Ireland:
- Calculates wind speed from u10 and v10
- Converts temperature from Kelvin to Celsius
- Averages each variable over all grid points
- Exports to a clean CSV file

In [2]:
def clean_weather_data(input_path, output_path):

    # Load the NetCDF dataset
    ds = xr.open_dataset(input_path)

    # Calculate wind speed
    wind_speed = np.sqrt(ds["u10"]**2 + ds["v10"]**2)

    # Average over all spatial grid points (Ireland)
    df = xr.Dataset({
        "Wind_Speed_10m": wind_speed.mean(dim=["latitude", "longitude"]),
        "Temperature_Celsius": (ds["t2m"] - 273.15).mean(dim=["latitude", "longitude"]),
        "Cloud_Cover": ds["tcc"].mean(dim=["latitude", "longitude"])
    }).to_dataframe().reset_index()

    # Rename and format
    df.rename(columns={"valid_time": "Date"}, inplace=True)
    df.sort_values("Date", inplace=True)
    df.reset_index(drop=True, inplace=True)

    # Save to processed CSV
    os.makedirs(os.path.dirname(output_path), exist_ok=True)
    df.to_csv(output_path, index=False)
    print(f"✅ Cleaned ERA5 weather data saved to: {output_path}")

    return df


In [3]:
if __name__ == "__main__":
    input_file = "../data/raw/data_stream-moda_stepType-avgua.nc"
    output_file = "../data/processed/Cleaned_Weather_Data_ERA5.csv"
    clean_weather_data(input_file, output_file)

✅ Cleaned ERA5 weather data saved to: ../data/processed/Cleaned_Weather_Data_ERA5.csv
