# Climate Overview for Energy Modelling

Initial step to perform energy modelling is to get an overview of the climate data available for the region of interest. This notebook provides a guide to download and process ERA5-Land monthly data using the CDS API.


## 0. Getting ERA5-Land monthly via Python API

> You need to install the `cdsapi` package to use this notebook.

You can install it using `pip install cdsapi`, but you also need to register on the [Climate Data Store](https://cds.climate.copernicus.eu/api-how-to) and get your API key.
You need to do that only once, and it is valid for all future downloads.

The key should be saved in the `~/.cdsapirc` file in your home directory, but everything is explained in the link above.


In [5]:
import cdsapi
import timeit
import os
from utils import *
import xarray as xr
import zipfile
from matplotlib import pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import calendar

### 1. Fill in the parameters

You only need to fill 3 information below to download the data and extract you need.

1. Define the region of interest using ISO A2 codes within the `ISO_A2` list.
2. Specify the start and end years for the data collection.
3. List the variables you want to download from the ERA5-Land dataset in the `variables_list`.

In [6]:
ISO_A2 = ['AO', 'BI', 'CM', 'CF', 'TD', 'CG', 'CD', 'GQ', 'GA', 'ST']

start_year, end_year = 1950, 2025                                           # from 1950

variables_list = ['2m_temperature',
                  'total_precipitation'
                 ]# to present year

#### Processing parameters
- Prepare the folder structure
- Data will be downloaded in the `climatic/era5_api` folder
- Extracted data will be saved in the `climatic/era5_extract` folder
- Output data will be saved in the `climatic/output` folder
- Data will be downloaded from the ERA5-Land monthly means dataset

In [7]:

# Create folder structure
folder_input = 'input'
if not os.path.exists(folder_input): os.mkdir(folder_input)
folder_api = os.path.join(folder_input, 'era5_api')
if not os.path.exists(folder_api): os.mkdir(folder_api)
folder_extract = os.path.join(folder_input, 'era5_extract')
if not os.path.exists(folder_extract): os.mkdir(folder_extract)
folder_output = 'output'
if not os.path.exists(folder_output): os.mkdir(folder_output)
folder_output = os.path.join(folder_output, '_'.join(ISO_A2))
if not os.path.exists(folder_output): os.mkdir(folder_output)

# Define the dataset
dataset_name = 'reanalysis-era5-land-monthly-means'# 'reanalysis-era5-land'

# Define the API client variable
variable_name = {
    'total_precipitation': 'tp',
    'surface_runoff': 'sro',
    'runoff': 'ro',
    'snow_depth_water_equivalent': 'sd',
    '2m_temperature': 't2m',
    'potential_evaporation': 'pev',
    'total_evaporation': 'e'
}

#
temp = '_'.join([variable_name[variable] for variable in variables_list])
downloaded_files = {iso: f'{dataset_name}_{iso}_{start_year}_{end_year}_{temp}.zip' for iso in ISO_A2}
downloaded_files = {iso: os.path.join(folder_api, file) for iso, file in downloaded_files.items()}
print(f'Files will be download in: {folder_api}, files: {downloaded_files}')

# Define the bounding box for each ISO A2 code
locations = {iso: get_bbox(iso) for iso in ISO_A2}
print(f'Locations: {locations}')

Files will be download in: input/era5_api, files: {'AO': 'input/era5_api/reanalysis-era5-land-monthly-means_AO_1950_2025_t2m_tp.zip', 'BI': 'input/era5_api/reanalysis-era5-land-monthly-means_BI_1950_2025_t2m_tp.zip', 'CM': 'input/era5_api/reanalysis-era5-land-monthly-means_CM_1950_2025_t2m_tp.zip', 'CF': 'input/era5_api/reanalysis-era5-land-monthly-means_CF_1950_2025_t2m_tp.zip', 'TD': 'input/era5_api/reanalysis-era5-land-monthly-means_TD_1950_2025_t2m_tp.zip', 'CG': 'input/era5_api/reanalysis-era5-land-monthly-means_CG_1950_2025_t2m_tp.zip', 'CD': 'input/era5_api/reanalysis-era5-land-monthly-means_CD_1950_2025_t2m_tp.zip', 'GQ': 'input/era5_api/reanalysis-era5-land-monthly-means_GQ_1950_2025_t2m_tp.zip', 'GA': 'input/era5_api/reanalysis-era5-land-monthly-means_GA_1950_2025_t2m_tp.zip', 'ST': 'input/era5_api/reanalysis-era5-land-monthly-means_ST_1950_2025_t2m_tp.zip'}
Locations: {'AO': (11.66939414300009, -18.031404723999884, 24.061714315000103, -4.391203714999932), 'BI': (28.986891723

### 2. Automatic download from the API Climate Data Store

By default, the data will be downloaded in the `climatic/era5_api` folder. If you want to change the folder, please change the `folder_api` variable above.
It will be downloaded as a zip file including `.grib`files. Includes all monthly means for the specified period.

In [8]:
years = [ str(start_year +i ) for i in range(end_year - start_year + 1)]

# Create folder if not exists
if not os.path.exists(folder_api): os.mkdir(folder_api)

for (iso, (long_west, lat_south, long_east, lat_north)) in locations.items():

    downloaded_file = downloaded_files[iso]

    print(f'Processing {iso} with bbox: {long_west}, {lat_south}, {long_east}, {lat_north}')
    if not os.path.exists(downloaded_file):
        print('Process started. Please wait the ending message ... ')
        start = timeit.default_timer()
        c = cdsapi.Client()

        c.retrieve(
            dataset_name,
            {
                #'format': 'netcdf',
                'format': 'grib',
                'product_type': 'monthly_averaged_reanalysis',
                'variable': variables_list,
                'year': years,
                'month': [ '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12' ],
                'time': '00:00',
                'area': [ lat_south, long_west, lat_north, long_east ],
            }, downloaded_file
            )

        stop = timeit.default_timer()
        print('Process completed in ', (stop - start)/60, ' minutes')
    else:
        print('File already exists.')

Processing AO with bbox: 11.66939414300009, -18.031404723999884, 24.061714315000103, -4.391203714999932
File already exists.
Processing BI with bbox: 28.9868917230001, -4.463344014999933, 30.833962443000075, -2.3030624389998877
File already exists.
Processing CM with bbox: 8.505056186000047, 1.6545512900001285, 16.20772342900011, 13.081140646000023
File already exists.
Processing CF with bbox: 14.387266072000074, 2.2364537560000457, 27.441301310000142, 11.000828349000074
File already exists.
Processing TD with bbox: 13.449183797000103, 7.4555667110001025, 23.98440637200008, 23.444719951000067
File already exists.
Processing CG with bbox: 11.114016304109821, -5.019630835999919, 18.642406860000023, 3.7082760620000528
File already exists.
Processing CD with bbox: 12.210541212000066, -13.45835052399994, 31.280446818000087, 5.375280253000085
File already exists.
Processing GQ with bbox: 5.6119897800000444, -1.475681247999944, 11.33634118700013, 3.7724063170000477
File already exists.
Proces

### 3. Extracting and formatting the data

In [9]:
dataset = {}
for iso, downloaded_file in downloaded_files.items():
    print(f'Processing {iso} with file: {downloaded_file}')
    data = extract_data(downloaded_file, step_type=True, extract_to=folder_extract)

    # Calculate resolution and add units
    calculate_resolution_netcdf(data, lat_name='latitude', lon_name='longitude')

    for var in data.data_vars:
        print(f"{var}: {data[var].attrs['units']}")

    # Convert units if necessary
    data = convert_dataset_units(data)

    # Export transformed data to NetCDF for future analysis
    data.to_netcdf(os.path.join(folder_output, f"{ os.path.splitext(os.path.basename(downloaded_file))[0]}.nc"))

    dataset.update({iso: data})


Processing AO with file: input/era5_api/reanalysis-era5-land-monthly-means_AO_1950_2025_t2m_tp.zip
Opening GRIB file: input/era5_extract/reanalysis-era5-land-monthly-means_AO_1950_2025_t2m_tp.grib
Opening GRIB file: input/era5_extract/reanalysis-era5-land-monthly-means_AO_1950_2025_t2m_tp.grib
Spatial resolution: 0.10000813008130116° lon x -0.09999999999999964° lat
Approximate spatial resolution:
10.89 km (lon_name) x -11.10 km (lat_name) at -11.23° lat
Temporal resolution: 31 days
t2m: K
tp: m
Processing BI with file: input/era5_api/reanalysis-era5-land-monthly-means_BI_1950_2025_t2m_tp.zip
Opening GRIB file: input/era5_extract/reanalysis-era5-land-monthly-means_BI_1950_2025_t2m_tp.grib
Opening GRIB file: input/era5_extract/reanalysis-era5-land-monthly-means_BI_1950_2025_t2m_tp.grib
Spatial resolution: 0.10005555555555645° lon x -0.10000000000000009° lat
Approximate spatial resolution:
11.09 km (lon_name) x -11.10 km (lat_name) at -3.41° lat
Temporal resolution: 31 days
t2m: K
tp: m
P

### 4. Visualization of climate data

In [10]:
# Plotting spatial mean timeseries for all variables
for var in variables_list:
    # Find the indirect from the variable name
    plot_spatial_mean_timeseries_all_iso(dataset, var=variable_name.get(var, var), folder=folder_output)

    if var == '2m_temperature':
        agg = 'avg'
    elif var == 'total_precipitation':
        agg = 'sum'
    else:
        agg = None

    plot_spatial_mean_timeseries_all_iso(dataset, var=variable_name.get(var, var), agg=agg, folder=folder_output)

# Plotting scatter plot of annual spatial means
scatter_annual_spatial_means(dataset, var_x='t2m', var_y='tp', folder=folder_output)

# Plotting monthly average for each variable
plot_monthly_mean(dataset, "t2m", lat_name='latitude', lon_name='longitude', folder=folder_output)
plot_monthly_mean(dataset, "tp", lat_name='latitude', lon_name='longitude', folder=folder_output)

# Plotting monthly climatology for each variable
for iso, data in dataset.items():
    print(f'Processing {iso}')
    plot_monthly_climatology_grid(data, "t2m", filename=os.path.join(folder_output, f'monthly_climatology_{iso}_{var}.pdf'))
    plot_monthly_climatology_grid(data, "tp", filename=os.path.join(folder_output, f'monthly_climatology_{iso}_{var}.pdf'))


Processing AO
Processing BI
Processing CM
Processing CF
Processing TD
Processing CG
Processing CD
Processing GQ
Processing GA
Processing ST


### 5. Checking representative regions

In [29]:
def plot_monthly_precipitation_heatmap(dataset: dict,
                                       var: str = 'tp',
                                       cmap: str = 'cividis',
                                       lat_name='latitude',
                                       lon_name='longitude',
                                       path=None):
    """
    Plot a heatmap of total monthly precipitation across countries.

    Parameters:
    -----------
    dataset : dict
        Dictionary where keys are country ISO codes or names, and values are xarray Datasets
        with a time dimension and a variable representing precipitation.

    var : str
        Name of the variable to extract (e.g., 'precip').

    cmap : str
        Colormap for the heatmap. Default is 'cividis' (colorblind-friendly).

    Returns:
    --------
    Displays a heatmap (countries x months) with total precipitation.
    """
    data_rows = []

    for iso, ds in dataset.items():
        if var not in ds:
            continue

        da = ds[var]
        spatial_mean = da.mean(dim=[lat_name, lon_name])
        # Group by calendar month and sum across time
        monthly_total = spatial_mean.groupby('time.month').mean(dim='time')

        # Convert to pandas Series (index = month, value = total precip)
        monthly_series = monthly_total.to_series()

        # Ensure all months are present (fill missing with 0)
        for month in range(1, 13):
            value = monthly_series.get(month, 0.0)
            data_rows.append({'country': iso, 'month': month, 'precip': value})

    # Create DataFrame and pivot
    df = pd.DataFrame(data_rows)
    heatmap_data = df.pivot(index='country', columns='month', values='precip').fillna(0)
    heatmap_data = heatmap_data.reindex(columns=range(1, 13))  # ensure 1–12 in order

    # Plot heatmap
    plt.figure(figsize=(12, max(4, 0.5 * len(heatmap_data))))
    sns.heatmap(heatmap_data, cmap=cmap, annot=True, fmt=".0f",
                cbar_kws={'label': 'Mean Precipitation (mm/day)'}, linewidths=0.5)

    plt.title("Monthly Mean Precipitation per Country (mm/day)")
    plt.xlabel("Month")
    plt.ylabel("Country")
    plt.xticks(
        ticks=[i + 0.5 for i in range(12)],
        labels=[calendar.month_abbr[i + 1] for i in range(12)],
        rotation=0
    )
    plt.yticks(rotation=0)
    plt.tight_layout()
    if path is not None:
        plt.savefig(path)
        plt.close()
    else:
        plt.show()


plot_monthly_precipitation_heatmap(dataset, path=os.path.join(folder_output, 'monthly_precipitation_heatmap.png'))