# **Hazard assessment for Infrastructures using Euro-Cordex datasets**

## Calculation of the Indicator "Percentiles of the Temperature"

- See our [how to use risk workflows](https://handbook.climaax.eu/notebooks/workflows_how_to.html) page for information on how to run this notebook.

## **Hazard assessment methodology**
Using these averaged datasets, we computed the percentile of the daily maximum temperature 95 and 99.9 for each time period.

We utilized outputs from 14 models within the EURO-CORDEX framework to evaluate hazards affecting infrastructure, in this notebook we used the daily maximum temperature as an indicator of the hazzard. Our analysis included three Representative Concentration Pathways (RCPs): RCP2.6, RCP4.5, and RCP8.5. To structure the future projections, Each RCP scenario was analyzed over three distinct future timeframes: 2021–2050, 2041–2070, and 2071–2100. Additionally, we used the historical period (1981–2010) as a baseline for comparison to evaluate changes in climate hazards over time.

### Analysis of the Temperature Percentiles
We computed the percentile of the daily maximum temperature 95 and 99.9 for each model, scenario, and time period. These calculations were performed for both the historical and future RCP scenarios. To quantify changes, we calculated anomalies by subtracting, for each individual model, the historical dataset from its corresponding future projection (RCP2.6, RCP4.5, and RCP8.5).

To account for uncertainties and provide a robust projection, we computed the average across all 14 models for each percentile of temperature, scenario, and time period. The ensemble averaging process involved aggregating anomalies for all models and then calculating the mean, yielding a single representative dataset for each RCP scenario and timeframe.

## **Limitation of the Euro-Cordex dataset**
The EURO-CORDEX (Coordinated Regional Climate Downscaling Experiment for Europe) project is a set of high-resolution regional climate projections for Europe, designed to support impact, adaptation, and vulnerability assessments under various climate change scenarios. The EURO-CORDEX integrate global climate model (GCM) outputs with regional climate models (RCMs), enabling the simulation of climatic patterns and extremes. The models explore different Representative Concentration Pathways (RCPs) from CMIP5 (RCP2.6, RCP4.5, RCP8.5) and Shared Socioeconomic Pathways (SSPs) from CMIP6 (SSP1-2.6, SSP5-8.5). The simulations cover historical periods (1950–2005) and future projections (2006–2100). These models are validated against observational data and reanalysis datasets

Some of the limitations:
- EURO-CORDEX offers high-resolution data (typically 0.11° ~ 12.5 km and 0.44° ~ 50 km), it may still not fully capture localized phenomena such as urban heat islands, small-scale topographic effects, and small meteorological events.
- Like all climate models, EURO-CORDEX RCMs and their driving GCMs exhibit biases compared to observed data, these Biases can vary regionally and seasonally. And may struggle to accurately simulate extreme weather events such as heatwaves, heavy precipitation, or storms.
- While the dataset captures trends in extremes, very high thresholds (>45°C or >100 mm/day rainfall) may have higher uncertainty due to limited observational data.

## Preparation work
All the EURO-CORDEX models used in this workflow are freely available on copernicus C3S platform (https://cds.climate.copernicus.eu/datasets/projections-cordex-domains-single-levels?tab=overview), and Downloaded through the implemented API, the data were then processed to ensure that the grid type was consistent across all models and to fill any gaps in the dates. Here an example for one model

### Select area of interest
Before downloading the data, we will define the coordinates of the area of interest, for this workflow we selected the Italy region. Based on the shapefile of the country we will be able to clip the datasets for further processing, and display hazard and damage maps for this area.

### Load libraries

In [2]:
import os
import xarray as xr
import xclim
import re

from collections import defaultdict

### Create the directory structure

In [None]:
# Define paths
nc_files = "/climax/data/cordex"
general_path = "/climax/indicators/cordex"
subfolders = ['historical','rcp26', 'rcp45', 'rcp85']

# Temperature thresholds
percentiles = ['0.95', '0.999']

# Time ranges to process
rcp_time_ranges = [('2021', '2050'), ('2041', '2070'), ('2071', '2100')]
historical_time_range = [('1981', '2010')]

In [None]:
# Function to process each NetCDF file for a given time range
def process_file(file_path, percentile, save_path, start_year, end_year):
    print("---------------------------------------------------")
    print(f"Processing {file_path} for time range {start_year}-{end_year} and {percentile}")
    ds = xr.open_dataset(file_path)

    # Select daily max temperature for the given time range
    ds_sliced = ds.sel(time=slice(start_year, end_year))
    dailyMaxTemp = (ds_sliced['tasmax'] - 273.15).resample(time='D').max()
    dailyMaxTemp.attrs['units'] = 'C'

    # Get the minimum and maximum values
    min_value = dailyMaxTemp.min(skipna=True).item()  # Convert to a scalar with .item()
    max_value = dailyMaxTemp.max(skipna=True).item()

    # Print the results
    print(f"Temp min value: {min_value}")
    print(f"Temp max value: {max_value}")

    # Calculate the percentiles across all time steps
    dailyMaxTemp_nonan = dailyMaxTemp.dropna(dim='time', how='all')
    calc_percentile = dailyMaxTemp_nonan.quantile(percentile, dim='time')


    # Create the new filename with the time range and threshold information
    filename = os.path.basename(file_path)  # Extract original filename
    file_name_no_ext = os.path.splitext(filename)[0]  # Remove extension
    percentile_numb = percentile.split('.')[1]
    new_filename = f"{file_name_no_ext}_p{percentile_numb}_{start_year}-{end_year}.nc"

    # Save the result to the new file path
    calc_percentile.to_netcdf(os.path.join(save_path, new_filename))

    # Get the minimum and maximum values
    min_value_indic = calc_percentile.min(skipna=True).item()  # Convert to a scalar with .item()
    max_value_indic = calc_percentile.max(skipna=True).item()

    # Print the results
    print(f"Minimum percentile {percentile}: {min_value_indic}")
    print(f"Maximum percentile {percentile}: {max_value_indic}")

    print(f"Saved {new_filename} to {save_path}")

    return os.path.join(save_path, new_filename)  # Return path of processed file


In [None]:
# Loop through each subfolder (rcp26, rcp45, rcp85)
for subfolder in subfolders:
    print(subfolder)
    folder_path = os.path.join(nc_files, subfolder)
    save_subfolder = os.path.join(general_path, 'tempPercentiles', subfolder)

    # Create the destination subfolder if it doesn't exist
    os.makedirs(save_subfolder, exist_ok=True)

    # Choose the time ranges based on the subfolder
    if subfolder == 'historical':
        time_ranges = historical_time_range
    else:
        time_ranges = rcp_time_ranges

    # Initialize a dictionary to store processed files per threshold and time range
    processed_files_by_threshold = {percentile: [] for percentile in percentiles}

    # Loop through each NetCDF file in the subfolder
    for file in os.listdir(folder_path):
        file_path = os.path.join(folder_path, file)

        # Check if it's a NetCDF file (usually ends with .nc)
        if file.endswith('.nc'):
            # Loop through the temperature thresholds
            for percentile in percentiles:
                # Loop through the defined time ranges
                for start_year, end_year in time_ranges:
                    print(f"Processing Percentile {percentile} for time range {start_year}-{end_year}")

                    # Process and save the file with the new name for each time range
                    processed_file_path = process_file(file_path, percentile, save_subfolder, start_year, end_year)

print("Percentile calculation complete!")

### **Anomly Calculation**

In [None]:
# Directories
historical_dir = "/climax/indicators/cordex/tempPercentiles/historical"
rcp26_dir = "/climax/indicators/cordex/tempPercentiles/rcp26"
rcp45_dir = "/climax/indicators/cordex/tempPercentiles/rcp45"
rcp85_dir = "/climax/indicators/cordex/tempPercentiles/rcp85"

output_dir = "/climax/indicators/cordex/tempPercentiles/anomalies"

# Create the output directory if does not exists
os.makedirs(output_dir, exist_ok=True)

In [None]:
# Function to parse filenames and extract key components
def parse_filename(filename):
    pattern = r"tasmax_EUR-11_(.+?)_(historical|rcp26|rcp45|rcp85)_r\d+i\d+p\d+_(.+?)_day_\d+_p(\d+)_([\d-]+).nc"
    match = re.match(pattern, filename)
    if match:
        model = match.group(1)
        scenario = match.group(2)
        rcm = match.group(3)
        percentile = match.group(4)
        time_period = match.group(5)
        return model, scenario, rcm, percentile, time_period
    return None

In [None]:
# Load filenames into dictionaries
historical_files = {parse_filename(f): os.path.join(historical_dir, f) for f in os.listdir(historical_dir) if f.endswith(".nc")}
rcp26_files = {parse_filename(f): os.path.join(rcp26_dir, f) for f in os.listdir(rcp26_dir) if f.endswith(".nc")}
rcp45_files = {parse_filename(f): os.path.join(rcp45_dir, f) for f in os.listdir(rcp45_dir) if f.endswith(".nc")}
rcp85_files = {parse_filename(f): os.path.join(rcp85_dir, f) for f in os.listdir(rcp85_dir) if f.endswith(".nc")}

In [None]:
# Function to perform subtraction with xarray
def subtract_and_save(historical_file, future_file, output_file):
    # Load the datasets using xarray
    historical_ds = xr.open_dataset(historical_file)
    future_ds = xr.open_dataset(future_file)
    print(historical_file)
    print(future_file)

    # Perform subtraction for the 'tasmax' variable
    diff = future_ds['tasmax'] - historical_ds['tasmax']

    # Create a new dataset with the diff
    diff_ds = diff.to_dataset(name='tasmax')

    # Copy attributes if needed
    diff_ds.attrs = future_ds.attrs

    # Save the result to a new NetCDF file
    diff_ds.to_netcdf(output_file)
    print(f"Saved: {output_file}")

In [None]:
# Match files and process
for key, hist_file in historical_files.items():
    if key is None:
        continue
    model, _, rcm, percentile, _ = key

    # Iterate over all future scenarios
    for future_files in [rcp26_files, rcp45_files, rcp85_files]:
        for f_key, fut_file in future_files.items():
            if f_key is None:
                continue
            fut_model, scenario, fut_rcm, fut_percentile, time_period = f_key

            # Match by model, percentile, and RCM
            if model == fut_model and rcm == fut_rcm and percentile == fut_percentile:
                output_filename = f"tasmax_EUR-11_{model}_{scenario}_diff_{rcm}_p{percentile}_{time_period}.nc"
                output_path = os.path.join(output_dir, output_filename)
                subtract_and_save(hist_file, fut_file, output_path)

### **Outputs and Hazard Assessment**

For each period, and RCP scenarios we calculated the average across all 14 models, producing a single representative dataset for each timeframe, including both the historical and RCP scenarios. The final outputs of the analysis include both individual model anomalies and ensemble-averaged datasets for each RCP scenario and time period. These datasets provide a comprehensive view of potential future hazards under different climate scenarios, offering insights into changes in percentile of temperature relative to the historical baseline. The averaged datasets were used to visualize the spatial distribution and magnitude of these hazards, as demonstrated in the notebook 05_cordex_precipPercentile_plots.ipynb

In [3]:
# Directory containing the anomaly files
anomalies_dir = "/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies"
output_dir = "/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/averaged_ensembles"

# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)

In [4]:
# Function to parse filenames and extract key components
def parse_filename(filename):
    pattern = r"tasmax_EUR-11_(.+?)_(rcp26|rcp45|rcp85)_diff_(.+?)_p(\d+)_([\d-]+).nc"
    match = re.match(pattern, filename)
    if match:
        model = match.group(1)
        scenario = match.group(2)
        rcm = match.group(3)
        percentile = match.group(4)
        time_period = match.group(5)
        return model, scenario, rcm, percentile, time_period
    return None

# Group files by scenario, percentile, and time period
files = [f for f in os.listdir(anomalies_dir) if f.endswith(".nc")]
grouped_files = {}

for f in files:
    parsed = parse_filename(f)
    if parsed:
        _, scenario, _, percentile, time_period = parsed
        key = (scenario, percentile, time_period)
        if key not in grouped_files:
            grouped_files[key] = []
        grouped_files[key].append(os.path.join(anomalies_dir, f))

In [None]:
# Function to average files and save the ensemble
def average_ensemble(files, output_file):
    print(files)
    print("-------------------------")
    datasets = [xr.open_dataset(f)['tasmax'] for f in files]  # Open and load the 'tasmax' variable
    ensemble_mean = xr.concat(datasets, dim='model').mean(dim='model')  # Average across models

    # Assign coordinates from the first dataset
    first_ds = xr.open_dataset(files[0])
    ensemble_mean = ensemble_mean.assign_coords({'lon': first_ds['lon'], 'lat': first_ds['lat']})

    # Save the averaged dataset
    ensemble_mean_ds = ensemble_mean.to_dataset(name='tasmax')
    ensemble_mean_ds.to_netcdf(output_file)
    print(f"Averaged ensemble saved: {output_file}")


# Process each group
for key, file_list in grouped_files.items():
    scenario, percentile, time_period = key
    output_filename = f"tasmax_EUR-11_{scenario}_ensemble_p{percentile}_{time_period}.nc"
    output_path = os.path.join(output_dir, output_filename)
    average_ensemble(file_list, output_path)

### **Standard Deviation Calculation**

In [5]:
output_dir_std = "/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/std_ensembles"

# Ensure the output directory exists
os.makedirs(output_dir_std, exist_ok=True)

In [6]:
# Function to average files and save the ensemble
def calculate_std_ensemble(files, output_file):
    print(files)
    datasets = [xr.open_dataset(f)['tasmax'].drop_vars('height', errors='ignore') for f in files]
    ensemble_std = xr.concat(datasets, dim='model').std(dim='model')

    # Assign coordinates from the first dataset
    first_ds = xr.open_dataset(files[0])
    ensemble_std = ensemble_std.assign_coords({'lon': first_ds['lon'], 'lat': first_ds['lat']})

    # Save the averaged dataset
    ensemble_std_ds = ensemble_std.to_dataset(name='tasmax')
    ensemble_std_ds.to_netcdf(output_file)
    print(f"Standard deviation ensemble saved: {output_file}")
    print("-------------------------")
# Process each group
for key, file_list in grouped_files.items():
    scenario, threshold, time_period = key
    output_filename = f"tasmax_EUR-11_{scenario}_ensemble_temp_std{threshold}_{time_period}.nc"
    output_path = os.path.join(output_dir_std, output_filename)
    calculate_std_ensemble(file_list, output_path)

['/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_ICHEC-EC-EARTH_rcp26_diff_SMHI-RCA4_v1_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_ICHEC-EC-EARTH_rcp26_diff_DMI-HIRHAM5_v2_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_NCC-NorESM1-M_rcp26_diff_GERICS-REMO2015_v1_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_ICHEC-EC-EARTH_rcp26_diff_KNMI-RACMO22E_v1_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_MPI-M-MPI-ESM-LR_rcp26_diff_SMHI-RCA4_v1a_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_MOHC-HadGEM2-ES_rcp26_diff_SMHI-RCA4_v1_p999_2071-2100.nc', '/work/cmcc/dg07124/climax/indicators/cordex2/tempPercentiles/anomalies/tasmax_EUR-11_CNRM-CERFACS-CNRM-CM5_rcp26_

## Contributors
- Giuseppe Giugliano (giuseppe.giugliano@cmcc.it)
- Carmela de Vivo (carmela.devivo@cmcc.it)
- Daniela Quintero (daniela.quintero@cmcc.it)