# Uncertainty in thermal comfort indices derived from ERA5

Production date: 2025-MM-DD.

**Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch.**

Dataset version: 1.1.

Produced by: C3S2_521 contract.

## 🌍 Use case: ERA5-HEAT

## ❓ Quality assessment question
* **In most cases there should be one question listed here in bold**

Introduction:

## 📢 Quality assessment statement

```{admonition} These are the key outcomes of this assessment
:class: note
* Finding 1
* Finding 2
* Finding 3
* etc
```

## 📋 Methodology

**Thermal comfort indices derived from ERA5 reanalysis** (*ERA5-HEAT*; [doi 10.24381/cds.553b7518](https://doi.org/10.24381/cds.553b7518)).

The analysis and results are organised in the following steps, which are detailed in the sections below: 

**[](section-setup)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

**[](section-consistency)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

**[](section-uncertainty)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

Any further notes on the method could go here (explanations, caveats or limitations).

## 📈 Analysis and results

(section-setup)=
### 1. Code setup

#### Imports

In [None]:
# Input / Output
from pathlib import Path
import earthkit.data as ekd
import warnings

# General data handling
import numpy as np
np.seterr(divide="ignore")  # Ignore divide-by-zero warnings
import pandas as pd
import xarray as xr
from functools import partial
from dask.array.core import PerformanceWarning
warnings.simplefilter(action="ignore", category=PerformanceWarning)

# Visualisation
import earthkit.plots as ekp
from earthkit.plots.styles import Style
import matplotlib.pyplot as plt
plt.rcParams["grid.linestyle"] = "--"
from tqdm import tqdm  # Progress bars

# Visualisation in Jupyter book -- automatically ignored otherwise
try:
    from myst_nb import glue
except ImportError:
    glue = None

#### Define thermal comfort indices

In [None]:
## Constants
# Variable names (to aid with grib/netcdf consistency)
Ta = "t2m"
Td = "d2m"
va = "va"
u10, v10 = "u10", "v10"
L, S = "str", "ssr"

# Physical constants
σ = 5.67e-8  # W m-2 K-4
εp = 0.97
air = 0.7

##### Mean radiant temperature (MRT)
[Di Napoli et al. (2020)](https://doi.org/10.1007/s00484-020-01900-5):

In [None]:
## MRT

##### Universal Thermal Climate Index (UTCI)
[Di Napoli et al. (2021)](https://doi.org/10.1002/gdj3.102):

In [None]:
## UTCI

#### Helper functions

##### General

In [None]:
# Type hints for helper functions
from typing import Callable, Optional, Iterable

# For pre-defining functions
from functools import partial

##### Downloading data

In [None]:
## Data downloading
def domain_to_request(domain: ekp.geo.domains.Domain) -> dict:
    """ From an earthkit-plots domain, generate a request for earthkit-data / cdsapi. """
    bbox = domain.bbox.to_latlon_bbox()

    # Round
    north = int(np.ceil(bbox.north) + 1)
    south = int(np.floor(bbox.south) - 1)
    west = int(np.floor(bbox.west) - 1)
    east = int(np.ceil(bbox.east) + 1)
    
    area = [north, west, south, east]
    return {"area": area}

##### Data (pre-)processing

In [None]:
## Wind
def add_10m_wind(data: xr.Dataset) -> xr.Dataset:
    """ Modifies a dataset `data` to add 10m wind speed from its U,V components. """
    data = data.assign({va: xr.ufuncs.sqrt(data[u10]**2 + data[v10]**2)})
    data[va] = data[va].assign_attrs({"long_name": "10 metre wind", "units": data[u10].units})
    return data

In [None]:
## Loops for convenience
def loop_over_(*args, progress=True, **kwargs) -> tqdm:
    """ Generate a tqdm progressbar; inverts `disable` keyword """
    return tqdm(*args, disable=not progress, leave=False, **kwargs)

def loop_over_ensemble_members(data: xr.Dataset, **tqdm_kwargs) -> tqdm:
    """ Loop over ensemble members in `data`, with a progress bar. """
    return loop_over_(data.groupby("number"), unit="member", **tqdm_kwargs)

def loop_over_data_variables(data: xr.Dataset, **tqdm_kwargs) -> tqdm:
    """ Loop over variable keys in `data`, with a progress bar. """
    return loop_over_(data.data_vars.keys(), unit="variable", **tqdm_kwargs)

##### Visualisation

In [None]:
# Styles for variables

# Stats
_style_std = {"vmin": 0, "extend": "max"}

# Temperature
# note currently in K
_style_t = {"cmap": plt.cm.YlOrBr.resampled(10), "vmin": 240, "vmax": 340, "extend": "both"}
_style_t_std = _style_std | {"cmap": plt.cm.YlOrBr.resampled(10), "vmax": 10}

# Wind
_style_wind = {"cmap": plt.cm.Purples.resampled(10), "vmin": 0, "vmax": 25, "extend": "max"}
_style_wind_vector = {"cmap": plt.cm.PRGn_r.resampled(15), "vmin": -_style_wind["vmax"], "vmax": _style_wind["vmax"], "extend": "both"}
_style_wind_std = _style_std | {"cmap": plt.cm.Purples.resampled(10), "vmax": 5}

# Radiation
_style_solar   = {"cmap": plt.cm.YlOrRd.resampled(10), "vmin": 0, "vmax": 1e7, "extend": "max"}
_style_thermal = {"cmap": plt.cm.YlOrRd.resampled(10), "vmin": 0, "vmax": 6e5, "extend": "max"}
_style_solar_std   = _style_std | {"cmap": plt.cm.YlOrRd.resampled(10), "vmax": 2.5e6}
_style_thermal_std = _style_std | {"cmap": plt.cm.YlOrRd.resampled(10), "vmax": 6e5}

# Thermal indices
_style_utci = {"cmap": plt.cm.Spectral_r.resampled(11), "vmin": -13, "vmax": 46, "extend": "both"}
_style_utci_std = _style_std | {"cmap": plt.cm.Spectral_r.resampled(11), "vmax": 5}

# Individual styles
# Set up like this so they can still be edited individually
styles = {
    Ta: Style(**_style_t),            f"{Ta}_std": Style(**_style_t_std),
    Td: Style(**_style_t),            f"{Td}_std": Style(**_style_t_std),
    va: Style(**_style_wind),         f"{va}_std": Style(**_style_wind_std),
    u10: Style(**_style_wind_vector), f"{u10}_std": Style(**_style_wind_std),
    v10: Style(**_style_wind_vector), f"{v10}_std": Style(**_style_wind_std),
    S: Style(**_style_solar),         f"{S}_std": Style(**_style_solar_std),
    L: Style(**_style_thermal),       f"{L}_std": Style(**_style_thermal_std),
    "utci": Style(**_style_utci),      "utci_std": Style(**_style_utci_std),
}

# Apply general settings
for style in styles.values():
    style.normalize = False

In [None]:
# Visualisation: Helper functions, general
def _glue_or_show(fig: plt.Figure, glue_label: Optional[str]=None) -> None:
    """
    If `glue` is available, glue the figure using the provided label.
    If not, display the figure in the notebook.
    """
    try:
        glue(glue_label, fig, display=False)
    except TypeError:
        plt.show()
    finally:
        plt.close()

def _add_textbox_to_subplots(text: str, *axs: Iterable[plt.Axes | ekp.Subplot], right=False) -> None:
    """ Add a text box to each of the specified subplots. """
    # Get the plt.Axes for each ekp.Subplot
    axs = [subplot.ax if isinstance(subplot, ekp.Subplot) else subplot for subplot in axs]

    # Set up location
    x = 0.95 if right else 0.05
    horizontalalignment = "right" if right else "left"

    # Add the text
    for ax in axs:
        ax.text(x, 0.95, text, transform=ax.transAxes,
        horizontalalignment=horizontalalignment, verticalalignment="top",
        bbox={"facecolor": "white", "edgecolor": "black", "boxstyle": "round",
              "alpha": 1})

def _sharexy(axs: np.ndarray) -> None:
    """ Force all of the axes in axs to share x and y with the first element. """
    main_ax = axs.ravel()[0]
    for ax in axs.ravel():
        ax.sharex(main_ax)
        ax.sharey(main_ax)

def _symmetric_xlim(ax: plt.Axes) -> None:
    """ Adjust the xlims for one Axes to be symmetric, based on existing values. """
    current = ax.get_xlim()
    current = np.abs(current)
    maxlim = np.max(current)
    newlim = (-maxlim, maxlim)

    ax.set_xlim(newlim)

In [None]:
def decorate_fig(fig: ekp.Figure, *, title: Optional[str]="") -> None:
    """ Decorate an earthkit figure with land, coastlines, etc. """
    # Add progress bar because individual steps can be very slow for large plots
    with tqdm(total=5, desc="Decorating", leave=False) as progressbar:
        fig.land()
        progressbar.update()
        fig.coastlines()
        progressbar.update()
        fig.borders()
        progressbar.update()
        fig.gridlines(linestyle=plt.rcParams["grid.linestyle"])
        progressbar.update()
        fig.title(title)
        progressbar.update()

(section-consistency)=
### 2. Consistency between ERA5 and ERA5-HEAT

Note this is meant as a sanity check on the UTCI calculation, not a full consistency analysis (that can be its own notebook if desired).
Just download and process ERA5 in its native resolution for some subset of space and time, compare that to ERA5-HEAT, and use that to check that you're calculating the indices correctly.

#### General setup
This notebook uses [earthkit-data](https://github.com/ecmwf/earthkit-data) to download files from the CDS.
If you intend to run this notebook multiple times, it is highly recommended that you [enable caching](https://earthkit-data.readthedocs.io/en/latest/guide/caching.html) to prevent having to download the same files multiple times.

We will be downloading multiple datasets in this notebook.
In this section, we define the parameters common to all datasets: time and space.
This way, these only need to be changed in one place if you wish to modify the notebook for your own use case.

In this example, we will be looking at data for Europe every day in
June–August
2024.
We will also do a time series comparison in one site within this area.
These settings can easily be changed by editing the variables in the following cell:

In [None]:
# domain = ekp.geo.domains.Domain.from_string("Europe")
domain = ekp.geo.domains.Domain.from_string("Italy")  # Smaller domain for testing
year = 2024
months = [6, 7, 8]

In [None]:
request_domain = domain_to_request(domain)

request_time = {
    "year": [f"{year}"],
    "month": [f"{month:02}" for month in months],
    "day": [f"{d:02}" for d in range(1, 32)],
}

#### ERA5-HEAT

In [None]:
# Download ERA5-HEAT
era5heat_ID = "derived-utci-historical"

request_era5heat = {
    "variable": [
        "universal_thermal_climate_index",
        "mean_radiant_temperature"
    ],
    "version": "1_1",
    "product_type": "consolidated_dataset",
} | request_domain | request_time

ds_era5heat = ekd.from_source("cds", era5heat_ID, request_era5heat)

In [None]:
data_era5heat = ds_era5heat.to_xarray(compat="equals")
data_era5heat

#### ERA5

[Di Napoli et al. (2021)](https://doi.org/10.1002/gdj3.102):
* 2 m air temperature (Ta)
* 2 m dew point temperature (Td)
* wind speed at 10 m above ground level (va)
* solar radiation
* thermal radiation

at the Earth's surface

In [None]:
# Download ERA5
# General setup
era5_ID = "reanalysis-era5-single-levels"

request_era5_files = {
    "data_format": "netcdf",
    "download_format": "zip",
}

request_era5_all = {
    "variable": [
        "10m_u_component_of_wind",
        "10m_v_component_of_wind",
        "2m_dewpoint_temperature",
        "2m_temperature",
        "surface_net_thermal_radiation",
        "surface_net_solar_radiation",
    ],
}

In [None]:
# Download ERA5
# Reanalysis
request_reanalysis = {
    "product_type": ["reanalysis"],
    "time": [f"{h:02}:00" for h in range(0, 24)],
}

request_era5_reanalysis_all = request_reanalysis | request_era5_all | request_era5_files | request_time | request_domain

ds_era5 = ekd.from_source("cds", era5_ID, request_era5_reanalysis_all)

In [None]:
data_era5 = ds_era5.to_xarray(compat="equals")

In [None]:
data_era5

In [None]:
data_era5 = add_10m_wind(data_era5)

In [None]:
data_era5

In [None]:
# Calculate indices

#### Comparison

In [None]:
date_time = f"{year}0831T12"  # 31 August, noon
data_era5heat_singleday = data_era5heat.sel(time=date_time)
data_era5_singleday = data_era5.sel(valid_time=date_time)

In [None]:
# Show side-by-side
fig = ekp.Figure(rows=1, columns=3, size=(10, 10))

# Plot datasets
subplot = fig.add_map(domain=domain)
subplot.grid_cells(data_era5heat_singleday, z="utci", style=styles["utci"])
subplot.legend()
subplot.ax.set_title("ERA5-HEAT")

subplot = fig.add_map(domain=domain)
subplot.grid_cells(data_era5_singleday, z=Ta, style=styles[Ta])
subplot.legend()
subplot.ax.set_title("ERA5")

# Show differences

# Decorate figures
decorate_fig(fig)

# Uncomment if running this notebook yourself:
# Show result
plt.show()

# Uncomment if building the Jupyter-book web version:
fig = fig.fig  # Extract matplotlib fig from earthkit fig
glue("fig_geo", fig, display=False)
plt.close()

In [None]:
# Calculate difference metrics

(section-uncertainty)=
### 3. Uncertainty

Apply the indices to the ERA5-EDA ensemble.

In [None]:
# Download ERA5 ensemble
request_time_ensemble = {
    "time": [f"{h:02}:00" for h in range(0, 22, 3)],
} | request_time

request_ensemble_members = {
    "product_type": ["ensemble_members"],  
}

request_ensemble_stats = {
    "product_type": [
        "ensemble_mean",
        "ensemble_spread"
    ],  
}

request_era5_ensemble_members = request_ensemble_members | request_era5_all | request_era5_files | request_time_ensemble | request_domain
# request_era5_ensemble_stats   = request_era5 | request_ensemble_stats   | request_domain | request_time_ensemble

ds_era5_members = ekd.from_source("cds", era5_ID, request_era5_ensemble_members)
# ds_era5_stats   = ekd.from_source("cds", era5_ID, request_era5_ensemble_stats)

In [None]:
ensemble_members = ds_era5_members.to_xarray()

In [None]:
ensemble_members = add_10m_wind(ensemble_members)

In [None]:
ensemble_members

In [None]:
ensemble_singleday = ensemble_members.sel(valid_time=date_time)

In [None]:
ensemble_singleday[S].max().values

In [None]:
# Show side-by-side
n_member, n_var = len(ensemble_members["number"]), len(ensemble_members.data_vars)
fig = ekp.Figure(rows=n_member, columns=n_var, size=(2*n_var, 2*n_member))

# Loop over ensemble members
for n, mem in loop_over_ensemble_members(ensemble_singleday):
    for var in loop_over_data_variables(ensemble_singleday, progress=False):
        # Plot single variable
        subplot = fig.add_map(domain=domain)
        subplot.grid_cells(mem, z=var, style=styles[var])

# Legend at the bottom
for subplot in fig.subplots[-n_var:]:
    subplot.legend()

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

In [None]:
# ensemble_stats = ds_era5_stats.to_xarray()
# ensemble_stats

In [None]:
# Calculate indices

In [None]:
# Show side-by-side

In [None]:
ensemble_mean = ensemble_members.mean(dim="number")
ensemble_std  = ensemble_members.std(dim="number")  \
                                .rename_vars({var: var+"_std" for var in loop_over_data_variables(ensemble_mean)})

ensemble_stats = xr.merge([ensemble_mean, ensemble_std], compat="equals", join="outer")

In [None]:
ensemble_stats_singleday = ensemble_stats.sel(valid_time=date_time)

In [None]:
# Show mean and spread
n_stats, n_var = 2, len(ensemble_mean.data_vars)
fig = ekp.Figure(rows=n_var, columns=n_stats, size=(3*n_stats, 2*n_var))

# Loop over variables
for var in loop_over_data_variables(ensemble_singleday):
    # Setup
    var_std = var+"_std"

    # Plot single variable
    subplot_mean, subplot_std = [fig.add_map(domain=domain) for _ in range(2)]
    subplot_mean.grid_cells(ensemble_stats_singleday, z=var, style=styles[var])
    subplot_std.grid_cells(ensemble_stats_singleday, z=var_std, style=styles[var_std])

    # Legend on either side
    subplot_mean.legend(location="left")
    subplot_std.legend(location="right")

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

In [None]:
# Show comparison to high-resolution ERA5 / ERA5-HEAT

## ℹ️ If you want to know more

### Key resources
The CDS catalogue entries for the data used were:
* Thermal comfort indices derived from ERA5 reanalysis (ERA5-HEAT): [derived-utci-historical](https://doi.org/10.24381/cds.553b7518)
* ERA5 hourly data on single levels from 1940 to present: [reanalysis-era5-single-levels](https://doi.org/10.24381/cds.adbb2d47)

Code libraries used:
* [earthkit](https://github.com/ecmwf/earthkit)
  * [earthkit-data](https://github.com/ecmwf/earthkit)
  * [earthkit-plots](https://github.com/ecmwf/earthkit-plots)

More about thermal comfort indices:

More about reanalysis data:
* [The ERA5 global reanalysis](https://doi.org/10.1002/qj.3803)

More about heatwaves:
* [](../Applications/application_pulse_extreme-events_q01)

### References