<img src='./img/LogoWekeo_Copernicus_RGB_0.png' align='right' width='20%'></img>

# Tutorial on creating a climate index for wind chill
In this tutorial we will plot a map of wind chill over Europe using regional climate reanalysis data (UERRA) of wind speed and temperature. From the WEkEO Jupyterhub we will download this data from the WEkEO HDA API Client. The tutorial comprises the following steps:

1. [Search and download](#search_download) regional climate reanalysis data (UERRA) of 10m wind speed and 2m temperature.
2. [Read data](#read_data): Once downloaded, we will read and understand the data, including its variables and coordinates.
3. [Calculate wind chill index](#wind_chill): We will calculate the wind chill index from the two parameters of wind speed and temperature, and view a map of average wind chill over Europe.
4. [Calculate wind chill with ERA5](#era5): In order to assess the reliability of the results, repeat the process with ERA5 reanalysis data and compare the results with those derived with UERRA.

<img src='./img/climate_indices.png' align='center' width='100%'></img>

## <a id='search_download'></a>1. Search and download data

Before we begin we must prepare our environment. This includes installing the HDA Application Programming Interface (API) Client, and importing the various python libraries that we will need.

#### Import libraries

We will be working with data in NetCDF format. To best handle this data we need a number of libraries for working with multidimensional arrays, in particular Xarray. We will also need libraries for plotting and viewing data, in particular Matplotlib and Cartopy.

In [None]:
# Libraries for working with multidimensional arrays
import numpy as np
import xarray as xr
import os
import glob

# Libraries for plotting and visualising data
import matplotlib.path as mpath
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

#### Install the WEkEO HDA client

The WEkEO HDA client is a python based library. To install the  HDA API Client, run the following command. We use an exclamation mark to pass the command to the shell (not to the Python interpreter).

In order to install the WEkEO HDA client via the package management system pip, you have to running on Unix/Linux the command shown below.

In [None]:
pip install -U hda

Please verify the following requirements are installed before skipping to the next step:
   - Python 3
   - requests
   - tqdm

#### Load WEkEO HDA client

The hda client provides a fully compliant Python 3 client that can be used to search and download products using the Harmonized Data Access WEkEO API.
HDA is RESTful interface allowing users to search and download WEkEO datasets.
Documentation about its usage can be found at the <a href='https://www.wekeo.eu/' target='_blank'>WEkEO website</a>.

In [None]:
from hda import Client

### <a id='wekeo_search'></a>2. Search for datasets on WEkEO

Under <a href='https://wekeo.eu/data?view=catalogue' target='_blank'>WEkEO DATA</a>. Clicking the + to add a layer, opens a catalogue search. Here you can use free text, or you can use the filter options on the left to refine your search and look by satellite plaform, sensor, Copernicus service, area (region of interest), general time period (past or future), as well as through a variety of flags.

You can click on the dataset you are interested in and you will be guided to a range of details including the dataset temporal and spatial extent, collection ID, and metadata.

Now search for the product `UERRA regional reanalysis for Europe on single levels from 1961 to 2019`. You can find it more easily by selecting 'UERRA' in the 'COPERNICUS SERVICE' filter group. 

Once you have found it, select 'Details' to read the dataset description.

<br>

<div style='text-align:center;'>
<figure><img src='./img/WEKEO_UERRA_data.png' width='70%' />
    <figcaption><i>WEkEO interface to search for datasets</i></figcaption>
</figure>
</div>

The dataset description provides the following information:
- **Abstract**, containing a general description of the dataset,
- **Classification**, including the Dataset ID 
- **Resources**, such as a link to the Product Data Format Specification guide, and JSON metadata
- **Contacts**, where you can find further information about the data source from its provider.  

You need the `Dataset ID` to request data from the Harmonised Data Access API. 

<br>

<div style='text-align:center;'>
<figure><img src='./img/UERRA_info.png' width='50%' />
    <figcaption><i>Dataset information on WEkEO</i></figcaption>
</figure>
</div>
<br>

Let's store the Dataset ID as a variable called `dataset_id` to be used later.

In [None]:
dataset_id = "EO:ECMWF:DAT:REANALYSIS_UERRA_EUROPE_SINGLE_LEVELS"

Now select `Add to map` in the data description to add the selected dataset to the list of layers in your map view. Once the dataset appears as a layer, select the `subset and download` icon. This will enable you to specify the variables, temporal and in some cases geographic extent of the data you would like to download. Select the dataset information and then select `NetCDF` as format.

Now select `Show API request`. This will show the details of your selection in `JSON` format. If you now select `Copy`, you can copy these details to the clipboard then paste it either into a text file to create a `JSON` file (see example [here](./SeaLevel_data_descriptor.json)), or paste it directly into the cell below.

The Harmonised Data Access API can read this information, which is in the form of a dictionary.

<br>

<div style='text-align:center;'>
<figure><img src='./img/WEKEO_UERRA_params_json.png' width='60%' />
    <figcaption><i>Displaying a JSON query from a request made to the Harmonised Data Access API through the data portal</i></figcaption>
</figure>
</div>
<br>

#### Configure the WEkEO API Authentication

In order to interact with WEkEO's Harmonised Data Access API, each user first makes sure the file "$HOME/.hdarc" exists with the URL to the API end point and your user and password.

For example, to search for the file .hdarc in the $HOME diretory, the user would open a terminale and run the following command:

Then he could copy the code below in the file "$HOME/.hdarc" (in your Unix/Linux environment) and adapt the following template with the credentials of your WEkEO account:

If he doesn't have a WEkEO account, please self register at the <a href='https://my.wekeo.eu/web/guest/user-registration' target='_blank'>WEkEO registration page</a>.

#### Load data descriptor file and download data

The Harmonised Data Access API can read your data request from a dictionary. In this dictionary, you can describe the dataset you are interested in downloading.

In [4]:
data = {
  "dataset_id": "EO:ECMWF:DAT:REANALYSIS_UERRA_EUROPE_SINGLE_LEVELS",
  "origin": "uerra_harmonie",
  "variable": "10m_wind_speed",
  "year": [
    "1998",
    "1999",
    "2000",
    "2001",
    "2002",
    "2003",
    "2004",
    "2005",
    "2006",
    "2007",
    "2008",
    "2009",
    "2010",
    "2011",
    "2012",
    "2013",
    "2014",
    "2015",
    "2016",
    "2017",
    "2018"
  ],
  "month": [
    "12"
  ],
  "day": [
    "15"
  ],
  "time": [
    "12:00"
  ],
  "format": "netcdf"
}
data

{'dataset_id': 'EO:ECMWF:DAT:REANALYSIS_UERRA_EUROPE_SINGLE_LEVELS',
 'origin': 'uerra_harmonie',
 'variable': '10m_wind_speed',
 'year': [
    "1998",
    "1999",
    "2000",
    "2001",
    "2002",
    "2003",
    "2004",
    "2005",
    "2006",
    "2007",
    "2008",
    "2009",
    "2010",
    "2011",
    "2012",
    "2013",
    "2014",
    "2015",
    "2016",
    "2017",
    "2018"
  ],
 'month': ['12'],
 'day': ['15'],
 'time': ['12:00'],
 'format': 'netcdf'}

In [5]:
data_2 = {
  "dataset_id": "EO:ECMWF:DAT:REANALYSIS_UERRA_EUROPE_SINGLE_LEVELS",
  "origin": "uerra_harmonie",
  "variable": "2m_temperature",
  "year": [
    "1998",
    "1999",
    "2000",
    "2001",
    "2002",
    "2003",
    "2004",
    "2005",
    "2006",
    "2007",
    "2008",
    "2009",
    "2010",
    "2011",
    "2012",
    "2013",
    "2014",
    "2015",
    "2016",
    "2017",
    "2018"
  ],
  "month": [
    "12"
  ],
  "day": [
    "15"
  ],
  "time": [
    "12:00"
  ],
  "format": "netcdf"
}
data_2

{'dataset_id': 'EO:ECMWF:DAT:REANALYSIS_UERRA_EUROPE_SINGLE_LEVELS',
 'origin': 'uerra_harmonie',
 'variable': '2m_temperature',
  "year": [
    "1998",
    "1999",
    "2000",
    "2001",
    "2002",
    "2003",
    "2004",
    "2005",
    "2006",
    "2007",
    "2008",
    "2009",
    "2010",
    "2011",
    "2012",
    "2013",
    "2014",
    "2015",
    "2016",
    "2017",
    "2018"
  ],
 'month': ['12'],
 'day': ['15'],
 'time': ['12:00'],
 'format': 'netcdf'}

As a final step, you can use directly the client to download data as in following example.

In [None]:
c = Client(debug=True)

matches = c.search(data)
print(matches)
matches.download()

In [None]:
c = Client(debug=True)

matches = c.search(data_2)
print(matches)
matches.download()

The code below searches in the directory all the netCDF file and provides the filename of the last one and penultimate downloaded. 

In [None]:
# get filename of latest .nc downloaded file 
files = [file for file in os.listdir(".") if (file.lower().endswith('.nc'))]
list_nc_file = []

for file in sorted(files,key=os.path.getmtime, reverse=True):
    list_nc_file.append(file)
    
print(f'{list_nc_file[1]} is downloaded from the first data request')
print(f'{list_nc_file[0]} is downloaded from the second data request')  

In addition the .nc file could be renamed when the download is finished. For example after the first download the original file name may be changed in 'UERRA_ws10m.nc'; while the second one may be 'UERRA_t2m.nc' as follow.

In [None]:
# rename nc file
os.rename(list_nc_file[1], 'UERRA_ws10m.nc')
os.rename(list_nc_file[0], 'UERRA_t2m.nc')

If the operation is not permitted, the user could also right-clicked on a file for manually renaming it. 

## <a id='read_data'></a>2. Read Data

Now that we have downloaded the data, we can start to play ...

We have requested the data in NetCDF format. This is a commonly used format for array-oriented scientific data. 

To read and process this data we will make use of the Xarray library. Xarray is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun! We will read the data from our NetCDF file into an Xarray **"dataset"**

In [None]:
fw = 'UERRA_ws10m.nc'
ft = 'UERRA_t2m.nc'

# Create Xarray Dataset
dw = xr.open_dataset(fw)
dt = xr.open_dataset(ft)

Now we can query our newly created Xarray datasets ...

In [None]:
dw

In [None]:
dt

We see that dw (dataset for wind speed) has one variable called **"si10"**. If you view the documentation for this dataset on the CDS you will see that this is the wind speed valid for a grid cell at the height of 10m above the surface. It is computed from both the zonal (u) and the meridional (v) wind components by 
$\sqrt{(u^{2} + v^{2})}$.

The units are m/s.

The other dataset, dt (2m temperature), has a variable called **"t2m"**. According to the documentation on the CDS this is air temperature valid for a grid cell at the height of 2m above the surface, in units of Kelvin.

While an Xarray **dataset** may contain multiple variables, an Xarray **data array** holds a single multi-dimensional variable and its coordinates. To make the processing of the **si10** and **t2m** data easier, we will convert them into Xarray data arrays.

In [None]:
# Create Xarray Data Arrays
aw = dw['si10']
at = dt['t2m']

## <a id='wind_chill'></a>3. Calculate wind chill index
There are several indices to calculate wind chill based on air temperature and wind speed. Until recently, a commonly applied index was the following:

$\textit{WCI} = (10 \sqrt{\upsilon}-\upsilon + 10.5) \cdot (33 - \textit{T}_{a})$

where:
- WCI = wind chill index, $kg*cal/m^{2}/h$
- $\upsilon$ = wind velocity, m/s
- $\textit{T}_{a}$ = air temperature, °C

We will use the more recently adopted North American and United Kingdom wind chill index, which is calculated as follows:

$\textit{T}_{WC} = 13.12 + 0.6215\textit{T}_{a} - 11.37\upsilon^{0.16} + 0.3965\textit{T}_{a}\upsilon^{0.16}$

where:
- $\textit{T}_{WC}$ = wind chill index
- $\textit{T}_{a}$ = air temperature in degrees Celsius
- $\upsilon$ = wind speed at 10 m standard anemometer height, in kilometres per hour

To calculate $\textit{T}_{WC}$ we first have to ensure our data is in the right units. For the wind speed we need to convert from m/s to km/h, and for air temperature we need to convert from Kelvin to degrees Celsius:

In [None]:
# wind speed, convert from m/s to km/h: si10 * 1000 / (60*60)
w = aw * 3600 / 1000
# air temperature, convert from Kelvin to Celsius: t2m - 273.15
t = at - 273.15

Now we can calculate the North American and United Kingdom wind chill index:
$\textit{T}_{WC} = 13.12 + 0.6215\textit{T}_{a} - 11.37\upsilon^{0.16} + 0.3965\textit{T}_{a}\upsilon^{0.16}$

In [None]:
twc = 13.12 + (0.6215*t) - (11.37*(w**0.16)) + (0.3965*t*(w**0.16))

Let's calculate the average wind chill for 12:00 on 15 December for the 20 year period from 1998 to 2019:

In [None]:
twc_mean = twc.mean(dim='time')

Now let's plot the average wind chill for this time over Europe:

In [None]:
# create the figure panel 
fig = plt.figure(figsize=(10,10))
# create the map using the cartopy Orthographic projection
ax = plt.subplot(1,1,1, projection=ccrs.Orthographic(central_longitude=8., central_latitude=42.))
# add coastlines
ax.coastlines()
ax.gridlines(draw_labels=False, linewidth=1, color='gray', alpha=0.5, linestyle='--')
# provide a title
ax.set_title('Wind Chill Index 12:00, 15 Dec, 1998 to 2019')
# plot twc
im = plt.pcolormesh(twc_mean.longitude, twc_mean.latitude,
                    twc_mean, cmap='viridis', transform=ccrs.PlateCarree())
# add colourbar
cbar = plt.colorbar(im)
cbar.set_label('Wind Chill Index')

Can you identify areas where frostbite may occur (see chart below)?

<img src='./img/Windchill_effect_en.svg' align='left' width='60%'></img>

RicHard-59, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons

## <a id='era5'></a>4. Exercise: Repeat process with ERA5 data and compare results
So far you have plotted wind chill using the UERRA regional reanalysis dataset, but how accurate is this plot? One way to assess a dataset is to compare it with an alternative independent one to see what differences there may be. An alternative to UERRA is the ERA5 reanalysis data that you used in the previous tutorials. Repeat the steps above with ERA5 and compare your results with those obtained using UERRA.

<hr>

<p><img src='./img/all_partners_wekeo.png' align='left' alt='Logo EU Copernicus' width='100%'></img></p>