# Part II Hands-on: ESA Implementation and Normalization

- Hands-on exercise for the StuMeTa 2021 Workshop "__A Practical Introduction to Ensemble Sensitivity Analysis__"
- Presentation by [Christopher Polster](mailto:cpolster@uni-mainz.de)
- Material: https://github.com/chpolste/ESA-Workshop


## Objective

Implement ESA and compare normalizations.


## Preparation

In [None]:
import datetime as dt
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
from cartopy import crs

%matplotlib inline

In [None]:
# Map projection
carree = crs.PlateCarree()

# Contour setup for ensemble-mean 500 hPa Geopotential
z500_kwargs = {
    "levels": [4900, 5100, 5300, 5500, 5700, 5900],
    "colors": "black",
    "linewidths": 1
}

# add_axes with basic setup of a map
def add_map(fig, pos):
    ax = fig.add_axes(pos, projection=carree)
    ax.set_ylim((10, 90))
    ax.set_xlim((-180, 60))
    ax.coastlines()
    grid = ax.gridlines(
        xlocs=[-150, -120, -90, -60, -30, 0, 30],
        ylocs=[30, 60],
        draw_labels=True
    )
    grid.right_labels = False
    grid.top_labels = False
    return ax

Load the data:

In [None]:
data = xr.load_dataset("data/data-2016-03-07.nc")
print(data)

The file contains two variables:

- `z500`: 500 hPa geopotential height (Z500) forecasts from the IFS ENS, initialized on 2016-03-07 00Z. Fields are available every 24 hours of lead time, out to 6 days (144 hours). These are the source fields in the sensitivity analysis.
- `rmse`: Forecast error of the 500 hPa geopotential height forecast in terms of the RMSE of every ensemble member relative to ERA5 over Europe (35°N-75°N, 12.5°W-42.5°E) evaluated at 144 hours lead time. This is our target metric in the sensitivity analysis. The larger this value, the worse the forecast for Europe.

Extract the RMSE values and the corresponding Z500 fields for the same valid time.

In [None]:
# Coordinates for plotting
lat = data.latitude.values
lon = data.longitude.values

# The sensitivity target/forecast metric is the Z500 RMSE over Europe at +144 h
target = data.rmse.values
# The sensitivity source/forecast field is Z500
source = data.z500.sel(time="2016-03-13", drop=True).values

## First Look

---

__Task__: Plot a histogram of the target values.

In [None]:
plt.figure(figsize=(10, 3))

... # TODO

plt.title("500 hPa Geopotential RMSE [gpm] +144 h", loc="left");

The metric is evalutated at 7 days lead time, so we expect some errors in the forecasts and a bit of spread in the RMSE.

---

__Task__: Plot the ensemble mean and standard deviation of the source Z500 field.

In [None]:
fig = plt.figure(figsize=(10, 5))
ax = add_map(fig, (0.1, 0.2, 0.8, 0.8))
cx = fig.add_axes((0.2, 0.2, 0.6, 0.03))

mean = ... # TODO
std  = ... # TODO

# Ensemble Z500 spread as filled contours
cf = ax.contourf(lon, lat, std, transform=carree, cmap="cubehelix_r")
plt.colorbar(cf, cax=cx, orientation="horizontal")

# Ensemble Z500 mean as contours
ct = ax.contour(lon, lat, mean, transform=carree, **z500_kwargs)

# Forecast Error Metric Region (35°N-75°N, 12.5°W-42.5°E)
ax.add_patch(plt.Rectangle([-12.5, 35], 55, 40, fill=False, edgecolor="k", linewidth=3, transform=carree))

ax.set_title("500 hPa Geopotential Mean and Standard Deviation [gpm] +144 h", loc="left");

## Slope-based ESA

---

**Task**: Implement slope-based ESA. Compute the slope of the linear regression line between the source and target ensemble at every gridpoint:

$$ \mathrm{l}_i = \frac{\mathrm{cov}(\mathbf{t}, \mathbf{s}_i)}{\sigma_{\mathbf{s}_i}^2} $$

Reminder: the (co)variances are evaluated along the ensemble dimension. Here, this is axis `0` of the source array (axis `1` is latitude and axis `2` is longitude). Note that [`np.cov`](https://numpy.org/doc/stable/reference/generated/numpy.cov.html) does not support the `axis` argument. you can use loops (the problem size considered here is small enough that this is feasible even with Python loops), or compute the covariance yourself with functions that have an `axis` argument.

In [None]:
def esa_slope(source, target):
    ... # TODO

---

Plot a slope-based sensitivity map.

In [None]:
fig = plt.figure(figsize=(10, 5))
ax = add_map(fig, (0.1, 0.2, 0.8, 0.8))
cx = fig.add_axes((0.2, 0.2, 0.6, 0.03))

mean = ... # TODO
sens = ... # TODO

# Slope-sensitivity map
cf = ax.contourf(lon, lat, sens, transform=carree, cmap="RdBu_r", extend="both",
                 levels=[-1.2, -1.0, -0.8, -0.6, -0.4, -0.2, 0.2, 0.4, 0.6, 0.8, 1.2])
plt.colorbar(cf, cax=cx, orientation="horizontal", spacing="proportional")

# Ensemble Z500 mean as contours
ct = ax.contour(lon, lat, mean, transform=carree, **z500_kwargs)

# Forecast Error Metric Region (35°N-75°N, 12.5°W-42.5°E)
ax.add_patch(plt.Rectangle([-12.5, 35], 55, 40, fill=False, edgecolor="k", linewidth=3, transform=carree))

ax.set_title("500 hPa Geopotential Mean and Standard Deviation [gpm] +144 h", loc="left");

---

## Apply normalization

---

__Task__: Implement the 3 normalized variants of ESA and compare the sensitivity maps.

- Normalize source (multiply by $\sigma_{\mathbf{s}_i}$): $ \frac{\mathrm{cov}(\mathbf{t}, \mathbf{s}_i)}{\sigma_{\mathbf{s}_i} } $
- Normalize target (divide by $\sigma_{\mathbf{t}}$): $ \frac{\mathrm{cov}(\mathbf{t}, \mathbf{s}_i)}{\sigma_{\mathbf{s}_i}^2 \sigma_{\mathbf{t}}} $
- Normalize both (correlation): $ \frac{\mathrm{cov}(\mathbf{t}, \mathbf{s}_i)}{\sigma_{\mathbf{s}_i} \sigma_{\mathbf{t}}} $

In [None]:
sens_slope       = esa_slope(source, target)
sens_norm_source = ... # TODO
sens_norm_target = ... # TODO
sens_corr        = ... # TODO


# To generate a colorbar symmetric around 0
def symm_lvls(data, nlvl=12):
    extr = max(abs(np.min(data)), abs(np.max(data)))
    return { "cmap": "RdBu_r", "levels": np.linspace(-extr, extr, 14) }

# Create a 2x2-panel plot similar to the table in the presentation
fig = plt.figure(figsize=(12, 6))

# Top left: slope without normalization
ax1 = add_map(fig, (0.07, 0.60, 0.4, 0.4))
cx1 = fig.add_axes((0.12, 0.57, 0.3, 0.03))
cf = ax1.contourf(lon, lat, sens_slope, transform=carree, **symm_lvls(sens_slope))
plt.colorbar(cf, cax=cx1, orientation="horizontal")
ax1.set_title("$\mathrm{cov}(y, x_i) \sigma_{x_i}^{-2}$ (slope) [gpm/gpm]", loc="left")

# Top right: slope normalized by the gridpoint-wise standard deviation of the source fields at every
ax2 = add_map(fig, (0.57, 0.6, 0.4, 0.4))
cx2 = fig.add_axes((0.62, 0.57, 0.3, 0.03))
cf = ax2.contourf(lon, lat, sens_norm_source, transform=carree, **symm_lvls(sens_norm_source))
plt.colorbar(cf, cax=cx2, orientation="horizontal")
ax2.set_title("$\mathrm{cov}(y, x_i) \sigma_{x_i}^{-1}$ [gpm]", loc="left")

# Bottom left: slope normalized by the standard deviation of the target metric
ax3 = add_map(fig, (0.07, 0.1, 0.4, 0.4))
cx3 = fig.add_axes((0.12, 0.07, 0.3, 0.03))
cf = ax3.contourf(lon, lat, sens_norm_target, transform=carree, **symm_lvls(sens_norm_target))
plt.colorbar(cf, cax=cx3, orientation="horizontal")
ax3.set_title("$\mathrm{cov}(y, x_i) \sigma_{x_i}^{-2} \sigma_{y}^{-1}$ [1/gpm]", loc="left")

# Bottom right: correlation (both normalizations applied)
ax4 = add_map(fig, (0.57, 0.1, 0.4, 0.4))
cx4 = fig.add_axes((0.62, 0.07, 0.3, 0.03))
cf = ax4.contourf(lon, lat, sens_corr, transform=carree, **symm_lvls(sens_corr))
plt.colorbar(cf, cax=cx4, orientation="horizontal")
ax4.set_title("$\mathrm{cov}(y, x_i) \sigma_{x_i}^{-1} \sigma_{y}^{-1}$ (correlation) [unitless]", loc="left");

---