# Windstorm tracks and footprints derived from reanalysis over Europe between 1940 to present: Windstorm tracks & footprints

Production date: YYYY-MM-DD

**Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch.**

Produced by: Olivier Burggraaff (National Physical Laboratory).

## üåç Use case: Use case listed here in full 

## ‚ùì Quality assessment question
* **In most cases there should be one question listed here in bold**

**‚ÄòContext paragraph‚Äô (no title/heading)** - a very short introduction before the assessment statement describing approach taken to answer the user question. One or two key references could be useful, if the assessment summarises literature.

**Background**

## üì¢ Quality assessment statement

```{admonition} These are the key outcomes of this assessment
:class: note
* Finding 1
* Finding 2
* Finding 3
* etc
```

## üìã Methodology

* Internal consistency: TempestExtremes vs TRACK/Hodges? (read papers)
* 

A ‚Äòfree text‚Äô introduction to the data analysis steps or a description of the literature synthesis, with a justification of the approach taken, and limitations mentioned. **Mention which CDS catalogue entry is used, including a link, and also any other entries used for the assessment**.

[Windstorm tracks and footprints derived from reanalysis over Europe between 1940 to present
](https://doi.org/10.24381/bf1f06a9)

Hodges (TRACK) algorithm [[Hoskins+02](https://doi.org/10.1175/1520-0469(2002)059%3C1041:NPOTNH%3E2.0.CO;2), [Hodges+99](https://doi.org/10.1175/1520-0493(1999)127%3C1362:ACFFT%3E2.0.CO;2), [Hodges+95](https://doi.org/10.1175/1520-0493(1995)123%3C3458:FTOTUS%3E2.0.CO;2)]

TempestExtremes [[Ullrich+21](https://doi.org/10.5194/gmd-14-5023-2021), [Ullrich+17](https://doi.org/10.5194/gmd-10-1069-2017)]

**Note:** This notebook is currently just a brain-dump in anticipation of starting the actual quality assessment at a later stage.

E.g. 'The analysis and results are organised in the following steps, which are detailed in the sections below:' 

**[](section-setup)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

**[](section-analysis)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

**[](section-results)**
 * Sub-steps or key points listed in bullet below. No strict requirement to match and link to sub-headings.

Any further notes on the method could go here (explanations, caveats or limitations).

## üìà Analysis and results

(section-setup)=
### 1. Code setup

#### Imports

In [None]:
# Input / Output
from pathlib import Path
import earthkit.data as ekd
import warnings

# General data handling
import numpy as np
np.seterr(divide="ignore")  # Ignore divide-by-zero warnings
import pandas as pd
import geopandas as gpd
import xarray as xr
from functools import partial
from dask.array.core import PerformanceWarning
warnings.simplefilter(action="ignore", category=PerformanceWarning)

# Visualisation
import earthkit.plots as ekp
from earthkit.plots.styles import Style
import matplotlib.pyplot as plt
plt.rcParams["grid.linestyle"] = "--"
from matplotlib.colors import LogNorm
from cartopy import crs as ccrs
import cmcrameri as cmc
from tqdm import tqdm  # Progress bars

# Visualisation in Jupyter book -- automatically ignored otherwise
try:
    from myst_nb import glue
except ImportError:
    glue = None

# Type hints
from typing import Callable, Iterable, Optional

#### Helper functions

##### Data (pre-)processing
The following cell contains some pre-defined constants for convenience,
such as a list of variable names in the data:

In [None]:
# Data
TRACKING_ALGORITHMS = ["hodges", "tempest_extremes"]

##### Data (pre-)processing

In [None]:
## Loops for convenience
def loop_over_(*args, progress=True, **kwargs) -> tqdm:
    """ Generate a tqdm progressbar; inverts `disable` keyword """
    return tqdm(*args, disable=not progress, leave=False, **kwargs)

def loop_over_ensemble_members(data: xr.Dataset, **tqdm_kwargs) -> tqdm:
    """ Loop over ensemble members in `data`, with a progress bar. """
    return loop_over_(data.groupby("number"), unit="member", **tqdm_kwargs)

def loop_over_data_variables(data: xr.Dataset, **tqdm_kwargs) -> tqdm:
    """ Loop over variable keys in `data`, with a progress bar. """
    return loop_over_(data.data_vars.keys(), unit="variable", **tqdm_kwargs)

##### Visualisation

In [None]:
_style_footprint = {"cmap": plt.cm.cividis, "vmin": 0, "vmax": 40}

styles = {
    "footprint": Style(**_style_footprint),
}

# Apply general settings
for style in styles.values():
    style.normalize = False

In [None]:
def _add_textbox_to_subplots(text: str, *axs: Iterable[plt.Axes | ekp.Subplot], right=False) -> None:
    """ Add a text box to each of the specified subplots. """
    # Get the plt.Axes for each ekp.Subplot
    axs = [subplot.ax if isinstance(subplot, ekp.Subplot) else subplot for subplot in axs]

    # Set up location
    x = 0.95 if right else 0.05
    horizontalalignment = "right" if right else "left"

    # Add the text
    for ax in axs:
        ax.text(x, 0.95, text, transform=ax.transAxes,
        horizontalalignment=horizontalalignment, verticalalignment="top",
        bbox={"facecolor": "white", "edgecolor": "black", "boxstyle": "round",
              "alpha": 1})

In [None]:
def decorate_fig(fig: ekp.Figure, *, title: Optional[str]="") -> None:
    """ Decorate an earthkit figure with land, coastlines, etc. """
    # Add progress bar because individual steps can be very slow for large plots
    with tqdm(total=4, desc="Decorating", leave=False) as progressbar:
        fig.land()
        progressbar.update()
        fig.coastlines()
        progressbar.update()
        # fig.borders()
        # progressbar.update()
        fig.gridlines(linestyle=plt.rcParams["grid.linestyle"])
        progressbar.update()
        fig.title(title)
        progressbar.update()

(section-analysis)=
### 2. Analysis

[sis-european-wind-storm-reanalysis](https://doi.org/10.24381/bf1f06a9)

In [None]:
dataset = "sis-european-wind-storm-reanalysis"

In [None]:
# domain = ekp.geo.domains.Domain.from_string("Europe")
domain = "North Atlantic"
year = 2040

In [None]:
tracking_algorithms = ["hodges", "tempestextremes"]

request_variables = {
    "variable": "all",
    "tracking_algorithm": tracking_algorithms,
}

request_time = {
    "year": [f"{year}" for year in range(2003, 2006)],
    "month": [f"{month:02}" for month in range(1, 13)],
    "day": [f"{day:02}" for day in range(1, 32)],
}

#### Windstorm track

In [None]:
request = {
    "product": "windstorm_track",
    "event_aggregation": "single_event",
} | request_variables | request_time

In [None]:
data = ekd.from_source("cds", dataset, request)
data = data.to_pandas()
data["time"] = pd.to_datetime(data["time"])
data

In [None]:
# Reindex
data = data.set_index(["algorithm", "id"])
data

##### Matching tracks
Note duplicates

In [None]:
# Pick out time column for each algorithm
df1, df2 = [data.loc[alg][["time"]].reset_index() for alg in TRACKING_ALGORITHMS]

# Find intersection
overlap = pd.merge(df1, df2, on="time", how="inner", suffixes=[f"_{alg}" for alg in TRACKING_ALGORITHMS])
overlap = overlap.drop_duplicates(subset=[f"id_{alg}" for alg in TRACKING_ALGORITHMS])
overlap = overlap[[f"id_{alg}" for alg in TRACKING_ALGORITHMS]]
overlap = overlap.reset_index(drop=True).set_index("id_hodges")

overlap

In [None]:
# Add storms that are in Hodges but not in TempestExtremes
hodges_all = data.loc["hodges"].index.unique()

hodges_only = [ind for ind in hodges_all if ind not in overlap.index]
hodges_only_mapping = {overlap.columns[0]: [None for i in hodges_only]}

hodges_only = pd.DataFrame(index=hodges_only, columns=hodges_only_mapping)
hodges_only

In [None]:
# Ignored for now for simplicity
# Add storms that are in TempestExtremes but not in Hodges
te_all = data.loc["tempest_extremes"].index.unique()
te_all
te_only = [ind for ind in te_all if ind not in overlap[overlap.columns[0]].unique()]
te_only
# hodges_only_mapping = {overlap.columns[0]: [None for i in hodges_only]}

# hodges_only = pd.DataFrame(index=hodges_only, columns=hodges_only_mapping)
# hodges_only

In [None]:
combined_mapping = pd.concat([overlap, hodges_only]).sort_index()
combined_mapping

##### Plot tracks

In [None]:
def _plot_track(fig: ekp.Figure, df: pd.DataFrame) -> None:
    subplot = fig.add_map(domain=domain)
    subplot.scatter(x=df["longitude"], y=df["latitude"], c=df["fg10"],
                    cmap=plt.cm.magma, vmin=0, vmax=20, zorder=2)
    subplot.line(x=df["longitude"], y=df["latitude"], c="k", zorder=1)

In [None]:
# Create figure
fig = ekp.Figure(rows=1, columns=2, size=(8, 4))

# Plot tracks by algorithm
_plot_track(fig, data.loc["hodges", 1396].set_index("time"))  # Example where TE merges two storms
_plot_track(fig, data.loc["tempest_extremes", 646].set_index("time"))

# Legend at the bottom
# N.B. Currently doesn't work because it looks for a style in the subplot.line call
for subplot in fig.subplots:
    try:
        subplot.legend(location="bottom", style=style)
    except ValueError:
        continue

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

In [None]:
# Create figure
fig = ekp.Figure(rows=1, columns=2, size=(8, 4))

# Plot tracks by algorithm
_plot_track(fig, data.loc["hodges", 1422].set_index("time"))  # Simple example
_plot_track(fig, data.loc["tempest_extremes", 660].set_index("time"))

# Legend at the bottom
# N.B. Currently doesn't work because it looks for a style in the subplot.line call
for subplot in fig.subplots:
    try:
        subplot.legend(location="bottom", style=style)
    except ValueError:
        continue

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

#### Windstorm footprint

In [None]:
request = {
    "product": "windstorm_footprint",
    "event_aggregation": "single_event",
    "windstorm_footprint_resolution": "original",
    "spatial_extent": "full_domain",
} | request_variables | request_time

# Note: Only downloads the last algorithm, so have to split it in two
requests = {alg: request | {"tracking_algorithm": alg} for alg in tracking_algorithms}

In [None]:
data = {alg: ekd.from_source("cds", dataset, request).to_xarray() for alg, request in requests.items()}

In [None]:
data["hodges"]

In [None]:
data["tempestextremes"]

##### Plot individual footprints
Currently arbitrary selections; should match
Could have side-by-side plots, N rows for matching storms, empty panels when not matching?

In [None]:
track_ids = {"hodges": 1422, "tempestextremes": 660}  # Matching pair ; slightly boring but safe

In [None]:
# Show variables
fig = ekp.Figure(rows=1, columns=len(tracking_algorithms), size=(8, 4))

# Plot single variable
for alg in tracking_algorithms:
    subplot = fig.add_map(domain=domain)

    data_plot = data[alg].sel(track_id=track_ids[alg])
    subplot.grid_cells(data_plot, z="footprint", style=styles["footprint"])

    subplot.title(alg)

# Legend at the bottom
for subplot in fig.subplots:
    try:
        subplot.legend(location="bottom")
    except ValueError:
        continue

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

#### Windstorm footprint (downscaled)
Merge into previous section later

In [None]:
request = {
    "product": "windstorm_footprint",
    "event_aggregation": "single_event",
    "windstorm_footprint_resolution": "downscaled",
    "spatial_extent": "full_domain",
} | request_variables | request_time

# Note: Only downloads the last algorithm, so have to split it in two
requests = {alg: request | {"tracking_algorithm": alg} for alg in tracking_algorithms}

In [None]:
data = {alg: ekd.from_source("cds", dataset, request).to_xarray() for alg, request in requests.items()}

In [None]:
data["hodges"]

In [None]:
data["tempestextremes"]

In [None]:
# Show variables
fig = ekp.Figure(rows=1, columns=len(tracking_algorithms), size=(8, 4))

# Plot single variable
for alg in tracking_algorithms:
    subplot = fig.add_map(domain=domain)

    data_plot = data[alg].sel(track_id=track_ids[alg])
    subplot.grid_cells(data_plot, z="footprint", style=styles["footprint"])

    subplot.title(alg)

# Legend at the bottom
for subplot in fig.subplots:
    try:
        subplot.legend(location="bottom")
    except ValueError:
        continue

# Decorate figures
decorate_fig(fig)

# Show result
plt.show()
plt.close()

(section-results)=
### 5. Results

#### Results Subsections
Describe what is done in this step/section and what the `code` in the cell does (if code is included). 

If this is the **results section**, we expect the final plots to be created here with a description of how to interpret them, and what information can be extracted for the specific use case and user question. The information in the 'quality assessment statement' should be derived here. 

## ‚ÑπÔ∏è If you want to know more

### Key resources

Code libraries used:
* [earthkit](https://github.com/ecmwf/earthkit)
  * [earthkit-data](https://github.com/ecmwf/earthkit-data)
  * [earthkit-plots](https://github.com/ecmwf/earthkit-plots)

### References
[[Hoskins+02](https://doi.org/10.1175/1520-0469(2002)059%3C1041:NPOTNH%3E2.0.CO;2)] B. J. Hoskins and K. I. Hodges, ‚ÄòNew Perspectives on the Northern Hemisphere Winter Storm Tracks‚Äô, Journal of the Atmospheric Sciences, vol. 59, no. 6, pp. 1041‚Äì1061, Mar. 2002, doi: 10.1175/1520-0469(2002)059%3C1041:NPOTNH%3E2.0.CO;2.

[[Hodges+99](https://doi.org/10.1175/1520-0493(1999)127%3C1362:ACFFT%3E2.0.CO;2)] K. I. Hodges, ‚ÄòAdaptive Constraints for Feature Tracking‚Äô, Monthly Weather Review, vol. 127, no. 6, pp. 1362‚Äì1373, June 1999, doi: 10.1175/1520-0493(1999)127%3C1362:ACFFT%3E2.0.CO;2.

[[Hodges+95](https://doi.org/10.1175/1520-0493(1995)123%3C3458:FTOTUS%3E2.0.CO;2)] K. I. Hodges, ‚ÄòFeature Tracking on the Unit Sphere‚Äô, Monthly Weather Review, vol. 123, no. 12, pp. 3458‚Äì3465, Dec. 1995, doi: 10.1175/1520-0493(1995)123%3C3458:FTOTUS%3E2.0.CO;2.

[[Ullrich+21](https://doi.org/10.5194/gmd-14-5023-2021)] P. A. Ullrich, C. M. Zarzycki, E. E. McClenny, M. C. Pinheiro, A. M. Stansfield, and K. A. Reed, ‚ÄòTempestExtremes v2.1: a community framework for feature detection, tracking, and analysis in large datasets‚Äô, Geoscientific Model Development, vol. 14, no. 8, pp. 5023‚Äì5048, Aug. 2021, doi: 10.5194/gmd-14-5023-2021.

[[Ullrich+17](https://doi.org/10.5194/gmd-10-1069-2017)] P. A. Ullrich and C. M. Zarzycki, ‚ÄòTempestExtremes: a framework for scale-insensitive pointwise feature tracking on unstructured grids‚Äô, Geoscientific Model Development, vol. 10, no. 3, pp. 1069‚Äì1090, Mar. 2017, doi: 10.5194/gmd-10-1069-2017.