# Notebook 1: Data Retrieval from FDB and Preprocessing

This notebook serves as a guide to accessing data from FDB (Fields Database) object storage and preprocessing. In the first part, it demonstrates the computation of median ensembles of precipitations aggregated over 6 hours, followed by a more complex computational process, the computation of potential vorticity.

https://github.com/MeteoSwiss/meteodata-lab

## Accessing Data from FDB

To access the data from FDB, follow these steps:

### Configuring Access to FDB

In [None]:
import logging
import os
import sys
from pathlib import Path

from meteodatalab import mars, mch_model_data

import plot_utils

In [None]:
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logging.getLogger("matplotlib").setLevel(logging.INFO)
cwd = Path().resolve().parent
view = cwd / "spack-env/.spack-env/view"
assert view.exists()

In [None]:
os.environ["FDB5_HOME"] = str(view)
os.environ["FDB5_CONFIG"] = """
---
type: local
engine: toc
schema: /scratch/mch/vcherkas/fdb-realtime-lcm/schema
spaces:
- handler: Default
  roots:
  - path: /scratch/mch/vcherkas/fdb-root-realtime
"""

### Retrieving Data

Use query functions to retrieve the required data.

The request for data is made by specifying the values of MARS keys.
MARS keys are derived from GRIB keys and serve as a base for the FDB index.
The `meteodatalab.mars` module provides helpers to build valid MARS request in the context of MeteoSwiss.

Note that the available data in the test FDB instance is typically limited to the 2 last runs so make sure to update the date and time.

In [None]:
request = mars.Request(
    param="TOT_PREC",
    date="20240409",
    time="1800",
    number=tuple(range(11)),
    step=tuple(i * 60 for i in range(10)),  # minutes
    levtype=mars.LevType.SURFACE,
    model=mars.Model.ICON_CH1_EPS,
)

In [None]:
request.to_fdb()

The `meteodatalab.mch_model_data` module provides some convenience functions to access model data.
Earthkit-data is used in the background to read the data that is being returned by FDB.

In [None]:
ds = mch_model_data.get_from_fdb(request)

The data is returned as dictionary of xarray DataArrays where the keys are set to the param short name.

In [None]:
ds["TOT_PREC"]

## Data Preprocessing for Computing Median Ensembles

Before computing median ensembles aggregated over 6 hours, ensure the data undergoes appropriate preprocessing:

### Data Aggregation
Aggregate data over 6-hour intervals.

`meteodatalab` implements operators that transform the data. 

For example, the total precipitation is accumulated from the reference time but reaggregated to 6 hour intervals using the `delta` operator.

In [None]:
import numpy as np
from meteodatalab.operators import time_operators as time_ops

In [None]:
tot_prec_6h = time_ops.delta(ds["TOT_PREC"], np.timedelta64(6, "h"))

In [None]:
tot_prec_6h

### Ensemble Calculation

Compute median ensembles using preprocessed data.

In [None]:
data = tot_prec_6h.isel(time=8).median(dim="eps").clip(min=0)
data.attrs["geography"] = tot_prec_6h.geography
plot_utils.plot_tot_prec(data)

## Potential Vorticity Calculation and Wind Field Rotation

This notebook introduces a comprehensive approach to computing potential vorticity (PV) and rotating the wind field, representing a more intricate computational process compared to Notebook 1, which primarily focused on straightforward data retrieval and preprocessing.

### Querying Data

Utilize query functions to smoothly retrieve the nine required fields spanning all model levels.

In [None]:
request = mars.Request(
    param=("P", "T", "U", "V", "W", "QV", "QC", "QI"),
    date="20240419",
    time="1200",
    number=0,
    step=420,
    levtype=mars.LevType.MODEL_LEVEL,
    levelist=tuple(range(1, 82)),
    model=mars.Model.ICON_CH1_EPS,
)

In [None]:
request_hhl_const = mars.Request(
    param="HHL",
    date="20240419",
    time="1200",
    number=0,
    step=0,
    levtype=mars.LevType.MODEL_LEVEL,
    levelist=tuple(range(1, 82)),
    model=mars.Model.ICON_CH1_EPS,
)

In [None]:
ds = mch_model_data.get_from_fdb(request)

In [None]:
ds |= mch_model_data.get_from_fdb(request_hhl_const)

In [None]:
hhl = ds["HHL"].squeeze(drop=True)
hhl

### Computing Potential Vorticity

The next Jupyter cell will tackle the computation of potential vorticity, a more complex process that isn't directly computed by the model.

In [None]:
from meteodatalab import metadata
from meteodatalab.operators.rho import compute_rho_tot
from meteodatalab.operators.theta import compute_theta
from meteodatalab.operators.pot_vortic import compute_pot_vortic

In [None]:
theta = compute_theta(ds["P"], ds["T"])
rho_tot = compute_rho_tot(ds["T"], ds["P"], ds["QV"], ds["QC"], ds["QI"])

metadata.set_origin_xy(ds, "HHL")
pot_vortic = compute_pot_vortic(ds["U"], ds["V"], ds["W"], theta, rho_tot, hhl)

### Interpolate to potential temperature levels

It's possible to interpolate the potential vorticity on isotherms of potential temperature.

In [None]:
from meteodatalab.operators.destagger import destagger
from meteodatalab.operators.vertical_interpolation import interpolate_k2theta

In [None]:
hfl = destagger(hhl, "z")
theta_values = [310.0, 315.0, 320.0, 325.0, 330.0, 335.0]
pot_vortic_th = interpolate_k2theta(pot_vortic, "low_fold", theta, theta_values, "K", hfl)

In [None]:
pot_vortic_th.coords

In [None]:
plot_utils.plot_pot_vortic(pot_vortic_th.sel(theta=320), hhl.geography, "Potential Vorticity at $\\theta$ = 320K")

### Compute the mean between pressure levels

There's also an option to compute the mean potential vorticity between two isobars (or pressure levels).

In [None]:
from meteodatalab.operators.vertical_interpolation import interpolate_k2p
from meteodatalab.operators.vertical_reduction import integrate_k

In [None]:
isobars = interpolate_k2p(hfl, "linear_in_lnp", ds["P"], [700, 900], "hPa")
h700, h900 = isobars.transpose("pressure", ...)
pot_vortic_mean = integrate_k(pot_vortic, "normed_integral", "z2z", hhl, (h900, h700))

In [None]:
plot_utils.plot_pot_vortic(pot_vortic_mean, hhl.geography, "Mean potential vorticity between 900 and 700 hPa")

## Summary

- retrieve data from FDB in python
- read GRIB data into xarray
- process the data with meteorological operators that are aware of the grib metadata
- keep the GRIB metadata consistent thoughout operations