In [1]:
#! pip install -U climetlab --quiet
#! pip install -U climetlab_s2s_ai_challenge --quiet

In [2]:
import climetlab as cml

import climetlab_s2s_ai_challenge

print(f"Climetlab version : {cml.__version__}")
print(f"Climetlab-s2s-ai-challenge plugin version : {climetlab_s2s_ai_challenge.__version__}")

Climetlab version : 0.9.1
Climetlab-s2s-ai-challenge plugin version : 0.8.1


# Observations data from training

Climetlab provides the observation datasets. They can be used as a xarray.Dataset :

In [3]:
cmlds = cml.load_dataset("s2s-ai-challenge-training-output-reference", date=20200102, parameter="t2m")
cmlds.to_xarray().coords

By downloading data from this dataset, you agree to the terms and conditions defined at https://apps.ecmwf.int/datasets/data/s2s/licence/. If you do not agree with such terms, do not download the data. 
 This dataset has been dowloaded from IRIDL. By downloading this data you also agree to the terms and conditions defined at https://iridl.ldeo.columbia.edu.


Coordinates:
    valid_time     (lead_time, forecast_time) datetime64[ns] dask.array<chunksize=(47, 20), meta=np.ndarray>
  * longitude      (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
  * latitude       (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
  * forecast_time  (forecast_time) datetime64[ns] 2000-01-02 ... 2019-01-02
  * lead_time      (lead_time) timedelta64[ns] 0 days 1 days ... 45 days 46 days

# Observations data like forecast data

The hindcast `training-input` for `origin='ncep'` is only available from `forecast_time` 1999 - 2010.

In [4]:
forecast = cml.load_dataset(
    "s2s-ai-challenge-training-input", date=[20100107], origin="ncep", parameter="tp", format="netcdf"
).to_xarray()
forecast.coords

By downloading data from this dataset, you agree to the terms and conditions defined at https://apps.ecmwf.int/datasets/data/s2s/licence/. If you do not agree with such terms, do not download the data. 


Coordinates:
  * realization    (realization) int64 0 1 2 3
  * forecast_time  (forecast_time) datetime64[ns] 1999-01-07 ... 2010-01-07
  * lead_time      (lead_time) timedelta64[ns] 1 days 2 days ... 43 days 44 days
  * latitude       (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
  * longitude      (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
    valid_time     (forecast_time, lead_time) datetime64[ns] dask.array<chunksize=(12, 44), meta=np.ndarray>

Download `observations` for precipitation flux `pr` (also works for 2m-temperature `t2m`) with a `time` dimension.
Use `climetlab_s2s_ai_challenge.extra.forecast_like_observations` to convert like a forecast, which converts `pr` to total precipitation `tp`.

In [5]:
obs_ds = cml.load_dataset("s2s-ai-challenge-observations", parameter=["pr"]).to_xarray()
from climetlab_s2s_ai_challenge.extra import forecast_like_observations

obs_lead_time_forecast_time = forecast_like_observations(forecast, obs_ds)
obs_lead_time_forecast_time.coords

By downloading data from this dataset, you agree to the terms and conditions defined at https://apps.ecmwf.int/datasets/data/s2s/licence/. If you do not agree with such terms, do not download the data. 
 This dataset has been dowloaded from IRIDL. By downloading this data you also agree to the terms and conditions defined at https://iridl.ldeo.columbia.edu.


Coordinates:
  * lead_time      (lead_time) timedelta64[ns] 1 days 2 days ... 43 days 44 days
    valid_time     (forecast_time, lead_time) datetime64[ns] 1999-01-08 ... 2...
  * longitude      (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
  * latitude       (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
  * forecast_time  (forecast_time) datetime64[ns] 1999-01-07 ... 2010-01-07

This is equivalent to `.to_xarray(like=forecast)`.

In [6]:
obs_like = cml.load_dataset("s2s-ai-challenge-observations", parameter=["pr"]).to_xarray(like=forecast)

In [7]:
import xarray

xarray.testing.assert_equal(obs_like, obs_lead_time_forecast_time)

> Note that you can use this with any initialized forecast `xr.Dataset` with coordinate `valid_time(forecast_time, lead_time)`,
> i.e. any initialized NMME, SubX or S2S output