# Warming levels

``xs.get_warming_level`` can be used to know when a given model reaches a given warming level.

The arguments of ``xs.get_warming_level`` are:

- `models`: Dataset, string, or list of strings. Strings should follow the format 'mip-era_source_experiment_member'
- `wl`: warming level.
- `window`: Number of years in the centered window during which the warming level is reached. Note that in the case of an even number, the IPCC standard is used (-n/2+1, +n/2).
- `tas_baseline_period`: The period over which the warming level is calculated, equivalent to "+0°C". Defaults to 1850-1900.
- `ignore_member`: The default `warming_level_csv` only contains data for 1 member. If you want a result regardless of the realization number, set this to True. This is only used when `models` is a Dataset.
- `return_horizon`: Whether to return the start/end of the horizon or to return the middle year.
    
If `models` is a list, the function returns a dictionary. Otherwise, it will return either a string or ['start_yr', 'end_yr'], depending on `return_horizon`. For entries that it fails to find in the csv, or for instances where a given warming level is not reached, the function returns None.

In [None]:
import xscen as xs

# Multiple entries, returns a dictionary
print(
    xs.get_warming_level(
        [
            "CMIP6_CanESM5_ssp126_r1i1p1f1",
            "CMIP6_CanESM5_ssp245_r1i1p1f1",
            "CMIP6_CanESM5_ssp370_r1i1p1f1",
            "CMIP6_CanESM5_ssp585_r1i1p1f1",
        ],
        wl=2,
        window=20,
        return_horizon=False,
    )
)
# Returns a list
print(
    xs.get_warming_level(
        "CMIP6_CanESM5_ssp585_r1i1p1f1", wl=2, window=20, return_horizon=True
    )
)
# Only the middle year is requested, returns a string
print(
    xs.get_warming_level(
        "CMIP6_CanESM5_ssp585_r1i1p1f1", wl=2, window=20, return_horizon=False
    )
)
# +10°C is never reached, returns None
print(xs.get_warming_level("CMIP6_CanESM5_ssp585_r1i1p1f1", wl=10, window=20))

This rest of this notebook will demonstrate a typical workflow for showing indicators by warming levels. First, initialize your project catalog.

In [None]:
# Basic imports
from pathlib import Path

import xarray as xr
import xesmf as xe
from matplotlib import pyplot as plt

output_folder = Path().absolute() / "_data"

project = {
    "title": "example-warminglevel",
    "description": "This is an example catalog for xscen's documentation.",
}

pcat = xs.ProjectCatalog(
    str(output_folder / "example-wl.json"),
    project=project,
    create=True,
    overwrite=True,
)

In [None]:
# Extract and regrid the data needed for the Tutorial

cat_sim = xs.search_data_catalogs(
    data_catalogs=[str(output_folder / "tutorial-catalog.json")],
    variables_and_freqs={"tas": "D"},
    other_search_criteria={"source": "NorESM2-MM", "activity": "ScenarioMIP"},
    periods=[[2000, 2050]],
    match_hist_and_fut=True,
    restrict_members={"ordered": 1},
)

region = {
    "name": "example-region",
    "method": "bbox",
    "tile_buffer": 1.5,
    "lon_bnds": [-68.5, -67.5],
    "lat_bnds": [48.5, 49.5],
}

ds_grid = xe.util.cf_grid_2d(-68.5, -67.5, 0.25, 48.5, 49.5, 0.25)
ds_grid.attrs["cat:domain"] = "region1"
for ds_id, dc in cat_sim.items():
    dset_dict = xs.extract_dataset(
        catalog=dc,
        region=region,
        xr_open_kwargs={"drop_variables": ["height", "time_bnds"]},
    )

    for key, ds in dset_dict.items():
        ds = xs.regrid_dataset(
            ds=ds,
            ds_grid=ds_grid,
            weights_location=str(output_folder / "gs-weights"),
            to_level="extracted",
        )
        filename = str(
            output_folder
            / f"wl_{ds.attrs['cat:id']}.{ds.attrs['cat:domain']}.{ds.attrs['cat:processing_level']}.{ds.attrs['cat:frequency']}.zarr"
        )
        chunks = xs.io.estimate_chunks(ds, dims=["time"], target_mb=50)
        xs.save_to_zarr(ds, filename, rechunk=chunks, mode="o")
        pcat.update_from_ds(ds=ds, path=filename, info_dict={"format": "zarr"})

## Subsetting the time period

``xs.subset_warming_level`` is used to subset a dataset for a window over which a given global warming level is reached.

Warming levels are computed individually in order to be able to calculate the ensemble weights properly (see [subsection below](#Ensemble-statistics)).

The function calls `get_warming_level`, so the arguments are essentially the same.:

- `ds`: input dataset.
- `wl`: warming level.
- `window`: Number of years in the centered window during which the warming level is reached. Note that in the case of an even number, the IPCC standard is used (-n/2+1, +n/2).
- `tas_baseline_period`: The period over which the warming level is calculated, equivalent to "+0°C". Defaults to 1850-1900.
- `ignore_member`: The default `warming_level_csv` only contains data for 1 member. If you want a result regardless of the realization number, set this to True.
- `to_level`: Contrary to other methods, you can use "{wl}", "{period0}" and "{period1}" in the string to dynamically include `wl`, 'tas_baseline_period[0]' and 'tas_baseline_period[1]' in the `processing_level`.
- `wl_dim`: The string used to fill the new `warminglevel` dimension. You can use "{wl}", "{period0}" and "{period1}" in the string to dynamically include `wl`, `tas_baseline_period[0]` and `tas_baseline_period[1]`. If None, no new dimension will be added.
    
If the source, experiment, (member), and warming level are not found in the csv. The function returns None.



In [None]:
dict_input = pcat.search(processing_level="extracted").to_dataset_dict(
    xarray_open_kwargs={"decode_timedelta": False}
)
wls = [1, 1.5]
for wl in wls:
    for id_input, ds_input in dict_input.items():
        ds_wl = xs.subset_warming_level(
            ds_input,
            wl=wl,
            window=20,
        )

        if ds_wl:  # check that the dataset is not None (if wl was not reached)
            # Save and update the catalog
            filename = str(
                output_folder
                / f"wl_{ds_wl.attrs['cat:id']}.{ds_wl.attrs['cat:domain']}.{ds_wl.attrs['cat:processing_level']}.{ds_wl.attrs['cat:frequency']}.zarr"
            )
            xs.save_to_zarr(ds_wl, filename, mode="o")
            pcat.update_from_ds(ds=ds_wl, path=filename, info_dict={"format": "zarr"})

In [None]:
ds_wl = pcat.search(
    id="CMIP6_ScenarioMIP_NCC_NorESM2-MM_ssp585_r1i1p1f1_example-region",
    processing_level="warminglevel-1.5vs1850-1900",
).to_dataset()
display(ds_wl)

## Producing the horizons

The extracted and subsetted dataset can be passed to ``xs.aggregate.produce_horizon`` to calculate indicators and the climatological mean. 

Since the years are meaningless for warming levels, and are even detrimental to making ensemble statistics, the function also formats the output such that 'time' and 'year' information is removed, while the seasons/months are unstacked to different coordinates. Hence, the single dataset outputed can contain indicators of different frequencies. 

The arguments of ``xs.aggregate.produce_horizon`` are:

- `ds`: input dataset.
- `indicators`: As in `compute_indicators`
- `period`: Period to cut. If None, the whole time series is used. Useful in the case when the timeseries was already extracted by ``xs.extract.subset_warming_level``.
- `to_level`:The processing level to assign to the output. Use "{wl}", "{period0}" and "{period1}" in the string to dynamically include the first value of the `warminglevel` coord of ds if it exists, `period[0]` and `period[1]`.

In [None]:
dict_input = pcat.search(processing_level="^warminglevel+").to_dataset_dict(
    xarray_open_kwargs={"decode_timedelta": False}
)

for id_input, ds_input in dict_input.items():
    ds_hor_wl = xs.produce_horizon(
        ds_input, indicators="samples/indicators.yml", to_level="clim{wl}"
    )

    # Save
    filename = str(
        output_folder
        / f"wl_{ds_hor_wl.attrs['cat:id']}.{ds_hor_wl.attrs['cat:domain']}.{ds_hor_wl.attrs['cat:processing_level']}.{ds_hor_wl.attrs['cat:frequency']}.zarr"
    )
    xs.save_to_zarr(ds_hor_wl, filename, mode="o")
    pcat.update_from_ds(ds=ds_hor_wl, path=filename, info_dict={"format": "zarr"})

In [None]:
display(ds_hor_wl)

### Reference horizon

For the purpose of deltas and future ensemble statistics, a time-based reference horizon is often required. That dataset can also be created by calling ``xs.aggregate.produce_horizon``, but with the reference period:

In [None]:
dict_input = pcat.search(processing_level="extracted").to_dataset_dict(
    xarray_open_kwargs={"decode_timedelta": False}
)
for id_input, ds_input in dict_input.items():
    ds_hor = xs.produce_horizon(
        ds_input,
        period=["2001", "2020"],
        indicators="samples/indicators.yml",
        to_level="clim{period0}-{period1}",
    )

    # Save
    filename = str(
        output_folder
        / f"wl_{ds_hor.attrs['cat:id']}.{ds_hor.attrs['cat:domain']}.{ds_hor.attrs['cat:processing_level']}.{ds_hor.attrs['cat:frequency']}.zarr"
    )
    xs.save_to_zarr(ds_hor, filename, mode="o")
    pcat.update_from_ds(ds=ds_hor, path=filename, info_dict={"format": "zarr"})

In [None]:
display(ds_hor)

## Deltas

This step is done as in the [Getting Started](2_getting_started.ipynb#Computing-deltas) Notebook, with the difference that for each simulation, the warming level and the reference horizon need to be concatenated in order to pass them to `xs.compute_deltas`.

In [None]:
dict_wl = pcat.search(processing_level="^clim+.*C").to_dataset_dict(
    xarray_open_kwargs={"decode_timedelta": False}
)
ref_period = "2001-2020"
dict_hor = pcat.search(processing_level=f"clim{ref_period}").to_dataset_dict(
    xarray_open_kwargs={"decode_timedelta": False}
)

for id_wl, ds_wl in dict_wl.items():
    # for this simulation, find the accompanying reference horizon
    level = ds_wl.attrs["cat:processing_level"]
    id_hor = id_wl.replace(level, f"clim{ref_period}")
    ds_hor = dict_hor[id_hor]

    # concat warming level and reference
    ds_concat = xr.concat([ds_wl, ds_hor], dim="horizon")

    # compute delta
    ds_delta = xs.aggregate.compute_deltas(
        ds=ds_concat, reference_horizon=ref_period, to_level=f"delta-{level}"
    )

    # Save
    filename = str(
        output_folder
        / f"wl_{ds_delta.attrs['cat:id']}.{ds_delta.attrs['cat:domain']}.{ds_delta.attrs['cat:processing_level']}.{ds_delta.attrs['cat:frequency']}.zarr"
    )
    xs.save_to_zarr(ds_delta, filename, mode="o")
    pcat.update_from_ds(ds=ds_delta, path=filename, info_dict={"format": "zarr"})

In [None]:
display(ds_delta)

## Ensemble statistics

Even more than with time-based horizons, the first step of ensemble statistics should be to generate the weights. Indeed, if a model has 3 experiments reaching a given warming level, we want it to have the same weight as a model with only 2 experiments reaching that warming.

<div class="alert alert-warning"> <b>WARNING</b>
    
`xs.ensembles.generate_weights` is currently purely based on metadata, and thus cannot distinguish subtleties about which realization reaches which warming level if multiple experiments are concatenated together before passing them to the function. The results are likely to be wrong, which is why each warming level needs to be computed individually.
</div>

Next, the weights and the datasets can be passed to `xs.ensemble_stats` to calculate the ensemble statistics.

In [None]:
for wl in wls:
    datasets = pcat.search(
        processing_level=f"delta-clim+{wl}Cvs1850-1900"
    ).to_dataset_dict(xarray_open_kwargs={"decode_timedelta": False})

    weights = xs.ensembles.generate_weights(datasets=datasets, independence_level="all")

    ds_ens = xs.ensemble_stats(
        datasets=datasets,
        common_attrs_only=True,
        weights=weights,
        statistics={"ensemble_mean_std_max_min": None},
        to_level=f"ensemble-deltas+{wl}C",
    )

    # It is sometimes useful to keep track of how many realisations made the ensemble.
    ds_ens.horizon.attrs["ensemble_size"] = len(datasets)

    filename = str(
        output_folder
        / f"wl_{ds_ens.attrs['cat:id']}.{ds_ens.attrs['cat:domain']}.{ds_ens.attrs['cat:processing_level']}.{ds_ens.attrs['cat:frequency']}.zarr"
    )
    xs.save_to_zarr(ds_ens, filename, mode="o")
    pcat.update_from_ds(ds=ds_ens, path=filename, info_dict={"format": "zarr"})

    # Create a figure
    plt.figure()
    ds_ens["growing_degree_days_delta_2001_2020_mean"].sel(
        horizon=f"+{wl}Cvs1850-1900"
    ).plot.imshow(vmin=0, vmax=400, cmap="inferno")

In [None]:
display(ds_ens)