<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction-to-cf_xarray" data-toc-modified-id="Introduction-to-cf_xarray-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction to <code>cf_xarray</code></a></span><ul class="toc-item"><li><span><a href="#What-attributes-have-been-discovered?" data-toc-modified-id="What-attributes-have-been-discovered?-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>What attributes have been discovered?</a></span></li><li><span><a href="#Feature:-Accessing-coordinate-variables" data-toc-modified-id="Feature:-Accessing-coordinate-variables-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Feature: Accessing coordinate variables</a></span></li><li><span><a href="#Feature:-Rewriting-arguments" data-toc-modified-id="Feature:-Rewriting-arguments-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Feature: Rewriting arguments</a></span><ul class="toc-item"><li><span><a href="#Slicing" data-toc-modified-id="Slicing-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Slicing</a></span></li><li><span><a href="#Reductions" data-toc-modified-id="Reductions-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Reductions</a></span></li><li><span><a href="#Plotting" data-toc-modified-id="Plotting-1.3.3"><span class="toc-item-num">1.3.3&nbsp;&nbsp;</span>Plotting</a></span></li><li><span><a href="#Resample-&amp;-groupby" data-toc-modified-id="Resample-&amp;-groupby-1.3.4"><span class="toc-item-num">1.3.4&nbsp;&nbsp;</span>Resample &amp; groupby</a></span></li><li><span><a href="#Rolling-&amp;-coarsen" data-toc-modified-id="Rolling-&amp;-coarsen-1.3.5"><span class="toc-item-num">1.3.5&nbsp;&nbsp;</span>Rolling &amp; coarsen</a></span><ul class="toc-item"><li><span><a href="#miscellaneous-features" data-toc-modified-id="miscellaneous-features-1.3.5.1"><span class="toc-item-num">1.3.5.1&nbsp;&nbsp;</span>miscellaneous features</a></span></li></ul></li></ul></li><li><span><a href="#Feature:-Weight-by-Cell-Measures" data-toc-modified-id="Feature:-Weight-by-Cell-Measures-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Feature: Weight by Cell Measures</a></span></li></ul></li></ul></div>

# Introduction to `cf_xarray`

This notebook is a brief introduction to `cf_xarray`'s current capabilities.

In [None]:
import cf_xarray
import numpy as np
import xarray as xr

Lets read two datasets.

In [None]:
ds = xr.tutorial.load_dataset("air_temperature")
ds

This one is inspired by POP model output and illustrates how the coordinates attribute is interpreted

In [None]:
pop = xr.Dataset()
pop.coords["TLONG"] = (("nlat", "nlon"), np.ones((20, 30)), {"units": "degrees_east"})
pop.coords["TLAT"] = (("nlat", "nlon"), 2 * np.ones((20, 30)), {"units": "degrees_north"})
pop.coords["ULONG"] = (("nlat", "nlon"), 0.5 * np.ones((20, 30)), {"units": "degrees_east"})
pop.coords["ULAT"] = (("nlat", "nlon"), 2.5 * np.ones((20, 30)), {"units": "degrees_north"})

pop["UVEL"] = (
    ("nlat", "nlon"),
    np.ones((20, 30)) * 15,
    {"coordinates": "ULONG ULAT"},
)
pop["TEMP"] = (
    ("nlat", "nlon"),
    np.ones((20, 30)) * 15,
    {"coordinates": "TLONG TLAT"},
)
pop

This synthetic dataset has multiple `X` and `Y` coords. An example would be model output on a staggered grid.

In [None]:
multiple = xr.Dataset()
multiple.coords["x1"] = ("x1", range(30), {"axis": "X"})
multiple.coords["y1"] = ("y1", range(20), {"axis": "Y"})
multiple.coords["x2"] = ("x2", range(10), {"axis": "X"})
multiple.coords["y2"] = ("y2", range(5), {"axis": "Y"})

multiple["v1"] = (("x1", "y1"), np.ones((30, 20)) * 15)
multiple["v2"] = (("x2", "y2"), np.ones((10, 5)) * 15)
multiple

## What attributes have been discovered?

In [None]:
ds.lon

`ds.lon` has attributes `axis: X`. This means that `cf_xarray` can identify the `'X'` axis as being represented by the `lon` variable.

It can also use the `standard_name` and `units` attributes to infer that `lon` is "Longitude". To see variable names that `cf_xarray` can infer, use `.cf.describe()`

In [None]:
ds.cf.describe()

For `pop`, only `latitude` and `longitude` are detected, not `X` or `Y`. Please comment here: https://github.com/xarray-contrib/cf-xarray/issues/23 if you have opinions about this behaviour.

In [None]:
pop.cf.describe()

For `multiple`, multiple `X` and `Y` coordinates are detected

In [None]:
multiple.cf.describe()

## Feature: Accessing coordinate variables

`.cf` implements `__getitem__` to allow easy access to coordinate and axis variables.

In [None]:
ds.cf["X"]

`Dataset.cf[...]` returns a `Dataset`, possibly with multiple variables

In [None]:
pop.cf["longitude"]

`DataArray.cf[...]` returns a single `DataArray`, parsing the `coordinates` attribute if present, so we correctly get the `TLONG` variable and not the `ULONG` variable

In [None]:
pop.TEMP.cf["longitude"]

## Feature: Rewriting arguments

`cf_xarray` can rewrite arguments for a large number of xarray functions. By this I mean that instead of specifing say `dim="lon"`, you can pass `dim="X"` or `dim="longitude"` and `cf_xarray` will rewrite that to `dim="lon"` based on the attributes present in the dataset. 

Here are a few examples

### Slicing

In [None]:
ds.air.cf.isel(T=1)

Slicing works will expand a single key like `X` to multiple dimensions if those dimensions are tagged with `axis: X`

In [None]:
multiple.cf.isel(X=1, Y=1)

### Reductions

In [None]:
ds.air.cf.mean("X")

Expanding to multiple dimensions is also supported

In [None]:
# takes the mean along ["x1", "x2"]
multiple.cf.mean("X")

### Plotting

In [None]:
ds.air.cf.isel(time=1).cf.plot(x="X", y="Y")

In [None]:
ds.air.cf.isel(T=1, Y=[0, 1, 2]).cf.plot(x="longitude", hue="latitude")

`cf_xarray` can facet

In [None]:
ds.air.isel(time=slice(4)).cf.plot(x="longitude", y="latitude", col="time")

### Resample & groupby

In [None]:
ds.cf.resample(T="D").mean()

In [None]:
ds.cf.groupby("time.month").mean("longitude")

### Rolling & coarsen 

In [None]:
ds.cf.rolling(X=5).mean()

``coarsen`` works but everything later will break because of xarray bug https://github.com/pydata/xarray/issues/4120

``ds.isel(lon=slice(50)).cf.coarsen(Y=5, X=10).mean()``

#### miscellaneous features

You can mix "special names" and variable names

In [None]:
ds.cf.groupby("time.month").mean(["lat", "X"])

## Feature: Weight by Cell Measures

`cf_xarray` can weight by cell measure variables `"area"` and `"volume"` if the appropriate attribute is set

In [None]:
# Lets make some weights (not sure if this is right)
ds.coords["cell_area"] = (
    np.cos(ds.air.cf["latitude"] * np.pi / 180)
    * xr.ones_like(ds.air.cf["longitude"])
    * 105e3
    * 110e3
)
# and set proper attributes
ds.air.attrs["cell_measures"] = "area: cell_area"

In [None]:
ds.air.cf.weighted("area").mean(["latitude", "time"]).cf.plot(x="longitude")
ds.air.mean(["lat", "time"]).cf.plot(x="longitude")