In [2]:
import numpy as np

- 
- Dataset

```python
# dataset holds the datacube
ds
# access to a data array with dictionary
ds["air"]
# usually refer to the spatial dimensions and time
ds.air.dims
# a simple data container for coordinate variables.
ds.air.coords
# datacube meta data
ds.air.attrs
# underlying data (e.g. numpy array)
ds.air.data
# what is the type of the underlying data
type(ds.air.data)

# plot
# Without xarray
lat = ds.air.lat.data  # numpy array
lon = ds.air.lon.data  # numpy array
temp = ds.air.data  # numpy array
plt.figure()
plt.pcolormesh(lon, lat, temp[0, :, :]);

# With xarray
ds.air.isel(time=0).plot(x="lon");
# Use dimension names instead of axis numbers
ds.air.mean(dim="time").plot(x="lon")

# label-based indexing using .sel
# pull out data for all of 2013-May
ds.sel(time="2013-05")
# demonstrate slicing
ds.sel(time=slice("2013-05", "2013-07"))
ds.sel(time="2013")
# demonstrate "nearest" indexing
ds.sel(lon=240.2, method="nearest")
# "nearest indexing at multiple points"
ds.sel(lon=[240.125, 234], lat=[40.3, 50.3], method="nearest")

# position-based indexing using .isel
# from using simple index
ds.air.data[0, 2, 3]
# pull out time index 0, lat index 2, and lon index 3
ds.air.isel(time=0, lat=2, lon=3)  #  much better than ds.air[0, 2, 3]
# demonstrate slicing
ds.air.isel(lat=slice(10))

```

## Concepts for computation

Consider calculating the *mean air temperature per unit surface area* for this dataset. Because latitude and longitude correspond to spherical coordinates for Earth's surface, each 2.5x2.5 degree grid cell actually has a different surface area as you move away from the equator! This is because *latitudinal length* is fixed ($ \delta Lat = R \delta \phi  $), but *longitudinal length varies with latitude* ($ \delta Lon = R \delta \lambda \cos(\phi) $)

So the [area element for lat-lon coordinates](https://en.wikipedia.org/wiki/Spherical_coordinate_system#Integration_and_differentiation_in_spherical_coordinates) is


$$ \delta A = R^2 \delta\phi \, \delta\lambda \cos(\phi) $$

where $\phi$ is latitude, $\delta \phi$ is the spacing of the points in latitude, $\delta \lambda$ is the spacing of the points in longitude, and $R$ is Earth's radius. (In this formula, $\phi$ and $\lambda$ are measured in radians)

```python
# Earth's average radius in meters
R = 6.371e6

# Coordinate spacing for this dataset is 2.5 x 2.5 degrees
dϕ = np.deg2rad(2.5)
dλ = np.deg2rad(2.5)

dlat = R * dϕ * xr.ones_like(ds.air.lon)
dlon = R * dλ * np.cos(np.deg2rad(ds.air.lat))
```

There are two concepts here:

you can call functions like np.cos and np.deg2rad ("numpy ufuncs") on Xarray objects and receive an Xarray object back.
We used ones_like to create a DataArray that looks like ds.air.lon in all respects, except that the data are all ones

```python
# returns an xarray DataArray!
np.cos(np.deg2rad(ds.lat))
# area
cell_area = dlon * dlat
cell_area
```

Tip: If you notice extra NaNs or missing points after xarray computation, it means that your xarray coordinates were not aligned exactly.

For more, see
[the Xarray documentation](https://docs.xarray.dev/en/stable/user-guide/computation.html#automatic-alignment). [This tutorial notebook](https://tutorial.xarray.dev/fundamentals/02.3_aligning_data_objects.html) also covers alignment and broadcasting (*highly recommended*)

To make sure variables are aligned as you think they are, do the following:

```python
xr.align(cell_area_bad, ds.air, join="exact")
```

Xarray has some very useful high level objects that let you do common
computations:

1. `groupby` :
   [Bin data in to groups and reduce](https://docs.xarray.dev/en/stable/groupby.html)
1. `resample` :
   [Groupby specialized for time axes. Either downsample or upsample your data.](https://docs.xarray.dev/en/stable/user-guide/time-series.html#resampling-and-grouped-operations)
1. `rolling` :
   [Operate on rolling windows of your data e.g. running mean](https://docs.xarray.dev/en/stable/user-guide/computation.html#rolling-window-operations)
1. `coarsen` :
   [Downsample your data](https://docs.xarray.dev/en/stable/user-guide/computation.html#coarsen-large-arrays)
1. `weighted` :
   [Weight your data before reducing](https://docs.xarray.dev/en/stable/user-guide/computation.html#weighted-array-reductions)


```python
# groupby
ds.groupby("time.season")
seasonal_mean = ds.groupby("time.season").mean()

# The seasons are out of order (they are alphabetically sorted).
# This is a common annoyance. The solution is to use .sel to change the order of labels
# "DJF" = Dec, Jan, Feb
seasonal_mean = seasonal_mean.sel(season=["DJF", "MAM", "JJA", "SON"])
# resample to monthly frequency
ds.resample(time="M").mean()
# weight by cell_area and take mean over (time, lon)
ds.weighted(cell_area).mean(["lon", "time"]).air.plot(y="lat");
```

## Plotting

```python
# facet the seasonal_mean
seasonal_mean.air.plot(col="season", col_wrap=2);
# contours
seasonal_mean.air.plot.contour(col="season", levels=20, add_colorbar=True);
# line plots too? wut
seasonal_mean.air.mean("lon").plot.line(hue="season", y="lat");
```

```python
# write to netCDF
ds.to_netcdf("my-example-dataset.nc")
# read from disk
fromdisk = xr.open_dataset("my-example-dataset.nc")
# check that the two are identical
ds.identical(fromdisk)
# convert to pandas dataframe
df = ds.isel(time=slice(10)).to_dataframe()
# convert dataframe to xarray
df.to_xarray()
```

## Using Dask
```python
# demonstrate dask dataset
dasky = xr.tutorial.open_dataset(
    "air_temperature",
    chunks={"time": 10},  # 10 time steps in each block
)

dasky.air
# demonstrate lazy mean
dasky.air.mean("lat")
# "compute" the mean
dasky.air.mean("lat").compute()
```

## HoloViz
Quickly generate interactive plots from your data!

The [`hvplot` package](https://hvplot.holoviz.org/user_guide/Gridded_Data.html) attaches itself to all
xarray objects under the `.hvplot` namespace. So instead of using `.plot` use `.hvplot`


```python
import hvplot.xarray

ds.air.hvplot(groupby="time", clim=(270, 300), widget_location='bottom')
```

### cf_xarray 

[cf_xarray](https://cf-xarray.readthedocs.io/) is a project that tries to
let you make use of other CF attributes that xarray ignores. It attaches itself
to all xarray objects under the `.cf` namespace.

Where xarray allows you to specify dimension names for analysis, `cf_xarray`
lets you specify logical names like `"latitude"` or `"longitude"` instead as
long as the appropriate CF attributes are set.

For example, the `"longitude"` dimension in different files might be labelled as: (lon, LON, long, x…), but cf_xarray let's you always refer to the logical name `"longitude"` in your code:

The following `mean` operation will work with any dataset that has appropriate
attributes set that allow detection of the "latitude" variable (e.g.
`units: "degress_north"` or `standard_name: "latitude"`)


```python
import cf_xarray
# describe cf attributes in dataset
ds.air.cf
# demonstrate equivalent of .mean("lat")
ds.air.cf.mean("latitude")
# demonstrate indexing
ds.air.cf.sel(longitude=242.5, method="nearest")
```

### Other cool packages

- [xgcm](https://xgcm.readthedocs.io/) : grid-aware operations with xarray
  objects
- [xrft](https://xrft.readthedocs.io/) : fourier transforms with xarray
- [xclim](https://xclim.readthedocs.io/) : calculating climate indices with
  xarray objects
- [intake-xarray](https://intake-xarray.readthedocs.io/) : forget about file
  paths
- [rioxarray](https://corteva.github.io/rioxarray/stable/index.html) : raster
  files and xarray
- [xesmf](https://xesmf.readthedocs.io/) : regrid using ESMF
- [MetPy](https://unidata.github.io/MetPy/latest/index.html) : tools for working
  with weather data

Check the Xarray [Ecosystem](https://docs.xarray.dev/en/stable/ecosystem.html) page and [this tutorial](https://tutorial.xarray.dev/intermediate/xarray_ecosystem.html) for even more packages and demonstrations.

## Next

1. Read the [tutorial](https://tutorial.xarray.dev) material and [user guide](https://docs.xarray.dev/en/stable/user-guide/index.html)
1. See the description of [common terms](https://docs.xarray.dev/en/stable/terminology.html) used in the xarray documentation: 
1. Answers to common questions on "how to do X" with Xarray are [here](https://docs.xarray.dev/en/stable/howdoi.html)
1. Ryan Abernathey has a book on data analysis with a [chapter on Xarray](https://earth-env-data-science.github.io/lectures/xarray/xarray_intro.html)
1. [Project Pythia](https://projectpythia.org/) has [foundational](https://foundations.projectpythia.org/landing-page.html) and more [advanced](https://cookbooks.projectpythia.org/) material on Xarray. Pythia also aggregates other [Python learning resources](https://projectpythia.org/resource-gallery.html).
1. The [Xarray Github Discussions](https://github.com/pydata/xarray/discussions) and [Pangeo Discourse](https://discourse.pangeo.io/) are good places to ask questions.
1. Tell your friends! Tweet!
