## GroupBy Operations

Adapted from: https://ncar-hackathons.github.io/scientific-computing/intro.html

xarray supports “group by” operations with the same API as pandas to implement the split-apply-combine strategy:

- Split your data into multiple independent groups.
- Apply some function to each group.
- Combine your groups back into a single data object.

Group by operations work on both Dataset and DataArray objects. Most of the examples focus on grouping by a single one-dimensional variable, although support for grouping over a multi-dimensional variable is also supported:

- **Using groupby to calculate a monthly climatology:**

In [1]:
import xarray as xr

In [2]:
da = xr.tutorial.open_dataset('air_temperature')

In [3]:
da_climatology = da.groupby('time.month').mean('time')

da_climatology

In this case, we provide what we refer to as a virtual variable (`time.month`). Other virtual variables include: `year`, `month`, `day`, `hour`, `minute`, `second`, `dayofyear`, `week`, `dayofweek`, `weekday` and `quarter`. It is also possible to use another DataArray or pandas object as the grouper.

In [4]:
da.groupby('time.season').median('time')

## Resampling Operations

In order to resample time-series data, xarray provides a `resample` convenience method for frequency conversion and resampling of time series. 

In [5]:
da

- **Downsample our 6 hourly time-series data to quaterly data:**

In [6]:
da1 = da.resample(time='QS').mean(dim='time')
da1

- **Upsample our quarterly time-series data to daily data:**

In [7]:
da.resample(time='1D').interpolate('linear')

## Rolling Window Operations

Xarray objects include a rolling method to support rolling window aggregations:

In [8]:
roller = da.rolling(time=3)

In [9]:
roller

DatasetRolling [time->3]

In [10]:
roller.mean()

- **We can also provide a custom function**

In [11]:
def sum_minus_2(da, axis):
    return da.sum(axis=axis) - 273

roller.reduce(sum_minus_2)