# GroupBy, Resample, Rolling Operations

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#GroupBy,-Resample,-Rolling-Operations" data-toc-modified-id="GroupBy,-Resample,-Rolling-Operations-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>GroupBy, Resample, Rolling Operations</a></span><ul class="toc-item"><li><span><a href="#Learning-Objectives" data-toc-modified-id="Learning-Objectives-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Learning Objectives</a></span></li><li><span><a href="#GroupBy-Operations" data-toc-modified-id="GroupBy-Operations-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>GroupBy Operations</a></span></li><li><span><a href="#Resampling-Operations" data-toc-modified-id="Resampling-Operations-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Resampling Operations</a></span></li><li><span><a href="#Rolling-Window-Operations" data-toc-modified-id="Rolling-Window-Operations-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Rolling Window Operations</a></span></li><li><span><a href="#Going-Further" data-toc-modified-id="Going-Further-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Going Further</a></span></li></ul></li></ul></div>

## Learning Objectives


- Use groupby to create climatologies and calculate anomalies.
- Change the temporal resolution of data via resample and rolling.

## GroupBy Operations

xarray supports “group by” operations with the same API as pandas to implement the split-apply-combine strategy:

- Split your data into multiple independent groups.
- Apply some function to each group.
- Combine your groups back into a single data object.

Group by operations work on both Dataset and DataArray objects. Most of the examples focus on grouping by a single one-dimensional variable, although support for grouping over a multi-dimensional variable is also supported:

- **Using groupby to calculate a monthly climatology:**

In [None]:
import xarray as xr

In [None]:
da = xr.open_dataarray("../../../data/air_temperature.nc")

In [None]:
da_climatology = da.groupby('time.month').mean('time')

da_climatology

In this case, we provide what we refer to as a virtual variable (`time.month`). Other virtual variables include: `year`, `month`, `day`, `hour`, `minute`, `second`, `dayofyear`, `week`, `dayofweek`, `weekday` and `quarter`. It is also possible to use another DataArray or pandas object as the grouper.

In [None]:
da.groupby('time.season').median('time')

## Resampling Operations

In order to resample time-series data, xarray provides a `resample` convenience method for frequency conversion and resampling of time series. 

In [None]:
da

- **Downsample our 6 hourly time-series data to quaterly data:**

In [None]:
da1 = da.resample(time='QS').mean(dim='time')
da1

- **Upsample our quarterly time-series data to daily data:**

In [None]:
da.resample(time='1D').interpolate('linear')

## Rolling Window Operations

Xarray objects include a rolling method to support rolling window aggregations:

In [None]:
roller = da.rolling(time=3)

In [None]:
roller

In [None]:
roller.mean()

- **We can also provide a custom function**

In [None]:
def sum_minus_2(da, axis):
    return da.sum(axis=axis) - 273

roller.reduce(sum_minus_2)

## Going Further

- [Xarray Docs - GroupBy: split-apply-combine](https://xarray.pydata.org/en/stable/groupby.html)
- [Xarray Docs - Rolling Window Operations](https://xarray.pydata.org/en/stable/computation.html#rolling-window-operations)
- [Xarray Docs - Resampling and grouped operations](https://xarray.pydata.org/en/stable/time-series.html#resampling-and-grouped-operations)

<div class="alert alert-block alert-success">
  <p>Previous: <a href="06_alignment.ipynb">Alignment</a></p>
</div>