# Examples of `xarray` resample and groupby
This notebook demonstrates how to use resample and groupby methods to calculate a time series of monthly means and a monthly climatology from daily data.

I'm using an example `DataArray` with dimensions `(time=730, points=10)` as a starting point.  This mimics the type of `DataArray` you would get after extracting 10 points from a grid.

In [1]:
import numpy as np
import xarray as xr
import pandas as pd

### Generate example `DataArray`

In [3]:
time = pd.date_range('2019-01-01', '2020-12-31', freq='D')  # Use pandas to generate time coordinate
ntime = len(time)
npoint = 10
point = np.arange(npoint)

data = np.random.rand(ntime, npoint)  # Use random array as an example

da = xr.DataArray(data, coords=[time, point], dims=['time', 'point'])
da

## Resample data to monthly means along time dimension
We want to resample along the `time` dimension at monthly (`M`) frequency and calculate a mean.  If you want variance or standard deviation you can use `var` or `std`

In [20]:
da_mon = da.resample(time='M').mean()
da_mon

## Calculate monthly climatology
There is two ways to do this.  First, calculate an average using all days in the time series that fall in a month.  Second, calculate a monthly climatology using monthly values.

In both methods, we use `groupby` to 'group' data based on the month and then calculate a mean.  We can use the `time` coordinate to do this.  We can extract the month number using `da.time.dt.month`

In [24]:
da.time.dt.month

Similarly, `da.time.dt.day`, `da.time.dt.year`, and `da.time.dt.season` can be used to access the day, year and season of each time stamp.  

First method...

In [23]:
da_clim = da.groupby(da.time.dt.month).mean(dim='time')
da_clim

Second method...

In [27]:
da_clim = da_mon.groupby(da_mon.time.dt.month).mean(dim='time')
da_clim