## The `groupby` function

Groupby can be used to organize data in specific ways and then apply an aggregator over the data under that organization. Groupby is often done to combine data in different time grouping and common aggregations are mean() or std(). 

This is typically referred to as `split-apply-combine`:

* Split your data into multiple independent groups.
* Apply some function to each group.
* Combine your groups back into a single data object.

In [None]:
import xarray as xr
import matplotlib.pyplot as plt
import numpy as np

We will read in some sea surface temperature data (SST)

In [None]:
path = '/home/lortizur/clim680/OISSTv2'
subdir = 'monthly'
file = 'sst.mnmean.nc'

In [None]:
ds = xr.open_dataset(path+'/'+subdir+'/'+file)
ds

In [None]:
subdir = 'lmask'
file = 'lsmask.nc'
ds_mask = xr.open_dataset(path+'/'+subdir+'/'+file).squeeze()
ds_mask

In [None]:
da_ocean = ds['sst'].where(ds_mask['mask']==1)

### We can use `groupby` to make seasonal means

In [None]:
da_seas = da_ocean.groupby('time.season').mean()
da_seas

In [None]:
plt.contourf(da_seas.lon,da_seas.lat,da_seas.sel(season='DJF'),cmap='coolwarm')
plt.colorbar()

### Or seasonal standard deviations or variances

In [None]:
da_seas_std = da_ocean.groupby('time.season').std()
plt.contourf(da_seas_std.lon,da_seas_std.lat,da_seas_std.sel(season='DJF'),cmap='cubehelix_r')
plt.colorbar()

In [None]:
da_seas_var = da_ocean.groupby('time.season').var()
plt.contourf(da_seas_var.lon,da_seas_var.lat,da_seas_var.sel(season='DJF'),cmap='cubehelix_r')
plt.colorbar()

### We can also groupby other time increments such as `month`

In [None]:
da_month = da_ocean.groupby('time.month').mean()
da_month

In [None]:
da_month_std = da_ocean.groupby('time.month').std()
plt.contourf(da_seas_std.lon,da_seas_std.lat,da_month_std.sel(month=12),cmap='cubehelix_r')
plt.colorbar()

In [None]:
plt.contourf(da_seas_std.lon,da_seas_std.lat,da_month_std.sel(month=4),cmap='cubehelix_r')
plt.colorbar()