# Data binning

## Binning 2D

Statistical data binning is a way to group a number of more or less continuous values into a smaller number of "bins". For example, if you have irregularly distributed data over the oceans, you can organize these observations into a smaller number of geographical intervals (for example, by grouping them all five degrees into latitudes and longitudes).

In [None]:
import os
import numpy as np
import xarray as xr
import pyinterp

In this example, we will calculate the binning on the MSS grid with boxes of 5 degrees in latitudes and longitudes.

In [None]:
binning = pyinterp.Binning2D(
    pyinterp.Axis(np.arange(-180, 185, 5), is_circle=True),
    pyinterp.Axis(np.arange(-90, 95, 5)))

In [None]:
binning

Load our test sets into memory.

In [None]:
ds = xr.load_dataset("../tests/dataset/mss.nc")

In [None]:
lon, lat = np.meshgrid(ds.lon, ds.lat)

We push the loaded data into the different defined bins using simple binning.

In [None]:
binning.clear()
binning.push(lon, lat, ds.mss, True)
nearest = binning.variable('mean')

Then, we push the loaded data into the different defined bins using linear binning.

In [None]:
binning.clear()
binning.push(lon, lat, ds.mss, False)
linear = binning.variable('mean')

We visualize our result

In [None]:
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
%matplotlib inline

fig = plt.figure(figsize=(18, 9))
ax = fig.add_subplot(121, projection=ccrs.PlateCarree())
lon, lat = np.meshgrid(binning.x, binning.y, indexing='ij')
ax.pcolormesh(lon, lat, nearest, cmap='jet',
              transform=ccrs.PlateCarree())
ax.coastlines()
ax.set_title("Simple binning.")

ax = fig.add_subplot(122, projection=ccrs.PlateCarree())
lon, lat = np.meshgrid(binning.x, binning.y, indexing='ij')
ax.pcolormesh(lon, lat, linear, cmap='jet',
              transform=ccrs.PlateCarree())
ax.coastlines()
ax.set_title("Linear binning.")