In [None]:
%matplotlib inline

Binning
======

Statistical data binning is a way to group several more or less
continuous values into a smaller number of *bins*. For example, if you
have irregularly distributed data over the oceans, you can organize
these observations into a lower number of geographical intervals (for
example, by grouping them all five degrees into latitudes and
longitudes).

In this example, we will calculate drifter velocity statistics on the
Black Sea over a period of 9 years.

In [None]:
import cartopy.crs
import matplotlib
import matplotlib.pyplot
import numpy
import pyinterp
import pyinterp.backends.xarray
import pyinterp.tests
import xarray

The first step is to load the data into memory and create the
interpolator object:

In [None]:
ds = xarray.open_dataset(pyinterp.tests.aoml_path())

Let's start by calculating the standard for vectors u and v.

In [None]:
norm = (ds.ud**2 + ds.vd**2)**0.5

Now, we will describe the grid used to calculate our
[binned](https://pangeo-pyinterp.readthedocs.io/en/latest/generated/pyinterp.Binning2D.html#pyinterp.Binning2D)
statics.

In [None]:
binning = pyinterp.Binning2D(
    pyinterp.Axis(numpy.arange(27, 42, 0.3), is_circle=True),
    pyinterp.Axis(numpy.arange(40, 47, 0.3)))
binning

We push the loaded data into the different defined bins using [simple
binning](https://pangeo-pyinterp.readthedocs.io/en/latest/generated/pyinterp.Binning2D.push.html#bilinear-binning).

In [None]:
binning.clear()
binning.push(ds.lon, ds.lat, norm, True)

It is possible to retrieve other statistical
[variables](https://pangeo-pyinterp.readthedocs.io/en/latest/generated/pyinterp.Binning2D.variable.html#pyinterp.Binning2D.variable)
such as variance, minimum, maximum, etc.

In [None]:
nearest = binning.variable('mean')

Then, we push the loaded data into the different defined bins using
[linear binning](https://pangeo-pyinterp.readthedocs.io/en/latest/generated/pyinterp.Binning2D.push.html#bilinear-binning)

In [None]:
binning.clear()
binning.push(ds.lon, ds.lat, norm, False)
linear = binning.variable('mean')

We visualize our result

In [None]:
fig = matplotlib.pyplot.figure(figsize=(10, 8))
ax1 = fig.add_subplot(211, projection=cartopy.crs.PlateCarree())
lon, lat = numpy.meshgrid(binning.x, binning.y, indexing='ij')
pcm = ax1.pcolormesh(lon,
                     lat,
                     nearest,
                     cmap='jet',
                     vmin=0,
                     vmax=1,
                     transform=cartopy.crs.PlateCarree())
ax1.coastlines()
ax1.set_title("Simple binning.")

ax2 = fig.add_subplot(212, projection=cartopy.crs.PlateCarree())
lon, lat = numpy.meshgrid(binning.x, binning.y, indexing='ij')
pcm = ax2.pcolormesh(lon,
                     lat,
                     linear,
                     cmap='jet',
                     vmin=0,
                     vmax=1,
                     transform=cartopy.crs.PlateCarree())
ax2.coastlines()
ax2.set_title("Linear binning.")
fig.colorbar(pcm, ax=[ax1, ax2], shrink=0.8)
fig.show()