# GroupBy

## Grouping with bins

Note that this notebooks requires [Mantid](https://www.mantidproject.org/Main_Page) and data files.
Therefore, outputs are unfortunately not available on readthedocs.
A [binder](https://mybinder.org/v2/gh/scipp/scipp-neutron-jupyter-demo/master) is available that can run this notebook.

In [None]:
import numpy as np
import scipp as sc
from scipp import Dim

In [None]:
events = sc.neutron.load(filename='PG3_4844_event.nxs', load_pulse_times=False)

### Example 1 (dense data): split-sum-combine

We histogram the sparse (event) data:

In [None]:
bins = sc.Variable([Dim.Tof], values=np.arange(0.0, 17000.0, 50.0), unit=sc.units.us)
pos_hist = sc.histogram(events, bins)

A plot shows the shortcoming of the data representation.
There is no physical meaning attached to the "spectrum" dimension and the plot is hard to interpret:

In [None]:
sc.plot.plot(pos_hist)

To improve the plot, we store first the scattering angle as albels in the data array.
Then we create a variable containing the desired target binning:

In [None]:
pos_hist.labels['scattering_angle'] = sc.neutron.scattering_angle(pos_hist)
theta = sc.Variable([Dim.ScatteringAngle],
                    unit=sc.units.rad,
                    values=np.linspace(0.0, np.pi/2, num=200))

We use `scipp.groupby` with the desired bins and apply a `sum` over `Dim.Spectrum`:

In [None]:
theta_hist = sc.groupby(pos_hist, 'scattering_angle', bins=theta).sum(Dim.Spectrum)

The result has `Dim.Spectrum` replaced by the physically meaningful `Dim.ScatteringAngle` and the resulting plot is easily interpretable:

In [None]:
sc.plot.plot(theta_hist)

### Example 2 (sparse data): split-flatten-combine

This is essentially the same as example 1 but avoids histogramming data too early.
A plot of the original data is hard to interpret:

In [None]:
sc.plot.plot(events, bins=np.linspace(0.0, 17000.0, 1000))

To improve the plot, we store first the scattering angle as albels in the data array.
Then we create a variable containing the desired target binning:

In [None]:
events.labels['scattering_angle'] = sc.neutron.scattering_angle(events)
theta = sc.Variable([Dim.ScatteringAngle],
                    unit=sc.units.rad,
                    values=np.linspace(0.0, np.pi/2, num=200))

We use `scipp.groupby` with the desired bins and apply a `flatten` `Dim.Spectrum`.
This is the sparse-data equivalent to summing histograms:

In [None]:
theta_events = sc.groupby(events, 'scattering_angle', bins=theta).flatten(Dim.Spectrum)

The result has `Dim.Spectrum` replaced by the physically meaningful `Dim.ScatteringAngle` and the resulting plot is easily interpretable:

In [None]:
sc.plot.plot(theta_events, bins=np.linspace(0.0, 17000.0, 1000))