# Using boost-histogram

This workbook was designed with Python 3 in mind.

In [None]:
import boost_histogram as bh
import numpy as np
import matplotlib.pyplot as plt

## 1: Basic 1D histogram

Let's start with the basics. We will create a histogram using boost-histogram and fill it.

### 1.1: Data

Let's make a 1d dataset to run on.

In [None]:
data1 = np.random.normal(3.5, 2.5, size=1_000_000)

Now, let's make a histogram

In [None]:
hist1 = bh.Histogram(bh.axis.Regular(40, -2, 10))

In [None]:
hist1.fill(data1)

You can see that the histogram has been filled. Let's explicitly check to see how many entries are in the histogram:

In [None]:
hist1.sum()

What happened to the missing items? They are in the underflow and overflow bins:

In [None]:
hist1.sum(flow=True)

Like ROOT, we have overflow bins by default. We can turn them off, but they enable some powerful things like projections.

Let's plot this (Hist should make this easier):

In [None]:
plt.bar(hist1.axes[0].centers, hist1.view(), width=hist1.axes[0].widths);

Note: You can leave off the `.view()` if you want to - histograms conform to the buffer protocol. Also, you can select the axes before or after calling `.centers`; this is very useful for ND histograms.

From now on, let's be lazy

In [None]:
plothist = lambda h: plt.bar(*h.axes.centers, h, width=h.axes.widths[0]);

Aside: here's step. It's quite ugly for us, just like it is for numpy. Or anyone.

In [None]:
plt.step(hist1.axes[0].edges[:-1], hist1, where='post');

No plotting is built in, but the data is easy to access.

## 2: Drop-in replacement for Numpy

To start using this yourself, you don't even need to change your code. Let's try the numpy adapters.

In [None]:
bins2, edges2 = bh.numpy.histogram(data1, bins=10)

In [None]:
b2, e2 = np.histogram(data1, bins=10)

In [None]:
bins2 - b2

In [None]:
e2 - edges2

Not bad! Let's start moving to the boost-histogram API, so we can use our little plotting function:

In [None]:
hist2 = bh.numpy.histogram(data1, bins='auto', histogram=bh.Histogram)
plothist(hist2);

Now we can move over to boost-histogram one step at a time! Just to be complete, we can also go back to a Numpy tuple from a Histogram object:

In [None]:
b2p, e2p = bh.numpy.histogram(data1, bins=10, histogram=bh.Histogram).to_numpy()
b2p == b2

## 3: More dimensions

The same API works for multiple dimensions.

In [None]:
hist3 = bh.Histogram(
    bh.axis.Regular(150, -1.5, 1.5),
    bh.axis.Regular(100, -1, 1)
)

In [None]:
def make_2D_data(*, mean=(0, 0), widths=(1, 1), size=1_000_000):
    cov = np.asarray(widths) * np.eye(2)
    return np.random.multivariate_normal(mean, cov, size=size).T

In [None]:
data3x = make_2D_data(mean=[-.75, .5], widths=[.2, 0.02])
data3y = make_2D_data(mean=[.75, .5], widths=[.2, 0.02])

From here on out, I will be using `.reset()` before a `.fill()`, just to make sure each cell in the notebook can be rerun.

In [None]:
hist3.reset()
hist3.fill(*data3x)
hist3.fill(*data3y)

Again, let's make plotting a little function:

In [None]:
def plothist2d(h):
    X, Y = h.axes.edges
    return plt.pcolormesh(X.T, Y.T, h.view().T)

This is transposed because it is ij indexed instead of xy. You can also do:

```python
X, Y = h.axes.edges
X, Y = np.broadcast_arrays(X, Y)
return plt.pcolormesh(X, Y, h)
```

In [None]:
plothist2d(hist3);

Let's try a 3D histogram

In [None]:
data3d = [np.random.normal(size=1_000_000) for _ in range(3)]

hist3d = bh.Histogram(
    bh.axis.Regular(150, -5, 5),
    bh.axis.Regular(100, -5, 5),
    bh.axis.Regular(100, -5, 5)
)

hist3d.fill(*data3d)

Let's project to the first two axes:

In [None]:
plothist2d(hist3d.project(0, 1));

## 4: UHI

Let's explore the boost-histogram UHI syntax. We will reuse the previous 2D histogram from part 3:

In [None]:
plothist2d(hist3);

I can see that I want y from 0.25 to 0.75, in data coordinates:

In [None]:
plothist2d(hist3[:, bh.loc(.25):bh.loc(.75)]);

What's the contents of a bin?

In [None]:
hist3[100, 87]

How about in data coordinates?

In [None]:
hist3[bh.loc(.5), bh.loc(.75)]

> Note: to get the coordinates manually:
> 
> ```python
hist3.axes[0].index(.5) == 100
hist3.axes[1].index(.75) == 87
```

How about a 1d histogram?

In [None]:
plothist(hist3[:, ::bh.sum]);
plothist(hist3[::bh.sum, :]);

Let's look at one part and rebin:

In [None]:
plothist2d(hist3[:50:bh.rebin(2), 50::bh.rebin(2)]);

What is the value at `(-.75, .5)`?

In [None]:
hist3[bh.loc(-.75), bh.loc(.5)]