# High-Performance Histogramming for HEP Analysis
## Henry Schreiner, Aman Geol, and Shuo Liu


Run the code with us through Binder, altering examples and asking "what if" questions along the way :)

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/henryiii/histogram-tutorial/master?filepath=talk_1_bh_uhi.ipynb)

In [None]:
import sys

# A new era: boost-histogram 1.0!

Boost-histogram 1.0 came out with some great features:

* Removal of Python 2.7 & 3.5 support
    * Fewer lines of code and more features than 0.13!
* Fully staticly typed
* Easier subclassing

Along with many great features available in 0.11-0.13:
* Much simpler, Pythonic metadata system
* Much better access to accumulator storages
* PlottableProtocol support! (see UHI section later)

Huge updates to pybind11 & cibuildwheel as part of this work, too!
* New CMake and setuptools helpers
* 40% faster accumulation with less memory used
* Python 3.9 support
* Universal2 wheels for Apple Silicon, including 3.8+
* And much more!

# Demo: statically typed Histogram

In [None]:
%%writefile tmp_bh_mypy.py

import boost_histogram as bh

h = bh.Histogram(bh.axis.StrCategory(["good", "bad", "ugly"]), storage=bh.storage.Double())

vals = h.values()

reveal_locals()

# Try:
# h = bh.Histogram(bh.axis.IntCategory(["good", "bad", "ugly"]), storage=bh.storage.Double())
# h["hi"]
# h.value()

In [None]:
!{sys.executable} -m mypy --strict tmp_bh_mypy.py

Try uncommenting some of the mistakes above!

# Boost-histogram 1.1

Released yesterday, boost-histogram 1.1 adds a few exciting features:

* Experimental support for selection on category axes!
* Python 3.8 Apple Silicon support (3.9 was already supported)
* Scaling and addition affect flow bins

In [None]:
import boost_histogram as bh

In [None]:
h = bh.Histogram(bh.axis.StrCategory(["good", "bad", "ugly"]))
h

In [None]:
h[[0,2]]

We can use this to reorder categorical axes!

In [None]:
h[[2, 1, 0]]

Of course, the locator protocol is supported as well:

In [None]:
h[[bh.loc("bad"), bh.loc("good")]]

# UHI: Unified Histotogram Interface

A staticly typed protocol for plotting histograms from any source.

Also an actual PyPI library, with helper functions, and lots of [centralized documentation](https://uhi.readthedocs.io/en/latest/plotting.html#using-the-protocol).

Supported by:

* boost-histogram 0.13+
* hist
* uproot 4
* mplhep
* histoprint

New in the latest version: converter for PyROOT to a PlottableHistogram!

* `h.kind`: The `bh.Kind` of the histogram (COUNT or MEAN)
* `h.values()`: The value (as given by the kind)
* `h.variances()`: The variance in the value (None if an unweighed histogram was filled with weights)
* `h.counts()`: How many fills the bin received or the effective number of fills if the histogram is weighted
* `h.axes`: A Sequence of axes

Axes have:

* `ax[i]`: A tuple of (lower, upper) bin, or the discrete bin value (integer or sting)
* `len(ax)`: The number of bins
* Iteration is supported
* `ax.traits.circular`: True if circular
* `ax.traits.discrete`: True if the bin represents a single value (e.g. Integer or Category axes) instead of an interval (e.g. Regular or Variable axes)

Your usage of these this is _statically verified_ by mypy when you claim to have a PlottableProtocol as an input.

```python
from typing import TYPE_CHECKING, cast
from uhi.typing.plottable import PlottableProtocol
from mylibrary import MyHistogram

if TYPE_CHECKING:
    _: PlottableHistogram = cast(MyHistogram, None)
```

No runtime dependence (unless you want the nice helpers)

# Demo: Plotting a boost-histogram with mplhep or histoprint

In [None]:
import boost_histogram as bh
import mplhep
import histoprint
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

rng = np.random.default_rng()

In [None]:
h = bh.Histogram(bh.axis.Regular(100, -10, 10))
h.fill(rng.standard_normal(1_000_000), threads=4)

We can pass this to mplhep, and it understands it, even though it doesn't depend on or import boost-histogram:

In [None]:
mplhep.histplot(h);

Or histoprint (though let's tighten the axis to -3:3 and rebin a bit):

In [None]:
histoprint.print_hist(h[bh.loc(-3) : bh.loc(3) : bh.rebin(2)])

## And, a 2D demo:

In [None]:
sp = Image.open("bh-icon.png")
sp

In [None]:
arr = np.asarray(sp)[:, :, 3]

In [None]:
x, y, p = rng.random((3, 200_000))

In [None]:
bx = (x * 200).astype(np.int32)
by = (y * 200).astype(np.int32)

In [None]:
# Probibility of hit
prob = arr[by, bx] / 255

# True if hit
hits = prob > p

In [None]:
X = x[hits]
Y = 1 - (y[hits] / 200 * 200 + (1 / 200))

In [None]:
h = bh.Histogram(
    bh.axis.Regular(300, 0, 1, metadata="x"), bh.axis.Regular(300, 0, 1, metadata="y")
)

h.fill(X, Y)

In [None]:
def plt_h2(h):
    x, y = h.axes.edges
    plt.pcolormesh(x.T, y.T, h.view().T)
    plt.gca().set_aspect("equal", "box")
    plt.xlabel(h.axes[0].metadata)
    plt.ylabel(h.axes[1].metadata)

In [None]:
plt_h2(h)

In [None]:
plt_h2(h[:: bh.rebin(6), :: bh.rebin(3)])

In [None]:
plt_h2(h[bh.loc(0.2):bh.loc(0.68), bh.loc(0.5):bh.loc(0.8)])