boost-histogram for Python
Warning: This bindings are in progress and are not yet in an alpha stage.
Join the discussion on gitter to follow the development!
This library is under development, but you can install directly from github if you would like. You need a C++14 compiler and Python 2.7--3.7. Boost 1.70 is not required or needed (this only depends on included header-only dependencies).
All the normal best-practices for Python apply; you should be in a virtual environment, otherwise add
python -m pip install git+https://github.com/scikit-hep/boost-histogram.git@develop
This is a suggested example of usage.
import boost.histogram as bh # Compose axis however you like hist = bh.histogram(bh.axis.regular(2, 0, 1), bh.axis.regular(4, 0.0, 1.0)) # Filling can be done with arrays, one per dimension hist([.3, .5, .2], [.1, .4, .9]) # Numpy array view into histogram counts, no overflow bins counts = hist.view()
- Many axis types (all support
bh.axis.regular(n, start, stop, flow=True, growth=False): shortcut
bh.axis.regular_uoflow(n, start, stop):
nevenly spaced bins from
bh.axis.regular_noflow(n, start, stop):
regularbut with no underflow or overflow bins
bh.axis.regular_growth(n, start, stop):
regularbut grows if a value is added outside the range
bh.axis.circular(n, start, stop): Value outside the range wrap into the range
bh.axis.regular_log(n, start, stop): Regularly spaced values in log 10 scale
bh.axis.regular_sqrt(n, start, stop): Regularly spaced value in sqrt scale
bh.axis.regular_pow(n, start, stop, power): Regularly spaced value to some
bh.axis.integer(start, stop): Special high-speed version of
regularfor evenly spaced bins of width 1
integer, but with no under/overflow bins
integer, but will grow if new values are added
bh.axis.variable([start, edge1, edge2, ..., stop]): Uneven bin spacing
bh.axis.category_int([1, 2, ...]): Integer bins
bh.axis.category_int_growth([1, 2, ...]): Integer bins where new items are added automatically
bh.axis.category_str(["item1", "item2", ...]): String bins
bh.axis.category_str_growth(["item1", "item2", ...]): String bins where new items are added automatically
- Axis features:
.bin(i): The bin or a bin view for continuous axis types
.lower(): The lower value
.upper(): The upper value
.center(): The center value
.width(): The bin width
.bins(): A list of bins or bin views
.size(): The number of bins (not including under/overflow)
.size(flow=True): The number of bins (including under/overflow)
.options(): The options set on the axis (
.edges(flow=False): The N+1 bin edges (if continuous)
.centers(flow=False): The N bin centers (if continuous)
.update(value): Grow an axis to include
valueif needed and able
.index(values): The index at a point (or points) on the axis
.value(index): The value for a fractional bin in the axis
- Many storage types
bh.storage.int: 64 bit unsigned integers for high performance and useful view access
bh.storage.double: Doubles for weighted values
bh.storage.unlimited: Starts small, but can go up to unlimited precision ints or doubles.
bh.storage.atomic_int: Threadsafe filling, for higher performance on multhreaded backends. Does not support growing axis in threads.
bh.accumulator.weighted_sum: Tracks a weighted sum and variance
bh.accumulator.weighted_mean: Tracks a weighted sum, mean, and variance (West's incremental algorithm)
bh.accumulator.sum: High accuracy sum (Neumaier)
bh.accumulator.mean: Running count, mean, and variance (Welfords's incremental algorithm)
- Histogram operations
(a, b, ...): Fill with arrays or single values
+: Add two histograms
.rank(): The number of dimensions
.size(): The number of bins (include under/overflow bins)
.reset(): Set counters to 0
*=: Multiply by a scaler (not all storages) (
hist * scalarand
scalar * histsupported too)
/=: Divide by a scaler (not all storages) (
hist / scalarsupported too)
.to_numpy(flow=False): Convert to a numpy style tuple (with or without under/overflow bins)
.view(flow=False): Get a view on the bin contents (with or without under/overflow bins)
np.asarray(...): Get a view on the bin contents with under/overflow bins
.axis(i): Get the
.at(i, j, ...): Get the bin contents as a location
.sum(): The total count of all bins
.project(ax1, ax2, ...)Project down to listed axis (numbers)
bh.histogram(..., storage=...)to make a histogram (there are several different types)
- Several common combinations are optimized, such as regular axes + int storage
For a source build, for example from an "sdist" package, the only requirements are a C++14 compatible compiler. If you are using Python 2.7 on Windows, this can be done in theory but is challenging; please upgrade to Python 3.6 or newer. Check the PyBind11 documentation for more help. On some Linux systems, you may need to use a newer compiler than the one your distribution ships with.
The easiest way to get boost-histogram is to use a binary wheel. These are the supported platforms for which wheels are produced:
|ManyLinux1 (custom GCC 8.3)||64 & 32-bit||2.7, 3.5, 3.6, 3.7|
|ManyLinux2010||64-bit||2.7, 3.5, 3.6, 3.7|
|macOS 10.9+||64-bit||2.7, 3.6, 3.7|
|Windows||64 & 32-bit||3.6, 3.7|
- Linux: I'm not supporting 3.4 because I have to build the Numpy wheels to do so.
- manylinux1: Using a custom docker container with GCC 8.3; should work but can't be called directly other compiled extensions unless they do the same thing (think that's the main caveat). Supporting 32 bits because it's there.
- manylinux2010: Requires pip 10+ and a version of Linux newer than 2010. This is very new technology.
- MacOS: Using the dedicated 64 bit 10.9+ Python.org builds. Not supporting 3.5 because those no longer provide binaries (could add a 32+64 fat 10.6+ that really was 10.9+, but not worth it, IMO)
- Windows: older is hard to support for now due to MSVC, could be attempted later - PyBind11 is technically supposed to be able to do it.
If you are on a Linux system that is not part of the "many" in manylinux, such as Alpine or ClearLinux, building from source is usually fine, since the compilers on those systems are often quite new. It will just take a little longer to install when it's using the sdist instead of a wheel.
Conda support is planned.
This repository has dependencies in submodules. Check out the repository like this:
git clone --recursive https://github.com/scikit-hep/boost-histogram.git cd boost-histogram
Faster version (click to expand)
git clone https://github.com/scikit-hep/boost-histogram.git cd boost-histogram git submodule update --init --depth 10
Make a build directory and run CMake. If you have a specific Python you want to use, add
-DPYTHON_EXECUTABLE=$(which python) or similar to the CMake line.
mkdir build cd build cmake .. make -j4
Run the unit tests (requires pytest and numpy). Use
make test, like this:
The tests require
pytest-benchmark. If you are using Python 2, you will need
futures as well.
You can enable benchmarking with
--benchmark-enable. You can also run explicit performance tests with
To install using the pip method for development instead, run:
python3 -m venv .env . .env/bin/activate python -m pip install .[test]
You'll need to reinstall it if you want to rebuild.
Updating dependencies (click to expand)
This will checkout new versions of the dependencies. Example given using the fish shell.
for f in * cd $f git fetch git checkout boost-1.70.0 || echo "Not found" cd .. end