# Performances with numpy arrays

We compare the performance for automatically generated numpy arrays, all eagerly loaded.
This is the same setup used in the [boost documentation](https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html) to compare with numpy performances.

In [1]:
import boost_histogram as bh
import numpy as np
import xarray as xr
import xarray_histogram as xh
from numpy.testing import assert_allclose
from xhistogram.xarray import histogram as xhistogram

In [2]:
nbins = (100, 100)
ranges = ((-3, 3), (-3, 3))

axes = [bh.axis.Regular(nbins[i], *ranges[i]) for i in range(2)]
edges = [ax.edges for ax in axes]

In [3]:
np.random.seed(42)

vals1d = np.random.normal(size=[10_000_000]).astype(np.float32)
da_1d = xr.DataArray(vals1d, name='test_1d')

vals = np.random.normal(size=[2, 10_000_000]).astype(np.float32)
da_2d = [xr.DataArray(vals[i], name='test_2d_{}'.format(i)) for i in range(2)]

## One-dimensional histogram (flattened)

In [4]:
answer, _ = np.histogram(da_1d, bins=edges[0])

### Numpy

In [5]:
%%timeit
h, _ = np.histogram(da_1d, bins=edges[0])
assert_allclose(h, answer)

578 ms ± 1.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### xhistogram

In [6]:
%%timeit
h = xhistogram(da_1d, bins=edges[0])
assert_allclose(h.values, answer)

402 ms ± 857 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


### xarray-histogram

In [7]:
%%timeit
h = xh.histogram(da_1d, bins=[axes[0]])
assert_allclose(h.values, answer)

34.5 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


## Two-dimensional (flattened)

In [8]:
answer, _, _ = np.histogram2d(*da_2d, bins=edges)

### Numpy

In [9]:
%%timeit
h, _, _ = np.histogram2d(*da_2d, bins=edges)
assert_allclose(h, answer)

793 ms ± 836 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


### xhistogram

In [10]:
%%timeit
h = xhistogram(*da_2d, bins=edges)
assert_allclose(h.values, answer)

808 ms ± 1.15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### xarray-histogram

In [11]:
%%timeit
h = xh.histogram(*da_2d, bins=axes)
assert_allclose(h.values, answer)

68 ms ± 273 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


## One-dimensional (along second dimension)

In [12]:
vals_1d_t = np.random.normal(size=[3, 3_000_000]).astype(np.float32)
da_1d_t = xr.DataArray(vals_1d_t, name='test_1d', dims=['t', 'x'])

In [13]:
answer = xhistogram(da_1d_t, bins=edges[0], dim=['x']).load()

### xhistogram

In [14]:
%%timeit
h = xhistogram(da_1d_t, bins=edges[0], dim=['x'])
assert_allclose(h.values, answer.values)

362 ms ± 3.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### xarray-histogram

In [15]:
%%timeit
h = xh.histogram(da_1d_t, bins=[axes[0]], dims=['x'])
assert_allclose(h.values.T, answer.values)

26.4 ms ± 224 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
