performance benchmarking #48

dcherian · 2021-11-22T21:03:59Z

bincount-ed sum is 6x slower than sum, and bincount-ed count is 2x slower than count so there are definitely cases where it makes sense to split the dataset early instead of using flox. I'm seeing this on Pangeo Cloud using the GODAS dataset.

array = np.zeros((10 ** 5,), dtype=int)
by = array

%timeit np.bincount(by)  # count
%timeit np.bincount(by, weights=array)  # sum
%timeit np.sum(~np.isnan(by)) # count
%timeit array.sum() # sum

264 µs ± 2.76 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
441 µs ± 1.33 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
117 µs ± 2.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
65.9 µs ± 1.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The text was updated successfully, but these errors were encountered:

dcherian · 2021-12-29T21:45:12Z

With engine="flox"

%timeit b, a = flox.core._prepare_for_flox(by, array); flox.aggregate_flox.nanlen(b, a, fill_value=0)
%timeit b, a = flox.core._prepare_for_flox(by, array); flox.aggregate_flox.sum(b, a, fill_value=0)

237 µs ± 33.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
180 µs ± 6.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

dcherian closed this as completed Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance benchmarking #48

performance benchmarking #48

dcherian commented Nov 22, 2021 •

edited

dcherian commented Dec 29, 2021

performance benchmarking #48

performance benchmarking #48

Comments

dcherian commented Nov 22, 2021 • edited

dcherian commented Dec 29, 2021

dcherian commented Nov 22, 2021 •

edited