ENH: add initial and out parameters to bincount #22907

arthurfeeney · 2022-12-31T00:07:12Z

This PR adds optional parameters initial and out to np.bincount, mostly based on discussion in #22471 and a prior attempt at this #9424.

initial is used to set the initial values for the output array.
out is an alternative output parameter. if initial is out, then np.bincount will reuse and directly accumulate onto out. (this allows bincount to reuse out across multiple calls, rather than overwriting it.)

It's a bit niche, but a feature like this has been requested several times in the past. Including these: #22471, #8495, #9424. It's sort of a faster alternative to np.add.at.

The intended use case is essentially to do bincounts over large chunks of data:

high_res_map = np.zeros(10000**2)
for indices, quantities in next_big_chunk_of_data():
    np.bincount(indices, quantities, initial=high_res_map, out=high_res_map)

This is my first time trying to contribute to Numpy, so apologies in advance if I've missed something. Of course, I am happy to make any changes / improvements :-)

The optional parameters initial and out allow bincount to efficiently reuse and accumulate results across multiple invocations.

mattip · 2023-01-01T12:25:01Z

Do we need both initial and out? They seem redundant, if I go to the bother of supplying an initial, it can just be out.

It's sort of a faster alternative to np.add.at

Did you run benchmarks on this new implementation? Also see #22889 which speeds up np.add.at by about 6x.

mhvk

Had a first quick look at this, and I think the initial plus out idea actually works well! I have some first in-line comments on the docstring and the tests.

The main code is a bit harder as it seems you made quite a few changes that may be improvements but are not essential for the addition of initial and out. I worry the changes in how strides are dealt with may make the code slower in some cases. I'll need to think a bit more but my sense is that rather than check this exhaustively, it may be better to keep the code as similar as possible, i.e., not change list and weight, and only deal with strides for out (for initial, it should not be necessary to worry about it if in the case of initial_array != out_array, one simply uses an array copy -- this would also take care of casting (but probably treat 0 separately!?). Or at least split it in 2 commits? Anyway, probably good to have others have a look too!

mhvk · 2023-01-01T22:40:23Z

numpy/core/multiarray.py

@@ -912,22 +913,32 @@ def bincount(x, weights=None, minlength=None):
        Weights, array of the same shape as `x`.
    minlength : int, optional
        A minimum number of bins for the output array.
+    initial: ndarray, 1 dimension, optional
+        Array of initial values for each bin. It must have the same shape and
+        buffer length as the expected output


I think ideally it would just broadcast to the output (i.e., effectively the default is initial=0).

mhvk · 2023-01-01T22:43:55Z

numpy/core/multiarray.py


        .. versionadded:: 1.6.0

    Returns
    -------
    out : ndarray of ints
        The result of binning the input array.
-        The length of `out` is equal to ``np.amax(x)+1``.
+        The length of `out` is equal to ``max(minlength, np.amax(x)+1)``.


Strictly, it can be even larger if the input out has a larger size.

mhvk · 2023-01-01T22:44:36Z

numpy/core/multiarray.pyi

@@ -417,6 +417,8 @@ def bincount(
    /,
    weights: None | ArrayLike = ...,
    minlength: SupportsIndex = ...,
+    initial: None | NDArray[Any] = ...,


So, I think scalar would be OK in principle.

mhvk · 2023-01-01T22:47:37Z

numpy/core/src/multiarray/compiled_base.c

 {
-    npy_intp min = *data;
-    npy_intp max = *data;
+    *mn = *(npy_intp *)data;


Why these changes?

mhvk · 2023-01-01T22:57:45Z

numpy/core/src/multiarray/compiled_base.c

    }

-    lst = (PyArrayObject *)PyArray_ContiguousFromAny(list, NPY_INTP, 1, 1);
-    if (lst == NULL) {
+    list_arr = _array_from_object(list_obj, NPY_INTP, 0);    


I'm a bit worried this change to deal with strides might actually make the code slower in some cases -- it may be faster to copy the index and weights to new arrays in turn rather than loop over them with unequal stride.

mhvk · 2023-01-01T23:06:00Z