in place fancy indexing for ufuncs #2821

Merged
merged 56 commits into from Aug 16, 2013
+524 −2

5 participants

Currently when doing something like a[idx] += 1 where idx is a fancy index, and the fancy index lists an element more than once, the increment is performed on each element's original value instead of accumulating the increments. This is because the first operand is buffered, and the ufunc operates on the buffered operand.

This pull request adds a new method called 'at' to ufunc object for in place fancy indexing with no buffering. This allows a ufunc operation to be performed on an element more than once, with the result of each operation being stored in place before the next operation is performed. This is equivalent to doing a[idx] += b with no buffering for operand 'a'. For example:

a = np.array([0,1,2,3])
# second element is incremented twice
a
>>> array([1, 3, 2, 3)]

The 'at' method also supports slicing. For example:

a = np.array([0,1,2,3])
a
>>> array([1, 2, 3, 4)]

A more complicated example:

a = np.arange(27).reshape(3,3,3)
b = np.array([100,200,300])
a
>>> array([[[0,401,202],
[3,404,205],
[6,407,208]],
[[9, 410,211],

[12,413,214],
[15,416,217]],
[[18,419,220],
[21,422,223],
[24,425,226]]])

The 'at' method can be used with all ufuncs, including unary ufuncs:

a = np.array([0, 1, 2, 3])
np.negative.at(a, [0,1])
a
>>> array([0, -1, 2, 3])

commented on an outdated diff Dec 14, 2012
 @@ -5684,6 +5684,47 @@ def luf(lamdaexpr, *args, **kwargs): """)) +add_newdoc('numpy.core', 'ufunc', ('at', + """ + at(a, indices, b=None) + + Performs operation in place on array for items specified by indices. + Items can be listed more than once and operation will be performed + on result of operation on previous item.
 njsmith NumPy member I can't parse "operation will be performed on result of operation on previous item."
commented on an outdated diff Dec 14, 2012
 @@ -5684,6 +5684,47 @@ def luf(lamdaexpr, *args, **kwargs): """)) +add_newdoc('numpy.core', 'ufunc', ('at', + """ + at(a, indices, b=None) + + Performs operation in place on array for items specified by indices. + Items can be listed more than once and operation will be performed + on result of operation on previous item. + + Equivalent to a[indices] += b for addition ufunc, except that results + are accumulated for indices listed more than once.
 njsmith NumPy member The docs should assume that the reader has no idea why there would be any problem with writing `a[indices] += 1`, and explain. It might also be useful to contrast with `ufunc(a[indices], *args, out=a[indices])` (which for fancy indexing doesn't work at all, but again it's not necessarily obvious why), or `a[indices] = ufunc(a[indices], *args)`.
and 1 other commented on an outdated diff Dec 14, 2012
numpy/core/code_generators/numpy_api.py
 @@ -332,7 +332,10 @@ 'PyDataMem_SetEventHook': 291, 'PyArray_MapIterSwapAxes': 293, 'PyArray_MapIterArray': 294, - 'PyArray_MapIterNext': 295, + 'PyArray_MapIterNew': 295, + 'PyArray_MapIterBind': 296, + 'PyArray_MapIterReset': 297, + 'PyArray_MapIterNext': 298,
 njsmith NumPy member What's going on here? Remind me whether the `PyArray_MapIter` API has been merged into 1.7? Because we can't change the numeric value for `PyArray_MapIterNext` after it's been in any release; that would be an ABI break. Also, didn't we just decide to export a simplified API that doesn't have all the weird rules that New/Bind/Reset do? Why is that changed here? jayvius You're right...I missed that discussion. I'll remove this change and replace my calls to New/Bind/Reset with MapIterArray.
and 1 other commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 @@ -4498,6 +4498,185 @@ static int get_ufunc_arguments(PyUFuncObject *ufunc, return PyUFunc_GenericReduction(ufunc, args, kwds, UFUNC_REDUCEAT); } +/* + * Call ufunc only on selected array items and store result in first operand. + * For add ufunc, method call is equivalent to op1[idx] += op2 with no + * buffering of the first operand. + * Arguments: + * op1 - First operand to ufunc + * idx - Indices that are applied to first operand. Equivalent to op1[idx]. + * op2 - Second operand to ufunc (if needed). Must be able to broadcast + * over first operand. + */ +static PyObject * +ufunc_at(PyUFuncObject *ufunc, PyObject *args, PyObject *kwds) +{ + static char *kwlist[] = {"op1", "idx", "op2"};
 njsmith NumPy member Uh, these strings aren't meaningless... they actually define which kwargs people can write. The docs also define which kwargs people can write. But the code and docs don't match. (The best solution is probably to just not accept kwargs in the code -- it's easier to extend APIs later if you know that existing callers aren't playing funny buggers with kwarg ordering and stuff.) jayvius I'm not quite sure what you're getting at here. I was trying to allow for both positional and keyword arguments like the other ufunc methods do (reduce, accumulate,...). Is this not the right way to do that? njsmith NumPy member I'm arguing that for required arguments, we shouldn't accept the keyword form. The only reason we'd want to support the kw form is so people can write things like ``````np.add.at(idx=[1, 2, 3], op2=b, op1=a) `````` but... that's a terrible reason :-). And the more calling forms we accept, the harder it is to extend interfaces later without breaking backwards compatibility. Like, for example, suppose we want to change the code later to accept an arbitrary number of positional arguments, because people start making 3-argument, 4-argument etc. ufuncs. The easy way to do this is to treat op2 and all further arguments as a simple list; but if backwards compatibility means we have to continue to accept `op2=` as a kwarg, then the code will become pointlessly complex. AFAICT this is how plain old ufunc calls work right now (`__call__`, not reduce etc.) -- `np.add(op1=x, op2=y)` doesn't work, and that's fine.
commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + PyArray_Descr *iter_descr = NULL; + PyArray_Descr *dtypes[3]; + int needs_api; + PyUFuncGenericFunction innerloop; + void *innerloopdata; + char *dataptr[3]; + npy_intp count[1], stride[1]; + int ndim; + int i; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "OO|O", kwlist, + &op1, &idx, &op2)) { + return NULL; + } + + if (ufunc->nin == 2 && op2 == NULL) {
 njsmith NumPy member IIUC, we in theory support ufuncs with more than 2 args. So in theory the parameters for this method should be `at(arr, indices, *args)`. Is there any reason that they aren't? Even if supporting >2 arg ufuncs is too hard right now, we should at least avoid closing the door on doing so later by (1) using an appropriate signature so that we aren't locked into treating the second argument specially (see above about kwargs), (2) checking for such ufuncs and raising an error.
and 2 others commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + "number of dimensions in index must match number of dimensions in first operand"); + goto fail; + } + } + else if (ndim > 1) { + PyErr_SetString(PyExc_ValueError, + "number of dimensions in index must match number of dimensions in first operand"); + goto fail; + } + + iter = (PyArrayMapIterObject*)PyArray_MapIterNew(idx, 0, 1); + if (iter == NULL) { + goto fail; + } + + PyArray_MapIterBind(iter, op1_array);
 njsmith NumPy member Didn't we decide that MapIterBind should have a return code? Why aren't we checking it? jayvius MapIterBind currently has a void return type, which is why the next line checks for a iter->ait NULL value as a cheap workaround. I'm going to remove this call to MapIterBind though, so it's irrelevant now. njsmith NumPy member Oh, gah, I'm sorry! It turns out I had you confused with @jsalvatier, who has been working on very similar stuff. (@jsalvatier, you'll probably find this PR interesting). The comment about MapIterBind in particular was in reference to #2710, which does add a return code to MapIterBind, and which I just merged. As you say this probably doesn't matter for you since you'll switch to the simplified API anyway, but you may want to make sure you're working against the latest master, since #2710 may have affected your code. jsalvatier Indeed, very interesting, nice work @jayvius!
and 1 other commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + /* + * May need to swap axes so that second operand is + * iterated over correctly + */ + if ((iter->subspace != NULL) && (iter->consec)) { + if (iter->iteraxes[0] > 0) { + PyArray_MapIterSwapAxes(iter, &op2_array, 0); + if (op2_array == NULL) { + goto fail; + } + } + } + + /* + * Be sure values array is "broadcastable" + * to shape of mit->dimensions, mit->nd
 njsmith NumPy member you mean `iter->dimensions` and `iter->nd` I assume. jayvius Correct :)
commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + * to shape of mit->dimensions, mit->nd + */ + if ((iter2 = (PyArrayIterObject *)\ + PyArray_BroadcastToShape((PyObject *)op2_array, + iter->dimensions, iter->nd))==NULL) { + goto fail; + } + } + + /* + * Create dtypes array for either one or two input operands. + * The output operand is set to the first input operand + */ + dtypes[0] = PyArray_DESCR(op1_array); + if (op2_array != NULL) { + dtypes[1] = PyArray_DESCR(op2_array);
 njsmith NumPy member Didn't we just FORCECAST `op2_array` to match the dtype of `op1_array`? I guess this is still good defensive style, but acknowledging it in the comment would be good to remind people what's going on. njsmith NumPy member Or, uh, actually, why did we FORCECAST it? When I read the call to `PyArray_FromAny` above it seemed like it made sense, but now that I'm looking at the calling code down here I can't tell if it was necessary.
commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + PyObject *op2 = NULL; + PyArrayObject *op1_array = NULL; + PyArrayObject *op2_array = NULL; + PyArrayMapIterObject *iter = NULL; + PyArrayIterObject *iter2 = NULL; + PyArray_Descr *iter_descr = NULL; + PyArray_Descr *dtypes[3]; + int needs_api; + PyUFuncGenericFunction innerloop; + void *innerloopdata; + char *dataptr[3]; + npy_intp count[1], stride[1]; + int ndim; + int i; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "OO|O", kwlist,
 seberg NumPy member I had to add an `O` here to be able to try the binary ufucs...
commented on an outdated diff Dec 14, 2012
numpy/core/tests/test_ufunc.py
 + [112,213,314], + [115,216,317]], + + [[118,219,320], + [121,222,323], + [124,225,326]]]) + + a = np.arange(10) + np.negative.at(a, [2,5,2]) + assert_equal(a, [0, 1, 2, 3, 4, -5, 6, 7, 8, 9]) + + # test exception when indices dimensions < first operand dimensions + a = np.arange(27).reshape(3,3,3) + b = np.array([100,200,300]) + assert_raises(ValueError, np.add.at, a, [1,2,1], b) + assert_raises(ValueError, np.add.at, a, ([1,2,1],), b)
 njsmith NumPy member Other things to check: 0-d array (the only valid index should be `()`, scalars and non-empty tuples should be errors... I'm not sure if the code above gets this right actually) some mixed dtype calls. Actually I think the code right now will produce the following (wrong) results because of the FORCECAST thing: ``````>>> a = np.array([2]) >>> a[0] **= 3.5 >>> a array([11]) >>> a = np.array([2]) >>> np.power.at(a, 0, 3.5) >>> a array([8]) `````` Fancy-indexing with multidimensional index arrays. Boolean indexing.
commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + PyErr_SetString(PyExc_TypeError, + "first operand must be array"); + return NULL; + } + + op1_array = op1; + + ndim = PyArray_NDIM(op1_array); + if (PyTuple_Check(idx)) { + if (PyTuple_GET_SIZE(idx) != ndim) { + PyErr_SetString(PyExc_ValueError, + "number of dimensions in index must match number of dimensions in first operand"); + goto fail; + } + } + else if (ndim > 1) {
 njsmith NumPy member This is the check that I think goes wrong if `ndim == 0`. Overall this code would simpler (in the sense of not just being correct, but being obviously correct) if you just had a block up front that did ``````if (!PyTuple_Check(idx)) { idx = (idx,) # well, the C API equivalent } `````` and then only dealt with the tuple case.
NumPy member

Anyway, overall this is awesome :-)

and 1 other commented on an outdated diff Dec 14, 2012
numpy/core/src/umath/ufunc_object.c
 + return NULL; + } + + if (!PyArray_Check(op1)) { + PyErr_SetString(PyExc_TypeError, + "first operand must be array"); + return NULL; + } + + op1_array = op1; + + ndim = PyArray_NDIM(op1_array); + if (PyTuple_Check(idx)) { + if (PyTuple_GET_SIZE(idx) != ndim) { + PyErr_SetString(PyExc_ValueError, + "number of dimensions in index must match number of dimensions in first operand");
 seberg NumPy member I don't know if this check actually does much, but it is simplifying things too much. You could maybe argue that giving only one index is not a good idea for this function, but `Ellipsis`, `newaxis/None` (even if its not supported in current master with fancy indexing) and boolean arrays will distort this check. njsmith via email Dec 14, 2012 NumPy member Yeah, you're right... I originally managed to confuse myself into thinking that the newaxis/None and scalar indexing special cases didn't make sense here, but they do. Now I'm thinking this check should just be removed entirely; let the indexing code take care of the checking. …
NumPy member

On further thought, I think our handling of broadcasting is not quite Perfect. (Maybe we'll decide this is okay but I want to point it out.)

``````np.add([1], [1, 2, 3]) # returns [2, 3, 4]
a = np.array([1])
# should 'a' now be 1 + 1 + 2 + 3 = 7?
``````
NumPy member

@jayvius This needs a rebase.

NumPy member

@jayvius Now it's failing in single file builds. Probably the include order needs to be changed, but I can't tell as travisbot is taking too long to show the output.

NumPy member

@njsmith What do you thing about putting this in when the failed build is fixed?

NumPy member

I would love to get this in, but it looks like there were several issues raised in review, and I don't know if they've been dealt with:

• We don't want to accept kw arguments for `.at`
• The code doesn't appear to handle ufuncs with >2 arguments -- not even to raise an error saying "we don't handle ufuncs with >2 arguments"
• The dtype/casting handling is suspicious
• The test coverage could be better
• There's a known segfault bug
• @seberg found he had to patch it further to work 'binary ufuncs' (not sure if that means those that operate on bools or those that have arity 2)
• It appears to mishandle several sorts of indexing, including ellipses and 0d arrays.

None of this looks hard to fix, and some of it may have been fixed (I didn't re-review all the code), but I'm wary of just merging given all the above...

Just pushed some commits that fix some of the above issues. The main change is in commit 2a9d557 which replaces a direct call to the ufunc with a call to iterator_loop, so that iterator_loop can handle any buffering that's needed to handle mixed dtype cases. I'm not sure if this is the best or most efficient way to do this though. I especially don't like having to create array operands for every iteration. Is there a better or easier way to do this? I'm sure there is some code cleanup, error checking, etc that is still needed, but I wanted to verify my approach first before spending any more time on it.

commented on an outdated diff Apr 28, 2013
doc/source/reference/ufuncs.rst
 @@ -417,6 +417,11 @@ an integer (or Boolean) data-type and smaller than the size of the :class:`int_` data type, it will be internally upcast to the :class:`int_` (or :class:`uint`) data-type. +Ufuncs also have a fifth method that allows in place operations to be +performed using fancy indexing. No buffering is used, so the fancy index
 charris NumPy member This is only for binary ufuncs, right?
commented on the diff Apr 28, 2013
numpy/core/src/umath/ufunc_object.c
 + if (op2_array == NULL) { + goto fail; + } + } + } + + /* + * Create array iter object for second operand that + * "matches" the map iter object for the first operand. + * Then we can just iterate over the first and second + * operands at the same time and not have to worry about + * picking the correct elements from each operand to apply + * the ufunc to. + */ + if ((iter2 = (PyArrayIterObject *)\ + PyArray_BroadcastToShape((PyObject *)op2_array,
 charris NumPy member Need extra indent. The `\` can be dropped, no?, next line should line up with this one. seberg NumPy member Can you use the new iterator here with the itershape argument? You could move the buffering/casting for the op2_array also into that iterator then as well using the op_dtypes. That would mean that the "dummy" casting iterator only needs to handle the first operand. Probably should check how to replace that with a direct call to the casting functions, but to be honest I don't care too much about that right now and if we add subspace iteration optimizations the dummy iterator wouldn't be quite as dummy any more for those.
commented on an outdated diff Apr 28, 2013
numpy/core/src/umath/ufunc_object.c
 + operands[2] = op1_array; + dtypes[1] = PyArray_DESCR(operands[1]); + dtypes[2] = PyArray_DESCR(operands[2]); + } + else { + operands[1] = op1_array; + dtypes[1] = PyArray_DESCR(operands[1]); + } + + if (ufunc->type_resolver(ufunc, NPY_DEFAULT_ASSIGN_CASTING, + operands, NULL, dtypes) < 0) { + goto fail; + } + + if (ufunc->legacy_inner_loop_selector(ufunc, dtypes, + &innerloop, &innerloopdata, &needs_api) < 0) {
 charris NumPy member Needs more indentation, two tabs ideally.
commented on an outdated diff Apr 28, 2013
numpy/core/src/umath/ufunc_object.c
 + PyArrayIterObject *iter2 = NULL; + PyArray_Descr *iter_descr = NULL; + PyArray_Descr *dtypes[3] = {NULL, NULL, NULL}; + PyArrayObject *operands[3] = {NULL, NULL, NULL}; + int needs_api; + PyUFuncGenericFunction innerloop; + void *innerloopdata; + char *dataptr[3]; + npy_intp count[1], stride[1]; + npy_intp dims[1]; + int i; + int buffersize = 0; + int errormask = 0; + PyObject *errobj = NULL; + PyObject *arr_prep[NPY_MAXARGS]; + PyObject *arr_prep_args;
 charris NumPy member Can some of these declarations be made local in some other block?
commented on the diff Apr 28, 2013
numpy/core/tests/test_ufunc.py
 + assert_equal(a, [0, 1, 2, 3, 4, -5, 6, 7, 8, 9]) + + # Test 0-dim array + a = np.array(0) + np.add.at(a, (), 1) + assert_equal(a, 1) + + assert_raises(IndexError, np.add.at, a, 0, 1) + assert_raises(IndexError, np.add.at, a, [], 1) + + # Test mixed dtypes + a = np.arange(10) + np.power.at(a, [1,2,3,2], 3.5) + assert_equal(a, np.array([0, 1, 4414, 46, 4, 5, 6, 7, 8, 9])) + + # Test boolean indexing and boolean ufuncs
 charris NumPy member @seberg Are there any issues with boolean indexing. ISTR you will be making some mods there. seberg NumPy member No, they just go through the typical fancy indexing stuff, i.e. get converted to normal fancy indexes first. The issues are that purely non-fancy indices will not work.
commented on an outdated diff Apr 28, 2013
numpy/core/tests/test_ufunc.py
 @@ -771,5 +771,118 @@ def add_inplace(a, b): assert_no_warnings(np.add, a, 1.1, out=a, casting="unsafe") assert_array_equal(a, [4, 5, 6]) + def test_inplace_fancy_indexing(self): +
 charris NumPy member What happens with unary, trinary ufuncs? Are exceptions raised? If so that should be tested also.
and 1 other commented on an outdated diff Apr 28, 2013
numpy/core/src/umath/ufunc_object.c
 + + /* + * Iterate over first and second operands and call ufunc + * for each pair of inputs + */ + i = iter->size; + dims[0] = 1; + while (i > 0) + { + /* Set up operands for call to iterator_loop */ + operands[0] = PyArray_NewFromDescr(&PyArray_Type, + PyArray_DESCR(op1_array), + 1, dims, NULL, iter->dataptr, + NPY_ARRAY_WRITEABLE, NULL); + if (iter2 != NULL) { + operands[1] = PyArray_NewFromDescr(&PyArray_Type,
 seberg NumPy member You already said that, but creating an array for every single element smells slow. Can't you pull out the copyswap/casting from iterator_loop and then call the inner_loop directly working on raw pointers? Maybe creating just one set of arrays before the loop? jayvius Finally taking a fresh look at this. I'm not sure exactly what needs to be pulled from iterator_loop, especially since iterator_loop uses a different type of iterator created by NpyIter_AdvancedNew (and which seems to take care of any buffering and casting needed?). Reusing the array operands for each iteration would be easier, but is there a way to set the data pointer of an array? Or am I misunderstanding what you're suggestion? seberg NumPy member Yeah, the iterator handles all that stuff, so pulling it fully out requires doing this by hand with copyswap and possibly other casting functions (I do not know this well either, it shouldn't be hard, but I am not sure what is needed exactly). A middle way would maybe be to abuse the new iterator. I.e. build up the iterator before the loop for a single item. I think you can then change the baseptrs of the iterator + reset it, np.nested_iters does something like it. Would be a little bit slower, but at least we don't create new arrays and iterators for each item.
NumPy member

@jayvius Needs a rebase.

NumPy member

@njsmith Can you take another look at this?
@jayvius Needs a rebase.

and 2 others commented on an outdated diff May 11, 2013
numpy/core/src/umath/ufunc_object.c
 + } + + /* Create second operand from number array if needed. */ + if (op2 != NULL) { + op2_array = (PyArrayObject *)PyArray_FromAny(op2, NULL, + 0, 0, 0, NULL); + if (op2_array == NULL) { + goto fail; + } + + /* + * May need to swap axes so that second operand is + * iterated over correctly + */ + if ((iter->subspace != NULL) && (iter->consec)) { + if (iter->iteraxes[0] > 0) {
 seberg NumPy member Could you remove this check (the `iteraxes[0] > 0` one)? I think I will merge gh-2701 soon, and with the change this check becomes unnecessary (and possibly wrong). charris NumPy member @jayvius gh-2701 has been merged. jayvius Looks like I still need this code, unless I'm missing something. seberg NumPy member Yes, but not the `iteraxes[0] > 0` check, you should remove that. The check is included in `iter->consec` check and iteraxes[0] can now be 0 in principle even when swapaxes is needed. jayvius Ah, I see what you mean now.
NumPy member

Needs a rebase.

Refactored to use NpyIter object on each element of input operands to handle buffering/casting, and then call innerloop directly using data pointers from NpyIter object.

NumPy member

@seberg Are you happy with this now?

NumPy member

I think it should be good. Would be interesting how slow/fast it is, but since fancy indexing isn't very fast anyway, I don't think it is anything to worry about. I would like a test where the buffering is tested, though, non-native bytetypes or such, didn't see one when I looked over it quickly.

NumPy member

Also, if it doesn't exist yet, please add a test along, to test the empty subspace:

``````orig = np.arange(4)
a = orig[:,None][:,0:0]
assert_array_equal(orig, np.arange(4))
``````

Another small note which is not yet a difference. If I remember correctly, the doc currently set that buffering is not enabled. The question is, do we want that? I think that it should say that buffering is not enabled for the fancy index. That way the subspace iteration could still be optimized by buffering in the future without contradicting the documentation.

and 1 other commented on an outdated diff May 27, 2013
numpy/core/src/umath/ufunc_object.c
 + NPY_ITER_NO_SUBTYPE; + } + + PyUFunc_GetPyValues(ufunc->name, &buffersize, &errormask, &errobj); + + /* + * Create NpyIter object to "iterate" over single element of each input + * operand. This is an easy way to reuse the NpyIter logic for dealing + * with certain cases like casting operands to correct dtype. On each + * iteration over the MapIterArray object created above, we'll take the + * current data pointers from that and reset this NpyIter object using + * those data pointers, and then trigger a buffer copy. The buffer data + * pointers from the NpyIter object will then be passed to the inner loop + * function. + */ + iter_buffer = NpyIter_AdvancedNew(nop, operands,
 seberg NumPy member Is there a good reason to iterate over the operand once as input and once as output (nop=2/3) instead of as readwrite with nop=1/2? seberg NumPy member I guess see the other comment above. I think we probably should put the casting for the second operand directly into its own iterator which it probably needs anyway. Though thinking about it. With an optimized subspace it would make sense to do it as you are doing it here, but maybe when adding that optimization we should give it its dedicated code path anyway. jayvius If I understand your first comment correctly, the reason for iterating twice is that the first iterator is for the fancy indexing, and the second iterator (the call to NpyIter_AdvancedNew) is for buffering each element of the operand in certain cases like casting to a new dtype. It's not really iterating over the operand, it's just to reuse the buffering logic for each individual element in the operand. Is that what you were getting at, or am I missing something? In any case, I'm not 100% sure that comment block in the code does a good job of explaining what's going on here. seberg NumPy member I think what I meant was that operands are `op1_array, op2_array, op1_array` or `op1_array, op1_array`. I guess because that is what happens in usual ufuncs, since the out operand isn't generally the input there.
and 1 other commented on an outdated diff May 27, 2013
numpy/core/src/umath/ufunc_object.c
 + + /* + * Set up data pointers for either one or two input operands. + * The output data pointer points to the first operand data. + */ + dataptr[0] = iter->dataptr; + if (iter2 != NULL) { + dataptr[1] = PyArray_ITER_DATA(iter2); + dataptr[2] = iter->dataptr; + } + else { + dataptr[1] = iter->dataptr; + dataptr[2] = NULL; + } + + iternext = NpyIter_GetIterNext(iter_buffer, NULL);
 seberg NumPy member You can move this outside the loop I am sure. jayvius I'm not sure about this. The data pointer of each iterator is updated each time through the loop, so I think we need to update the dataptr array inside the loop. seberg NumPy member Didn't try, but I would think you can move the iternext definition outside, since the function should just work with whatever dataptrs it sees. jayvius Ah, I see what you mean now.
commented on an outdated diff May 27, 2013
numpy/core/src/umath/ufunc_object.c
 + NPY_END_THREADS; + } + + PyArray_MapIterNext(iter); + if (iter2 != NULL) { + PyArray_ITER_NEXT(iter2); + } + + i--; + } + + return op1; + +fail: + + if (op2_array != NULL)
 seberg NumPy member You can use Py_XDECREF instead of all the NULL checks.
commented on an outdated diff May 27, 2013
numpy/core/src/umath/ufunc_object.c
 + dataptr[1] = iter->dataptr; + dataptr[2] = NULL; + } + + iternext = NpyIter_GetIterNext(iter_buffer, NULL); + if (iternext == NULL) { + NpyIter_Deallocate(iter_buffer); + goto fail; + } + + /* Reset NpyIter data pointers which will trigger a buffer copy */ + NpyIter_ResetBasePointers(iter_buffer, dataptr, NULL); + buffer_dataptr = NpyIter_GetDataPtrArray(iter_buffer); + + if (!needs_api) { + NPY_BEGIN_THREADS;
 seberg NumPy member I have no idea, but if NPY_BEGIN_THREADS adds some real overhead it might be misplaced in the inner loop? I am not sure if it can be moved outside the inner loop though since MapIter might need it for some reason?
NumPy member

@charris Sorry, got sidetracked by other things. I'll address these remaining issues today.

NumPy member

@jayvius Needs a rebase also.

I would like a test where the buffering is tested, though, non-native bytetypes or such, didn't see one when I looked over it quickly.

I'm not sure what you mean by this. Can you elaborate? I did add a test for mixed dtypes which tests buffering after casting to a new dtype.

NumPy member

I meant dtypes such as

``````dt = np.dtype(np.intp).newbyteorder()
index = np.array([1,2,3], dt)
dt = np.dtype(np.float_).newbyteorder()
values = np.array([1,2,3,4], dt)
``````

testing that that works should be good enough.

I think all the above comments have been addressed now. Would it be good to squash all this into a single commit?

NumPy member

@jayvius: I don't care that much about squashing either way, but before I read the patch can you confirm that the stuff here got addressed? #2821 (comment)

I don't care that much about squashing either way, but before I read the patch can you confirm that the stuff here got addressed? #2821

Yep, just confirmed all those issues have been addressed.

NumPy member

Considering that this has a lot of back and forth editing, I would somewhat prefer squashing, but I don't mind much. I will try to play around with it a bit and have another quick look over today. So as far as I see it we can merge this soon and have a nice new feature! (Though at some point it would be nice to have a faster mapiter and use it here)

NumPy member

OK, so seems fine to me. It is slower then I expected, so if we care about speed, maybe we should rethink that abuse and do it by hand (which shouldn't be much harder) and i.e. disable the dance completly when it is not necessary...

Now that it works okay and we've hopefully shaken all the bugs and edge cases out, I'll go back and look at performance improvements (as a separate pull request if that's okay with everyone).

NumPy member

Sounds fine to me. Does anyone still have any comments about the API as such? Since we have to get that right.

Just noticed that there is a memory leak somewhere, my guess is all those old style iterators also need some cleanup:

``````a = np.arange(2000)
while True:
``````

slowly eats RAM.

Using only a single slice does not work, but I am willing to ignore that and leave it until the indexing itself gets cleaned out a bit. Because it takes a slightly different code path, it also requires a tuple and not a list of indices. I am not sure if that is a drag (since I think one shouldn't use lists anyway, it feels more a difference to usual indexing then a problem, but of course differences should be avoided).

and 1 other commented on an outdated diff Jul 23, 2013
 + Performs operation in place on array for items specified by indices. + For addition ufunc, this method is equivalent to a[indices] += b, + except that results are accumulated for indices listed more than once. + This solves the problem with a[indices] += b where each time a duplicate + index is encounted, the increment is performed on the original element. + As a result, an element that appears three times in the fancy indexing + list will only be incremented once in the final result, whereas with the + new 'at' method the original element will be incremented by three in the + final result. + + Parameters + ---------- + a : array_like + The array to act on. + indices : array_like + Paired indices, comma separated (not colon), specifying slices to
 seberg NumPy member I am not sure what to write, but I honestly don't understand anything of this right now. It should probably say `indices : array_like or tuple of indexing objects` or even just `indexing object`? And maybe then just refer to indexing in the longer description? jayvius via email Jul 23, 2013 Is it just the indices parameter that's unclear, or the entire description? … seberg NumPy member Honestly, the whole thing. The comma seperated sounds like it hints at tuples, and the slices to reduce I have currently no idea. Of course describing indexing is hard... but I guess we don't need to here. I am not a great doc writer... jayvius Now that I'm taking a closer look at that "indices" line, I have no idea what I was trying to say. I must have been half asleep.
commented on an outdated diff Jul 23, 2013
 @@ -5746,6 +5746,50 @@ def luf(lamdaexpr, *args, **kwargs): """)) +add_newdoc('numpy.core', 'ufunc', ('at', + """ + at(a, indices, b=None) + + Performs operation in place on array for items specified by indices. + For addition ufunc, this method is equivalent to a[indices] += b, + except that results are accumulated for indices listed more than once. + This solves the problem with a[indices] += b where each time a duplicate + index is encounted, the increment is performed on the original element. + As a result, an element that appears three times in the fancy indexing
 seberg NumPy member General question (I really don't know). Do we still call it fancy indexing or was that changed to advanced indexing at some point? It sometimes seemed like that to me.
commented on an outdated diff Jul 23, 2013
 + This solves the problem with a[indices] += b where each time a duplicate + index is encounted, the increment is performed on the original element. + As a result, an element that appears three times in the fancy indexing + list will only be incremented once in the final result, whereas with the + new 'at' method the original element will be incremented by three in the + final result. + + Parameters + ---------- + a : array_like + The array to act on. + indices : array_like + Paired indices, comma separated (not colon), specifying slices to + reduce. + b : array_like or scalar + Second operation for ufuncs requiring two operands.
 seberg NumPy member "Second operand for"... A scalar is also array_like, so it could be dropped.
commented on an outdated diff Jul 23, 2013
 @@ -5746,6 +5746,50 @@ def luf(lamdaexpr, *args, **kwargs): """)) +add_newdoc('numpy.core', 'ufunc', ('at', + """ + at(a, indices, b=None) + + Performs operation in place on array for items specified by indices. + For addition ufunc, this method is equivalent to a[indices] += b, + except that results are accumulated for indices listed more than once. + This solves the problem with a[indices] += b where each time a duplicate + index is encounted, the increment is performed on the original element. + As a result, an element that appears three times in the fancy indexing + list will only be incremented once in the final result, whereas with the + new 'at' method the original element will be incremented by three in the
 seberg NumPy member Maybe make the example `a[indices] += 1` so this adds up better, or incremented three times? Also not sure about the "new" it seems like that will be wrong before anyone bothers to change the docs. Might add backticks to `a[indices] += 1`.
commented on an outdated diff Jul 23, 2013
 + list will only be incremented once in the final result, whereas with the + new 'at' method the original element will be incremented by three in the + final result. + + Parameters + ---------- + a : array_like + The array to act on. + indices : array_like + Paired indices, comma separated (not colon), specifying slices to + reduce. + b : array_like or scalar + Second operation for ufuncs requiring two operands. + + Examples + --------
 seberg NumPy member The examples should be doctest style I think, like: ``````Set items 0 and 1 to their negative values: >>> a = np.array([1, 2, 3, 4]) >>> np.negative.at(a, [0, 1]) array([-1, -2, 3, 4]) Increment items 0, and 1 once and item 2 twice: >>> a = np.array([1, 2, 3, 4]) >>> np.add.at(a, [0, 1, 2, 2], 1) array([2, 3, 5, 4]) `````` etc.
and 1 other commented on an outdated diff Jul 24, 2013
numpy/core/src/umath/ufunc_object.c
 + + PyArray_MapIterNext(iter); + if (iter2 != NULL) { + PyArray_ITER_NEXT(iter2); + } + + i--; + } + + if (!needs_api) { + NPY_END_THREADS; + } + + NpyIter_Deallocate(iter_buffer); + + return op1;
 seberg NumPy member An API opinion I just nonticed. I think we should not return the operand. `ufunc.at` is always in-place, so returning the operand is not the usual python design. jayvius Good point. Also, shouldn't I be decref-ing op2_array, iter, and iter2 before returning successfully, like in the "fail" section? I think that's part of the reason for the memory leak, but when I add some decrefs here, python crashes during the finalize/gc stage after my test script runs. seberg NumPy member Sounds right to me (well XDECREF for op2_array and nothing for op1), not sure why the refcounting would get broken. The dtypes may still need DECREFing, too.

Fixed memory leaks and updated documentation.

and 1 other commented on an outdated diff Aug 5, 2013
 + Array like index object or slice object for indexing into first + operand. If first operand has multiple dimensions, indices can be a + tuple of array like index objects or slice objects. + b : array_like + Second operand for ufuncs requiring two operands. Operand must be + broadcastable over first operand after indexing or slicing. + + Examples + -------- + Set items 0 and 1 to their negative values: + + >>> a = np.array([1, 2, 3, 4]) + >>> np.negative.at(a, [0, 1]) + array([-1, -2, 3, 4]) + + ::
 seberg NumPy member Not sure if this is necessary/should be there, I mean the `::`. But I am not a doc specialist. jayvius I believe the double colon marks the end of a code example and the beginning of a new one.
commented on an outdated diff Aug 5, 2013
 + The array to perform in place operation on. + indices : array_like or tuple + Array like index object or slice object for indexing into first + operand. If first operand has multiple dimensions, indices can be a + tuple of array like index objects or slice objects. + b : array_like + Second operand for ufuncs requiring two operands. Operand must be + broadcastable over first operand after indexing or slicing. + + Examples + -------- + Set items 0 and 1 to their negative values: + + >>> a = np.array([1, 2, 3, 4]) + >>> np.negative.at(a, [0, 1]) + array([-1, -2, 3, 4])
 seberg NumPy member Since we return None now, we will have to add a line `>>> a` here to "print" the result.
commented on the diff Aug 5, 2013
numpy/core/src/umath/ufunc_object.c
 + * puts result in buffer. + */ + do { + innerloop(buffer_dataptr, count, stride, innerloopdata); + } while (iternext(iter_buffer)); + + PyArray_MapIterNext(iter); + if (iter2 != NULL) { + PyArray_ITER_NEXT(iter2); + } + + i--; + } + + if (!needs_api) { + NPY_END_THREADS;
 seberg NumPy member Question: What happens if we have an object array and we get an error during the ufunc call? Such as: ``````a = np.array([1, 2], dtype=object) b = np.array(['a']) np.add.at(a, [0], b) `````` When api is needed, the inner loop can return an error I think? I m not sure how this is handled in the ufunc machinery, but I suspect we may need to add some magic in that case. seberg NumPy member I guess we can live with `object_array += something` leaving `object_array` in an inconsistent state upon an error? The same will be the case for `ufunc.at(object_array, indices, something)`. Just thought there was an API question here, but `+=` also just stops when the first error occured, so I think we can copy that behaviour, or does anyone disagree? njsmith via email Aug 5, 2013 NumPy member Yeah, that sounds right to me. …
NumPy member

Sorry, rather busy lately. Didn't realize the possible problem with errors in the ufunc machinery when api is needed before, but modulo that, I guess we can merge it and try to improve performance later.

commented on the diff Aug 9, 2013
numpy/core/src/umath/ufunc_object.c
 + + if (!needs_api) { + NPY_END_THREADS; + } + + NpyIter_Deallocate(iter_buffer); + + Py_XDECREF(op2_array); + Py_XDECREF(iter); + Py_XDECREF(iter2); + Py_XDECREF(array_operands[0]); + Py_XDECREF(array_operands[1]); + Py_XDECREF(array_operands[2]); + Py_XDECREF(errobj); + + if (needs_api && PyErr_Occurred()) {
 seberg NumPy member Hmm, I think the other ufuncs already stop during calculation instead of the end? Not sure if it matters, might need code duplication, to do it in the loop, but... Could you add a test for this too? Something like `np.add.at(np.array(['a', 1], dtype=object), [0, 1], 1)` gives and error (and maybe does not change the 1).

Added test to make sure input array is unchanged if exception is thrown.

NumPy member

@jayvius Needs a rebase, probably `1.8.0-notes.rst` is causing the problem. The 1.8 branch will be Aug 18.

added some commits Oct 4, 2012
 jayvius `Proof-of-concept for ufunc fancy indexing` ```Conflicts: numpy/core/code_generators/numpy_api.py``` `925220b` jayvius `Add support for both unary and binary ufuncs` `7191df0` jayvius `Add comments` `025a056` jayvius `Change 'select' method name to 'at'` `99fa26c` jayvius `Add unit tests` ```Conflicts: numpy/core/tests/test_ufunc.py``` `e1fa3d7` jayvius `Check for sucessful call to MapIterBind` `f79e56d` jayvius `Misc fixes` ```- Fix crash when bind for second map iter fails - Accept array like object for second operand - Fix memory leaks``` `dcea500` jayvius `Rework 'at' method to fit customer's expectations.` `'at' method should be functionally equivalent to op1[idx] += op2 (for add ufunc).` `7b551e9` jayvius `Update/expand unit tests` `6bf50f1` jayvius `Update documentation for 'at' method` `04c341d` jayvius `Add comment to 'at' method unit test` `56d74a4` jayvius `Only allow explicit index slicing in 'at' method` `d94cd57` jayvius `Correct implementation of 'at' method that covers all corner cases` `7dde42a` jayvius `Add more unit tests for 'at' method` `0eafea7` jayvius `Code cleanup for 'at' method` `cf9452f` jayvius `Add more unit tests for 'at' method` `a8d13b8` jayvius `Fix issue with 'at' method when indices are all slice objects` `5b7cda0` jayvius `Add documentation for new 'at' ufunc method` `f63eb17` jayvius `Fix build issue` `0d087dc` jayvius `Fix for crash when PyArray_MapIterSwapAxes is called` `Set up default iteraxes array values in call to MapIterBind().` `d6fa103` jayvius `Add more comments` `1b81be7` jayvius `Reword some comments/documentation` `13c1847` jayvius `Replace calls to New/Bind/Reset with MapIterArray` `45419ea` jayvius `Remove keyword support for required arguments` `2e2d99a` jayvius `Let indexing code do error checking for arguments with mismatched dim…` `…ensions` `2a8c04b` jayvius `Improve comment for creating iter object from second operand` `6d7b51a` jayvius `Add tests for 0-dim array` `f9a3e17` jayvius `Just create an array object for second operand; don't cast to dtype o…` `…f first operand.` `e4cbcb2` jayvius `Call iterator_loop instead of directly calling ufunc.` `Call iterator_loop which will call ufunc and handle any buffering for cases with mixed dtypes.` `da3f8cb` jayvius `Add test for mixed dtypes` `1e786ea` jayvius `Add test for boolean indexing and boolean ufunc` `5dea4ca` jayvius `Fix DeprecationWarning on python 3.3` `4b33849` jayvius `Another attempt at fixing DeprecationWarning` `2e65204` jayvius `Maybe this will fix the DeprecationWarning` `d467d49` jayvius `Use NpyIter object to buffer each input value` `Use NpyIter object to buffer each input value in order to handle certain situations such as operands with mixed dtypes.` `3820cf3` jayvius `Revert "Call iterator_loop instead of directly calling ufunc."` `This reverts commit 2a9d5577a087e664ee047b3e099c1355000d8661.` `b259cdc` jayvius `Clean up previous comment and add test for unary ufunc.` `f899c19` jayvius `More clean up` `9438a8f` jayvius `Remove unnecessary check` `ea2e1f5` jayvius `Reverse commit 6ec51b3` `c26e548` jayvius `Fix deprecation warning` `8f22733` jayvius `Use Py_XDECREF in fail section` `f917e38` jayvius `Add NpyIter_Deallocate` `fcc891d` jayvius `Add test for empty subspace` `2c37d3b` jayvius `Update docs` `b440929` jayvius `Add test for non native bytetypes` `a4bb3bc` jayvius `Move NpyIter_IterNextFunc call outside of loop` `6c249cd` jayvius `Move gil release/aquire outside of loop` `5729685` jayvius `Add decrefs at end of function and return None object` `1e4d589` jayvius `Fix memory leaks and inc dtype refs` `336151e` jayvius `Update docs` `79bc97c` jayvius `Update docs` `7c4eeea` jayvius `Update docs` `a04bb37` jayvius `Check for python error when needs_api flag is set` `2ec4c94` jayvius `Bail out of loop if python error occurs` `ec9d9cc` jayvius `Change casting type for ufunc operands` `acef718`

@charris Rebased.

NumPy member

@seberg Sounds like this is ready. Yes, No? Resistance is futile.

NumPy member

:), yeah, I think it is ready.

NumPy member

In it goes. Thanks Jay. And thanks to @seberg and @njsmith for the exhaustive review.

closed this Aug 16, 2013
reopened this Aug 16, 2013
merged commit `d4af612` into numpy:master Aug 16, 2013

1 check passed

Details default The Travis CI build passed