Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: umath: ensure ufuncs are well-defined with memory overlapping inputs #8043

Merged
merged 33 commits into from Feb 16, 2017

Conversation

Projects
None yet
7 participants
@pv
Copy link
Member

commented Sep 11, 2016

Make ufuncs and ufunc operations to make temporary copies if input and output operands overlap in memory. Currently (cf. gh-6272, gh-1683) output from such operations is undefined. With this PR, the result is the same as if the operands were non-overlapping.

Fixes gh-1683

Adds new flags to NpyIter, and one new API function is added to deal with MapIter.

This PR does not fix up overlap handling in array assignment, that needs to be done later (and is slightly more hairy --- some common overlapping cases are in fact already handled). Unlike adding overlap-if-copy flag to NpyIter, it appears a similar flag would not be as useful in MapIter, as there are many special-cased code paths, and MapIter is not always made aware of all arrays operated on.

  • reread whole patch
  • re-check test coverage, and whether some of the tests are unnecessary
PyArray_ITEMSIZE(arr)))

static NPY_INLINE int
PyArray_EQUIVALENTLY_ITERABLE_OVERLAP_OK(PyArrayObject *arr1, PyArrayObject *arr2,

This comment has been minimized.

Copy link
@charris

charris Nov 6, 2016

Member

Seems rather large for inline ;)

This comment has been minimized.

Copy link
@pv

pv Jan 22, 2017

Author Member

OTOH, I don't really want to put it in a .c file since the logic is tightly coupled to the macros here (and I suspect the multiarray/umath split adds to the mess). Removing the inline makes it emit compiler warnings. I could write this as a macro, but that's messy.

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

C++ beckons ;)

@@ -4,9 +4,11 @@
import itertools

import numpy as np
from numpy.testing import run_module_suite, assert_, assert_raises, assert_equal
from numpy.testing import (run_module_suite, assert_, assert_raises, assert_equal, assert_array_equal,

This comment has been minimized.

Copy link
@charris

charris Nov 6, 2016

Member

Line too long ?

@@ -344,6 +344,8 @@
# End 1.9 API
'PyArray_CheckAnyScalarExact': (300, NonNull(1)),
# End 1.10 API
# End 1.11 API

This comment has been minimized.

Copy link
@charris

charris Nov 6, 2016

Member

Should update this now that 1.12 is branched. The cversion and C-API number also need updating, see documentation in doc/HOWTO_RELEASE.rst.txt.

@charris

This comment has been minimized.

Copy link
Member

commented Nov 6, 2016

Just starting to look at this.

NpyIter_Deallocate(iter);
return -1;
}
Py_DECREF(op_tmp);

This comment has been minimized.

Copy link
@pv

pv Nov 12, 2016

Author Member

Now that I look at this again, I guess op_tmp should be placed back into op instead of being discarded.

In the memory overlap checks done when ``NPY_ITER_COPY_IF_OVERLAP``
is specified, consider this array as overlapping even if it is
exactly the same as another array.

This comment has been minimized.

Copy link
@charris

charris Dec 6, 2016

Member

?. What other array? I think the operation here could be best explained with the help of some diagrams showing the data access and writebacks. The operation can be performed, and is valid, even if there is no overlap, just that it isn't necessary if no overlap.

This comment has been minimized.

Copy link
@charris

charris Dec 6, 2016

Member

I think this might be help clarify behavior when the indexing is many to one in either array.

operation is the same as if all operands were copied.
In cases where copies would need to be made, **the result of the
computation may be undefined without this flag!**

This comment has been minimized.

Copy link
@charris

charris Dec 6, 2016

Member

This is only for two operands? For inplace operations, both inputs are read, but only one is written too. Or does this apply for three operands, two inputs and one output. Or is it more general than that?

This comment has been minimized.

Copy link
@pv

pv Jan 10, 2017

Author Member

It applies to any number of operands.

* @returns 1 if memory overlap found, 0 if not.
*/
NPY_NO_EXPORT int
index_has_memory_overlap(PyArrayObject *self,

This comment has been minimized.

Copy link
@charris

charris Dec 6, 2016

Member

Hmm, so you are actually checking indexes rather than the resulting view?

This comment has been minimized.

Copy link
@pv

pv Jan 10, 2017

Author Member

It's used to checking the overlap between the array operated on, and the index array used in ufunc.at, for example

x = np.arange(100)
np.invert.at(x[::-1], x)  # <- crazy stuff if no x.copy() of the second argument

It doesn't look at the content of the indices, so it's a somewhat conservative check.

@charris

This comment has been minimized.

Copy link
Member

commented Dec 6, 2016

Just a first read through. This will need a release note eventually.

@charris

This comment has been minimized.

Copy link
Member

commented Dec 6, 2016

Probably should bring in @seberg on this also.

Test ufunc call memory overlap handling
"""

def check_unary_fuzz(self, operation, get_out_axis_size, dtype=np.int16,

This comment has been minimized.

Copy link
@pv

pv Jan 10, 2017

Author Member

I would need to check again if these tests cover all places where the copyifoverlap behavior is in, and/or whether some of the tests are unnecessary.

This comment has been minimized.

Copy link
@pv

pv Jan 22, 2017

Author Member

ok, checked --- removing any of NPY_COPY_IF_OVERLAP et al. causes some test to fail

@@ -3635,6 +3654,7 @@ PyUFunc_Reduceat(PyUFuncObject *ufunc, PyArrayObject *arr, PyArrayObject *ind,
/* In case COPY or UPDATEIFCOPY occurred */
op[0] = NpyIter_GetOperandArray(iter)[0];
op[1] = NpyIter_GetOperandArray(iter)[1];
op[2] = NpyIter_GetOperandArray(iter)[2];

This comment has been minimized.

Copy link
@juliantaylor

juliantaylor Jan 12, 2017

Contributor

why is this line added? a bug in the old code or due to the other changes?

This comment has been minimized.

Copy link
@pv

pv Jan 12, 2017

Author Member

I think it's a noop. Not necessary to add.

This comment has been minimized.

Copy link
@pv

pv Jan 19, 2017

Author Member

On second thought, I think it's useful to have here to ensure correctness, if we later on change NPY_ITER_COPY_IF_OVERLAP to make copies of also the input arrays.

pv added some commits Sep 5, 2016

ENH: NpyIter: add a flag to handle read/write operand overlap
Add a new NPY_ITER_COPY_IF_OVERLAP iterator flag to NpyIter, which
instructs it to check if read operands overlap with write operands in
memory, and make temporary copies to eliminate detected overlap.

Thanks to Sebastian Berg.
BUG: core: fix use of wrong output array in iterator_loop
These can differ if the iterator decides to make temporary copies of
output arrays.
ENH: core: add overlap detection logic to EQUIVALENTLY/TRIVIALLY_ITER…
…ABLE loops

These loops iterate over whole arrays in "trivial" order, so that it is
possible to reason about the data dependency.
ENH: umath: turn on overlap detection + copying in ufunc basic calls
After this commit, all code paths in PyUFunc_GeneralizedFunction and
PyUFunc_GenericFunction perform input/output operand memory overlap
detection, and make copies to eliminate it.
BUG: core/ufunc: fix assumptions about iterator in execute_fancy_ufun…
…c_loops

Because the wheremask loop does not write to all elements of the array, it
needs to consider output arrays as READWRITE, in case COPY_IF_OVERLAP makes
temporary copies.

As the iterator may make temporary copies of arrays, make sure to pass those to
__array_prepare__ instead of the originals.
ENH: core: handle memory overlap in ufunc.at
This adds a new method PyArray_MapIterArrayCopyIfOverlap to the API.
@pv

This comment has been minimized.

Copy link
Member Author

commented Jan 23, 2017

pv added some commits Jan 23, 2017

@charris

This comment has been minimized.

Copy link
Member

commented Jan 29, 2017

@pv Do you feel that this is about ready?

@pv

This comment has been minimized.

Copy link
Member Author

commented Feb 6, 2017

@charris: I've read this several times through now, and only finding forgotten decref on error handling. I'm fairly confident now that it's what it says on the tin, and doesn't do anything more.

It's a bit ugly that this adds PyArray_MapIterArrayCopyIfOverlap to the API --- it's essentially a private function, as was PyArray_MapIterArray --- but the better way is not obvious. On the other hand, adding the new iterator flags appears to be a good way to do this, as it makes reasoning about the change easier, and the changes in ufuncs themselves are mostly minimal. Currently, the copying is somewhat conservative in that perhaps some could be avoided (e.g. in accumulate) and the heuristics in deciding what to copy is a bit primitive, but that's something that in principle can be tightened up later.

@pv

This comment has been minimized.

Copy link
Member Author

commented Feb 6, 2017

One additional thing is that this probably won't work on PyPy due to the use of updateifcopy, but I guess that's part of a bigger pypy issue that needs separate fixes (probably sprinkling PyArray_FlushUpdateIfCopy around) in more places than just this.

@pv pv referenced this pull request Feb 8, 2017

Open

ENH: Automatic copy on overlap #6272

3 of 5 tasks complete
@charris

This comment has been minimized.

Copy link
Member

commented Feb 8, 2017

@mattip @rlamy Comments?

@rlamy

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2017

Well, I don't really understand the UPDATEIFCOPY issue, but since @mattip is on holiday, I'll comment anyway.

All the new tests appear to pass on both pypy and pypy3. I'm guessing that's because the copy cannot be seen by the RPython GC. So I think this is OK for us.

@pv

This comment has been minimized.

Copy link
Member Author

commented Feb 9, 2017

if (index_type & (HAS_FANCY | HAS_BOOL)) {
for (i = 0; i < num; ++i) {
if (indices[i].object != NULL && PyArray_Check(indices[i].object) &&
solve_may_share_memory(self, (PyArrayObject *)indices[i].object,

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Should have extra indent here to separate the condition from the block.

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Bit tricky with the long line, but it doesn't read well as is.

}

if (extra_op != NULL && PyArray_Check(extra_op) &&
solve_may_share_memory(self, (PyArrayObject *)extra_op, 1) != 0) {

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

See above.

@@ -3197,6 +3259,21 @@ PyArray_MapIterArray(PyArrayObject * a, PyObject * index)
}


/*NUMPY_API
*
* Use advanced indexing to iterate an array. Please note

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

This needs to be documented somewhere public. It seems dangerous to expose an API that may change...

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

How could someone detect the change to this particular function?

This comment has been minimized.

Copy link
@pv

pv Feb 10, 2017

Author Member

This the is old comment for PyArray_MapIterArray, not added by me.

This comment has been minimized.

Copy link
@pv

pv Feb 10, 2017

Author Member

Anyway, removing it, since it's unlikely we will change this...

@@ -2708,6 +2710,80 @@ npyiter_allocate_arrays(NpyIter *iter,
bufferdata = NIT_BUFFERDATA(iter);
}

if (flags & NPY_ITER_COPY_IF_OVERLAP) {
/* Perform operand memory overlap checks if requested */

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Could add more detail on what is being checked here, the notes below are a bit too local.

* make sense to go further.
*/
may_share_memory = solve_may_share_memory(
op[iop], op[iother], 1);

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Maybe extra indent here, or even break the args and align.

return 1;
}

/* Arrays overlapping in memory may be equivalently iterable

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Multiline comment not correctly formatted ;) Also below.

}
else if (stride1 < 0) {
arr1_ahead = (stride1 <= stride2 &&
(npy_uintp)PyArray_BYTES(arr1) <= (npy_uintp)PyArray_BYTES(arr2));

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Hmm... Out of curiousity, did the compiler complain without the cast?

This comment has been minimized.

Copy link
@pv

pv Feb 10, 2017

Author Member

The casts are not needed (probably leftovers from earlier versions).

@@ -1730,12 +1764,17 @@ execute_fancy_ufunc_loop(PyUFuncObject *ufunc,
}
}
for (i = nin; i < nop; ++i) {
/*
* Because we don't write to all elements, the output arrays
* must be considered READWRITE by the iterator.

This comment has been minimized.

Copy link
@charris

charris Feb 10, 2017

Member

Curious.

This comment has been minimized.

Copy link
@pv

pv Feb 10, 2017

Author Member

Updateifcopy copies all elements back, not only those for which wheremask is true, so the old content of the output array must be copied there.

@charris

This comment has been minimized.

Copy link
Member

commented Feb 10, 2017

Not done reviewing yet. Determining when a copy isn't needed for overlapping arrays seems tricky except in single operand the linear case. ISTR that the iterator may coalesce axes.

pv added some commits Feb 10, 2017

DOC: core: remove old comment saying public API may change
It's unlikely we will break backward compatibility, so better
not claim we will do it in the code.

@charris charris merged commit d55f40b into numpy:master Feb 16, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@charris

This comment has been minimized.

Copy link
Member

commented Feb 16, 2017

I'm going to put this in so it will start getting more testing. I don't claim to have understood all the details, but I doubt I will discover any errors by inspection. The write up in the release notes is very nice, you should consider copying that over to the official documentation. There are still some style nits that I may attempt to clean up later, I would gently urge you to pay more attention to spaces around operators ('+') and line length, even though the latter may not always be as readable.

Thanks for this work, Pauli.

@seberg

This comment has been minimized.

Copy link
Member

commented Feb 17, 2017

Wow, cool. Assuming nothing goes haywire, this is the next huge addition :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.