New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken __iadd__ behavior on views #6119
Comments
The problem here is that numpy |
My understanding of this issue was that it indeed is a bug, with the correct behavior being |
Not completely the same --- consider |
I would already be happy if the arrays with ndim > 1 were covered. Are there any other important and safe use cases? |
I don't recall seeing too many "safe" examples given, although it may be there are some additional ones in old threads on the mailing list. I suspect there's not so much (correct) code that actually relies on this as a performance optimization, as it is cumbersome to reason about, but it AFAIK worked like this also in Numeric (pre-2006) so some may exist. |
I would be a little wary of providing features, or documenting behaviors, that rely on implementation details like array traversal order. It may be shooting ourselves in the foot if someone ever comes up with some iterator optimization scheme that needs to change this, and we are stuck with backwards compatibility. You can get the original example to work playing around with the iterator buffer size, but it's the kind of thing that should come with a big fat DO NOT TRY THIS AT HOME sign. |
@jaimefrio: the suggestion here (gh-1683) was to have it always behave as if a copy of the RHS was made. The ufunc implementation in Numpy knows what order it is going to do the traversal (and about buffering etc), and can skip the copies in some cases. The current behavior is implementation-defined, but as seen from the issue being discussed on regular intervals, it's fairly surprising. |
If the current behavior does qualify as a bug, then wouldn't the optimal resolution be to first fix it, while only accounting for the most common safe case ( On a related note, is there a way to automatically test some few thousand lines of existing codebase for possible appearances of this undesired behavior? |
Yes --- given the assumption it is a bug and not a wontfix. (However, allowances for the 1D cases are probably fairly simple to implement.) At runtime, you can use |
Firstly: should I continue discussion here (the issue is still closed), in #1683 (same issue, still open), or here http://thread.gmane.org/gmane.comp.python.numeric.general/60880/focus=60884 ? I would strongly argue for considering it a bug for the following reasons:
|
I see two potentially productive action items: (a) improve the docs, (b)
|
Sorry, which example? I don't see any other messages from you in this issue discussion. |
@akhmerov I'm quite sure that @njsmith is referring to this kind of thing: >>> import numpy as np
>>> a = np.ones((4, 4))
>>> a
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]])
>>> a[:, 0] += a[:, 1]
>>> a
array([[ 2., 1., 1., 1.],
[ 2., 1., 1., 1.],
[ 2., 1., 1., 1.],
[ 2., 1., 1., 1.]])
>>> np.may_share_memory(a[:, 0], a[:, 1])
True In this example we are adding the second column of For what it's worth, I'd personally be OK with a policy where Also I don't understand @njsmith's suggestion that "what we need here is a |
I believe that @njsmith is suggesting that |
I'm not sure it's productive to insist on definitely_share_memory(),
given that the scaling is non-polynomial (in the number of dimensions),
so it's likely going to be slow for high-dimensional arrays. For a
compiler this might be OK, but Numpy has to do the calculation on the fly.
|
The simplest solution might be to use the copy semantics whenever there is a possible overlap. Less efficient in some cases, but that is pretty much my standard procedure to avoid problems. The same might be appropriate for the out parameter of binary ufuncs. @njsmith I mentioned the memcpy case because we don't have documented behavior for the one dimensionsal case. If we want to keep that behavior it should be documented and tested. |
Sorry about getting confused about the different issue threads there.
|
We have a firm upper bound on the number of dimensions and almost all arrays have a tiny number of dimensions. Without an implementation to test, I wouldn't discount it as slow out of hand. |
To be clear, I don't necessarily oppose making copies just-in-case myself,
|
Would it seem as plausible to improve |
Ignoring the variable details of order of access, the contiguous case should be easy to handle. If the step size if fixed throughout, that is sort of like the contiguous case. The complicated cases can be done (I think) using GCD and checks, or we could simply make a copy in that situation. |
I dunno, maybe! The thing to do is to try it :-) |
For what it's worth, I'd personally still favor copying if the possibility of aliasing is detected, even without waiting for someone to take up this challenge to improve |
I think the priority should be correctness rather than efficiency. Here we have an example of someone getting wrong answers for reasons that were not obvious to him, code that was correct with one set of variables was not correct for another. That is probably not the sort of reputation that we want to develop. |
So make the case on the list? On Wed, Jul 29, 2015 at 12:51 PM, Charles Harris notifications@github.com
Nathaniel J. Smith -- http://vorpus.org |
The usual structure that (permute to sorted strides) stride[i]_shape[i] |
Here's a chapter that claims to provide an algorithm for solving bounded On Wed, Jul 29, 2015 at 5:28 PM, Pauli Virtanen notifications@github.com
Nathaniel J. Smith -- http://vorpus.org |
I think intersection of (shifted?) lattices might be the thing to look at first. One problem is likely to be integer overflow, as some of the basis vectors of the intersection can be very, very, large. In any case, that would give all common points and from there it might be possible to check for those in bounds in a relatively efficient manner. I don't know if any of that would be efficient enough, however. And while the strict lattice case is straightforward, the shifted case looks to be tricky. I suspect William Stein is the guy to ask here, he did his Ph'D under Hendrik Lenstra ;) |
I read the ramachandran paper some time ago. If the runtimes there are in seconds, it was of order 10s for 9 variables (java implementation). |
@charris: If I understood correctly, the unbounded problem is not hard, but the bounded one is. |
Yeah, IIUC finding the space of all possible solutions to a linear Diophantine equation is basically trivial (just apply Euclid's algorithm repeatedly, in polynomial time), but checking whether that space intersects with the bounds is still NP-hard. For our purposes though there are probably a lot of cases where solving the unbounded equation will immediately reveal that overlap is impossible. If the two arrays are aligned and have the same size elements, then one can simplify out the element-size entry in the equation, and if you can do that then you can immediately rule out cases like |
I think we could subdivide the ranges of each array, say into the subarrays indexed by the index with largest stride, and check for possible overlap between pairs. Then recurse on the possible pairs until the subarrays are small enough to check explicitly. Of course, that doesn't solve the problem of when it is OK to use arrays with common element addresses. It needs a bit of cleverness to avoid blowup of the n**2 variety, but that might be possible by sudividing just the largest (sub)arrays each time into, say, at most four parts. Stride trick arrays pretty much mess up everything as any element can have multiple indices. But in the case of unique indices I think it is doable. |
Here's something to play with:
https://gist.github.com/pv/fe407dee3170a7e2243a
|
In fact, in the unique index case the element addresses can be accessed in sorted order, so the upper bound is on comparisons is O(n1 + n2), it's like merging two sorted lists. |
@pv: from a quick skim, it looks like that code is currently not taking
|
Yes, it wasn't considering itemsize; added now as an extra dimension. The unique index cases (usual strided arrays) are probably easy for the GCD algorithm, which might be useful to try to actually show. The worst case scales exponentially with dimension, and you can easily find by random trial cases in 0.1 sec range (for the C code) at 10 variables. I don't think these correspond to something you'd ever have to solve when dealing with Numpy arrays, but I'd prefer to cap the maximum work done in order to not add potentially exponentially long runtimes. |
Would it make sense to upgrade the gist to a repo? I guess it's a work in progress to reduce the false positive rate of |
@argriffing: the gist actually is a repository that you can clone etc.
Assuming the algorithm is correct and I didn't flub the integer
overflows the next step would be to use it for may_share_memory, and
after that reuse the logic for ufuncs.
|
This has been added to the cython may_share_memory, but not to the maybe.py testing code, right? Also I think it causes an issue with the scalar detection. I think scalars with itemsize larger than 1 are no longer detected as scalars? |
In |
Here goes, not yet for ufuncs though: gh-6166 |
Using
+=
with a view of the same array sometimes is broken. To reproduce runThis produces a bunch of zeros (correct result), followed by a bunch of finite numbers of order 1 (wrong). For me the result is wrong for
n > 90
.Replacing
h += h.T
withh = h + h.T
or anything equivalent fixes the issue.I have verified that the issue appears in linux pip installation of numpy v1.9.2 in python 2, as well as python 3. In particular, it is reproducible on http://try.jupyter.org.
The text was updated successfully, but these errors were encountered: