-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: np.piecewise not working for scalars #8194
Conversation
I just noticed that the changes are making this test fail: np.piecewise([0, 0], [True, False], [1]) But I do not understand why the default behavior is for this not to fail, and to return
In that example the number of functions is neither equal, or one more than the number of conditions. np.piecewise([0, 0], [[True, False]], [1]) which in fact also returns |
Did not check, but it is not completely impossible that the test more documents behaviour then actually trying to fix it (i.e. there might even be a comment that it should be deprecated). It is however a hint that even if the current behaviour is bad, a deprecation/future warning cycle should likely be made before changing it. |
@seberg |
I have found a way to do this, changing the behaviour only in the cases when it was failing, so 100% retrocompatible. Essentially I have modified my original condition to be: if (isscalar(condlist) or not (isinstance(condlist[0], list) or
isinstance(condlist[0], ndarray))):
if not isscalar(condlist) and x.size == 1 and len(x.shape) == 0:
condlist = [[c] for c in condlist]
else:
condlist = [condlist] and before my fork it was: if (isscalar(condlist) or not (isinstance(condlist[0], list) or
isinstance(condlist[0], ndarray))):
condlist = [condlist] This means that the change in behaviour will only happen when the condition list is not a scalar ( The only circumstance when this can be different than before is when the input is a single value, and the condition list is something of the shape It should now be good to go :) |
I actually realized, based on one of the old existing tests:
that the behaviour that I was implementing was already expected to work. So this is not really an enhancement, but a bug correction. Essentially that test was not failing becuase if only two intervals are given, then after condlist=[condlist], len(condlist)==1 which is still allowed for len(functlist)==2. However, as soon as there are three intervals len(condlist) is still 1, but len(functlist)==3, making it stop working. |
@tkamishima is this the same bug you were going to fix in gh-7800? |
@seberg It does look like the same problem. In fact I just realize that the current implementation in 1.11 does not give an error (as I suggested in my first post, I was using 1.10 when I first tried that, my bad), but it still gives the wrong value as @tkamishima pointed out (I was not aware of that previous pull request at all). However, I still think that my solution is a bit less invasive, as it only changes behavior in a very particular subcase, leaving the function untouched in all other cases. This should be better than changing the location of |
@seberg @alvarosg This patch can fix the bug that I want to fix, and I confirmed that the unittests added in my pactch could be passed by @alvarosg 's patch. I leave it up to @seberg decision which patch would be used. Note: I tried to remove the hack as @seberg suggested, but I failed to simplify; it's too complicated. |
Frankly, I am not sure that this fix is correct, can we go a step back and make sure we are on the same page as to what is right? I still would like to get rid of that zerod hack, which I think is the reason for this confusing There are two basic things that you can do with condlist when condlist has less then
Now what actually happens? (this might be wrong):
I don't like the fact that we get two very different types of behaviour based on the dimensionality of the inputs. An option might be to just deprecate it to go with 3. Or we try to consolidate it (will need future warnings)? To me option 2 seems slightly more sensible, but because boolean indexing does not actually truly broadcast, that is very slightly, so I am actually tempted to think option 3 is best... My simplified piecewise code: def piecewise(x, condlist, funclist):
x = np.asanyarray(x)
cond = np.array(condlist, copy=False, dtype=bool, ndmin=1); condlist=list(cond)
if len(funclist) == len(condlist) + 1:
condlist.append(~np.logical_or.reduce(cond, axis=0))
y = np.zeros(x.shape, x.dtype)
for k in range(len(condlist)):
item = funclist[k]
if not isinstance(item, collections.Callable):
y[condlist[k]] = item
else:
vals = x[condlist[k]]
if vals.size > 0:
y[condlist[k]] = item(vals)
return y This function always uses option 2 (plus the "just make it 1-d if its 0-d logic). But I think adding the axis insertions after the cond array creation here is probably much more straight forward (especially since there is no weird |
It would also be good to throw actual errors when funclist is too long.... |
@seberg On the other hand, maybe we should treat this PR as a bug, and not an enhancement:
It just completely ignores everything after the first condition, and after the first function. Furthermore all those cases are cases where condlist has exactly n+1 dimensions, so it should be a no brainer. My tiny fix, gets this:
And because of the way it is implemented we know is only acting differently from the previous implementation in cases when: x.ndim==0, condlist.ndim=1, and len(condlist)>1. And almost all possible tests under those conditions are there. So my proposal would be to merge this as it is, and start a separate PR to make it nicer, including exceptions, making as many tests as we can to make sure it is as consistent as possible with previous behaviour. If that makes it before the next release, great. If it does not, then we at least would have solved the bug. |
Yes, I am OK with doing the bug fix only thing, but had to have a bit deeper look to see what exactly your code changes anyway. I guess we can probably put this in as is, could you squash the commits and make the commit message "BUG: ...", etc. as in the dev guidelines? |
As an example, does the code
actually do anything...? |
EDIT: Squash has been done Great, I will squash the commits and include an appropriate message. A couple of questions:
I will get back to you about the latest question later when I have access to a computer! |
3848902
to
714993e
Compare
About this:
The reason for this lines is actually related to what I have done. Essentially my implementation is preventing the function from transforming a one dimensional array of conditions in something like [cond1, cond2,...,condn] when the input is zero-dimensional, and instead transform it into [cond1],[cond2],...,[condn]. This was doing exactly the same, but a posteriori: First it makes it into (A), and then, if the input was zero-dimensional, it transposes it to make it like (B). the problem, is that if calculates n = len(condlist), before transposing, and this leads to bad behavior, hence, @tkamishima solution. The good thing, is that after this fix, we will just never be in a case where that happens(Only when condlist.shape=[1] --> condlis.shape=[1,1] , and in that case, transposing does nothing) , so now we can just remove those two lines. I will do that. |
714993e
to
4c3096c
Compare
Yeah please always start of against master (not sure you can change the PR in that regard maybe need to make a new PR). Yeah not sure about thanks.txt, we have not really been using it anyway and but list contributers in the individual release notes. Luckily with version control you basically got a list in some sense anyway. |
4c3096c
to
6420f84
Compare
The rebase to master is done, and everything is working. And for future reference, yes, there is an option in the pull requests to change the base branch. So I just had to cherry pick my last commit and re run it in top of master. I did not edit the release notes. Based on your message and the commits history I assumed you would do that. |
We add the contributers at release times from the git history and a bug fix like this does not have to be in the release notes for changed behaviour, so it should be fine. Will look over it later though, no time now. |
Looking further into solving the bug in a more elegant way I tried a different approach to make scalar and arrays work more similarly by casting scalars into arrays with flatten, and then back to scalars with reshape. Essentially I made this wrapper: def piecewise(x, condlist, *args, **kwargs):
x = np.asarray(x)
shape = x.shape
if x.ndim==0:
if hasattr(condlist, '__iter__'):
condlist = [np.asarray(c).flatten() for c in condlist]
else:
condlist = [[condlist]]
x = x.flatten()
xout = np.piecewise(x, condlist, *args, **kwargs)
return xout.reshape(shape) Which passes all the tests, because it essentially always makes the input how it should be: x is an array, and condlist is a list of arrays (So no further reshaping would be required inside piecewise). I guess the downside of this may be the mandatory casting into an array for efficiency-wise. But I am sure we can find a balance. If you like this approach I can explore it further. It would take some time as we should include more tests on the old behaviour so we make sure it really is backwards compatible, so I would still prefer doing this in the context of a different PR (Essentially so the tests we just added in this PR are already taken into account). |
I had some time this weekend to look into this, and came with the following implementation for piecewise that passes all the tests, without the previously existing hack. The idea is to first homogenize the input so the x is always a 1-d array, and the list of conditions a list of 1-d arrays (2-d array) following the existing behavior, and then reshape the output at the end according to the input. It is a bit similar to what you proposed here, but taking into account some of the undocumented behaviour, and the zero-d case (I tried your version as it was, and it made lots of the tests fail.) def piecewise(x, condlist, funclist, *args, **kw):
x = asarray(x)
condlist = array(condlist, dtype=bool)
# Make sure that piecewise([0,1],[True,False]),
# is interpreted as piecewise([0,1],[[True,False]]),
# according to previous undocumented behavior
if x.ndim==condlist.ndim:
condlist = [condlist]
# We flatten everything, this way, 0-d arrays
# can be treated in exactly the same way as n-d arrays
condlist = array([np.asarray(c).flatten() for c in condlist])
shape = x.shape
x = x.flatten()
# We look at the lenght only after normalizing input
nf = len(funclist)
nc = len(condlist)
# Adding the default case, when there are more functions than conditions
if nf == nc + 1:
totlist = np.logical_or.reduce(condlist, axis=0)
condlist = np.vstack([condlist, ~totlist])
nc += 1
# Calculating output
y = zeros(x.shape, x.dtype)
for k in range(nc):
item = funclist[k]
if not isinstance(item, collections.Callable):
y[condlist[k]] = item
else:
vals = x[condlist[k]]
if vals.size > 0:
y[condlist[k]] = item(vals, *args, **kw)
return y.reshape(shape) It is shorter than the previous version (after removing the comments), and in my opinion, it is clearer in what is doing what. It may still need some work regarding optimizing array copying, and making more tests (specially with higher dimensional arrays, for which there are none, now) to check against previous behavior. Also, I would need to add some exceptions. I am happy to work on this as long as you think it is worth it and will be merged. Please, let me know :) |
If there are enough new tests, we are happy to merge cleanups, if you The way I did it with the 0-d arrays, at the very least the next stage |
Yes, I guess it may be better to do the cleanup after 1.12, so we can count on, e.g. high_d_array[True] working.
I think then the only other thing that your approach would be missing is what to do in cases where |
OK, I will merge it as is then. @charris might make sense to still squeeze into 1.12, though maybe also not a big deal. |
Great, just let me know if/when you want me to create a separate PR to refactor piecewise |
OK, will put it in. I'm planning on releasing the beta this weekend. |
I was using np.piecewise to create a piecewise lambda function, and I realized that the way it is written it only allows to work with ndarrays, and not with list of scalars for the condlist argument.
Similarly to may other functions in np which can handle both ndarrays and native types, this also should.
I am making the following function:
This works fine:
But this fails:
And the reason is that internally, it is detecting that my masklist is
[False,False,True]
, this part of the code:makes len(condlist)==1.
I have replaced that condition by:
This makes it work in both cases: