BUG: np.piecewise not working for scalars #8194

alvarosg · 2016-10-21T14:07:14Z

I was using np.piecewise to create a piecewise lambda function, and I realized that the way it is written it only allows to work with ndarrays, and not with list of scalars for the condlist argument.

Similarly to may other functions in np which can handle both ndarrays and native types, this also should.

I am making the following function:

import numpy as np

def buildPieceWise(x):
    condlist=[x<-1,(x>=-1)*(x<1),x>1]
    funclist=[lambda x: -x, 0, lambda x:x]
    return np.piecewise(x,condlist,funclist)

This works fine:

>>>print buildPieceWise(np.linspace(-2,2,11))
[ 2.   1.6  1.2  0.   0.   0.   0.   0.   1.2  1.6  2. ]

But this fails:

>>>print buildPieceWise(-1.5)
C:\Anaconda\lib\site-packages\numpy\lib\function_base.pyc in piecewise(x, condlist, funclist, *args, **kw)
    773     if (n != n2):
    774         raise ValueError(
--> 775                 "function list and condition list must be the same")
ValueError: function list and condition list must be the same

And the reason is that internally, it is detecting that my masklist is [False,False,True], this part of the code:

if (isscalar(condlist) or not (isinstance(condlist[0], list) or
                                   isinstance(condlist[0], ndarray))):
        condlist = [condlist]

makes len(condlist)==1.

I have replaced that condition by:

if (isscalar(condlist) or not (isinstance(condlist[0], list) or
                                   isinstance(condlist[0], ndarray))):
        if not isscalar(condlist) and x.size == 1 and len(x.shape) == 0:
            condlist = [[c] for c in condlist]
        else:
            condlist = [condlist]

This makes it work in both cases:

import numpy as np

def buildPieceWise(x):
    condlist=[x<-1,(x>=-1)*(x<1),x>1]
    funclist=[lambda x: -x, 0, lambda x:x]
    return np.piecewise(x,condlist,funclist)

>>>print buildPieceWise(np.linspace(-2,2,11))
>>>print buildPieceWise(-1.5)
[ 2.   1.6  1.2  0.   0.   0.   0.   0.   1.2  1.6  2. ]
1.5

alvarosg · 2016-10-21T15:22:15Z

I just noticed that the changes are making this test fail:

np.piecewise([0, 0], [True, False], [1])

But I do not understand why the default behavior is for this not to fail, and to return [1,0], instead. I It must by undocumented behaviour:

condlist : list of bool arrays
Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclisti is used as the output value.
Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x.
The length of condlist must correspond to that of funclist. If one extra function is given, i.e. if len(funclist) - len(condlist) == 1, then that extra function is the default value, used wherever all conditions are false.

In that example the number of functions is neither equal, or one more than the number of conditions.
In reality it is interpreting the call as if it was:

np.piecewise([0, 0], [[True, False]], [1])

which in fact also returns [1,0].
However, the documentation clearly says that condlist is a LIST of boolean arrays, and not a list of arrays. So if in this case there is only one function, and one boolean array it should be inputed as [[True, False]].

seberg · 2016-10-21T15:41:54Z

Did not check, but it is not completely impossible that the test more documents behaviour then actually trying to fix it (i.e. there might even be a comment that it should be deprecated). It is however a hint that even if the current behaviour is bad, a deprecation/future warning cycle should likely be made before changing it.

alvarosg · 2016-10-21T15:47:36Z

@seberg
I think I may be able to make it work in both cases, by checking the shape of the input compared to the shape of the conditions, but I am not sure if there will be more tests that will still fail.

alvarosg · 2016-10-22T12:06:26Z

@seberg

I have found a way to do this, changing the behaviour only in the cases when it was failing, so 100% retrocompatible.

Essentially I have modified my original condition to be:

    if (isscalar(condlist) or not (isinstance(condlist[0], list) or
                                   isinstance(condlist[0], ndarray))):
        if not isscalar(condlist) and x.size == 1 and len(x.shape) == 0:
            condlist = [[c] for c in condlist]
        else:
            condlist = [condlist]

and before my fork it was:

    if (isscalar(condlist) or not (isinstance(condlist[0], list) or
                                   isinstance(condlist[0], ndarray))):
        condlist = [condlist]

This means that the change in behaviour will only happen when the condition list is not a scalar (not isscalar(condlist)), the condition list is also not a lists of lists or arrays (isinstance(condlist[0], list) or isinstance(condlist[0], ndarray)), and the input was a scalar (x.size == 1 and len(x.shape) == 0).

The only circumstance when this can be different than before is when the input is a single value, and the condition list is something of the shape [True,False,True,...,False]. And the only subset of this cases that overlaps with the behaviour that I described in my previous comment is when the length in the of the input and the length of the conditions is one, in which case the result of condlist = [condlist] and condlist = [[c] for c in condlist] should be the same.

It should now be good to go :)

alvarosg · 2016-10-23T11:28:21Z

@seberg @charris

I actually realized, based on one of the old existing tests:

x = 3
piecewise(x, [x <= 3, x > 3], [4, 0])  # Should succeed.

that the behaviour that I was implementing was already expected to work. So this is not really an enhancement, but a bug correction.

Essentially that test was not failing becuase if only two intervals are given, then after condlist=[condlist], len(condlist)==1 which is still allowed for len(functlist)==2. However, as soon as there are three intervals len(condlist) is still 1, but len(functlist)==3, making it stop working.

seberg · 2016-10-23T22:11:50Z

@tkamishima is this the same bug you were going to fix in gh-7800?

alvarosg · 2016-10-23T22:37:41Z

@seberg It does look like the same problem. In fact I just realize that the current implementation in 1.11 does not give an error (as I suggested in my first post, I was using 1.10 when I first tried that, my bad), but it still gives the wrong value as @tkamishima pointed out (I was not aware of that previous pull request at all).

However, I still think that my solution is a bit less invasive, as it only changes behavior in a very particular subcase, leaving the function untouched in all other cases. This should be better than changing the location of n = len(condlist), which may also work, but it is much difficult to justify not changing behavior in other cases.

tkamishima · 2016-10-24T08:09:18Z

@seberg @alvarosg This patch can fix the bug that I want to fix, and I confirmed that the unittests added in my pactch could be passed by @alvarosg 's patch.

I leave it up to @seberg decision which patch would be used.

Note: I tried to remove the hack as @seberg suggested, but I failed to simplify; it's too complicated.

seberg · 2016-10-24T17:06:59Z

Frankly, I am not sure that this fix is correct, can we go a step back and make sure we are on the same page as to what is right? I still would like to get rid of that zerod hack, which I think is the reason for this confusing condlist manipulation....

There are two basic things that you can do with condlist when condlist has less then n+1 dimensions:

You assume that there is only a single condition
You use boolean indexing broadcasting-like rules (ignore the missing dimensions). (Warning: This is not broadcasting, for broadcasting the indexing operations would have to be y[..., condlist[k]]).
Throw an error (or warning at least).

Now what actually happens? (this might be wrong):

If condlist.ndim == 0, fine you can make it 1D if you really want, since 0D, does not really make sense in any case (though we could deprecate that as well). (unrelated to the above)
If condlist.ndim == 1 all is fine for x.ndim == 0 (after your fix), for x.ndim > 0 we have behaviour 1.
If condlist.ndim > 1 we get behaviour 2 always.

I don't like the fact that we get two very different types of behaviour based on the dimensionality of the inputs. An option might be to just deprecate it to go with 3. Or we try to consolidate it (will need future warnings)? To me option 2 seems slightly more sensible, but because boolean indexing does not actually truly broadcast, that is very slightly, so I am actually tempted to think option 3 is best...

My simplified piecewise code:

def piecewise(x, condlist, funclist):
    x = np.asanyarray(x)
    cond = np.array(condlist, copy=False, dtype=bool, ndmin=1); condlist=list(cond)
    if len(funclist) == len(condlist) + 1:
        condlist.append(~np.logical_or.reduce(cond, axis=0))
    y = np.zeros(x.shape, x.dtype)
    for k in range(len(condlist)):
        item = funclist[k]
        if not isinstance(item, collections.Callable):
            y[condlist[k]] = item
        else:
            vals = x[condlist[k]]
            if vals.size > 0:
                y[condlist[k]] = item(vals)
    return y

This function always uses option 2 (plus the "just make it 1-d if its 0-d logic). But I think adding the axis insertions after the cond array creation here is probably much more straight forward (especially since there is no weird x.ndim == 0 avoidance logic.

seberg · 2016-10-24T17:09:00Z

It would also be good to throw actual errors when funclist is too long....

alvarosg · 2016-10-24T20:32:14Z

@seberg
I agree with you, going a step back is the best option. And if we can find a way to make it more elegant that it is right now, while keeping back-compatibility, we should do it. Basically the "hack" was not very well implemented...

On the other hand, maybe we should treat this PR as a bug, and not an enhancement:
With the existing current implementation:

>>> np.piecewise(3, [False, True, False], [1, 2, 3])
array(0)
>>> np.piecewise(3, [False, True, False], [1, 2, 3, 4])
array(0)

It just completely ignores everything after the first condition, and after the first function. Furthermore all those cases are cases where condlist has exactly n+1 dimensions, so it should be a no brainer.

My tiny fix, gets this:

>>> np.piecewise(3, [False, True], [1, 2])
array(2)
>>> np.piecewise(3, [False, False], [1, 2, 3])
array(3)
>>> np.piecewise(3, [False, True, False], [1, 2, 3])
array(2)
>>> np.piecewise(3, [False, False, False], [1, 2, 3])
array(0)
>>> np.piecewise(3, [False, True, False], [1, 2, 3, 4])
array(2)
>>> np.piecewise(3, [False, False, False], [1, 2, 3, 4])
array(4)

And because of the way it is implemented we know is only acting differently from the previous implementation in cases when: x.ndim==0, condlist.ndim=1, and len(condlist)>1. And almost all possible tests under those conditions are there.

So my proposal would be to merge this as it is, and start a separate PR to make it nicer, including exceptions, making as many tests as we can to make sure it is as consistent as possible with previous behaviour. If that makes it before the next release, great. If it does not, then we at least would have solved the bug.

seberg · 2016-10-25T15:54:45Z

Yes, I am OK with doing the bug fix only thing, but had to have a bit deeper look to see what exactly your code changes anyway. I guess we can probably put this in as is, could you squash the commits and make the commit message "BUG: ...", etc. as in the dev guidelines?

seberg · 2016-10-25T15:56:10Z

As an example, does the code

        if condlist.shape[-1] != 1:
             condlist = condlist.T

actually do anything...?

alvarosg · 2016-10-25T19:18:31Z

@seberg

EDIT: Squash has been done

Great, I will squash the commits and include an appropriate message. A couple of questions:

Do you want me to rebase to master? Currently this is based in maintenance.
Is there something similar to Thanks.txt in scipy to acknowledge contributions? I saw that there is an equivalent file here, but it is not really updated for small contributions like this, and hasn't been for 6 years hehehe.

I will get back to you about the latest question later when I have access to a computer!

alvarosg · 2016-10-26T00:09:24Z

@seberg

About this:

As an example, does the code
if condlist.shape[-1] != 1:
condlist = condlist.T
actually do anything...?

The reason for this lines is actually related to what I have done.

Essentially my implementation is preventing the function from transforming a one dimensional array of conditions in something like [cond1, cond2,...,condn] when the input is zero-dimensional, and instead transform it into [cond1],[cond2],...,[condn].

This was doing exactly the same, but a posteriori: First it makes it into (A), and then, if the input was zero-dimensional, it transposes it to make it like (B). the problem, is that if calculates n = len(condlist), before transposing, and this leads to bad behavior, hence, @tkamishima solution.

The good thing, is that after this fix, we will just never be in a case where that happens(Only when condlist.shape=[1] --> condlis.shape=[1,1] , and in that case, transposing does nothing) , so now we can just remove those two lines. I will do that.

seberg · 2016-10-26T07:52:33Z

Yeah please always start of against master (not sure you can change the PR in that regard maybe need to make a new PR). Yeah not sure about thanks.txt, we have not really been using it anyway and but list contributers in the individual release notes. Luckily with version control you basically got a list in some sense anyway.

alvarosg · 2016-10-26T09:57:00Z

@seberg

The rebase to master is done, and everything is working. And for future reference, yes, there is an option in the pull requests to change the base branch. So I just had to cherry pick my last commit and re run it in top of master.

I did not edit the release notes. Based on your message and the commits history I assumed you would do that.

seberg · 2016-10-26T10:51:27Z

We add the contributers at release times from the git history and a bug fix like this does not have to be in the release notes for changed behaviour, so it should be fine. Will look over it later though, no time now.

alvarosg · 2016-10-31T11:44:44Z

@seberg

Looking further into solving the bug in a more elegant way I tried a different approach to make scalar and arrays work more similarly by casting scalars into arrays with flatten, and then back to scalars with reshape.

Essentially I made this wrapper:

def piecewise(x, condlist, *args, **kwargs):
    x = np.asarray(x)
    shape = x.shape
    if x.ndim==0:
        if hasattr(condlist, '__iter__'):
            condlist = [np.asarray(c).flatten() for c in condlist]
        else:
            condlist = [[condlist]]
    x = x.flatten()
    xout = np.piecewise(x, condlist, *args, **kwargs)
    return xout.reshape(shape)

Which passes all the tests, because it essentially always makes the input how it should be: x is an array, and condlist is a list of arrays (So no further reshaping would be required inside piecewise). I guess the downside of this may be the mandatory casting into an array for efficiency-wise. But I am sure we can find a balance.

If you like this approach I can explore it further. It would take some time as we should include more tests on the old behaviour so we make sure it really is backwards compatible, so I would still prefer doing this in the context of a different PR (Essentially so the tests we just added in this PR are already taken into account).

alvarosg · 2016-11-06T16:10:07Z

@seberg

I had some time this weekend to look into this, and came with the following implementation for piecewise that passes all the tests, without the previously existing hack. The idea is to first homogenize the input so the x is always a 1-d array, and the list of conditions a list of 1-d arrays (2-d array) following the existing behavior, and then reshape the output at the end according to the input.

It is a bit similar to what you proposed here, but taking into account some of the undocumented behaviour, and the zero-d case (I tried your version as it was, and it made lots of the tests fail.)

def piecewise(x, condlist, funclist, *args, **kw):
    x = asarray(x)
    condlist = array(condlist, dtype=bool)

    # Make sure that piecewise([0,1],[True,False]), 
    # is interpreted as piecewise([0,1],[[True,False]]), 
    # according to previous undocumented behavior
    if x.ndim==condlist.ndim:
        condlist = [condlist]

    # We flatten everything, this way, 0-d arrays
    # can be treated in exactly the same way as n-d arrays
    condlist = array([np.asarray(c).flatten() for c in condlist])   
    shape = x.shape
    x = x.flatten()

    # We look at the lenght only after normalizing input
    nf = len(funclist)
    nc = len(condlist)

    # Adding the default case, when there are more functions than conditions
    if nf == nc + 1:
        totlist = np.logical_or.reduce(condlist, axis=0)
        condlist = np.vstack([condlist, ~totlist])
        nc += 1

    # Calculating output
    y = zeros(x.shape, x.dtype)
    for k in range(nc):
        item = funclist[k]
        if not isinstance(item, collections.Callable):
            y[condlist[k]] = item
        else:
            vals = x[condlist[k]]
            if vals.size > 0:
                y[condlist[k]] = item(vals, *args, **kw)

    return y.reshape(shape)

It is shorter than the previous version (after removing the comments), and in my opinion, it is clearer in what is doing what.

It may still need some work regarding optimizing array copying, and making more tests (specially with higher dimensional arrays, for which there are none, now) to check against previous behavior. Also, I would need to add some exceptions.

I am happy to work on this as long as you think it is worth it and will be merged. Please, let me know :)

seberg · 2016-11-06T16:55:23Z

If there are enough new tests, we are happy to merge cleanups, if you
are interested in making this nicer.

The way I did it with the 0-d arrays, at the very least the next stage
in the indexing deprecations would have to be done first, I guess
(possibly more, but high_d_array[True], etc. won't work currently
IIRC). If that would make things nicer, I think we might be able to do
that next step after 1.12 is branched off for good. But more things
might be wrong with my approach....

alvarosg · 2016-11-06T17:10:05Z

The way I did it with the 0-d arrays, at the very least the next stage
in the indexing deprecations would have to be done first, I guess
(possibly more, but high_d_array[True], etc. won't work currently
IIRC). If that would make things nicer, I think we might be able to do
that next step after 1.12 is branched off for good.

Yes, I guess it may be better to do the cleanup after 1.12, so we can count on, e.g. high_d_array[True] working.
Would it then be better in that case to first solve the BUG by merging the code as it is in this PR for 1.12, so the bug is gone in the next release, and then do the cleanup afterwards directly on 1.12, probably with a hybrid between your code and my code proposed before?

But more things
might be wrong with my approach....

I think then the only other thing that your approach would be missing is what to do in cases where x.ndim == condlist.ndim, which as for current behaviour it should be interpreted as if condlist was [condlist].

seberg · 2016-11-09T16:36:22Z

OK, I will merge it as is then. @charris might make sense to still squeeze into 1.12, though maybe also not a big deal.

alvarosg · 2016-11-09T17:50:16Z

Great, just let me know if/when you want me to create a separate PR to refactor piecewise

charris · 2016-11-10T20:01:58Z

OK, will put it in. I'm planning on releasing the beta this weekend.

alvarosg mentioned this pull request Oct 21, 2016

np.piecewise not working with lists of scalars #8193

Closed

charris added 01 - Enhancement component: numpy.lib labels Oct 21, 2016

alvarosg changed the title ~~np.piecewise to also work with lists of scalars~~ BUG: np.piecewise not working for scalars Oct 25, 2016

alvarosg force-pushed the scalar-piecewise branch from 3848902 to 714993e Compare October 25, 2016 23:11

alvarosg force-pushed the scalar-piecewise branch from 714993e to 4c3096c Compare October 26, 2016 00:13

alvarosg changed the base branch from maintenance/1.11.x to master October 26, 2016 08:48

BUG: np.piecewise not working for scalars

6420f84

alvarosg force-pushed the scalar-piecewise branch from 4c3096c to 6420f84 Compare October 26, 2016 09:17

seberg merged commit 0268680 into numpy:master Nov 9, 2016

charris mentioned this pull request Nov 10, 2016

BUG: np.piecewise not working for scalars #8265

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: np.piecewise not working for scalars #8194

BUG: np.piecewise not working for scalars #8194

alvarosg commented Oct 21, 2016 •

edited

alvarosg commented Oct 21, 2016

seberg commented Oct 21, 2016

alvarosg commented Oct 21, 2016 •

edited

alvarosg commented Oct 22, 2016 •

edited

alvarosg commented Oct 23, 2016

seberg commented Oct 23, 2016

alvarosg commented Oct 23, 2016 •

edited

tkamishima commented Oct 24, 2016

seberg commented Oct 24, 2016 •

edited

seberg commented Oct 24, 2016

alvarosg commented Oct 24, 2016 •

edited

seberg commented Oct 25, 2016

seberg commented Oct 25, 2016 •

edited

alvarosg commented Oct 25, 2016 •

edited

alvarosg commented Oct 26, 2016

seberg commented Oct 26, 2016

alvarosg commented Oct 26, 2016

seberg commented Oct 26, 2016

alvarosg commented Oct 31, 2016

alvarosg commented Nov 6, 2016 •

edited

seberg commented Nov 6, 2016

alvarosg commented Nov 6, 2016

seberg commented Nov 9, 2016

alvarosg commented Nov 9, 2016

charris commented Nov 10, 2016

BUG: np.piecewise not working for scalars #8194

BUG: np.piecewise not working for scalars #8194

Conversation

alvarosg commented Oct 21, 2016 • edited

alvarosg commented Oct 21, 2016

seberg commented Oct 21, 2016

alvarosg commented Oct 21, 2016 • edited

alvarosg commented Oct 22, 2016 • edited

alvarosg commented Oct 23, 2016

seberg commented Oct 23, 2016

alvarosg commented Oct 23, 2016 • edited

tkamishima commented Oct 24, 2016

seberg commented Oct 24, 2016 • edited

seberg commented Oct 24, 2016

alvarosg commented Oct 24, 2016 • edited

seberg commented Oct 25, 2016

seberg commented Oct 25, 2016 • edited

alvarosg commented Oct 25, 2016 • edited

alvarosg commented Oct 26, 2016

seberg commented Oct 26, 2016

alvarosg commented Oct 26, 2016

seberg commented Oct 26, 2016

alvarosg commented Oct 31, 2016

alvarosg commented Nov 6, 2016 • edited

seberg commented Nov 6, 2016

alvarosg commented Nov 6, 2016

seberg commented Nov 9, 2016

alvarosg commented Nov 9, 2016

charris commented Nov 10, 2016

alvarosg commented Oct 21, 2016 •

edited

alvarosg commented Oct 21, 2016 •

edited

alvarosg commented Oct 22, 2016 •

edited

alvarosg commented Oct 23, 2016 •

edited

seberg commented Oct 24, 2016 •

edited

alvarosg commented Oct 24, 2016 •

edited

seberg commented Oct 25, 2016 •

edited

alvarosg commented Oct 25, 2016 •

edited

alvarosg commented Nov 6, 2016 •

edited