-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate truth-testing on empty arrays #9583
Comments
RationaleOverviewIt is impossible to take advantage of the fact that empty arrays are
Unfortunately, this tries to follow two different incompatible conventions at once. Somebody typing >>> import numpy as np
>>> bool(np.array([]))
False
>>> # but this is not a good way to test for emptiness, because...
... bool(np.array([0]))
FalsePersonally, this behavior made me waste a good hour barking up the wrong tree debugging a function after I changed a list variable with 0 or 1 members into an array type, causing an ProposalBased on comments in the newsgroup thread, the best proposal seemed to be as follows:
One possible addition is to deprecate this for non-scalar arrays as well; (e.g. stop accepting pv warns against this addition:
To address some possible concernsWhy take away our ability to test for emptiness?We aren't! The fact of the matter is, the truthiness of a numpy array is not a reliable test for emptiness to begin with! As shown above, But isn't truthiness the accepted way to test for emptiness in Python? Why should numpy be different?Eric Firing presents the stance that the behavior ought to be changed in the other direction:
While this does have it's advantages (in particular, it facilitates duck typing in polymorphic code that wants to work with Numpy and non-numpy types), it is a pretty tough sell, since it changes the semantics of working, currently bug-free code. Unfortunately, Numpy has made a number of fundamental design decisions that are simply incompatible with using truth testing to determine emptiness. @eric-wieser puts it as follows:
Another way in which arrays behave like scalars is in the fact that comparison operators return arrays. The implications of this are huge! Here's some things that would happen if we redefined >>> bool(array([1,2,3]) == array([7,8,9])) # currently an error
True
>>> bool(array([1]) != array([1])) # currently False
True
>>> 2 < array([0,7]) < 5 # currently an error
True
>>> bool(array([]) == array([])) # actually, this is False even today
FalseThe first example is particularly sinister. Imagine unit tests that trivially succeed without actually testing anything! My concern isn't addressed!Please share it! Deprecations in such a popular library should be taken seriously. Also share if you have any existing code that legitimately/correctly makes use of the current behavior, as my currently held belief is that no such code exists. |
|
Thanks for the nice & detailed summary! |
|
This is a trivial fix on |
|
Hi, I am looking to start contributing, and I would love to work on adding the patch (deprecation warning) based on the proposal described if it is still desired and no one is currently working on it. |
|
@hemildesai: go for it - most of the work is in coming up with the right message to show, which you may as well propose here first |
|
Cool by me, I've been busy with other things. When I was trying to write a message earlier, I was looking through other deprecations and was disappointed to find that they don't link to anything; this is what I was planning to do with my "Rationale" post. |
|
Being consistent with the error for arrays with size > 1, how about Also, is it suggested to use |
|
Consistency is a decent goal and helps enforce a consistent mental model, though my 2 cents is that I don't think it's so important for deprecation warnings. There is a human element here; there likely exists code out there which uses this feature, and it may be intentional or accidental. I would try to consider how such code could arise and what message we want to communicate to the author in each case. I can think of these scenarios:
So by these measures, at least, I do think the message is good. Edit: I added one more goal, which I'm not certain is effectively met. I'm trying to think of how one could do this while keeping the message terse. |
|
@ExpHP Thanks for the feedback. That is an excellent way to think about deprecation warnings and I will definitely use this thought process wherever possible. Regarding the last goal, how about - Based off of https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L753-L754 |
|
I don't like this one as much because it feels like it says a lot. In particular, the "this is ambiguous" bit feels awkward now and my brain doesn't quite as easily make the connection to the size > 2 error message. IMO, the best error messages are ones that communicate a lot while appearing to say very little. The same words just happen to communicate the right information to the right people. (have I mentioned that I have a tendency to overthink things? |
|
I agree that the iterated message is a bit verbose. |
I think this works. |
|
Maybe |
as recommended in numpy/numpy#9583, to handle np.arrays's new bool-behavior.
as recommended in numpy/numpy#9583, to handle np.arrays's new bool-behavior.
as recommended in numpy/numpy#9583, to handle np.arrays's new bool-behavior.
I tested the waters with this on the numpy-discussion newsgroup
earlier this weekyesterday, and the general response seemed to be that this is actionable, so I am making an issue for further discussion.The long and short is that truth-testing on empty arrays is dangerous, misleading, and not in any way useful, and should be deprecated.
The next post will explain the rationale for deprecation more in-depth.
The text was updated successfully, but these errors were encountered: