Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEP: Deprecate boolean math operations #4105

Merged
merged 2 commits into from Feb 15, 2014

Conversation

seberg
Copy link
Member

@seberg seberg commented Dec 5, 2013

The bitwise or logical operations should be used instead,
after a while.

To be honest, I am a bit unsure here. While it is nice, the impact out there might be rather big (this currently does not include the fixes necessary in ma and some tests, so failing).

@seberg seberg mentioned this pull request Dec 5, 2013
@njsmith
Copy link
Member

njsmith commented Dec 5, 2013

Best way to find out how big the impact is is to run a few other projects
test suites - scipy, sklearn, any other big old projects like that?
On 5 Dec 2013 13:56, "seberg" notifications@github.com wrote:

The bitwise or logical operations should be used instead,
after a while.

To be honest, I am a bit unsure here. While it is nice, the impact out
there might be rather big (this currently does not include the fixes

necessary in ma and some tests, so failing).

You can merge this Pull Request by running

git pull https://github.com/seberg/numpy deprecate-boolean-math

Or view, comment on, or merge it at:

#4105
Commit Summary

  • DEP: Deprecate boolean math operations

File Changes

Patch Links:

@rkern
Copy link
Member

rkern commented Dec 5, 2013

Does this affect np.sum(bool_array)?

@charris
Copy link
Member

charris commented Dec 5, 2013

@rkern if the sum dtype is boolean

In [1]: sum(ones(5, dtype=bool_))
Out[1]: 5

In [2]: sum(ones(5, dtype=bool_), dtype=bool_)
Out[2]: True

@seberg
Copy link
Member Author

seberg commented Dec 5, 2013

Sum and prod are fine, since they get the output type set to an integer dtype, which means the ??->? loop is not selected (and it is the only loop that has the first argument boolean).
Quick try with scipy suggests that it is likely OK. The new sparse code/tests doesn't like it at all, but that makes a lot of sense considering it is replicating the numpy bool behaviour.

EDIT: Your second example would be deprecated.

@alan-isaac
Copy link

The current behavior of + and * for Boolean arrays are correct and should not be changed. These are the standard meet and join operations for Boolean arrays, including Boolean matrices. Please research the literature on Boolean matrices before considering such a change.

@abalkin
Copy link
Contributor

abalkin commented Dec 6, 2013

Please research the literature on Boolean matrices before considering such a change.

According to wikipedia, "Frequently operations on binary matrices are defined in terms of modular arithmetic mod 2."

The same article says "If the Boolean domain is viewed as a semiring, where addition corresponds to logical OR and multiplication to logical AND, the matrix representation of the composition of two relations is equal to the matrix product of the matrix representations of these relation."

Note that ordinary numbers can also be viewed as a semiring with for example + being max and * being +. This does not mean that it would make sense to redefine numpy's + and * this way.

I think most people expect things that can be added with + and multiplied with * to form a ring, not a semiring.

If you want semiring operations you can use | and &. If you want to implement matrix algebra over a semiring, you can use logical_or.reduce and logical_and.reduce or even maximum.reduce and minimum.reduce.

@alan-isaac
Copy link

@balkin I believe your comment misses the point. A correct behavior currently exists in NumPy. It is a behavior that has a widely recognized usage. (There is a whole literature.) Removing a correct behavior and suggesting users can always find a workaround does not make sense.

Very simple example: compare what computation of the (non-reflexive) transitive closure of a graph (using its Boolean matrix representation) will look like before and after the proposed change. (Hint: dot current works correctly for boolean arrays; this will break.)

@abalkin
Copy link
Contributor

abalkin commented Dec 6, 2013

@alan-isaac - there is a discussion on this topic at the ML. I suggest that we move our dispute there.

@alan-isaac
Copy link

@balkin Rather than engaging in a dispute, I prefer to think of myself as making a request. The current behavior of + and * for boolean arrays is correct, and as a result dot behaviors correctly too. By correctly, I mean that it is in conformance with a large literature. My request is: don't break this well established, useful, and correct behavior. PLEASE.

@seberg
Copy link
Member Author

seberg commented Dec 7, 2013

Just to note, it is now only - being deprecated. I modified the occurances in MA and some tests. np.allclose and np.ma.allclose need some kind of fix (ma allclose is tested and fails), though not sure how to best fix it (the np.assert_... modification might need to be revisited similarly).

@charris
Copy link
Member

charris commented Dec 19, 2013

This should be run past the list once more so that everyone is clear on the proposal and can sign off on it.

@charris
Copy link
Member

charris commented Feb 2, 2014

The tests fail because the - deprecations get turned into errors in the tests.

@charris
Copy link
Member

charris commented Feb 9, 2014

@seberg If you fix the tests I'll put this in.

@seberg
Copy link
Member Author

seberg commented Feb 10, 2014

Ah, now I remember why I didn't fix it earlier... The allclose functions (only ma gets tested) also use abs(a - b), and the question is whether we should add a new special case for them or not. (I guess yes, but...)

@njsmith
Copy link
Member

njsmith commented Feb 10, 2014

+1 for just teaching allclose how to compare booleans. (Possibly allclose
should just delegate to array_equal when dealing with any exact type.)

On Mon, Feb 10, 2014 at 8:28 AM, seberg notifications@github.com wrote:

Ah, now I remember why I didn't fix it earlier... The allclose functions
(only ma gets tested) also use abs(a - b), and the question is whether we
should add a new special case for them or not. (I guess yes, but...)


Reply to this email directly or view it on GitHubhttps://github.com//pull/4105#issuecomment-34630630
.

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

@pv
Copy link
Member

pv commented Feb 10, 2014

allclose should not delegate to array_equal for integer types, as the
semantics imply comparison with tolerances.

This perhaps implies that arguments of allclose should be cast to floats/complex
inside the function.

@seberg
Copy link
Member Author

seberg commented Feb 10, 2014

I fixed that, but there is a question remaining for me actually... Right now I did this with a deprecation warning since I thought we will remove the functionality. But removing it will probably cause it to be upcast (not sure to what exactly right now) unless we raise an error explicitly. So it should be a FutureWarning or do we plan on raising an error explicitly.

@njsmith
Copy link
Member

njsmith commented Feb 10, 2014

Okay, right, my off the cuff suggestion wasn't so good. Let me try making
another and see if this one has better luck :-)

In np.allclose, maybe we should just always cast all non-inexact types to
float? The allclose semantics are defined in terms of floating point
arithmetic anyway, so the only way it can possibly work for non-floats is
if a coercion happens at some point. Right now it's implicit in abs or
something, but if we just make it explicit then shouldn't that handle all
these messy int and bool cases correctly?

On Mon, Feb 10, 2014 at 10:20 AM, seberg notifications@github.com wrote:

I fixed that, but there is a question remaining for me actually... Right
now I did this with a deprecation warning since I thought we will remove
the functionality. But removing it will probably cause it to be upcast (not
sure to what exactly right now) unless we raise an error explicitly. So it
should be a FutureWarning or do we plan on raising an error explicitly.


Reply to this email directly or view it on GitHubhttps://github.com//pull/4105#issuecomment-34643067
.

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

@seberg
Copy link
Member Author

seberg commented Feb 10, 2014

Could do the cast to an "at least" float type. That way that minimum int stuff gets fixed... For this it doesn't really matter.

@charris
Copy link
Member

charris commented Feb 10, 2014

Sounds like a good approach.

@seberg
Copy link
Member Author

seberg commented Feb 12, 2014

OK, I did that by casting y to result_type(y, 1.) which will make everything inexact. The error is still a DeprecationWarning, should it be changed to a futurewarning saying that it will be cast, or do we want to forbid it explicitly for the subtraction?

Boolean - is not well defined, especially the unary and
binary operator are not compatible. In general boolean
minus seems to have no real application and does not do
what might be expected.

All "allclose" type functions (numpy, tests, masked) have to
now check for boolean to avoid the deprecation warning. In
the future one could think about removing it again and just
allowing the upcast.
# make sure y is an inexact type to avoid abs(MIN_INT); will cause
# casting of x later.
dtype = result_type(y, 1.)
y = array(y, dtype=dtype, copy=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to import array in line 796.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I seem to keep forgotting to run the tests... should those imports really be there and not on the module level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, they are for all, so it would be a larger change to really do that right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of imports like that in this file. I'd be inclined to clean them up sometime, but it isn't needed for this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, put the imports there, if you guys prefer another way to do this (I mean the result_type thing), I can change it. Added a simple extra test, because I was a bit worried I overlooked things with result_type and array-likes.

@charris
Copy link
Member

charris commented Feb 15, 2014

@seberg Test failure due to missing array import.

# casting of x later.
dtype = result_type(y, 1.)
y = array(y, dtype=dtype, copy=False)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to have acquired an insatiable taste for blank lines ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, like blank lines :)... but two blank lines is just insane (and that one after y is nonsense too...). fixed.

Casting y to an inexact type fixes problems such as
abs(MIN_INT) < 0, and generally makes sense since the allclose
logic is inherently for float types.
@charris
Copy link
Member

charris commented Feb 15, 2014

This should fix allclose for minimum integers also, right?

@seberg
Copy link
Member Author

seberg commented Feb 15, 2014

Yes, tests are in place for that. (probably not the magic closes in the commit message)

@charris
Copy link
Member

charris commented Feb 15, 2014

OK, merging. Thanks Sebastian. I'll hunt down the other tickets dealing with minimum ints.

charris added a commit that referenced this pull request Feb 15, 2014
DEP: Deprecate boolean math operations
@charris charris merged commit 2868dc4 into numpy:master Feb 15, 2014
@seberg seberg deleted the deprecate-boolean-math branch February 15, 2014 23:41
@seberg
Copy link
Member Author

seberg commented Feb 15, 2014

Thanks.

@charris
Copy link
Member

charris commented Feb 15, 2014

One thing we might want to do in the future is have more explicit error messages for non-numeric + boolean types. Currently they fail with isinf, which is a bit fortuitous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants