New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make comparison function (gt, ge, ...) respect __array_priority__. #3324

Merged
merged 11 commits into from May 23, 2013

Conversation

Projects
None yet
4 participants
@nouiz
Copy link
Contributor

nouiz commented May 10, 2013

I made tests in Theano and they passed. Where should I add test about this in NumPy?

fixes Theano/Theano#1375

@charris

This comment has been minimized.

Copy link
Member

charris commented May 10, 2013

I'd add the test to numpy/core/tests/test_multiarray.py. Probably all the builtin operators should be tested, but that would be a start. Maybe as class TestArrayPriority. A quick check here shows it works for polynomials now:

In [1]: from numpy.polynomial import Polynomial as P

In [2]: p = P([1,2,3])

In [3]: ones(3) == p
Out[3]: False
@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 10, 2013

Out of curiosity, how does it work for the usual operators? Do those work because they are defined on the C-level, and just always have lower priority?

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 12, 2013

Two things, first to make it work like the other things work, the change should probably be more along the diff below. That said, I think this can probably use some more thought and the proposed change may actually be better then the current behaviour. Until I don't even understand why array * matrix works (and np.multiple(array, matrix) does correct normal multiplication), I am not sure what to think...

diff --git a/numpy/core/src/multiarray/arrayobject.c b/numpy/core/src/multiarray/arrayobject.c
index ad196e0..a380a48 100644
--- a/numpy/core/src/multiarray/arrayobject.c
+++ b/numpy/core/src/multiarray/arrayobject.c
@@ -1298,7 +1298,7 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
         }

         result = PyArray_GenericBinaryFunction(self,
-                (PyObject *)array_other,
+                (PyObject *)other,
                 n_ops.equal);
         if ((result == Py_NotImplemented) &&
                 (PyArray_TYPE(self) == NPY_VOID)) {
@@ -1354,7 +1354,7 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
             return Py_NotImplemented;
         }

-        result = PyArray_GenericBinaryFunction(self, (PyObject *)array_other,
+        result = PyArray_GenericBinaryFunction(self, (PyObject *)other,
                 n_ops.not_equal);
         if ((result == Py_NotImplemented) &&
                 (PyArray_TYPE(self) == NPY_VOID)) {
diff --git a/numpy/core/src/umath/ufunc_object.c b/numpy/core/src/umath/ufunc_object.c
index a3a1647..5cf1075 100644
--- a/numpy/core/src/umath/ufunc_object.c
+++ b/numpy/core/src/umath/ufunc_object.c
@@ -492,6 +492,14 @@ _has_reflected_op(PyObject *op, char *name)
     _GETATTR_(bitwise_and, rand);
     _GETATTR_(bitwise_xor, rxor);
     _GETATTR_(bitwise_or, ror);
+    /* Comparisons */
+    _GETATTR_(equal, eq);
+    _GETATTR_(not_equal, ne);
+    _GETATTR_(greater, lt);
+    _GETATTR_(less, gt);
+    _GETATTR_(greater_equal, le);
+    _GETATTR_(less_equal, ge);
     return 0;
 }
@charris

This comment has been minimized.

Copy link
Member

charris commented May 15, 2013

@nouiz Any progress on this?

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 15, 2013

@seberg With the +, -, ... operator, NumPy check the array_priority flag. I copy/pasted part of the code from there. If is find that that right operand have higher priority, it will raise NotImplemented, and python will call the radd, ... function.

There isn't such __r* function for logic operator, so we need to call them manually.

If the right operand don't have an higher priority, it is the ndarray function that will be executed.

Do this answer your question?

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 15, 2013

about array * matrix, matrix have higher priority then array, so it is matrix.rmul(array) that get executed in the end. I'll try your suggested diff.

@charris I have many other priority, I'll try to finish this shortly, but it won't be as shortly as I would like. I haven't started doing new tests in NumPy.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 15, 2013

@nouiz, well, I simply have no idea where the exact code is which causes array * matrix to trigger the matrix product! I know it should be related to array priority, but if the code I changed in that diff also handles that I don't understand how.

The change you suggested checked array_priority before actually hitting the ufunc machinery at all. And quite honestly I think that might be better then doing it in the ufunc machinery. np.multiply(array, matrix) does not trigger a matrix product, but I am not sure if you cannot construct cases where such a thing happens. But that is probably a larger discussion for the mailing list if it gets confusing...

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 15, 2013

I perfer @seberg fix. It reuse the current mechanism. I don't like the idea of using 2 differents mechanism for the comparison and for the other function. I don't know a reason why we should not use the current mechanism. Do you know one? If there is, I think it should be another PR that change the mechanism for all function.

I think my PR is ready to merge. Other questions/comments? I can remove the 2 firsts commits if you wish. But I wanted to keep them for now, just in case you prefer that.

I think we need a way to have element-wise multiple between an array and a matrix. Changing the behavior of np.multiple would I think remove this possibility. To my knowledge array_priority is only used for function with special python syntax. I think we should continue to do that. But I don't have a strong opinion, I can easily be convinced with some argument on that.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 16, 2013

Yes, you are right, anything more could be thought about later. In my diff, I just replaced array_other with other in the ufunc call. I didn't look carefully, but maybe the whole thing should be a bit more modified/cleaned up (i.e. most of the time array_other is never used), so I would prefer if the the call is moved before the array_other creation and then it breaks if everything is fine already.

The test look nice, but I believe assert_(...) is preferable to the assert keyword argument for testing? There are also many defined assertion functions, i.e. assert_array_equal, assert_allclose...

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 16, 2013

One thing I don't understand, why the case of eq and neq is so different
then the other comparison? There isn't any comment to explain why. Could it
be just for historic reason and we could just change it to be of the same
structure? I'm trying to find, but if you know why or have any guest, it
would help.

On Thu, May 16, 2013 at 9:43 AM, seberg notifications@github.com wrote:

Yes, you are right, anything more could be thought about later. In my
diff, I just replaced array_other with other in the ufunc call. I didn't
look carefully, but maybe the whole thing should be a bit more
modified/cleaned up (i.e. most of the time array_other is never used), so
I would prefer if the the call is moved before the array_other creation
and then it breaks if everything is fine already.

The test look nice, but I believe assert_(...) is preferable to the
assert keyword argument for testing? There are also many defined assertion
functions, i.e. assert_array_equal, assert_allclose...


Reply to this email directly or view it on GitHubhttps://github.com//pull/3324#issuecomment-18001650
.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 16, 2013

I think it is for structured arrays. Comparisons for them are ill defined anyway, but since ufuncs cannot handle general structured arrays this is a hack to make at least the simple comparisons work for them. I.e.

a = np.array([(1,2), (3,4)], dtype='i,i')
a == a[0] # works, pretty much nothing else seems to work, but...
@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 16, 2013

I updated the PR to reorder that. I added a comment about what you told.

On Thu, May 16, 2013 at 10:18 AM, seberg notifications@github.com wrote:

I think it is for structured arrays. Comparisons for them are ill defined
anyway, but since ufuncs cannot handle general structured arrays this is a
hack to make at least the simple comparisons work for them. I.e.

a = np.array([(1,2), (3,4)], dtype='i,i')
a == a[0] # works, pretty much nothing else seems to work, but...


Reply to this email directly or view it on GitHubhttps://github.com//pull/3324#issuecomment-18003845
.


// I have been told that the rest of this case is probably an
// hack to support a few cases of structured arrays since
// ufuncs cannot handle general structured arrays.

This comment has been minimized.

@seberg

seberg May 16, 2013

Member

No C-style comments please, use for multiline comments use:

/*
 * Multi line comment
 * here.
 */
@@ -1297,9 +1307,6 @@ static void _unistripw(npy_ucs4 *s, int n)
return Py_NotImplemented;
}

result = PyArray_GenericBinaryFunction(self,
(PyObject *)array_other,
n_ops.equal);
if ((result == Py_NotImplemented) &&
(PyArray_TYPE(self) == NPY_VOID)) {

This comment has been minimized.

@seberg

seberg May 16, 2013

Member

You can move the whole array creation inside this or actually, invert the block and break.

// I have been told that the rest of this case is probably an
// hack to support a few cases of structured arrays since
// ufuncs cannot handle general structured arrays.

/* Make sure 'other' is an array */
if (PyArray_TYPE(self) == NPY_OBJECT) {

This comment has been minimized.

@seberg

seberg May 16, 2013

Member

Once we break out early for anything but void arrays, this if should become unnecessary. (which means dtype is just NULL). I wonder if we should not actually force-cast to the correct void typed array. That way

a = np.array([(0,1)], dtype='i,i')
a == (0,1)

should work correctly. Is there anything that could break if we change that or would that be another bug fix? And if, we need to test for that case as well if array priority works correctly.

* hack to support a few cases of structured arrays since
* ufuncs cannot handle general structured arrays.
*/

/* Make sure 'other' is an array */
if (PyArray_TYPE(self) == NPY_OBJECT) {

This comment has been minimized.

@seberg

seberg May 16, 2013

Member

Seems I lost my comment. If we break out earlier, the check is unnecessary (it cannot be npy_object). However the block is there to force object dtype. I wonder if it would not be correct to force the dtype for void arrays. That way the code:

a = np.array([(0,1)], dtype='i,i')
a == (0, 1)

should work. Would that be another fix, or is there anything that might go bad? Would need to test if array priority works correctly for that case then.

This comment has been minimized.

@nouiz

nouiz May 16, 2013

Contributor

I'm not willing to change the structured array behavior. This is something
I never used and don't know well. It would be easy for me to make mistake
without learning more about them. Also, logically, this is something else
and should be in another PR.

I fixed the comment style, simplified the if and moved the computation
inside the if (after validating this is equivalent).

About the if with NPY_OBJECT, why PyArray_GenericBinaryFunction will always
work when this is false? I think we need to keep it. Removing it would at
worst, generate bad results. Keeping it at worst would make it as slow as
before. Don't forget, numpy is missing many tests and I don't have the time
to look into this case in more details. So I think we should keep it.

On Thu, May 16, 2013 at 4:03 PM, seberg notifications@github.com wrote:

In numpy/core/src/multiarray/arrayobject.c:

@@ -1280,6 +1280,18 @@ static void _unistripw(npy_ucs4 *s, int n)
Py_INCREF(Py_False);
return Py_False;
}

  •    result = PyArray_GenericBinaryFunction(self,
    
  •            (PyObject *)other,
    
  •            n_ops.equal);
    
  •    if (result && result != Py_NotImplemented)
    
  •      break;
    
  •    /*
    
  •     \* I have been told that the rest of this case is probably an
    
  •     \* hack to support a few cases of structured arrays since
    
  •     \* ufuncs cannot handle general structured arrays.
    
  •     _/
    
    • /_ Make sure 'other' is an array */
      if (PyArray_TYPE(self) == NPY_OBJECT) {

Seems I lost my comment. If we break out earlier, the check is unnecessary
(it cannot be npy_object). However the block is there to force object
dtype. I wonder if it would not be correct to force the dtype for void
arrays. That way the code:

a = np.array([(0,1)], dtype='i,i')
a == (0, 1)

should work. Would that be another fix, or is there anything that might go
bad? Would need to test if array priority works correctly for that case
then.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3324/files#r4263883
.

This comment has been minimized.

@seberg

seberg May 17, 2013

Member

Yeah, I think it would be wrong anyway (because of casting), I was just thinking out loud...

The if contradicts the one that is in your current version right above it, which means dtype is always NULL. Maybe you could change the comment to something more specific: The ufunc does not support void/structured types, so these need to be handled specifically.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 17, 2013

Anyway, I am pretty much good with this, only other thing would be replacing the asserts in the test with assert_(...) function, and the assert np.allclose(...) with assert_allclose(...) or assert_array_equal.
I am not sure how important avoiding asserts is in tests (and I see there are some in numpy...), I guess it could happen that in release the assert the is optimized away by python otherwise?

@charris

This comment has been minimized.

Copy link
Member

charris commented May 17, 2013

Yeah, assert becomes a nop when python is run with the -O optimize flag. It's really a nose problem, asserts should only be used as debugging aids. The assert_ function avoids that difficulty.

@charris

This comment has been minimized.

Copy link
Member

charris commented May 17, 2013

If there are some asserts in numpy, they should be fixed.

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 17, 2013

I fixed the assert and the comment.

On Fri, May 17, 2013 at 8:52 AM, Charles Harris notifications@github.comwrote:

If there are some asserts in numpy, they should be fixed.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3324#issuecomment-18059524
.

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 23, 2013

Is there something else I need to do for this PR?

thanks

@charris

This comment has been minimized.

Copy link
Member

charris commented May 23, 2013

@seberg I was leaving this to Sebastian.

(PyArray_TYPE(self) == NPY_VOID)) {
if (PyArray_TYPE(self) == NPY_VOID) {
/* Make sure 'other' is an array */
if (PyArray_TYPE(self) == NPY_OBJECT) {

This comment has been minimized.

@seberg

seberg May 23, 2013

Member

Sorry, quite busy. if type is void, then here it cannot be object, so this really is unnecessary (and dtype always NULL). Puzzles a bit why it was there in the first place, but it is not there for the other comparisons so I can't see it being necessary.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 23, 2013

Anyway, looks good to me and tests seem nice, so will merge it soon one way or another I think.

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 23, 2013

I just pushed a new commit that remove the useless code.

I didn't test it locally, so wait for travis to pass.

On Thu, May 23, 2013 at 11:40 AM, seberg notifications@github.com wrote:

Anyway, looks good to me and tests seem nice, so will merge it soon one
way or another I think.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3324#issuecomment-18351803
.

@nouiz

This comment has been minimized.

Copy link
Contributor

nouiz commented May 23, 2013

travis-ci test passed.

@seberg

This comment has been minimized.

Copy link
Member

seberg commented May 23, 2013

OK, I will merge this then, thanks. Btw. we usually add prefixes like "BUG" to the commit message in numpy. For the future maybe :).

seberg added a commit that referenced this pull request May 23, 2013

Merge pull request #3324 from nouiz/prio_cmp
Make comparison function (gt, ge, ...) respect __array_priority__.

@seberg seberg merged commit 56a878c into numpy:master May 23, 2013

1 check passed

default The Travis CI build passed
Details
@cowlicks

This comment has been minimized.

Copy link
Contributor

cowlicks commented Jun 30, 2013

Any chance of this being backported to 1.7? I'm adding support for sparse matrix inequalities to scipy/sparse and it depends on this.

@charris

This comment has been minimized.

Copy link
Member

charris commented Jul 1, 2013

@cowlicks Looks good to backport. Want to do a PR?

@cowlicks

This comment has been minimized.

Copy link
Contributor

cowlicks commented Jul 1, 2013

@charris will do.

@cowlicks cowlicks referenced this pull request Jul 1, 2013

Merged

Backport 3324 #3490

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment