-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: __array_ufunc__ should always be looked up on the type, never the instance #9087
Conversation
1836f00
to
260a128
Compare
260a128
to
671cfdd
Compare
int disables; | ||
|
||
array_ufunc = PyObject_GetAttrString(obj, "__array_ufunc__"); | ||
disables = (array_ufunc == Py_None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seemed dumb to look up the attribute again, when we already did given that "The __array_func__
attribute must already be known to exist."
@@ -548,15 +554,13 @@ PyUFunc_CheckOverride(PyUFuncObject *ufunc, char *method, | |||
goto fail; | |||
} | |||
|
|||
/* Access the override */ | |||
array_ufunc = PyObject_GetAttrString(override_obj, | |||
"__array_ufunc__"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why look it up a third time? Especially since here we're using a different lookup rule to the previous time
For some reason the last commit is not showing up in this PR - you'll see it if you click the branch name at the bottom of the page |
671cfdd
to
c190798
Compare
} | ||
/* Set the self argument, since we have an unbound method */ | ||
Py_INCREF(override_obj); | ||
PyTuple_SetItem(override_args, 0, override_obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because PyArray_LookupSpecial
returns an unbound method. Arguably it should return a bound one, but that's wasteful to construct, and I wasn't able to make it work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyObject_GetAttrString
should have done that too, was it buggy before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was buggy because it did PyObject_GetAttrString(obj, ...)
not PyObject_GetAttrString(Py_TYPE(obj), ...)
if (obj == Py_None || | ||
PyBool_Check(obj) || | ||
/* Basic number types */ | ||
PyTypeObject const * const known_types[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is somewhat performance relevant for small arrays, a large if will give better code than creating this array each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't the compiler optimize the array since it's const, and full of addresses to static variables?
I could add a static
if it helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no those variables may come from a dynamic loading so they are not const as far as the compiler can tell.
adding static won't work for the same reason.
You could add a static init variable, but imo just using an if statement is nicer than modifiable global state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, that's a bit of a pain. I guess the if
is not too verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an if is less lines of code and a decent text editor with block and multicursor editing should handle it just as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for the tests to pass before pushing that, to see if I missed anything that my local build didn't pick up. Any other comments?
This also corrects a broken short-circuit when looking up __array_ufunc__
c190798
to
99789a4
Compare
Previously, we would check if the attribute existed on the class, yet use it from the instance. This also cuts 3 lookups of `__array_ufunc__` down to one.
99789a4
to
5bf2a79
Compare
Ugh, warning hell - this file is shared between multiple libraries, but each library only uses one function from it, so warnings are produced about the unused ones
|
add |
|
||
return 0; | ||
/* sentinel to swallow trailing || */ | ||
NPY_FALSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just remove the trailing ||
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do, but it makes it much easier to reorder the above conditions, and add new ones, leaving it this way.
In particular, there's probably some small performance gain to be had by putting the most common types first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, Py_None should be at the top as in the original as it is one of the common arguments (out, axis, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll move that back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though that is very very minor as GCC just does all comparisons at once anyway and follows it up with some form of jump table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of not incurring another rebase/wait-for-tests cycle, shall I leave it as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave it, it is not that important that it needs so much microoptimization.
bc888bd
to
d37fc08
Compare
int nargs = PyTuple_GET_SIZE(args); | ||
npy_intp nin = ufunc->nin; | ||
npy_intp nout = ufunc->nout; | ||
npy_intp nargs = PyTuple_GET_SIZE(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing this will emit warnings in the following format functions, they use %d
for intp you have to use NPY_INTP_FMT
or cast them to int again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should just add casts to PyTuple_GET_SIZE
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just leaving it int is probably fine, I'm not too worried about stuff crashing when you pass more than 32 bit arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it seems like either travis does not catch this type of warning, or int == intp
on travis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the warning check is only done on 32 bit where int == intp, I wanted to fix that at some point ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leave fixing these remaining warnings until that time then? With this change, I get fewer warnings on MSVC, which insists on only showing warnings when an error is elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have them fixed. I can push a commit doing so when you are otherwise done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I'm otherwise done here. I've given up on making PyArray_LookupSpecial
behave more like _PyObject_LookupSpecial
, because it messes with our if (override == ndarray.__array_ufunc__)
check in ways that I can't debug (probably comparing bound/unbound methods, or a copy being made of the function object at some level)
d37fc08
to
171cdee
Compare
Is there any way to make the warning travis build be first? It's maddening having to come back to this every half hour, only to find that most of the way through the test, the warning one (which isn't easy to test locally) has failed |
the warning test is after the regular matrix job in the.travis.yaml, I never tried switching the ordering of there. It probably works. |
You can reorder the runs in the matrixl: section of .travis.yml
|
I know, these are ordered from slowest to fastest for scheduling already, but I wonder if the |
} | ||
/* does the class define __array_ufunc__? */ | ||
cls_array_ufunc = PyArray_GetAttrString_SuppressException( | ||
(PyObject *)Py_TYPE(obj), "__array_ufunc__"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting that before and after this patch, __array_ufunc__
would fail on old-style classes, where type(obj)
is instancetype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is
TypeError: unsupported operand type(s) for +: 'instance' and 'instance'
Which I think is probably good enough
@juliantaylor: Thanks for that commit. Tests all pass. |
thanks |
Does this fix the performance issue you raised in #9085? |
yes as you moved the basic check to the type object it doesn't call GetAttr anymore so CheckOverride is down to 4% of the runtime from 50% |
/* PyTuple_SET_ITEM steals reference */ | ||
PyTuple_SET_ITEM(override_args, 0, (PyObject *)ufunc); | ||
Py_INCREF(Py_None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, thanks for putting this together. Much better! But why set override_args 0 here? It is just overwritten below anyway (and if you keep this write, wouldn't you have to write to decref None below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried about the behaviour of PyTuple_SetItem
with a NULL in its target position.
I do not need to decref None
, because unlike SETITEM
, SetItem
handles the decref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use SET_ITEM
below too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that doesn't free the reference of the object previously in the tuple. To be honest, the reference counting design here was based on the case that I was sure I had reasoned correctly about, rather than picking a strategy that would require me to read the tuple source code again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! Indeed, while correcting, I played with changing it to SET_ITEM
, and realised it became a convoluted pain of now having to decref things oneself.
Py_DECREF(array_ufunc); | ||
/* Call the method */ | ||
*result = PyObject_Call( | ||
override_array_ufunc, override_args, normal_kwds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to DECREF override_array_ufunc
after this, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, they all get should have been cleaned up at the bottom. Wanna patch that quick?
@@ -580,6 +584,9 @@ PyUFunc_CheckOverride(PyUFuncObject *ufunc, char *method, | |||
return 0; | |||
|
|||
fail: | |||
for (i = 0; i < num_override_args; i++) { | |||
Py_XDECREF(array_ufunc_methods[i]); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, mistake here - this path needs to be run under both failure and success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is probably the easiest solution...
@eric-wieser, @juliantaylor - this is much nicer than what I had -- I had thought of storing But also a worry about a memory leak: see inline comment. |
I assume a fixup for this is coming? |
I don't have time tonight for a fixup, I'm afraid. @mhvk, can you make the fix mentioned in the inline comment? |
OK, I'll make a fixup. |
See #9092 |
there is a subtle problem with this code. The |
This also corrects a broken short-circuit when looking up
__array_ufunc__
.Doesn't build on my machine, but msvc is not giving me useful errors.