ENH: Add `__array_ufunc__` #8247

charris · 2016-11-06T17:58:15Z

This reverts commit bac094c and enables
__numpy_ufunc__ for development in the NumPy 1.13.0 development cycle.

EDIT: Note that the name has been changed to __array_ufunc__ together with various changes to the function signature and implementation.

njsmith · 2016-11-06T18:09:11Z

I'm all for getting this into 1.13, but I don't get why we would want to
enable it on master before it's ready. All that does is put master into an
unreleasable state? Generally the rule is things should live on branches
until they're ready, and this is part of how we got into the mess with
scipy jumping the gun...

On Nov 6, 2016 9:58 AM, "Charles Harris" notifications@github.com wrote:

This reverts commit bac094c
bac094c
and enables

numpy_ufunc for development in the NumPy 1.13.0 development cycle.

You can view, comment on, or merge this pull request online at:

#8247
Commit Summary

ENH: Revert "Temporarily disable numpy_ufunc"

File Changes

M numpy/core/src/private/ufunc_override.h
https://github.com/numpy/numpy/pull/8247/files#diff-0 (6)

M numpy/core/tests/test_multiarray.py
https://github.com/numpy/numpy/pull/8247/files#diff-1 (20)

M numpy/core/tests/test_umath.py
https://github.com/numpy/numpy/pull/8247/files#diff-2 (25)

Patch Links:

https://github.com/numpy/numpy/pull/8247.patch

https://github.com/numpy/numpy/pull/8247.diff

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#8247, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAlOaEd4slFcqly26_j-yqed1ikcnhwQks5q7hU5gaJpZM4KqoDm
.

charris · 2016-11-06T18:39:18Z

My thought is to have it there so we can work on it and have it tested against. If things breaks, tests will break and will need fixing anyway. I think trying to work this out in one big commit would be difficult and I would like to get an early start. However, there are a couple of things we might want to put in here before merging this:

Change the name, mostly on account of scipy tests enabling themselves.
Remove the index.

Other changes, such as adding an override, have less effect downstream and can come in later. Currently priority still plays a role and will need a deprecation cycle. I suppose we need to finalize what the function looks like, then argue about the binops.

rgommers · 2016-11-07T08:18:40Z

I would agree with @njsmith in principle, but with @charris in practice. If we don't re-enable it on master, there's a good chance it will never make it in. However, at the moment it's very hard to judge how far off we still are. Would be good to have a concise summary of the open points and a plan about who/how to approach those in broad lines before hitting the green button here.

pv · 2016-11-07T13:05:27Z

IIRC, the plan we sort-of converged to try was to (i) rename numpy_ufunc, (ii) remove the index argument, (iii) in Python binops, make it so that ndarray and its subclasses always defer to numpy_ufunc, if present. If numpy_ufunc ends up with NotImplemented, this results to a TypeError, never falling back to Python resolution. . The last point takes a similar (and backward-compatible) view as Numpy does currently vs. treating all non-array Python objects similarly as shape=() object arrays --- in the Python binop resolution hierarchy, arrays always win over non-arrays. I think this is not the only possible choice, but it was preferred over the alternatives by @njsmith and @shoyer. . I think the question of numpy scalars did not come up so far in the discussion, so that's an open point. . I think the points (i-iii) were discussed to death last time, so unless there's something new to discuss, I think we shouldn't reopen that.

charris · 2016-11-07T15:54:42Z

@pv I think the sticking point was the last. My own preference would be to have an opt out mechanism with the usual NotImplemented return. The reason is twofold, first, at the present time, many Objects opt out by using __array_priority__ in combination with an __rop__ as we have recommended , second, there is a desire to deprecate __array_priority__ as it doesn't really work for its original intended purpose, the second is that the Python way is to rely on NotImplemented and write an __rop__ . Note that the core problem is greedy arrays and how to keep them in check. Now it is possible to get around the lack of an __rop__ call using __numpy_ufunc__ (possibly one of the few cases where an index would be helpful), but it requires a fair amount to new code and knowledge of the what ufuncs are used for which __op__. The easy way out would be to change the meaning of __array_priority__ >= 1000 to opt out and otherwise ignore it at some future date, the other option is to add an opt out attribute. I'm in favor of the either of the last two options.

pv · 2016-11-08T00:19:06Z

Yes, that is the point where we failed to find consensus on what is the optimum. It's quite possible that if we start again, the issue will not be clarified, so the decision needs to be made without convergence (and with an escape route in mind). I think there were some responses to your specific point above in the old huge discussion (rare use case to handle numpy stuff without knowing about numpy, provide dummy implementation in numpy?), plus alternative proposals for the opt-out.

charris · 2016-11-09T02:27:30Z

I've left the NEP alone as an historical document, even though names and arguments have changed.

mhvk · 2016-11-11T11:17:06Z

Agreed with @pv's summary. And that the remaining unresolved issue was with treating items that become object arrays. If I recall correctly, plans are now afoot to disallow automatic creation of object arrays with np.array, so this would be a good time to do the same with __numpy_ufunc__. With that, it should be possible to get rid of __array_priority__ much more easily.

charris · 2016-11-12T20:02:19Z

I've removed the index. For out always tuple, does that mean the the **kwds will always contain an out entry, possibly empty? I note that the index will currently contain positions in the out argument, so I assume that having __array_ufunc__ is also called if one of an out arguments implements it.

Note on documentation, there looks to be very little any apart from the NEP, which is now outdated.

EDIT: Looks like the tuple needs to contain exactly as many entries, some possibly None, as the function has, so I guess this means always having out in the keyword dictionary. Is there any interest in deprecating non-keyword out? I note that having one of each for two output functions is deprecated and should probably go to an error soon.

charris · 2016-11-13T19:26:23Z

It looks to me like the nin argument to PyUFunc_CheckOverride can be removed if the out argument is always a tuple keyword argument. IFAICT, the only need for it now is for the three psuedo ufuncs used for matrix multiplication, but if the required arguments, and only the required arguments, are passed in the args tuple, the length of that tuple fills the need. @pv Comment?

charris · 2016-11-14T21:15:49Z

OK, there are complications with __array__ufunc__ in general that I think we should deal with to keep it simple on the user end. Take for instance the following for the reduce, accumulate, and reduceat methods:

    static char *reduce_kwlist[] = {
            "array", "axis", "dtype", "out", "keepdims", NULL};
    static char *accumulate_kwlist[] = {
            "array", "axis", "dtype", "out", "keepdims", NULL};
    static char *reduceat_kwlist[] = {
            "array", "indices", "axis", "dtype", "out", NULL};

Note that all of these can be passed by position or by keyword, there is no simple division between the two except that everything after the last positional argument must be passed by keyword. ISTM that from the user end that we should have a definite division into required (positional) and optional (keyword) arguments. That is not so difficult to achieve, but is likely to add call overhead. I think that the simplest way to implement this is to use PyArg_ParseTupleAndKeywords with all object output types and then construct the args tuple and kwargs dictionary from them. We could possibly save some time by only doing this if one of the inputs implements __array_ufunc__, although even in that case we need to determine the inputs. Also note how far along the out argument appears in the given examples.

Thoughts?

mhvk · 2016-11-14T22:08:22Z

I agree that all optional arguments should always end up in kwargs. I'm not sure why you say one needs to have an out even if there are no actual output arrays passed in. The advantage of not doing that is that the user can hide possibly expensive checks for any special cases behind a simply if kwargs: ....

One question: can even an array passed in via where have __array_ufunc__? (out arrays already do -- and should. I cannot immediately think why it would be useful for where [except trickery where it is an array that adapts itself based on the inputs] but it is a form of consistency to allow any of the array inputs to override)

njsmith · 2016-11-14T22:14:06Z

It would be useful to allow where= to be a sparse array.

On Nov 14, 2016 2:09 PM, "Marten van Kerkwijk" notifications@github.com
wrote:

I agree that all optional arguments should always end up in kwargs. I'm
not sure why you say one needs to have an out even if there are no actual
output arrays passed in. The advantage of not doing that is that the user
can hide possibly expensive checks for any special cases behind a simply if
kwargs: ....

One question: can even an array passed in via where have array_ufunc?
(out arrays already do -- and should. I cannot immediately think why it
would be useful for where [except trickery where it is an array that
adapts itself based on the inputs] but it is a form of consistency to allow
any of the array inputs to override)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#8247 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAlOaOmaG1Xmnj9Qs1d2dlR0sQTBQgwsks5q-NvXgaJpZM4KqoDm
.

charris · 2016-11-14T22:23:32Z

@mhvk I didn't say there needs to be an out argument, I'm thinking of parsing out all the arguments, but in that case the ones not found will be NULL and can be ignored in constructing the kwrds dictionary.

eric-wieser · 2016-11-15T01:27:52Z

we should have a definite division into required (positional) and optional (keyword) arguments

What exactly do you mean by that? Are you suggesting position-only arguments, keyword-only arguments, or both? Is there any precedent elsewhere in numpy for keyword-only arguments (when not combined with vararg functions)?

shoyer · 2016-11-15T01:36:59Z

What exactly do you mean by that? Are you suggesting position-only arguments, keyword-only arguments, or both? Is there any precedent elsewhere in numpy for keyword-only arguments (when not combined with vararg functions)?

This is for the arguments passed into __array_ufunc__, which is only called by NumPy, not users.

charris · 2016-11-15T02:24:59Z

Checking, I see that the override function already takes care of the argument normalization.

charris · 2016-11-15T17:42:42Z

@pv The outer ufunc method just reshapes the input arrays and calls ufunc_generic_call, passing on the keywords untouched. The upshot is that PyUFunc_CheckOverride is called twice, but in only the second call are the arguments properly normalized. I propose to drop the first call. If not dropped, the arguments should at least be normalized. Thoughts?

EDIT: OK, I think we need to preserve the call as the array reshape might fail for some inputs, so the arguments need a proper normalization.

pv · 2016-11-15T17:51:47Z

The input may contain objects for which reshaping may not be even defined, so preferably it should dispatch once, immediately at the beginning of the method.

charris · 2016-11-15T18:26:46Z

Hmm, there is no error checking in the normalization functions, not even if there are sufficient positional arguments. All of the python C-API functions may error, even PyTuple_GetSlice where it is undocumented.

EDIT: And the positional arguments could be passed as keywords. Should we error on that?

charris · 2016-11-15T19:37:17Z

The default values of optional arguments are also not set, should we handle those?

mhvk · 2016-11-15T19:55:38Z

The default values of optional arguments are also not set, should we handle those?

I'd say we should not set anything unless it is explicitly passed in, if only to ensure that if kwargs is empty, no special stuff needs to be done.

charris · 2017-04-27T20:04:18Z

Merged. Thanks to all for pulling this off, and a special thanks to Marten for sheparding the process to completion.

Further modifications and enhancements should go through the normal PR process.

shoyer · 2017-04-27T20:09:36Z

Thanks @charris.

Just to be 100% clear: I think we've very close, but before releasing NumPy we need to change the ufunc override rules so that encountering any argument with __array_ufunc__ = None disables overrides entirely.

mrocklin · 2017-04-27T20:12:17Z

I am very glad to see this. Thank you for pushing this through.

I'm now looking forward to the 1.13 release more than any other release that I can recall :)

eric-wieser · 2017-04-27T20:13:03Z

Do we want to get MaskedArray using __array_ufunc__ before we release 1.13, as a testing ground?

charris · 2017-04-27T20:15:48Z

@eric-wieser I think it is best to leave applications to future releases as they are likely to take a fair amount of time to settle as people experiment and gain experience.

shoyer · 2017-04-27T20:24:11Z

Do we want to get MaskedArray using array_ufunc before we release 1.13, as a testing ground?

I think this is a definitely worth trying. Certainly implementing NDArrayOperatorsMixin turned up enough design design issues on its own.

mhvk · 2017-04-27T21:27:18Z

🎆 🎆 🎆

I'd actually not make further changes right now -- we do note that the thing is provisional after all.

Definitely, redoing MaskedArray using it would be great, but surely for 1.14!

shoyer · 2017-04-27T21:28:59Z

To be clear, I think it's worth trying to redo MaskedArray with it, not necessarily worth merging the attempt :).

eric-wieser · 2017-04-27T21:31:16Z

@shoyer: Was the use of the word redo intentional? It's tempting to make a new np.ma2 or np.masked module which discards the hardmask and sharedmask behaviour...

shoyer · 2017-04-27T21:35:49Z

@eric-wieser ask @mhvk, he used "redo" first :)

mhvk · 2017-04-27T21:39:30Z

I'm definitely tempted to redo -- that way we also don't have to worry about, e.g., introducing views of masks instead of copies. And the code could become substantially more agnostic about what the items being masked actually are. Anyway, for another issue/PR...

shoyer · 2017-04-28T06:52:01Z

See #9014 for a follow-up PR, changing the behavior of __array_ufunc__ = None and adding recommendations to the NEP.

njsmith · 2017-04-29T04:46:23Z

If re-doing np.ma with more sensible semantics, then it should probably at least start as a third-party library, to allow a little more room for experimentation, independent release cycle, etc.? We could still link to it from the docs or even a DeprecationWarning on np.ma.

It definitely would be an interesting experiment.

* ENH: NDArrayOperatorsMixin calls ufuncs directly, like ndarray Per our discussion in numpy#8247 (comment) * add back in accidentally dropped __repr__

mrocklin · 2017-06-07T21:36:16Z

I have a naive implementation of this up for dask.array here: dask/dask#2438

In [1]: import dask.array as da, numpy as np

In [2]: x = np.arange(24).reshape((4, 6))

In [3]: np.sum(np.sin(x), axis=0)
Out[3]:
array([-1.56697566,  2.06850183,  3.80220828,  2.04018197, -1.59757823,
       -3.76653238])

In [4]: d = da.arange(24, chunks=(4,)).reshape((4, 6))

In [5]: np.sum(np.sin(d), axis=0)
Out[5]: dask.array<sum-aggregate, shape=(6,), dtype=float64, chunksize=(6,)>

In [6]: _.compute()
Out[6]:
array([-1.56697566,  2.06850183,  3.80220828,  2.04018197, -1.59757823,
       -3.76653238])

charris · 2017-06-07T23:07:26Z

@mrocklin Have you been happy with the implementation so far?

mrocklin · 2017-06-07T23:39:08Z

@charris very. My first attempt took about ten lines to reroute dask.array's existing functions to be called when the user uses the proper numpy equivalent. I suspect that with a bit more work I can remove dask's functions entirely so that da.add is np.add

mhvk · 2017-06-08T07:06:37Z

@mrocklin -- Great that it seemed relatively easy! Do ping if there seem to be things that do not work quite as expected.

charris added 01 - Enhancement component: numpy._core labels Nov 6, 2016

charris changed the title ~~ENH: Revert "Temporarily disable __numpy_ufunc__"~~ WIP, ENH: Revert "Temporarily disable __numpy_ufunc__" Nov 6, 2016

charris mentioned this pull request Nov 6, 2016

Drop index from __numpy_ufunc__ signature? out always a tuple? #5986

Closed

charris force-pushed the enable-numpy_ufunc branch from 5cf3706 to 88845c0 Compare November 9, 2016 02:26

charris force-pushed the enable-numpy_ufunc branch 2 times, most recently from 50b585f to 07b0e5c Compare November 12, 2016 21:27

dusenberrymw mentioned this pull request Dec 3, 2016

[SYSTEMML-1116] Make SystemML Python DSL NumPy-friendly apache/systemds#290

Closed

charris force-pushed the enable-numpy_ufunc branch from c33d9c4 to 32221df Compare April 27, 2017 19:38

charris merged commit e332ba4 into numpy:master Apr 27, 2017

charris deleted the enable-numpy_ufunc branch April 27, 2017 20:02

shoyer mentioned this pull request Apr 27, 2017

Ufunc overrides should always raise TypeError if any arguments set __array_ufunc__ = None #9011

Closed

shoyer mentioned this pull request Apr 28, 2017

Backwards compatible mixin like NDArrayOperatorsMixin as a separate project #9016

Closed

mhvk mentioned this pull request Apr 28, 2017

Quantity.__array_ufunc__ astropy/astropy#2583

Merged

eric-wieser mentioned this pull request Apr 28, 2017

Decide on what the resolution rules for __op__/__rop__/__numpy_ufunc__ actually are #5844

Closed

charris mentioned this pull request Apr 28, 2017

Scipy sparse inplace fails after __array_ufunc__ merge #9019

Closed

gdementen mentioned this pull request May 2, 2017

consider support for np 1.13 __array_ufunc__ larray-project/larray#234

Open

mrocklin mentioned this pull request May 3, 2017

Matmul/generic2 pydata/sparse#18

Closed

charris changed the title ~~ENH: Add __array_ufunc__~~ ENH: Add __array_ufunc__ May 9, 2017

jbrockmendel mentioned this pull request Jun 14, 2017

__numpy_ufunc__ confusion in docs #9252

Open

ENH: Add __array_ufunc__ #8247

ENH: Add __array_ufunc__ #8247

Conversation

charris commented Nov 6, 2016 • edited

njsmith commented Nov 6, 2016

numpy_ufunc for development in the NumPy 1.13.0 development cycle.

charris commented Nov 6, 2016

rgommers commented Nov 7, 2016

pv commented Nov 7, 2016 via email

charris commented Nov 7, 2016 • edited

pv commented Nov 8, 2016 • edited

charris commented Nov 9, 2016

mhvk commented Nov 11, 2016

charris commented Nov 12, 2016 • edited

charris commented Nov 13, 2016

charris commented Nov 14, 2016

mhvk commented Nov 14, 2016

njsmith commented Nov 14, 2016

charris commented Nov 14, 2016

eric-wieser commented Nov 15, 2016

shoyer commented Nov 15, 2016

charris commented Nov 15, 2016

charris commented Nov 15, 2016 • edited

pv commented Nov 15, 2016 via email

charris commented Nov 15, 2016 • edited

charris commented Nov 15, 2016

mhvk commented Nov 15, 2016

charris commented Apr 27, 2017 • edited by rgommers

shoyer commented Apr 27, 2017

mrocklin commented Apr 27, 2017

eric-wieser commented Apr 27, 2017

charris commented Apr 27, 2017

shoyer commented Apr 27, 2017

mhvk commented Apr 27, 2017

shoyer commented Apr 27, 2017

eric-wieser commented Apr 27, 2017 • edited

shoyer commented Apr 27, 2017

mhvk commented Apr 27, 2017

shoyer commented Apr 28, 2017

njsmith commented Apr 29, 2017

mrocklin commented Jun 7, 2017

charris commented Jun 7, 2017

mrocklin commented Jun 7, 2017

mhvk commented Jun 8, 2017

ENH: Add `__array_ufunc__` #8247

ENH: Add `__array_ufunc__` #8247

charris commented Nov 6, 2016 •

edited

charris commented Nov 7, 2016 •

edited

pv commented Nov 8, 2016 •

edited

charris commented Nov 12, 2016 •

edited

charris commented Nov 15, 2016 •

edited

charris commented Nov 15, 2016 •

edited

charris commented Apr 27, 2017 •

edited by rgommers

eric-wieser commented Apr 27, 2017 •

edited