-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A catch-all protocol for numpy-like duck arrays #11129
Comments
This is interesting in that it's a bit more general than the |
Right. My hope would be that this would be a placeholder for the common case while more specific protocols, like |
Why restrict this module protocol to certain types at all? We can follow the same algorithm as, for example, (Apologies for the long post)
def variable_dispatch(name, args, **kwargs):
for arg in args:
if hasattr(arg, '__array_module__'):
module = arg.__array_module__()
if hasattr(module, name):
retval = getattr(module, name)(args, **kwargs)
if retval is not NotImplemented:
return retval
raise TypeError('This operation is not possible with the supplied types.')
def dispatch(name, *args, **kwargs):
for arg in args:
if hasattr(arg, '__array_module__'):
module = arg.__array_module__()
if hasattr(module, name):
retval = getattr(module, name)(*args, **kwargs)
if retval is not NotImplemented:
return retval
raise TypeError('This operation is not possible with the supplied types.')
def where(*args, **kwargs):
return dispatch('where', *args, **kwargs)
def stack(args, **kwargs):
return variable_dispatch('stack', args, **kwargs) In[2]: import numpy as np
In[3]: import sparse
In[4]: import sandbox
In[5]: class PotatoArray(sparse.COO):
...: def __array_module__(self):
...: return sparse
...:
In[6]: x = PotatoArray(np.eye(5))
In[7]: y = PotatoArray(np.zeros((5, 5)))
In[8]: condition = PotatoArray(np.ones((5, 5), dtype=np.bool_))
In[9]: result = sandbox.where(condition, x, y)
In[10]: sandbox.where(condition, x, y)
Out[10]: <COO: shape=(5, 5), dtype=float64, nnz=5>
In[11]: sandbox.where(condition, x, y).todense()
Out[11]:
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
In[12]: sandbox.stack([x, y], axis=0)
Out[12]: <COO: shape=(2, 5, 5), dtype=float64, nnz=5>
In[13]: sandbox.stack([x, y], axis=0).todense()
Out[13]:
array([[[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]])
In[14]: class A():
...: pass
...:
In[15]: sandbox.where(A(), x, y)
Traceback (most recent call last):
File "/anaconda3/envs/sparse/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-15-775f69206c32>", line 1, in <module>
sandbox.where(A(), x, y)
File "/Users/hameerabbasi/PycharmProjects/sparse/sandbox.py", line 30, in where
return dispatch('where', *args, **kwargs)
File "/Users/hameerabbasi/PycharmProjects/sparse/sandbox.py", line 26, in dispatch
raise TypeError('This operation is not possible with the supplied types.')
TypeError: This operation is not possible with the supplied types. |
My main concern with this approach is that top level functions should be raising For example, consider Python arithmetic (on which def _not_implemented(*args, **kwargs):
return NotImplemented
class A:
__add__ = __radd__ = _not_implemented
class B:
__add__ = __radd__ = _not_implemented
a = A()
b = B()
a.__add__(b) # NotImplemented
a.__radd__(b) # NotImplemented
a + b # TypeError: unsupported operand type(s) for +: 'A' and 'B' However, I do like the idea of a generic method for NumPy functions that aren't ufuncs. I would still make this a method on array objects, though, e.g., In most cases, you could write something like the following: import dask.array as da
class DaskArray:
def __array_function__(self, func, *args, **kwargs):
if (not hasattr(da, func.__name__) or
not all(isinstance(arg, HANDLED_TYPES) for arg in args)):
return NotImplemented
return getattr(da, func.__name__)(*args, **kwargs) |
There is a NEP PR #10706 which has yet to be merged |
@njsmith and I want to revisit #10706 in a follow-on NEP. The conclusion of our in-person discussion a few months ago (see our notes) was that it would be better to introduce a protocol for a duck-array equivalent of From our notes: Focus on protocolsHistorically, numpy has had lots of success at interoperating with third-party objects by defining protocols, like NEP 16 took a different approach: we need a duck-array equivalent of This is strictly more powerful: if an object is already a duck array, it can simply The protocol approach is also more consistent with core Python conventions: for example, see the Conclusion: protocols are one honking great idea – let’s do more of those. |
I'm neutral between @shoyer's and @mrocklin's proposal, but I do see a bit of an issues with @shoyer's:
On the other hand, @mrocklin's has certain issues as well:
As a side note, All-in-all, I'm definitely for a catch-all protocol vs many separate protocols. It may be a bit "quick-and-dirty", but I'm rather fine with that. |
This discussion seems to have evolved into PR #11189 |
Can we close this? The ideas all seem to be in NEP 18 |
Closing, please open a new issue if there are discussion points not covered in NEPS 18 and 16 |
There are several functions for which I would like to see protocols constructed. I've raised issues for #11074 and #11128 but these are just special cases of a much larger issue that includes many operations. The sense I've gotten is that the process to change numpy takes a while, so I'm inclined to find a catch-all solution that can serve as a catch-all while things evolve.
To that end I propose that duck-arrays include a method that returns a module that mimics the numpy namespace
Then, in various functions like stack or concatenate we check for these modules
There are likely several things wrong the implementation above, but my hope is that it gets a general point across that we'll dispatch wholesale to the module of the provided duck arrays.
cc @shoyer @hameerabbasi @njsmith @ericmjl
The text was updated successfully, but these errors were encountered: