If one wants to have the output of a ufunc passed to a ndarray subclass, but the input is a regular array, the behaviour differs depending on whether one passes the output as a positional or as a keyword argument: if passed as positional argument, __numpy_ufunc__ is called, while if passed as a keyword argument, __array_prepare__ and __array_wrap__ are called. I would have expected for __numpy_ufunc__ to be called in both cases.
p.s. If the input is a subclass, it always works as expected.
import numpy as np
def __array_prepare__(self, obj, context=None):
print('prepare:', self, obj, context)
return super(MyArray, self).__array_prepare__(obj, context)
def __array_wrap__(self, obj, context):
print('wrap:', self, obj, context)
return super(MyArray, self).__array_wrap__(obj, context)
def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs):
print('ufunc:', self, ufunc, method, i, inputs, kwargs)
values = [np.asarray(arg) for arg in inputs]
if 'out' in kwargs:
kwargs['out'] = kwargs['out'].view(np.ndarray)
result = getattr(ufunc, method)(*values, **kwargs) / 2.
a = np.array([-1., -1.]).view(MyArray)
a = np.array([-1., -1.]).view(MyArray)
Is this something that was not addressed in gh-4626?
EDIT: apparently yes. this is an oversight
@cowlicks: overlooked case. Also, what should i be when self is one of the output arrays (probably past the inputs, which is likely ok...)?
Checking 1.10 blockers brought up by my __numpy_ufunc__ trials: In numpy 1.10.0.dev-256d256, this is still an issue.
I'll take a look this weekend.
Okay so what currently happens is PyUFunc_CheckOverride is called at every ufunc call. It sets result as the result of the override if one exists, if no override is found result is NULL. Inside PyUFunc_CheckOverride we check positional arguments for __numpy_ufunc__ attribute. If it is not there, set result = NULL and return 0. Clearly we need to check out here, but it will add a little overhead to every ufunc call since we are now checking a dictionary too. I think we need to insert this logic here
result = NULL
They are only normalized (to what __numpy_ufunc__ needs) If there is a __numpy_ufunc__ attribute present. I was reluctant to normalize first because that would add even more overhead to every ufunc call. I'm testing my prototype (based on my last comment) now. I can try both approaches and see if the timing vs code complexity tradeoff is significant.
Yeah, that is definitely the right way to do this. I'm not sure why I did not do it that way originally. I'll try to find where that happens and check it out... After I get this working.
@njsmith I found some arg parsing stuff here, but it is just for multiarray stuff? I don't think it is used for ufuncs. I need to look more into this.
My prototype is here (based on my first comment). But it is breaking np.dot and idk why.
Had a look at the prototype and while it looks good generally, I worry it would still break for ufuncs with two outputs (where out is a tuple; see your work in #4171)
Little update: I rebased on current master and the segfault went away... So it wasn't my fault I guess? That had me stumped for a while. Working on a fix taking @mhvk's comments into consideration.
This seems to fix it. But I have not refactored to use the general arg processing as suggested by @njsmith. I have not assessed the difficulty of that yet. If it seems hard I'll make an issue with a plan of action.
I think there is still cleaning up to do. And I have not checked that I did not create any memory leaks. I'm not confident in my C coding. So I'll double check this then make a PR.
looks good to me, but am definitely even less of an expert.
There is a bit of a problem. The __numpy_ufunc__ protocol specifies it has the signature:
def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs)
Where "i is the index of self in inputs" but output arguments get put into kwargs. So what should the i index be in this case? (See the NEP here).
Basically __numpy_ufunc__ does not account for the output arguments being able to override the ufunc. I'm still thinking about the best way to approach this. Suggestions? @pv
FWIW: I think the approach you took, of just continuing to add to i, is the most logical one. It is important for an output to be able to have __numpy_ufunc__ and thus to tell that it cannot handle the result.
@mhvk But it is breaking this test.
Basically the test uses the "defective" behavior.
Hmm, fortunately, __numpy_ufunc__ has not been part of a release, so still in time to get it right! It seems obvious the implementer will have to be able to deal with either i is None or i >= len(inputs). The latter would carry more information, so seems preferred.
i is None
i >= len(inputs)
If it hasn't been released perhaps the signature should be changed. Maybe add an o parameter or make i be input_index and add an output_index and require that one or the other always be None. This seems much clearer than indexing into a list of output arrays if i >= len(inputs).
Is it documented anywhere that passing an output array can effect the ufunc dispatch? Seems kind of surprising that it work this way.
@ewmoore - but the output should not just be treated as an arbitrary ndarray either (indeed, that is what the current __array_prepare__ and __array_wrap__ do well).
I'd also think i>=len(inputs) is most logical conceptually in the sense that it now effectively indexes *args in the original call for the case that no out keyword was present.
Okay. I'll correct the test that is failing to check for i >= len(inputs), do some cleanup, and make a PR.