Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we promise to delegate to reverse operators ? #732

Closed
NeilGirdhar opened this issue Jan 20, 2024 · 9 comments
Closed

Can we promise to delegate to reverse operators ? #732

NeilGirdhar opened this issue Jan 20, 2024 · 9 comments

Comments

@NeilGirdhar
Copy link

NeilGirdhar commented Jan 20, 2024

Background

Ordinary NumPy arrays have a peculiar operator behaviour:

  • if an operator like __add__ fails, then instead of returning NotImplemented (which would be idiomatic Python),
  • NumPy decomposes the array into basic Python types (such as float or int), and then calls the corresponding reverse operator (like __radd__) on these elements and the RHS
  • It then collates the results into a new array.

It is too late for NumPy to change this since this would break a lot code.

Motivation

Consider someone implementing a class that must interact with arrays. For example, a rectangle class defined by storing the opposite corners of a rectangle. If x = Rectange(...) and a is an array, we want to support both x + a and a + x. If the decomposition/collation described above happens, then this will not work: a + x will try to build an array of rectangles offset by floats.

Proposal

Conformant implementations of the Array API should return NotImplemented when they don't know how to evaluate an operator with their arrays—rather than this behavior being implementation-defined. Also, it would be useful to have a test in the conformance test suite that verifies this.

@rgommers
Copy link
Member

Interesting question. The discussion you linked in numpy/numpy#9677 is quite relevant. The NumPy behavior is clearly not great, but @seberg makes a good point:

Frankly, if you want to add such a promise there (which is an issue for the Array API), you better have an idea how to pull of changing it in NumPy. Saying numpy must change it is useless if it can't.

The Array API does not even define a way to mix array objects, interacting with a custom object is undefined to begin with anyway.

As noted in https://data-apis.org/array-api/latest/purpose_and_scope.html, subclassing of an array class is out of scope; an implementation may not even allow this. E.g., JAX doesn't seem to support it at all (google/jax#4269), and in CuPy subclassing is mostly supported since 2022 only (see cupy/cupy#3972 (comment)) and still undocumented as far as I can tell.

So I suspect that the best we can do here is a note that recommends to return NotImplemented as it's the idiomatic thing to do in Python - but also says that this isn't guaranteed to be the case and notes that this is out of scope for the standard.

@NeilGirdhar
Copy link
Author

NeilGirdhar commented Jan 21, 2024

Frankly, if you want to add such a promise there (which is an issue for the Array API), you better have an idea how to pull of changing it in NumPy. Saying numpy must change it is useless if it can't.

I guess I'm confused by this wrt the Array API. I understand that it's not possible to change Numpy. What's the problem with making the Array API promise this? Are there implementors that can't do it for some reason? Even NumPy's implementation of the Array API should be able to do it, right?

As noted in https://data-apis.org/array-api/latest/purpose_and_scope.html, subclassing of an array class is out of scope

What do you think of the example I gave of a rectangle class? Should it be possible to write such a class and have it interact normally with the array API?

@rgommers
Copy link
Member

Even NumPy's implementation of the Array API should be able to it, right?

No, the implementation is in the main namespace from 2.0 onwards - and even if that wasn't the case, it would still want to reuse the ndarray object, so changing behavior of operators would not really be feasible.

What do you think of the example I gave of a rectangle class? Should it be possible to write such a class and have it interact normally with the array API?

I don't think so, no. It's not just about operator support or subclassing. It'd be very hard to see how to implement any such rectangle object that can be used with regular functions, since those only support array input and not custom objects.

For every arrays library in existence, custom objects like this are either unsupported or very messy. I don't see any chance of fixing that in this standard.

@NeilGirdhar
Copy link
Author

NeilGirdhar commented Jan 21, 2024

No, the implementation is in the main namespace from 2.0 onwards

Thank you for explaining. I think I understand now.

  • and even if that wasn't the case, it would still want to reuse the ndarray object, so changing behavior of operators would not really be feasible.

If it weren't the case, then numpy.xp.ndarray could reuse numpy.ndarray by composition though? and then you could forward everything to the composed object—except of course taking care to return NotImplemented where appropriate. This is what I imagined was happening. I didn't realize that numpy.ndarray is going to implement the Array API.

Doesn't that mean that for a = numpy.xp.asarray(...), then a has all kinds of extra methods and attributes not specified by the Array API?

So I suspect that the best we can do here is a note that recommends to return NotImplemented as it's the idiomatic thing to do in Python

Right, now I understand this. This would be a nice start—maybe one day, we'll be able to shake off this technical debt.

@rgommers
Copy link
Member

Doesn't that mean that for a = numpy.xp.asarray(...), then a has all kinds of extra methods and attributes not specified by the Array API?

Yes, sure. That is perfectly fine though - it will implement a superset, and that doesn't impede the ability to write portable code which targets the standard.

@seberg
Copy link
Contributor

seberg commented Jan 21, 2024

That was not how it is designed for better or worse. Remember, the Array API mainly targets libraries. Those get passed a NumPy array by the user and must return a NumPy array.
So if xp.asarray() converts to a custom np.xp.ndarray object, that can work, but you will also need a mechanism like xp.unwrap() to return the right thing again (and libraries must use it then).

That would be a legitimate design and I do think it would have remove some obstacles. But, it is not the current choice it looks like that is probably not a big headache.
For the dataframe API that was never even an option I think, so their choice is very different and it entirely relies on custom objects.

@NeilGirdhar
Copy link
Author

Got it. Thank you @seberg and @rgommers for patiently explaining this to me! I guess I will close this as impossible, but I hope that the recommendation for other libraries ends up somewhere.

@asmeurer
Copy link
Member

It does seem kind of odd to include the __r*__ methods in the spec and not say anything about this. The only point of those methods is to interface with other classes.

Note that for numpy specifically, you can work around this by making __array_ufunc__ return NotImplemented.

TBH, I'm a little disappointed that nothing was done about this in numpy 2.0 specifically for the case where a non-object array is promoted to an object array. That's a major source of bugs, or at worse, inscrutable error messages in my experience. I know there are some libraries like pandas that use object arrays on purpose, but requiring more explicitness about something becoming an object array would be good thing. It's inline, for instance, with the change to make ragged arrays raise an exception instead of silently becoming an object array.

@asmeurer
Copy link
Member

By the way, this was discussed before at #229 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants