Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Add outer to numpy.linalg [Array API] #25101

Merged
merged 2 commits into from Dec 5, 2023

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Nov 10, 2023

Hi @rgommers @ngoldbaum,

This PR contains next Array API compatibility change. It adds numpy.linalg.outer (almost the same as numpy.outer) with docs taken from Array API specification.

@ngoldbaum
Copy link
Member

Needs a release note. It also occurs to me that the changes you're making for array API compatibility should be mentioned in the migration guide. Also this page in the docs should probably get updated as you're making changes.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 10, 2023

Right, I added a release note and updated Array API Standard Compatibility doc page (I will update it in other PRs as well).

About migration guide: The guide explains changes and provides instructions that make user's codebase compatible with NumPy 2.0. The changes in this PR aren't breaking (former outer isn't deprecated).

Should I add a separate section with "recommended" ways of using API? (e.g. use np.linalg.outer instead of np.outer because the former is Array API compatible? But it doesn't mean that original one should be avoided) WDYT?

@ngoldbaum
Copy link
Member

I don't think we need to go into a ton of detail and totally OK to do this is a different PR. That said, I think it's worth adding a note that various aliases and keyword arguments were added to numpy's main namespace specifically to improve array API compatibility. It could be just a short blurb that links out to the NEP you and Ralf are writing for full details.

@ngoldbaum
Copy link
Member

The docstrings for numpy.outer should probably also get updated to mention that linalg.outer also exists and has slightly different semantics because of array API support. I have no opinion about whether it's worth it to deprecate numpy.outer but it is a wart to have two identical names that do slightly different things.

@lucascolley
Copy link
Contributor

It could be just a short blurb that links out to the NEP you and Ralf are writing for full details.

+1 on keeping it short and sweet here, since many users will want to migrate to 2.0 but not care about complying with array API. (Assuming the NEP goes through and is finished) IMO it would deserve a page of its own, detailing how to convert code that has already been migrated to 2.0.

For context, I'm considering working on a ruff rule for np2 to xp

@rgommers
Copy link
Member

I think I agree with @mtsokol here - the migration guide is only for things that were changed, deprecated or removed and affect users' code as they wrote it for 1.26.x, and hence require action from them. That includes a few things that will be done for array API support, but this isn't one of them.

I have no opinion about whether it's worth it to deprecate numpy.outer but it is a wart to have two identical names that do slightly different things.

Not yet I'd say, since that would require if-else conditional code depending on version. Should be considered in 3-4 releases from now.

I think it's worth adding a note that various aliases and keyword arguments were added to numpy's main namespace specifically to improve array API compatibility.

I completely agree that that needs to be documented, but not in the migration guide. Let's treat array API support as a separate new feature in 2.0, with its own doc page focused on that.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 12, 2023

The docstrings for numpy.outer should probably also get updated to mention that linalg.outer also exists and has slightly different semantics because of array API support. I have no opinion about whether it's worth it to deprecate numpy.outer but it is a wart to have two identical names that do slightly different things.

I updated np.outer docstring so np.linalg.outer is mentioned there.

@mtsokol mtsokol force-pushed the array-api-outer branch 2 times, most recently from d70612d to df2eed9 Compare November 29, 2023 11:55
@mtsokol mtsokol added this to the 2.0.0 release milestone Nov 29, 2023
@ngoldbaum
Copy link
Member

This has conflicts but can likely go in quickly once those are resolved. The mypy failure is real too.

@@ -18,6 +18,7 @@ AR_c16: npt.NDArray[np.complex128]
AR_O: npt.NDArray[np.object_]
AR_m: npt.NDArray[np.timedelta64]
AR_S: npt.NDArray[np.str_]
AR_b: npt.NDArray[np.bool]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but @BvB93, is there a way to make the error you get from npt.NDArray[np.bool_] nicer? I bet there are downstream type stubs that are broken now. In an older version of this PR this line triggered the following error:

E   AssertionError: Reveal mismatch at line 21
    
    error:
    Name "np.bool_" is not defined  [name-defined]
        AR_b: npt.NDArray[np.bool_]
        ^~~~~~~~~~~~~~~~~~~~~~~~~~

But both np.bool and np.bool_ are defined right now in the main numpy namespace and neither are deprecated, so the typing should probably accept both or somehow warn that bool_ is deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but @BvB93, is there a way to make the error you get from npt.NDArray[np.bool_] nicer?

Unfortunately not yet, no. There is PEP 702: Marking deprecations using the type system in the works which might help with this in the future, but even then both numpy and mypy will have to add some support for it before seeing the actual benefits.

@ngoldbaum
Copy link
Member

Let's pull this one in as well, since I think it's uncontroversial. Thanks @mtsokol!

@ngoldbaum ngoldbaum merged commit f7effb8 into numpy:main Dec 5, 2023
58 of 62 checks passed
@mtsokol mtsokol deleted the array-api-outer branch December 5, 2023 21:45
outer

"""
x1 = asarray(x1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtsokol - only seeing this now since it entered nightly and hence caused astropy to notice it is missing coverage: is there any reason this is not asanyarray? It means we have to cover this function even though we already cover np.outer, so with np.asanyarray this would just have worked. I think for new code we should just assume subclasses are OK and able to deal with standard numpy code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It indeed seems reasonable to accept subclasses in new functions. However, as long as np.matrix and masked arrays are still around, I think it'd be a lot safe to have a new validation function that does what asanyarray does plus explicitly rejects instances of those two problematic subclasses. And in order to do so, we need a faster check than isinstance calls (perhaps adding a private ._invalid_subclass and checking for that attribute?). WDYT @mhvk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just ignore those? Concretely:

  1. Matrix is already rejected here and even if it wasn't, I wouldn't want to bother: we are trying to phase it out, don't use it with new functions.
  2. Masked arrays are a good example of a place where using asarray() is guaranteed to give bad results, while asanyarray() at least has a chance to do the right thing (and in fact, it looks like it does something pretty reasonable here).

The question is what pattern we want, which is the _invalid_subclass thought. I think the answer is: it's the subclasses problem, especially for new functions. We just always use asanyarray(), the subclass should/must implement __array_function__ and deal with it if that doesn't work out.

Using asanyarray() gives the subclass a chance at gambling it can use super().__array_function__ or call func._implementation() (and a bit of a gamble it is, as we may change the implementation; astropy is happy with that gamble).

The main thing I do not quite like about the pattern is that it would be nice if we could open this up more clearly to non-subclasses but...

PS: I actually do wonder if rather than trying to implement a proper __array_function__ for masked arrays, it may be plausible to implement one that does nothing but reject clearly broken ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. For non-subclasses, in this case all that is needed is the dimension of the array, so one could do np.ndim(x1) which would be fast for anything except list input. But just changing to asanyarray here would seem sensible!

@charris charris changed the title API: Add outer to numpy.linalg [Array API] API: Add 1outer to numpy.linalg [Array API] Dec 11, 2023
@charris charris changed the title API: Add 1outer to numpy.linalg [Array API] API: Add outer to numpy.linalg [Array API] Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants