Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: accept np-coercible array-likes with array API flag set #19187

Merged
merged 6 commits into from Sep 6, 2023

Conversation

lucascolley
Copy link
Member

Reference issue

Closes #19118.

What does this implement/fix?

compliance_scipy is modified to accept array-likes by coercing with np.asanyarray. Previously, they were rejected, which is unwanted for backwards compatibility reasons.

Additional information

This implementation may be suboptimal - I'm not confident that creating a new tuple and using try except is the right thing to do. But this seems to have the desired behaviour.

A follow-up may come in the future as @mdhaber has expressed a want to sometimes accept certain types of array like Masked arrays.

@lucascolley lucascolley marked this pull request as ready for review September 5, 2023 15:10
@lucascolley
Copy link
Member Author

CI has caught that we are still testing for array-like denial. I'll remove the check now. (Maybe we want to check that they are accepted?)

scipy/_lib/_array_api.py Outdated Show resolved Hide resolved
scipy/_lib/_array_api.py Outdated Show resolved Hide resolved
mdhaber
mdhaber previously requested changes Sep 6, 2023
Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to check that they are accepted?

Sounds good to me.

@mdhaber mdhaber added scipy._lib maintenance Items related to regular maintenance tasks array types Items related to array API support and input array validation (see gh-18286) labels Sep 6, 2023
@lucascolley
Copy link
Member Author

lucascolley commented Sep 6, 2023

Edit: will the try except stop the correct TypeError being raised?

@mdhaber I've addressed your comments, I think this is correct?

@mdhaber
Copy link
Contributor

mdhaber commented Sep 6, 2023

Close, but you need to loop over the indices of the list if you want to modify it. Right now, you are just redefining the array variable, but this does not change the list.

for i in range(len(arrays)):
    array = arrays[i]
    ...
    arrays[i] = ...  # to modify list element

and technically, you don't even need to return arrays (although it's probably not a bad idea to do so to be explicit that it's modified).

You could also try:

for i, array in enumerate(arrays):
    ...
    arrays[i] = ...  # to modify list element

Although a linter or mypy might not like it.


The rest of the changes LGTM but you might add a test that when objects are coerced to object arrays, the new error is emitted. Nit: to save a level of indentation, you could put the object dtype check after the try-except block rather than within the try.

Also @rgommers should character arrays continue to be let through?

[skip cirrus] [skip circle]
@lucascolley
Copy link
Member Author

Close, but you need to loop over the indices of the list if you want to modify it

My most recent commit has tried to address this.

add a test that when objects are coerced to object arrays, the new error is emitted.

Can you give me an example input that would be coerced to an object array?

@rgommers
Copy link
Member

rgommers commented Sep 6, 2023

Also @rgommers should character arrays continue to be let through?

Good question. I hadn't given that any thought before, because I can't remember a single bug report about them. However, I think there is value in defining more clearly what dtypes are accepted. I'd say it should be only boolean and numerical dtypes (signed/unsigned integer and real/complex floating-point dtypes). Would you agree with that?

@mdhaber
Copy link
Contributor

mdhaber commented Sep 6, 2023

Can you give me an example input that would be coerced to an object array?

Sure.

import numpy as np
from scipy import stats

x = np.asanyarray([object()])
x.dtype  # dtype('O')

y = np.asanyarray(stats)
y.dtype  # dtype('O')

Would you agree with that?

After mode stopped supporting non-numerical arrays, I can't think of functions that should support non-numerical types. So yes, I think we can restrict to subdtypes of np.number and np.bool_.

@lucascolley
Copy link
Member Author

lucascolley commented Sep 6, 2023

thanks Matt, hopefully that's everything now.

restrict to subdtypes of np.number and np.bool_

Unless we want to include that in here too.

@mdhaber
Copy link
Contributor

mdhaber commented Sep 6, 2023

Unless we want to include that in here too.

That would be nice. @rgommers are we going to be bothered by the overhead of np.issubdtype? It's probably the slowest thing in this function, and in the simplest implementation, it would be called twice per argument.

import numpy as np
x = np.asanyarray(['a'])
%timeit np.issubdtype(x.dtype, np.flexible)  # 646 ns ± 2.71 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Or @rgommers feel free to merge as is and we can consider it in a follow-up.

@tylerjereddy tylerjereddy added this to the 1.12.0 milestone Sep 6, 2023
Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time reading over the diff/discussion and ensuring that these produce sensible results on this branch (modulo: gh-19194 ):

  • SCIPY_DEVICE=cuda python dev.py test -j 32 -b all
  • python dev.py test -j 32 -b all

Seems good for my local case with GPU and associated libs locally.

I think we can postpone the refinement of acceptable type coercions to a separate PR, and indeed that would probably be easier to review separately, at least for me.

@tylerjereddy tylerjereddy dismissed mdhaber’s stale review September 6, 2023 22:37

Matt ok with merging as-is with follow-up

@tylerjereddy tylerjereddy merged commit a26a964 into scipy:main Sep 6, 2023
21 of 22 checks passed
@rgommers
Copy link
Member

rgommers commented Sep 7, 2023

The overhead of ~650 ns is quite a bit, so we may have to optimize this later. But it's not a big worry for now - we can circle back to that once it start showing up in profiling results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks scipy._lib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: np-coercible array-likes are not accepted with array API flag set
4 participants