Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: array types: make compliance_scipy more strict #19276

Closed
lucascolley opened this issue Sep 21, 2023 · 1 comment · Fixed by #19606
Closed

MAINT: array types: make compliance_scipy more strict #19276

lucascolley opened this issue Sep 21, 2023 · 1 comment · Fixed by #19606
Labels
array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks
Milestone

Comments

@lucascolley
Copy link
Member

lucascolley commented Sep 21, 2023

Following gh-19187, there is desire to restrict the types of array we accept. I'm collecting the relevant comments here for future reference when a PR wants to deal with this. A test in gh-19260 which involves a string dtype is tripping us up.

@rgommers:

I'd say it should be only boolean and numerical dtypes (signed/unsigned integer and real/complex floating-point dtypes).

@mdhaber:

I think we can restrict to subdtypes of np.number and np.bool_.

are we going to be bothered by the overhead of np.issubdtype? It's probably the slowest thing in this function, and in the simplest implementation, it would be called twice per argument.

import numpy as np
x = np.asanyarray(['a'])
%timeit np.issubdtype(x.dtype, np.flexible)  # 646 ns ± 2.71 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

@rgommers:

The overhead of ~650 ns is quite a bit, so we may have to optimize this later. But it's not a big worry for now - we can circle back to that once it start showing up in profiling results.


Related to this is the possibility of optional parameters in array_namespace and compliance_scipy, or as_xparray, which could eventually eliminate the need for scipy._lib._util._asarray_validated (used in linalg). Here are its optional parameters:

"""
...
check_finite : bool, optional
    Whether to check that the input matrices contain only finite numbers.
    Disabling may give a performance gain, but may result in problems
    (crashes, non-termination) if the inputs do contain infinities or NaNs.
    Default: True
sparse_ok : bool, optional
    True if scipy sparse matrices are allowed.
objects_ok : bool, optional
    True if arrays with dype('O') are allowed.
mask_ok : bool, optional
    True if masked arrays are allowed.
as_inexact : bool, optional
    True to convert the input array to a np.inexact dtype.
...
"""
@lucascolley lucascolley changed the title ENH: array types: make compliance_scipy stricter ENH: array types: make compliance_scipy more strict Sep 21, 2023
@j-bowhay j-bowhay added enhancement A new feature or improvement array types Items related to array API support and input array validation (see gh-18286) labels Sep 24, 2023
@lucascolley
Copy link
Member Author

Regarding the potential to get rid of _asarray_validated, we probably want to think about how array_namespace will fit in with gh-18972.

@lucascolley lucascolley changed the title ENH: array types: make compliance_scipy more strict MAINT: array types: make compliance_scipy more strict Nov 29, 2023
@lucascolley lucascolley added maintenance Items related to regular maintenance tasks and removed enhancement A new feature or improvement labels Nov 29, 2023
@rgommers rgommers added this to the 1.12.0 milestone Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants