Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask.array module should expose dtype objects and the isdtype function #10387

Open
ogrisel opened this issue Jun 30, 2023 · 3 comments
Open

dask.array module should expose dtype objects and the isdtype function #10387

ogrisel opened this issue Jun 30, 2023 · 3 comments
Labels
needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. needs triage Needs a response from a contributor

Comments

@ogrisel
Copy link
Contributor

ogrisel commented Jun 30, 2023

>>> import dask.array as da
>>> da.float32
Traceback (most recent call last):
  Cell In[5], line 1
    da.float32
AttributeError: module 'dask.array' has no attribute 'float32'
>>> da.isdtype
Traceback (most recent call last):
  Cell In[6], line 1
    da.isdtype
AttributeError: module 'dask.array' has no attribute 'isdtype'

See:

It's not clear if the dtype objects should be all exposed as attributes of the top level module but this usually the case in other implementations of the spec:

>>> import numpy.array_api as xp
<ipython-input-7-e68880e64b48>:1: UserWarning: The numpy.array_api submodule is still experimental. See NEP 47.
  import numpy.array_api as xp
>>> xp.float32
dtype('float32')
>>> xp.float64
dtype('float64')
>>> xp.isdtype(xp.zeros(shape=5, dtype=xp.float32), "real floating")
True
>>> numpy.array_api.float32 == numpy.float32
True

Since dask arrays use numpy arrays by default, I think that dask could just alias numpy.float32 as dask.array.float32 (and similarly for the other numpy dtypes).

Unfortunately, numpy.isdtype does not exist yet. Maybe dask.array could lazily alias numpy.array_api.isdtype as dask.array.isdtype with some module level __getattr__ to not raise the numpy.array_api experimental warning by default.

Alternative, dask.array could ship its own implementation of isdtype (maybe vendoring/forking the one from numpy.array_api).

@github-actions github-actions bot added the needs triage Needs a response from a contributor label Jun 30, 2023
@ogrisel
Copy link
Contributor Author

ogrisel commented Jun 30, 2023

#8750 should address this by exposing a dedicated array_api submodule instead using the dask.array module directly.

However, as of now, #8750 does not expose an implementat of isdtype.

@github-actions github-actions bot added the needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. label Aug 7, 2023
@lucascolley
Copy link

lucascolley commented Feb 28, 2024

FWIW, exposing these would help with early adoption of Dask into SciPy. Now that we have array_api_compat.dask, we should be able to get our array-agnostic code working.

The slight problem at the minute is that in our tests, we make some assumptions about very basic things that NumPy, CuPy, PyTorch and the standard all have in common, such as xp.float64. This assumption has allowed us to reduce diffs quite a lot by avoiding (so far) unnecessary calls to get the wrapped namespaces.

I understand that resolution to this may be somewhat blocked on a resolution to whether converging the main namespace or adding a separate namespace is desirable in gh-8750. For now though, even just exposing the dtypes as @ogrisel suggested would be helpful. Thanks!

@lucascolley
Copy link

bump on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. needs triage Needs a response from a contributor
Projects
None yet
Development

No branches or pull requests

2 participants