Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: array_api stream has wrong type #24642

Open
aaronmondal opened this issue Sep 5, 2023 · 7 comments
Open

BUG: array_api stream has wrong type #24642

aaronmondal opened this issue Sep 5, 2023 · 7 comments
Labels
33 - Question Question about NumPy usage or development component: numpy.array_api Static typing

Comments

@aaronmondal
Copy link

aaronmondal commented Sep 5, 2023

Describe the issue:

According to the spec the stream parameter for __dlpack__ and to_device should be Optional[Union[int, Any]]. The current implementation just uses None.

Reproduce the code example:

from typing import Any, Protocol, Self

from numpy import array_api as xp

PyCapsule = Any

class Array(Protocol):
    def __dlpack__(*, stream: int | Any | None = None) -> PyCapsule:
        ...


def identity(array: Array) -> Array:
    return array

identity(xp.asarray([1]))  # Error.

class Array2(Protocol):
    def to_device(
        self,
        device: Any,
        /,
        *,
        stream: int | Any | None = None,
    ) -> Self:
        ...

def identity2(array: Array2) -> Array2:
    return array

identity2(xp.asarray([1]))  # Error

Error message:

[Pyright] Argument of type "Array" cannot be assigned to parameter "array" of type "Array" in function "identity"
  "Array" is incompatible with protocol "Array"
    "__dlpack__" is an incompatible type
      Type "(*, stream: None = None) -> PyCapsule" cannot be assigned to type "(*, stream: int | Any | None = None) -> PyCapsule"
        Keyword parameter "stream" of type "int | Any | None" cannot be assigned to type "None"
          Type "int | Any | None" cannot be assigned to type "None"

[Pyright] Argument of type "Array" cannot be assigned to parameter "array" of type "Array2" in function "identity2"
  "Array" is incompatible with protocol "Array2"
    "to_device" is an incompatible type
      Type "(device: Device, /, stream: None = None) -> Array" cannot be assigned to type "(device: Any, /, *, stream: int | Any | None = None) -> Array"
        Keyword parameter "stream" of type "int | Any | None" cannot be assigned to type "None"
          Type "int | Any | None" cannot be assigned to type "None"

Runtime information:

1.25.1
3.11.4 (main, Jun 6 2023, 22:16:46) [GCC 12.3.0]

Context for the issue:

Loosely related to #24641

__dlpack__: https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html
to_device: https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.to_device.html

@BvB93
Copy link
Member

BvB93 commented Sep 5, 2023

The __dlpack__ docs in the link you provided above do also clearly note that "Support for a stream value other than None is optional and implementation-dependent.", a case which is very much applicable to numpy.

EDIT: Arguably would be more accurate to type it upstream as a free parameter, e.g. a union between None and some type variable, rather than a fixed type as is the case now.

@BvB93 BvB93 added 33 - Question Question about NumPy usage or development component: numpy.array_api and removed 00 - Bug labels Sep 5, 2023
@aaronmondal
Copy link
Author

aaronmondal commented Sep 5, 2023

The __dlpack__ docs in the link you provided above do also clearly note that "Support for a stream value other than None is optional and implementation-dependent.", a case which is very much applicable to numpy.

EDIT: Arguably would be more accurate to type it upstream as a free parameter, e.g. a union between None and some type variable, rather than a fixed type as is the case now.

The note only talks about the implementation for a value other than None, not about the type. From what I understand the concrete implementation that handles the non-None case doesn't have to exist, but the type signature should still match. So I'm not saying that there should be an implementation for the non-None case, just that the signature should be annotated.

None vs int | Any |None is problematic since a Protocol that satisfies the standard API can't satisfy numpy.array_api arrays. I believe that this is the reason why the note you mentioned only talks about concrete values, not about types.

@BvB93
Copy link
Member

BvB93 commented Sep 6, 2023

The note only talks about the implementation for a value other than None, not about the type.

That's in practice purely academic difference? The only member of the None type is the literal None. Never the less, the fact remains that the signature in the array API docs is a simplification (xref #19083 (comment)), one that doesn't fully capture the optional nature of int/Any in favor of providing a simpler type signature (hence the note).

@aaronmondal
Copy link
Author

aaronmondal commented Sep 6, 2023

I wouldn't call this an academic difference since (int | None) != None and while this doesn't get flagged during runtime it gets flagged by static type checkers and potentially leads to hard errors when using runtime type checkers.

It seems to me that the issue here is with the API spec though. It's not clear whether the Optional means the actual typing.Optional (which would be int | None) or something like an informal "you can decide whether you type this parameter int or None".

I'd argue that the following are quite different:

def f(x: int | None = None) -> Any:
    # Probably the variant that should generally be preferred.
    if x is not None:
        raise NotImplementedError('We don't support this.')
    return do_something(x)
def g(x: int | None = None) -> Any:
    # Not ideal, but valid. Spec doesn't require handling the not-None case.
    return do_something(x)  # Undefined behavior, but fine per spec.
def h(x: None = None) -> Any:  # Wrong. Doesn't match type signature.
    return do_something(x)

I would assume that the spec means the optional in the sense of f and g, but not in the sense of h.

If h is valid it might make sense to either change this in the spec or to make the terminology more precise. AFAIU it would mean that we can't have generic protocols to check a library for it's array API compatibility.

cc @rgommers

@rgommers
Copy link
Member

rgommers commented Sep 6, 2023

All these signatures use positional arguments, so this doesn't make sense to me. It's a keyword argument with a default of None, and whether it's unused or an explicit stream=None is passed in is expected to not make a difference - you get the default behavior.

@aaronmondal
Copy link
Author

All these signatures use positional arguments, so this doesn't make sense to me.

My bad those were meant to be kwargs. I've updated my previous comment.

@rgommers
Copy link
Member

Okay, after some more thought:

  • I agree that f(x) is preferred, and that we should change the annotation in numpy.array_api
  • g(x) is indeed valid; handling unexpected input values by a library is good practice, but not required by the standard
  • h(x) indeed seems wrong

AFAIU it would mean that we can't have generic protocols to check a library for it's array API compatibility.

It is a goal to have the whole standard be type-safe/stable/checkable.

It's not clear whether the Optional means the actual typing.Optional (which would be int | None) or something like an informal "you can decide whether you type this parameter int or None".

That should be typing.Optional. In general, the aim is to both have valid and complete type annotations, and make them easy to understand as documentation. Sometimes those things conflict a little, but this doesn't seem to be one of those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
33 - Question Question about NumPy usage or development component: numpy.array_api Static typing
Projects
None yet
Development

No branches or pull requests

4 participants