Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TypeIs in is_dataclass #11929

Merged
merged 1 commit into from
May 26, 2024
Merged

Use TypeIs in is_dataclass #11929

merged 1 commit into from
May 26, 2024

Conversation

NeilGirdhar
Copy link
Contributor

Fixes #9723

@NeilGirdhar NeilGirdhar force-pushed the isdataclass branch 2 times, most recently from 7721a02 to 3f47361 Compare May 17, 2024 10:37

This comment has been minimized.

@NeilGirdhar
Copy link
Contributor Author

@erictraut Is this a problem with Pyright?

@erictraut
Copy link
Contributor

@NeilGirdhar, can you be more specific? I'm not sure what you're asking.

@JelleZijlstra
Copy link
Member

I think it's about this CI failure:

/home/runner/work/typeshed/typeshed/stdlib/@tests/test_cases/check_dataclasses.py
  /home/runner/work/typeshed/typeshed/stdlib/@tests/test_cases/check_dataclasses.py:60:21 - error: "assert_type" mismatch: expected "type[DataclassInstance]" but received "<subclass of type and DataclassInstance>" (reportAssertTypeFailure)

I haven't looked at this code in detail to figure out whether pyright is correct here.

@NeilGirdhar
Copy link
Contributor Author

NeilGirdhar commented May 17, 2024

So, I removed one of the overloads, but from my testing it didn't seem necessary anymore. I'll quickly try replacing it to see if it makes a difference. (Edit: still broken. Removing it since it doesn't belong anymore IMO.)

The failure in PyRIght seems to be because Pyright isn't evaluating the overloads for TypeIs functions? Or else it's picking the wrong one. But even if it's picking the wrong one, why is type & (DataclassInstance | type[DataclassInstance]) equal to type & DataclassInstanc (or Pyright's approximation to the intersection). Seems odd to me.

This comment has been minimized.

This comment has been minimized.

@erictraut
Copy link
Contributor

@NeilGirdhar, PEP 742 allows some variation between type checkers. What you're seeing here is an example of this. Pyright's evaluation is arguably more accurate (narrower) and more consistent than mypy's, but both results are defensible. I think you'll need to change your test case to accommodate this variation.

PEP 742 says:

As a rule of thumb, a type checker should use the same type narrowing logic – and get results that are consistent with – its handling of isinstance().

If you try this self-contained code sample in both pyright and mypy, you'll see that mypy is inconsistent but pyright is consistent.

from dataclasses import Field
from typing import Any, ClassVar, Protocol, runtime_checkable
from typing_extensions import TypeIs

@runtime_checkable
class DataclassInstance(Protocol):
    __dataclass_fields__: ClassVar[dict[str, Field[Any]]]

def is_dataclass(obj: type) -> TypeIs[type[DataclassInstance]]: ...

def test1(x: type) -> None:
    if isinstance(x, DataclassInstance):
        reveal_type(x) # Both mypy and pyright reveal <subclass of "type" and "DataclassInstance">

def test2(x: type) -> None:
    if is_dataclass(x):
        reveal_type(x) # Mypy reveals "DataclassInstance" but pyright gives results consistent with "isinstance"

@NeilGirdhar
Copy link
Contributor Author

NeilGirdhar commented May 18, 2024

@erictraut Sorry, but I don't understand how these two functions are supposed to be the same. In the first function, you are checking against DataclassInstance whereas in the second you are checking against type[DataclassInstance].

If you change is_dataclass to return TypeIs[DataclassInstance], then MyPy and PyRight agree with each other and are consistent between the two functions.

Similarly, if you change the isinstance line to issubclass(x, DataclassInstance), then you will see that MyPy is consistent between the two functions whereas Pyright is not.

Therefore, I believe this is a bug in Pyright even if as you say PEP 742 doesn't specify this case.

@erictraut
Copy link
Contributor

Ah yes, you're correct. This is a bug in pyright.

@erictraut
Copy link
Contributor

@NeilGirdhar, this will be fixed in the next release of pyright. I typically publish a new version every Tuesday evening PST. You could either wait until then to land this PR or you could temporarily comment out this test case.

@NeilGirdhar NeilGirdhar force-pushed the isdataclass branch 2 times, most recently from 27bb3ff to e48608a Compare May 22, 2024 09:06

This comment has been minimized.

@NeilGirdhar
Copy link
Contributor Author

@JelleZijlstra There appear to be some errors with updating PyRight. Would you mind updating PyRight when you have a chance?

@srittau
Copy link
Collaborator

srittau commented May 22, 2024

Dependencies are updated automatically by renovate each UTC night.

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

pydantic (https://github.com/samuelcolvin/pydantic)
+ pydantic/v1/json.py:80: error: No overload variant of "asdict" matches argument type "type[DataclassInstance]"  [call-overload]
+ pydantic/v1/json.py:80: note: Possible overload variants:
+ pydantic/v1/json.py:80: note:     def asdict(obj: DataclassInstance) -> dict[str, Any]
+ pydantic/v1/json.py:80: note:     def [_T] asdict(obj: DataclassInstance, *, dict_factory: Callable[[list[tuple[str, Any]]], _T]) -> _T
+ pydantic/deprecated/json.py:98: error: No overload variant of "asdict" matches argument type "type[DataclassInstance]"  [call-overload]
+ pydantic/deprecated/json.py:98: note: Possible overload variants:
+ pydantic/deprecated/json.py:98: note:     def asdict(obj: DataclassInstance) -> dict[str, Any]
+ pydantic/deprecated/json.py:98: note:     def [_T] asdict(obj: DataclassInstance, *, dict_factory: Callable[[list[tuple[str, Any]]], _T]) -> _T

hydra-zen (https://github.com/mit-ll-responsible-ai/hydra-zen)
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "Literal[False] | None"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "bool"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "Builds[Callable[[Callable[..., Any]], Callable[..., Any]]] | ZenPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]] | HydraPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]] | Just[Callable[[Callable[..., Any]], Callable[..., Any]]] | type[Builds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[ZenPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[HydraPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[Just[Callable[[Callable[..., Any]], Callable[..., Any]]]] | Callable[[Callable[..., Any]], Callable[..., Any]] | str | None | Sequence[Builds[Callable[[Callable[..., Any]], Callable[..., Any]]] | ZenPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]] | HydraPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]] | Just[Callable[[Callable[..., Any]], Callable[..., Any]]] | type[Builds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[ZenPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[HydraPartialBuilds[Callable[[Callable[..., Any]], Callable[..., Any]]]] | type[Just[Callable[[Callable[..., Any]], Callable[..., Any]]]] | Callable[[Callable[..., Any]], Callable[..., Any]] | str | None]"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "Mapping[str, SupportedPrimitive] | None"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "list[str | DataClass_ | type[DataClass_] | Mapping[str, str | Sequence[str] | None]] | None"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "str | None"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "tuple[type[DataClass_], ...]"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "ZenConvert | None"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1129: error: Argument 2 to "builds" of "BuildsFn" has incompatible type "**dict[str, int | float | Path | DataClass_ | type[DataClass_] | Enum | Any | Sequence[HydraSupportedType] | Mapping[Any, HydraSupportedType] | None]"; expected "T"  [arg-type]
- src/hydra_zen/structured_configs/_implementations.py:1161: error: Incompatible types in assignment (expression has type "Just", variable has type "type[Builds[Any]]")  [assignment]

pytest (https://github.com/pytest-dev/pytest)
+ src/_pytest/_io/pprint.py:117: error: Right operand of "and" is never evaluated  [unreachable]
+ src/_pytest/_io/pprint.py:125: error: Statement is unreachable  [unreachable]

streamlit (https://github.com/streamlit/streamlit)
+ lib/streamlit/runtime/caching/hashing.py: note: In member "_to_bytes" of class "_CacheFuncHasher":
+ lib/streamlit/runtime/caching/hashing.py:409:34: error: No overload variant of "asdict" matches argument type "Type[DataclassInstance]"  [call-overload]
+ lib/streamlit/runtime/caching/hashing.py:409:34: note: Possible overload variants:
+ lib/streamlit/runtime/caching/hashing.py:409:34: note:     def asdict(obj: DataclassInstance) -> Dict[str, Any]
+ lib/streamlit/runtime/caching/hashing.py:409:34: note:     def [_T] asdict(obj: DataclassInstance, *, dict_factory: Callable[[List[Tuple[str, Any]]], _T]) -> _T

@NeilGirdhar
Copy link
Contributor Author

NeilGirdhar commented May 24, 2024

Updated. Ready for pull.

@srittau
Copy link
Collaborator

srittau commented May 24, 2024

Looking at the primer output: pydantic and streamlit are true positives. They use is_dataclass() and immediately call asdict() if true. This can lead to errors if is_dataclass() is a dataclass and not a dataclass instance. (Weird design choice to have one function to check for both types and instances ...)

pytest is a false positive:

https://github.com/pytest-dev/pytest/blob/889d9b28d786c75b66f4d1acb80123bdb341639c/src/_pytest/_io/pprint.py#L117C13-L118C45

It's basically complaining about this code (where object is Any):

if is_dataclass(object) and not isinstance(object, type): ...

I'm unsure why that is. Maybe it's assuming that is_dataclass(Any) is always True?

@NeilGirdhar
Copy link
Contributor Author

(Weird design choice to have one function to check for both types and instances ...)

Definitely a weird choice.

So, can the PR be pulled, or should we wait for the dependent projects to fix bugs?

@srittau
Copy link
Collaborator

srittau commented May 25, 2024

I'm fine with merging this, despite the false positive, but I'll leave it open for a day or two if someone has an idea how to fix/work around it.

@NeilGirdhar
Copy link
Contributor Author

Sounds good 😄 !

@srittau srittau merged commit 6246a38 into python:main May 26, 2024
63 checks passed
@NeilGirdhar NeilGirdhar deleted the isdataclass branch May 26, 2024 17:09
@NeilGirdhar
Copy link
Contributor Author

Thanks for the quick merge. If anyone has time, it may be worth mentioning that most of the other uses of TypeGuard should actually be changed to TypeIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repair is_dataclass definition?
4 participants