Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support flexible TypedDict creation/update #15425

Merged
merged 13 commits into from
Jun 26, 2023

Conversation

ilevkivskyi
Copy link
Member

Fixes #9408
Fixes #4122
Fixes #6462
Supersedes #13353

This PR enables two similar technically unsafe behaviors for TypedDicts, as @JukkaL explained in #6462 (comment) allowing an "incomplete" TypedDict as an argument to .update() is technically unsafe (and a similar argument applies to ** syntax in TypedDict literals). These are however very common patterns (judging from number of duplicates to above issues), so I think we should support them. Here is what I propose:

  • Always support cases that are safe (like passing the type itself to update)
  • Allow popular but technically unsafe cases by default
  • Have a new flag (as part of --strict) to fall back to current behavior

Note that unfortunately we can't use just a custom new error code, since we need to conditionally tweak some types in a plugin. Btw there are couple TODOs I add here:

  • First is for unsafe behavior for repeated TypedDict keys. This is not new, I just noticed it when working on this
  • Second is for tricky corner case involving multiple ** items where we may have false-negatives in strict mode.

Note that I don't test all the possible combinations here (since the phase space is huge), but I think I am testing all main ingredients (and I will be glad to add more if needed):

  • All syntax variants for TypedDicts creation are handled
  • Various shadowing/overrides scenarios
  • Required vs non-required keys handling
  • Union types (both as item and target types)
  • Inference for generic TypedDicts
  • New strictness flag

More than half of the tests I took from the original PR #13353 by @eflorico

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

OK, so about the few new errors in mypy_primer:

  • Auto-Split: two new errors are correct, previously not-shown because inferred type from previous error was Any. Btw @hauntsaninja that repo is archived, it looks like development is now at https://github.com/Toufool/AutoSplit
  • pydantic: Previously error code was misc, so now type ignores stopped working. Not sure what is the policy on this, I guess this is OK. Also maybe we can allow **Any and **dict[Any, Any] in TypedDict constructor? We probably can still give an error about missing required keys.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't look at the code in detail, but I do think the Any cases should be allowed. Any means it could be anything, so as long as the runtime type could be something legal, we shouldn't show an error if you use Any.

pass

foo: Dict[str, Any] = {}
bar: Bar = {**foo} # E: Unsupported type "Dict[str, Any]" for ** expansion in TypedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this should be allowed; Any means Any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Ivan that for consistency we can only this if the type is Dict[Any, Any]. However, I'm not sure if Dict[Any, Any] is supported in other contexts (need to double check).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW as far as I know Dict[Any, Any] is not well supported in other TypedDict contexts. I decided to allow Dict[Any, Any] only because I found an example in mypy_primer where Dict[Any, Any] is used specifically in ** context. I think we can allow Dict[Any, Any] in other contexts, e.g. it can be a (non-proper) subtype of all non-total TypedDicts.

x: Any
y: Dict[Any, Any]
z: Union[Any, Dict[Any, Any]]
t1: Foo = {**x} # E: Missing key "a" for TypedDict "Foo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel these should all be allowed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, but a reasonable first step would be to do the same we do with Dict[Any, Any] in other TypedDict contexts, and if we decide to make things more flexible, we should do consistently.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Jun 14, 2023

For what it's worth, the behaviour with Any for this kind of thing was intentional, see #4976 (comment)
pyright's behaviour also matches mypy's here.

@ilevkivskyi
Copy link
Member Author

Just to clarify, there are two question/decisions here: First question is should we allow:

  • **Any -- IMO definitely yes
  • **dict[Any, Any] -- I think probably yes. My motivation is this is kind of like allowing a: Any; k: Literal["key"] = a (and we allow this)
  • **dict[str, Any] -- IMO definitely no, because this is like allowing s: str; k: Literal["key"] = s. str is not like Any in the world of literals, it is like object in the world of literals.

Second question is whether we should still warn about missing required keys in cases where we allow Any types. I think probably yes, my motivation is this is kind of like we always prohibit

a: Any
k: Final = "key"
k = a

This is fine from the purely type comparison point of view (because of Any), but not semantically correct.

@ilevkivskyi
Copy link
Member Author

Are there any other comments on this? (Also besides the Any type behavior)

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For strict update, can we also allow in the case that the argument TypedDict is final? Similar situation to #7981 / microsoft/pyright#1899 (comment)

Comment on lines 2965 to 2966
[case testTypedDictUnpackSame]
from typing import TypedDict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might as well test this one with strict. Probably some of the other test cases with no errors could be tested with strict as well, like testTypedDictUnpackCompatible, testTypedDictUnpackMultiple

Suggested change
[case testTypedDictUnpackSame]
from typing import TypedDict
[case testTypedDictUnpackSame]
# flags: --strict-typeddict-update
from typing import TypedDict

@ilevkivskyi
Copy link
Member Author

@hauntsaninja

For strict update, can we also allow in the case that the argument TypedDict is final?

I think this is kind of a separate question, first we need a more general support for final TypedDicts, i.e. first #7981 needs to be closed, most importantly a TypedDict with extra keys should not be a subtype of a final TypedDict. Once this is done, yes, we can allow final TypedDicts even in strict mode.

@github-actions

This comment has been minimized.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 23, 2023

I think that it's reasonable to let users enable the old (safe) behavior using a command-line flag, but adding a new flag doesn't seem worth it. I'd suggest adding a new flag that would cover both --strict-typeddict-update and --strict-concatenate, and we'd make --strict-concatenate a deprecated flag that isn't shown in --help. The new flag could be called --pedantic or --extra-checks, for example. We can later add more safety checks that are technically correct, will probably not find many real issues, but can cause friction, behind this flag. Some other flags, such as --strict-equality, should probably not be merged into this flag, since even though it has false positives, it often finds real issues.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! These are very useful and often requested improvements. Some details are still open, but they can be refined later on after this has been merged.

pass

foo: Dict[str, Any] = {}
bar: Bar = {**foo} # E: Unsupported type "Dict[str, Any]" for ** expansion in TypedDict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Ivan that for consistency we can only this if the type is Dict[Any, Any]. However, I'm not sure if Dict[Any, Any] is supported in other contexts (need to double check).

x: Any
y: Dict[Any, Any]
z: Union[Any, Dict[Any, Any]]
t1: Foo = {**x} # E: Missing key "a" for TypedDict "Foo"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, but a reasonable first step would be to do the same we do with Dict[Any, Any] in other TypedDict contexts, and if we decide to make things more flexible, we should do consistently.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

AutoSplit (https://github.com/Toufool/AutoSplit)
- src/user_profile.py:118: error: Expected keyword arguments, {...}, or dict(...) in TypedDict constructor  [misc]
+ src/user_profile.py:133: error: TypedDict key must be a string literal; expected one of ("split_hotkey", "reset_hotkey", "undo_split_hotkey", "skip_split_hotkey", "pause_hotkey", ...)  [literal-required]
+ src/user_profile.py:137: error: TypedDict key must be a string literal; expected one of ("split_hotkey", "reset_hotkey", "undo_split_hotkey", "skip_split_hotkey", "pause_hotkey", ...)  [literal-required]

graphql-core (https://github.com/graphql-python/graphql-core): typechecking got 1.05x slower (361.3s -> 380.5s)
(Performance measurements are based on a single noisy sample)

optuna (https://github.com/optuna/optuna)
+ Warning: --strict-concatenate is deprecated; use --extra-checks instead

pydantic (https://github.com/samuelcolvin/pydantic)
+ pydantic/v1/networks.py:237: error: Unsupported type "dict[str, str]" for ** expansion in TypedDict  [typeddict-item]
+ pydantic/v1/networks.py:237: note: Error code "typeddict-item" not covered by "type: ignore" comment

hydra-zen (https://github.com/mit-ll-responsible-ai/hydra-zen)
- src/hydra_zen/wrapper/_implementations.py:1472: error: Argument 1 to "update" of "TypedDict" has incompatible type "_StoreCallSig"; expected "TypedDict({'name'?: str | Callable[[Any], str], 'group'?: str | None | Callable[[Any], str | None], 'package'?: str | Callable[[Any], str] | None, 'provider'?: str | None, '__kw'?: dict[str, Any], 'to_config'?: Callable[[Any], Any]})"  [typeddict-item]

@ilevkivskyi ilevkivskyi merged commit 8290bb8 into python:master Jun 26, 2023
21 checks passed
@ilevkivskyi ilevkivskyi deleted the support-start-td branch June 26, 2023 23:26
@eflorico
Copy link

eflorico commented Jul 2, 2023

Thank you for your work @ilevkivskyi, this is really exciting! I'm glad I could help a little bit through my earlier tests 😊

@LarsDu
Copy link

LarsDu commented Jul 6, 2023

Thanks for this @ilevkivskyi. I was shocked this was only fixed so recently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants