Skip to content

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Feb 1, 2024

Stack from ghstack (oldest at bottom):

Type checking Python is a pain. Here are my learnings:

  • The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using @overload liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
  • Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
  • GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang ezyang@meta.com

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 1, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118870

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 0a35db4 with merge base 8f82a44 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ezyang added a commit that referenced this pull request Feb 1, 2024
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: f87d01b
Pull Request resolved: #118870
@ezyang ezyang requested review from int3 and lezcano February 1, 2024 15:10
@Skylion007
Copy link
Collaborator

FYI: You may have an opinion on this PR @ezyang sympy/sympy#26158

@lezcano
Copy link
Collaborator

lezcano commented Feb 1, 2024

I'll defer to @Skylion007 on this one. My only concern is the obvious: It's going to be difficult bordering impossible to ask people to write this sort of stuff in their code, and we're just going to end up with a thousand noqa's.

@ezyang
Copy link
Contributor Author

ezyang commented Feb 1, 2024

I mean, I'm concerned about this too! In this case, though, I don't feel like it will take a Haskell veteran to make equivalent types in other situations? But maybe I am overestimating.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!
Comment is really more of a question than a ask to do it now.

@lezcano
Copy link
Collaborator

lezcano commented Feb 2, 2024

n this case, though, I don't feel like it will take a Haskell veteran to make equivalent types in other situations?

No it would not. But it is also easier to just skip the error via a noqa once you try the two obvious solutions and the linter still complains. If you want this to be enforced, the only way of doing it would be to have a CI job that checks for new additions of the relevant noqa qualifier and then points the person to a short tutorial on how to write them.
I know it's a bit of an overkill, but I really don't see 99% of the people adding this level of typing in their PRs.

Also note that things get exponentially worse the more you use the inherent Python dynamism...

@ezyang
Copy link
Contributor Author

ezyang commented Feb 2, 2024

For what it's worth, even mypy's authors think that there should be a point where you give up and type: ignore, and that you might even expect to hit this point very frequently. But the point was never to prove that your code was correct with the type system, it was to raise the bar for dumb problems.

What I'm a little less certain about is how well this will work for cases where there's some higgledy piggledy problem like sympy.Expr/Boolean confusion, and I am specifically trying to make the typing accurate so that people can catch problems like this. I don't know if people are going to respect the typechecker enough to notice that there is some nuance they missed. Actually, I think people tend to pay attention to what the machine tells them most of the time!

@ezyang
Copy link
Contributor Author

ezyang commented Feb 2, 2024

This cannot be landed until #118529 is relanded

@ezyang
Copy link
Contributor Author

ezyang commented Feb 2, 2024

The typing failure tests are cursed. What appears to happen is if I specifically invoke mypy from the test script, it chokes when trying to understand my new mypy plugin. But I cannot tell what about my plugin is causing it to choke. And I cannot debug probe it because I appear to be hitting compiled mypy (but suo tells me mypyc is also dead so... think??)

Type checking Python is a pain. Here are my learnings:

* The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using `overload` liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
* Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
* GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Feb 3, 2024

mypy problem is python/mypy#16866

Type checking Python is a pain. Here are my learnings:

* The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using `overload` liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
* Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
* GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 4, 2024
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: b749f6f
Pull Request resolved: #118870
Type checking Python is a pain. Here are my learnings:

* The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using `overload` liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
* Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
* GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Feb 5, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 5, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

Type checking Python is a pain. Here are my learnings:

* The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using `overload` liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
* Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
* GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 5, 2024
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: 4e83ba9
Pull Request resolved: #118870
@ezyang
Copy link
Contributor Author

ezyang commented Feb 5, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ezyang ezyang added the topic: not user facing topic category label Feb 5, 2024
@ezyang
Copy link
Contributor Author

ezyang commented Feb 5, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorch-bot bot pushed a commit that referenced this pull request Feb 8, 2024
Type checking Python is a pain. Here are my learnings:

* The types for heavily polymorphic code is going to be verbose, no way around it. I originally was hoping I could lean on polymorphism with a bounded TypeVar to compactly write signatures for many of the ValueRanges methods, but I ran into some unworkaroundable mypy bugs. Writing out all the types explicitly and using `@overload` liberally works pretty well, so I think I recommend people do that instead of trying to do fancy things.
* Sympy is missing annotations for assumptions, because they are all metaprogrammed. I don't really relish maintaining a typeshed for sympy, so I wrote a small mypy plugin to add them in.
* GADT style refinement is... just not a good idea in practice. Mypy easily gets confused whether or not a return value from a refined section is allowed for the outer return type. So many of these have been replaced with less informative implementation types and more informative external types via overloads. Hopefully this is good for use sites.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: #118870
Approved by: https://github.com/Skylion007, https://github.com/albanD
@github-actions github-actions bot deleted the gh/ezyang/2545/head branch March 7, 2024 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants