-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type annotations #2025
Comments
Yes, please! Python-3 type annotations to torchvision would be great! |
Two questions:
|
@pmeier |
Type annotations ToDo
|
I don't know if there is already an internal discussion about this, but in the light of inline type annotations I think we really should talk about the code format. With type annotations the signatures get quite big and with in our current style this not only looks ugly but also hinders legibility: def save_image(tensor: Union[torch.Tensor, Sequence[torch.Tensor]], fp: Union[str, io.FileIO, io.BytesIO],
nrow: int = 8, padding: int = 2, normalize: bool = False, range: Optional[Tuple[int, int]] = None,
scale_each: bool = False, pad_value: int = 0, format: Optional[str, io.FileIO] = None) -> None: The same signature formatted with def save_image(
tensor: Union[torch.Tensor, Sequence[torch.Tensor]],
fp: Union[str, io.FileIO, io.BytesIO],
nrow: int = 8,
padding: int = 2,
normalize: bool = False,
range: Optional[Tuple[int, int]] = None,
scale_each: bool = False,
pad_value: int = 0,
format: Optional[str, io.FileIO] = None,
) -> None: which (subjectively) looks better and is more legible. Note that I'm not advocating specifically for |
I'm probably not the best person to ask because I never liked the pyi files for Python-implemented bits in the first place. That said, my understanding is that think it's fine to do them inline now. Regarding the formatting: I do agree that cramming as much as possible in each line probably isn't the best way. You might see if PyTorch has a formatting preference in its formatting checks and what's the generally recommended Python way (I never know). |
Thanks for the input @t-vi! @pmeier good point about the code getting messy with type annotations. I don't have a good answer now, although indeed the output of Let me add @cpuhrsch @vincentqb and @zhangguanheng66 for discussion as well |
I second the use of |
Yes. |
This would also make reviewing easier. If you have a look at this review comment is unclear at first glance which parameter is meant since GitHub only allows to review a complete line. |
Although black is a nice formatter, I would be careful with re-formatting the whole codebase at once -- this effectively messes up with |
The author talked very briefly about integrating |
The problem is that most of the users (including Github UI) most probably only uses |
As we will touching every signature while adding annotations we could simply format them with |
@pmeier I think we could adopt the function-signature style of black, but leave the rest of the code-base unchanged? If that's what you proposed, I'm ok with it! |
I don't think this remark is fair. Please when it doubt, assume positive intent and seek clarifications. Coming to the discussion, my original position was that mypy is the defacto static analysis tool used for validating typing and that we should use it as much as possible (except in cases where it makes mistakes where we turn it off). This is pretty much the way we use it right now in TorchVision. By taking into account your remarks along with those of @pmeier and @oke-aditya, it became clear that sometimes it's very hard to please it and this could lead to convoluted workarounds (see #4237). Given that perhaps a more reasonable approach is to try to adjust, configure and use mypy with caution, as @pmeier advocates. |
Philip, That we can set flags on a per-file basis does not change the fact that these annotations won't help users until mypy doesn't conflict with torchscript, thus (IMO) drastically reducing the benefits of these annotations. Also, yes, there are technical explanations and technical solutions to all of our problems here. Thank you for taking the time and effort to clarify these. But just because we can logically explain something doesn't mean that this thing is easy, or natural, or expected for most contributors. The barrier to contribute is something that I care deeply about, and it's also something that our upper management cares about (it came up various times in internal meetings). We're all trying to foster external contributions in this very issue, so surely, we all care about it here. And indeed, contributors can ask for help, but from my own experience these mypy-related issues popup so frequently that this isn't just something we can ignore, or disregard, or just waive off as a minor hindrance. Vasilis, my apologies if my comment came out as unfair. Please rest assured that I am assuming positive intent and seeking clarifications. On my end, I believe I missed your position on a few points that I tried to make, which left me with the impression that my comments had been ignored. With that regard, in order for me to better understand where we all stand, would you mind clarifying your position on the following items: Do you disagree with my POV that type annotations won't help users much until mypy doesn't conflict with torchscript? From my understanding, you believe that annotations are still relevant info for us (the developers), and you prefer annotations to docstrings. Which leads me to these next questions: What is your position regarding mypy raising the barrier for contributors? Thanks a lot for you patience and feedback. |
I believe that anyone who reads the code, either they are a user or a developer of the library, can benefit from seeing good typing information. Provided that the info is not wrong or misleading, they improve code readability and make the code more explicit. You did convince me though that in cases where mypy requires lots of workarounds, we should omit them.
I'm not sure why docstrings are even considered an option as they are meant for documentation not for defining types (perhaps when Python did not support typing it was a workaround but not anymore). Python's effort of introducing typing is still in its infancy and the static analysis tools like mypy are not mature. Still I believe it's a step towards the right direction to make the language more precise and potentially faster. For me the typing annotations is the right tool to encode typing and using them will future-proof our code (covering the code-base will take time and it's not an all-or-nothing thing, will unlock potential speed benefits as the language evolves, will improve readability etc).
We can use type aliases if we want to reduce complexity.
I don't think it's going to be a problem. We are a friendly bunch and we can help getting this fixed during the code reviews. Moreover contributors can still submit code without annotations and we can take care of that either on a follow up (good trade-off to reduce the back-and-forth and frustration on new contributors). I would even say that adding typing info is a good "bootcamp" task for someone who wants to learn the code-base and get involved. |
True, not debating that. Same for the legibility debate in the docstrings. There are definitively cases where the annotation is less legible than a short description. Unfortunately, this will usually be the case for heavily overloaded keyword arguments in convenience functions. These functions are probably used by a lot of users so good documentation is key here. We need to find a solution for that. An easy solution would be to let the annotations and the docstring diverge. We are not enforcing equality here. My point is that in the examples you have shown the nuisance stems from the fact that either
Note that you are switching your argument here from "we should hold adding type annotations until
I don't think asking for help is an issue. This is true not only for the contributors, but also for us maintainers. If you see some "weird" |
Thank you both for taking the time to reply. I will address a few last points below that I believe are important to clarify, but I think it's probably best to move on. I've made my case and I understand that I didn't convince you yet, so I probably won't be able to convince you with more discussion. On my side, I still fail to foresee the benefits of this current typing effort, but I'll be patient and hopefully the benefits will be appear clearer in the future, should you decide to move forward with it. Thanks again for your input :)
It seems that we're referring to different things here. I will definitely agree that docstrings aren't a substitute for annotations, and annotations aren't a substitute for docstrings either. They have different purposes (documentation vs type-checking) and a different audience (humans vs computers). My original point here, which was a reply to an argument of yours #2025 (comment), is that when it comes to code readability, annotations add no value over a docstring. Again: the point of an annotation is not to specify types for human readers, they specify types for the type-checker; types should be documented for humans as well, but this is the docstrings' job.
I generally agree with your opinion, but I would like to add a bit of nuance. The first one is that the word "simply" in "simply silence it" looks out of place for me :). Getting to understand and silence mypy was never simple or obvious to me. The second is that, while I agree that we should ideally take the time to properly refactor the code when mypy flags some code smell, the reality is that this is usually not the mode that we operate in. For better or worse, we tend to get sh!t done and silence / merge fast, rather than fully address the underlying issues.
To clarify: you're right that the barrier of entry will hardly change in the future. However, once mypy and torchscript are compatible, we'll be able to fully type-check torchvision. I can understand that fully type-checking torchvision is a strong enough reason to eventually raise the contribution bar. But until we can properly type check, I believe that annotations have a very limited value, and so I believe that it's not enough of a reason at the moment for raising the bar. |
Maybe we are finally getting to the source of your aversion. Let's take your example from above: def return_4() -> int:
l = [2, 4, 6]
assert 4 in l # basically something that you *know* is True for at least one element in l
def cond(x):
return x == 4
for x in l:
if cond(x):
return x Running
If you just want to silence
If you now place a def return_4() -> int: # type: ignore[return]
...
IMHO, this is as simple as locally silencing other linters such as
I can fully get behind that. On anything that is not user facing, we can just slap a
Again, I can get behind that. @oke-aditya and @frgfm have put in quite some effort into the PRs so I think it is fair to properly review them. After that I wouldn't push further until torchscript finally gets support for fundamental stuff. Of course, if the development of torchscript is halted we need to revisit this and see if it is worth to still add the annotations. |
Hey everyone 👋 Just making sure you guys don't trouble yourself too much for @oke-aditya & me:
Again, you all have brought us an extremely useful framework and related resources, I'm far from being the only one willing to help 👌 Happy to discuss this further, we all just want to continue making the PyTorch ecosystem as user/developer-friendly and useful as possible 😀 |
Hello. I would like to add type annotations to file torchvision/transforms/transforms.py. The reason is not just for static type checking. The goal is to make I am asking here because I saw the |
We are currently in the process of revamping the transforms. With this, we are likely to drop JIT scriptability for the transform classes (although there are options to retain it #6711). Thus, there is no reason to keep wrong or unnecessary strict annotations on these classes. See #5626 for a discussion. |
On this topic, should we add / update the TODO list in the issue description? Or open another perhaps? I can easily this GH issue sticking around forever otherwise :) |
@frgfm Yes, it would be a good idea to get a better summary of what needs to be done and what is already finished. Core team is swamped at the moment so we won't be making progress on that soon. If you are willing, could you post a comment like I did in #2025 (comment) (probably finer grained) and link open PRs. I freely admit that I have no idea what is still open and blocked by something (I vaguely remember there was a PR from you where the JIT simply didn't respond on core). After that we can make a decision how to move forward. For datasets and transforms it is probably harder since they are still in prototype mode. We already have #5626 and #6668 that deal with annotations there so you might have a look at them first. |
Hello all, I recently stumbled across this thread as I have been spending the past few days developing stubs for torchvision as PyTorch already had py.typed incorporated to support the code base. I spent a good deal of time reviewing this thread, the CONTRIB readme, and it appears that there is still value in moving forward with contributing to the type hinting while keeping in mind a few key points:
I was going through the dev setup for Mac, but given some recent issues in #8421, I have a dev container for linux up and running. One small question though: I notice quite a few times mypy throwing errors over import type of Any (which can be suppressed) but seems to be an issue with not have a proper annotation of |
Hi @scm-aiml , Thanks for working on this. We're not planning on adding any new type annotation at this point. Those that currently exist in torchvision are often wrong because they are incompatible with torchscript, so type-checking torchvision code has little value: it'll error for irrelevant reasons. Since here's no plan on adding more annotations, I'll close this issue to avoid misleading contributors. Sorry about that @scm-aiml |
one last thing: if there's a community-supported stub package for torchvision that gets regular updates, I'd be more than happy to point users to it (without BC guarantees from torchvision's part though) |
Hi @NicolasHug, I didn't intend to disregard your comments in the thread. My confusion just comes from seeing your comment in August about 3 years ago and having strong feelings opposing typing torchvision, but then there continued to be a large series of contribution(s) that were approved and brought in from PRs. I guess my primary question would be are PRs not supported/received which make these contributions or are they just not going to be labeled for any priority as a "feature request"? |
No worries @scm-aiml A lot has passed since the old discussion was started. I was never convinced that type annotations made much sense for torchvision, and I haven't changed my mind on that. The discussion above was never truly resolved in the sense that active maintainers at the time just... disagreed to disagree. So, with inertia, the status quo continued: PRs were submitted and those who were in favor of annotating torchvision reviewed and merged this PR. |
🚀 Feature
Type annotations.
Motivation
Right now, if a project depends on
torchvision
and you want to static type check it, you have no choice but either ignore it or write your own stubs.torch
already has partial support for type annotations.Pitch
We could add type annotations for
torchvision
as well.Additional context
If we want this, I would take that up.
The text was updated successfully, but these errors were encountered: