Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Using sphinx documentation annotations for type checking in py2 mode #1015
So I've been able to gather that mypy has a python 2 mode, triggered via --py2, and that the way to annotate types in this mode was via comments.
I've been using pycharm on one side, and jedi for completion in vim on the other side for quite a while. Both of those tools can use sphinx doc annotations to infer types in python 2 mode.
I was wondering if that had been considered for mypy ? It would be tremendously useful for us for those programs that we can't migrate to python 3.
Note that if this is considered a good idea but has not been done because nobody had time/desire to do it, I'm completely ready to get my hand dirty !
I'm reluctant to add this -- it was proposed and considered in the discussion leading to PEP 484 but it has too many drawbacks (including, it's very verbose).
If you really want this it would be nice to start a discussion in the PEP 484 tracker (https://github.com/ambv/typehinting) or on python-ideas about whether we should standardize one or multiple syntaxes for type hints in Python 2 code, and if so, which.
Another problem with using sphinx markup in docstrings is that they are a bit hard to translate back to the PEP 484 style annotations (once you're leaving behind Python 2), while the current comment-based syntax maps pretty directly to PEP 484 (and I intend to write a converter).
If you want to get your hands dirty in mypy it might be better to focus on tools to make working with the currently supported syntax better -- e.g. we need a tool to add a first draft of type annotations to all functions, to bootstrap type-checking code that isn't annotated. (I have several millions of lines of such code that I want to analyze, and it's not using the Sphinx convention, and the hack I currently have isn't good enough. Maybe the mypy parser can be reused for this purpose? Otherwise the lib2to3 framework should be able to handle this easily.)
We also need better stubs for the stdlib. Many of the current stub files only define 1-2 functions out of dozens actually defined by the actual stdlib module. And I would really like need a stub generator that parses the modules instead of importing them. (Again, maybe mypy's parser or lib2to3 can be used here.)
I agree that it is very verbose, and honestly, starting from a clean codebase, I probably would not choose this way of adding type annotations.
The point in my opinion is more that this has been the standard (by status quo) way of annotating types for a long time in a lot of tools, and that supporting it would allow a lot of people that standardized this practice internally (which is the case on a lot of projects at my company) to transition more smoothly to gradual static typing.
Another thing to keep in mind in my opinion, is that while it is verbose, it is also documentation. For a lot of code (libraries come to mind), this will be done anyway, so reusing documentation, in the context of python 2, makes more sense than maintaining two sets of type hints information. In the context of Python 3 of course this is irrelevant because doc generators can extract type information from the AST directly.
Ok, will do !
I looked at mypy's code, and I see that indeed python 2 support is allowed to be a minimal modification to the parser by the current annotation syntax choices, and that it would complexify the parser code a bit to support multiple different syntaxes.
I don't know if it would make sense to:
That way we can avoid standardizing several syntaxes (which seems a bit heavy) but still have the benefits of allowing reuse of existing annotations for people who need them. Tell me what you think !
Well admittedly, I'm completely interested into getting my hands dirty, but I will readily admit that my motives are selfish: If I can make use of mypy in some of my projects (I'm trying the comment syntax on a new project already), I will have much more incentive to start contributing.
At my company we have standardized the use of sphinx annotations, and a lot of developers already use Jedi/Pycharm. Moving to a new syntax won't be done easily, and that's why I opened this issue.
If I can make use of mypy (which might happen with or without support for docstrings type annotations), I will definitely try and contribute !
Anyway thank you very much for your timely answer !
Hey again !
I just saw that mypy in effect already have docstring parsing for type hints, using docstring.py, and a custom format that I don't recognize. It means that in effect, much of the infrastructure for this already exists, and the format is not much less verbose than epydoc or sphinx. Why in that case not try to parse them in the common doc generators format ?
It's really all a matter of priorities. We don't need this at Dropbox, and
On Sat, Nov 28, 2015 at 8:22 AM, Raphaël AMIARD firstname.lastname@example.org
--Guido van Rossum (python.org/~guido)
Even Pycharm works with it (https://www.jetbrains.com/pycharm/help/using-docstrings-to-specify-types.html)
PEP257 (2001), so around for almost 15 years now has advised against a solution similar for type hints in 2.7:
""" The one-line docstring should NOT be a "signature" reiterating the function/method parameters (which can be obtained by introspection). Don't do: """ def function(a, b): """function(a, b) -> list"""
def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None """Embezzle funds from account using fake receipts.""" <code goes here>
I think the annotation serves to put a lot of open source projects that bought into autodoc-style docstrings into a funky situation. Its only for python 2.7 projects - yet its incompatible with advice many projects accommodated themselves to.
There is another thing to consider. If we were to have mypy supporting autodoc and numpy style type annotations - would that inhibit python 3 migrations?
@raph-amiard Any update on your PR? I'm happy to team up on it.
@gvanrossum I'm interested in hearing if you have anything to add or any other complications that others may not be considering. Why more against than earlier?
I notice one area that could pose a potential problem with autodoc, take something like https://github.com/tony/tmuxp/blob/475ed94/tmuxp/window.py#L75 or https://github.com/tony/tmuxp/blob/475ed94/tmuxp/window.py#L166.
That could get screwy trying to parse. Anything else where autodoc where it'd be tricky? Or is that what you mean?
I want to weigh the cost-benefits of it. I'm more determined to get mypy in any way (even if it superficially feels redundant at first) than I am to trying jerry–rig autodoc to work with mypy.
First, I think the language should lead, and the tools will follow. This doesn't mean the language should intentionally try to fight the tools. It just means that the language should point the tools in the direction it wants to go and not the other way around.
In this case, the direction in which the language is pointing is the direction of the Python 3 syntax from PEP 484 (i.e. inline annotations using PEP 3107). Eventually (probably sooner rather than later) the tools will support that syntax and combine it with the human-readable descriptions from docstrings if present. The Python 2 variant from PEP 484 is a temporary solution designed to be as close to inline annotations as possible given that Python 2 doesn't support PEP 3107. It is not hard to adapt, especially for a tool that wants to support the Python 3 syntax.
Second, the docstring convention is often followed only approximately. This is no big deal for the original purpose, generating documentation: if there are typos in a docstring the human reader can easily recover the intention. I suspect that in many cases no tooling is used to generate documentation from docstrings -- programmers simply read the source code and the markup is light enough that they can follow along. (The "napoleon" convention is particularly attractive when used this way.)
But if mypy were to be pointed at a body of code using type annotations in docstrings (if it could read them) it would overwhelm the user with a barrage of complaints due to discrepancies between the information in the docstrings and the actual code. This is not a good first experience for a user interested in starting with mypy -- even if the blame lies purely with incorrect information in the docstring, it will be hard for the user to figure out what to do, and a barrage of warnings that must be ignored trains the user in the wrong attitude. Compare this to the current situation -- the user is encouraged to add annotations piecemeal, to one class or module at a time, and mypy will only type-check those parts of the program to which annotations are explicitly added. This process can be controlled by the user and the overall experience will be much more favorable.
(Note that the codebases that stand most to benefit from type hints are very large ones. For these it is particularly important not to overwhelm the user with spurious warnings, since trying to address them all at once will not be possible.)
Perhaps a more useful thing to contribute would be a separate tool that extracts information from docstrings and converts it into PEP 484 conforming type annotations. Such a tool could be used to get the annotations started. A possible starting point might be the existing annotation generator (https://github.com/python/mypy/blob/master/misc/fix_annotate.py, to be run as a 2to3 fixer).
I agree that we should give up on parsing docstrings in mypy. There are many issues with them, the most obvious being what Guido pointed out above: they are usually too inconsistent in existing code to be very useful for static type checking. We need a well-defined syntax for type annotations that doesn't conflict with existing code and that is also easy to automatically migrate to Python 3 style annotations.
I understand this thread is old, but just wanted to bounce off a idea, about whether it has been tried.
Is there any existing solution to not explicitly provide the type information in docstring if you already have it in the comment string or annotation format?
For ex. if you have
The overall theory is that annotation already has some technical information about the signature, which is also good to have in the documentation. So, documentation would somehow infer the type from the annotation without duplication.
@gvanrossum Thanks for information on the ongoing work. It seems fine in my opinion, that mypy shouldnt support it directly, considering the scope of the project and the huge number of documentation formats out there.