Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable system for generating types from docstrings (revisited) #3225

Closed
wants to merge 1 commit into from

Conversation

chadrik
Copy link
Contributor

@chadrik chadrik commented Apr 23, 2017

Hi,
I'm picking up where I left off on #2240.

Here's a bit of a review:

This PR would allow third-party tools to extend mypy with callbacks to parse PEP484 type annotations within function docstrings. To do so, this PR introduces a basic "hook" system, as a pre-cursor to a complete plugin system (issue #1240).

A hook can be registered, along with options specific to that hook, using the mypy configuration file:

[mypy]
fast_parser = true
hooks.docstring_parser = mypydoc.parse_docstring

[docstring_parser]
default_return_type = 'None'

Hook options specified in the config file are passed to the hook as a dictionary.

One big new change from where I left off with #2240 is that the docstring_parser hook is now passed the active mypy.errors.Errors instance so that it can use the full range of error reporting, especially specification of blocker and severity level.

I have concerns about exposing the mypy.errors.Errors class as part of a public API so I was considering writing a simple wrapper that exposes only the bare minimum interface, perhaps something like this:

class Errors:
    _errors = None  # type: errors.Errors

    def __init__(self, errors: errors.Errors):
        self._errors = errors

    def report(self, line: int, column: int, message: str, blocker: bool = False,
               severity: str = 'error', file: str = None, only_once: bool = False,
               origin_line: int = None) -> None:
        return self._errors.report(line, column, message, blocker, severity, file, only_once, origin_line)

What do you think?

Sorry about getting stalled out before (I blamed it on the birth of my second daughter!). Looking forward to getting this wrapped up.

@chadrik chadrik force-pushed the docstrings branch 3 times, most recently from 7f655a2 to 254f3d2 Compare April 23, 2017 16:12
@chadrik chadrik force-pushed the docstrings branch 2 times, most recently from 376cd58 to 724c484 Compare June 5, 2017 17:55
@chadrik
Copy link
Contributor Author

chadrik commented Jun 5, 2017

Added tests and improved the structure a bit.

  • Hooks are now defined using a Hook generic class, for holding the hook function and its options.
  • A Hooks class holds all known hook types as Hook instances
  • A Hooks instance is stored on Options

Thus accessing a docstring_parser hook function from an Options instance is done like so: options.hooks.docstring_parser.func

Questions:

  • Do you like the hooks namespace used on Options and in the mypy.ini file, or should I ditch it?
  • I wrote a simplified version of pydoc.locate to import and locate an object given a dotted string path. I'd like to use the more complete version in pydoc, but there are no stubs for pydoc in the typeshed. Additionally locate does not appear in pydoc.__all__ so I'm unclear if it would even be appropriate to add it to the stubs if I were to author them. As an aside, I've found pydoc.locate to be incredibly useful (as have quite a few people on stack overflow) and I would love for it to find a more prominent place in the stdlib.
  • tests are failing in python 3.3 due to use of importlib.spec_from_file_location. Should I use the deprecated functions or is 3.3 support going away soon?
  • is it ok to use mypy.errors.Error as a public API or would you prefer a simple wrapper to limit the public interface?

Thanks!

mypy/hooks.py Outdated

from typing import Callable, Dict, Generic, Optional, Tuple, TypeVar, TYPE_CHECKING

if TYPE_CHECKING:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use MYPY = False and then if MYPY: .... TYPE_CHECKING is not present in older versions of Python.

@ilevkivskyi
Copy link
Member

@gvanrossum You reviewed the previous attempt, maybe you could review this PR too?

@gvanrossum gvanrossum self-requested a review June 5, 2017 21:42
@gvanrossum gvanrossum self-assigned this Jun 5, 2017
@chadrik
Copy link
Contributor Author

chadrik commented Jun 6, 2017

I see that @JukkaL just pushed up a general-plugins branch for #1240. There's a lot of overlap with what I've done so it seems like a good starting place for this PR. The mechanism for registering and loading custom plugins looks like it has not been added yet, so I'm interested to see how that will work: via mypy.ini or something else?

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 7, 2017

@chadrik Plugins will be registered through mypy.ini. Once the general plugin framework lands, hopefully it isn't hard to extend it to support plugins that parse docstrings.

@gvanrossum
Copy link
Member

I'm not sure how to proceed. After #3501 lands maybe we can look at how to add hooks for parsing docstrings using the new Plugin class, but a bunch more machinery needs to be invented first. But I expect that a thorough review of the current PR would be a waste of time, since the plan is to let you define a plugin for this purpose outside the mypy source code.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 7, 2017

Maybe wait until #3501 has landed and there is support for user-defined plugins (in a follow-up PR)? Then we can look at adding docstring parse hooks to the Plugin class. The latter would hopefully be pretty straightforward once the rest of the plugin system is in place.

@gvanrossum
Copy link
Member

gvanrossum commented Jun 7, 2017 via email

@chadrik
Copy link
Contributor Author

chadrik commented Jun 7, 2017

I'd be glad to help out with the user-defined plugins. I have prior experience designing a handful of similar systems. I'll drop some thoughts over in the other PR.

@gvanrossum
Copy link
Member

In the meantime I hope you don't mind if I close this one (again). Thanks for your suggestions on the dsign of a plugin system, your insights are useful!

@gvanrossum gvanrossum closed this Jun 9, 2017
@chadrik
Copy link
Contributor Author

chadrik commented Jun 10, 2017

@gvanrossum If you don't mind I'd like to keep this open specifically for adding the docstring support once we wrap up the user-plugin feature (#3517). I don't want to complicate that PR by mixing it up with docstrings. Based on the progress in #3517 this won't be lingering for long.

@gvanrossum gvanrossum reopened this Jun 10, 2017
@gvanrossum
Copy link
Member

OK, np! Though I was under the impression that with a powerful enough plugin system we could have plugins installed from PyPI, so the docstring plugin wouldn't need to be a mypy PR?

@gvanrossum gvanrossum closed this Jun 10, 2017
@gvanrossum gvanrossum reopened this Jun 10, 2017
@chadrik
Copy link
Contributor Author

chadrik commented Jun 10, 2017

I was under the impression that with a powerful enough plugin system we could have plugins installed from PyPI, so the docstring plugin wouldn't need to be a mypy PR?

The closest existing hook to what I would need is the MethodSignatureHook which is given a CallableType and is expected to return a new CallableType with an improved signature. I looked into adding a docstring attribute to CallableType, but here's the problem: a CallableType is not created if no type annotations are found, and the docstrings are intended to be the source of those type annotations, so we have a catch22 / order-of-operations problem.

Docstrings are an alternative to type comments, so it makes sense for a docstring hook to run earlier, at the time that type annotations are extracted from code and turned into Types. That's the approach that I've taken in this PR. When I port it to the plugin system, I plan to add a new hook type and call it in fastparse. That is, unless you see a way to make it work using MethodSignatureHook in checkexpr.

@gvanrossum
Copy link
Member

Agreed, I had always understood that you need a hook in the parsing stage for this purpose. It's just that the code of the hook (i.e. the actual docstring parsing) doesn't have to be part of mypy, right?

@chadrik
Copy link
Contributor Author

chadrik commented Jun 10, 2017

Correct. This PR will be to add the new hook type to the Plugin class and call it at the appropriate place in fastparse.

@gvanrossum
Copy link
Member

Honestly that sounds like a new PR, but you are free to reuse this one if it's easier for you.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for you.

@chadrik
Copy link
Contributor Author

chadrik commented Jun 14, 2017

here you go!

@gvanrossum
Copy link
Member

Heads up: once #3534 lands (specifically 3283cbf) I believe this will require more work, since Jukka is changing the signatures for hooks. Instead of having a separate signature for each hook type, the hooks get passed a specific object which has various useful attributes -- it's easier to add new attributes that way, without having to update all hooks of that type.



def parse(source: Union[str, bytes], fnam: str = None, errors: Errors = None,
options: Options = Options()) -> MypyFile:
options: Options = Options(), plugin: Plugin = Plugin()) -> MypyFile:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There ought to be no reason to provide a default. And the default definitely should not create a Plugin instance that's usually not used.

@@ -58,10 +59,12 @@

TYPE_COMMENT_SYNTAX_ERROR = 'syntax error in type comment'
TYPE_COMMENT_AST_ERROR = 'invalid type comment or annotation'
TYPE_COMMENT_DOCSTRING_ERROR = ('Arguments parsed from docstring are not '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'One or more arguments specified in docstring are not '.

return_type = type_map.pop('return', AnyType())
if type_map:
errors.report(line, 0,
TYPE_COMMENT_DOCSTRING_ERROR.format(list(type_map), arg_names))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there also ought to be an error for arguments in the arg list but not in the docstring (given that there's a type map at all).

Also, I think the error message here needn't list the expected arg names (those are easily recovered from the source) but only the list of args in the docstring but not in the arg list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there also ought to be an error for arguments in the arg list but not in the docstring (given that there's a type map at all).

I think it's important that docstring annotations can be sparse, as with multi-line python 2 function annotations.

@@ -345,6 +369,14 @@ def do_func_def(self, n: Union[ast3.FunctionDef, ast3.AsyncFunctionDef],
return_type = TypeConverter(self.errors, line=n.returns.lineno
if n.returns else n.lineno).visit(n.returns)

docstring_hook = self.plugin.get_docstring_parser_hook()
if docstring_hook is not None and not any(arg_types) and return_type is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there'd be value in parsing the docstring (if present) even if there are types in the signature, and flagging discrepancies (only when both are in direct conflict) and maybe even merging the info (if some args have a type only in one of the two sources)? (Admittedly we don't do that for PEP-3107-style annotations and signature type comments either, but there we at least reject the presence of both. Here you seem to completely disregard the docstring if any types are present in the signature.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejecting the presence of both seems like the right choice.

mypy/plugin.py Outdated
# The function's return type, if specified, is stored in the mapping with the special
# key 'return'. Other than 'return', each key of the mapping must be one of the
# arguments of the documented function; otherwise, an error will be raised.
DocstringParserHook = Callable[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that will need updating after #3534 lands.

Copy link
Contributor Author

@chadrik chadrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I'll get these fixed up and pushed, then do an optimistic rebase against #3534 so that I'm ready when that lands.

@chadrik
Copy link
Contributor Author

chadrik commented Jun 22, 2017

@gvanrossum rebased against master and addressed review notes

@chadrik
Copy link
Contributor Author

chadrik commented Jul 5, 2017

Hi @gvanrossum & @ilevkivskyi, I'm ready for the next round of notes. thanks!

@chadrik
Copy link
Contributor Author

chadrik commented Jul 23, 2017

Hey guys, I know that this issue isn't very high on your priority list, but I feel it's very close, so it would be great to get it completed!

@gvanrossum
Copy link
Member

@JukkaL Maybe you can look at this?

@JukkaL
Copy link
Collaborator

JukkaL commented Aug 15, 2017

I'm going to be on vacation soon for a little over a week and I unfortunately may not have time for this until after my vacation. I can get back to this in 2 weeks or so.

@chadrik
Copy link
Contributor Author

chadrik commented Aug 16, 2017 via email

@ethanhs
Copy link
Collaborator

ethanhs commented Dec 16, 2017

It appears this has hit merge conflicts, @chadrik can you rebase?
@JukkaL were you going to review this?

@gvanrossum
Copy link
Member

The merges don't look very hard. I used to be very skeptical of this approach but I've mellowed a bit and I think this is fine to go in. I might even find time to review it over my holiday break (which starts now).

@chadrik
Copy link
Contributor Author

chadrik commented Dec 16, 2017

I merged and pushed. There are a few remaining issues, but that's all I have time for at this moment.

@gvanrossum
Copy link
Member

Strangely, the conflicts are still there. Perhaps you didn't fetch before merging?

@chadrik
Copy link
Contributor Author

chadrik commented Dec 16, 2017

Yeah, I noticed that, but thought there might be a delay. I squashed the branch and force pushed it, which cleared up the issue.

@smessmer
Copy link

smessmer commented Mar 9, 2018

What's the status of this? Is this abandoned or got it replaced by a different PR or project? I'd really like to have mypy checking for pybind11 C extensions (which do add type info to the doc strings).

@gvanrossum
Copy link
Member

gvanrossum commented Mar 9, 2018 via email

@smessmer
Copy link

smessmer commented Mar 9, 2018

Thanks. In short, pybind11 is a python extension that allows C++ functions to be called from python. When accessing it from python, they have .__doc__ strings with PEP484 type hints. Would be cool if mypy could use these. https://pybind11.readthedocs.io/en/stable/

@chadrik
Copy link
Contributor Author

chadrik commented Mar 11, 2018

@smessmer While it would still be cool to get this merged, I don't think it's going to work for your case. This plugin runs within mypy and so does not import any code and thus only has access to statically defined docstrings. Unless pybind11 is generating .py files containing entrypoint/wrapper functions and classes with statically inspectable docstrings -- which I'm guessing it's not -- this plugin isn't going to "see" the doc attributes generated by the bindings. And if it is then it should be simple to make those into .pyi files.

@smessmer
Copy link

I see, makes sense. Thanks for the explanation. I had misunderstood the PR description to have plugins import python modules into a python runtime and read the doc strings there.

@gvanrossum
Copy link
Member

I propose to (once again) abandon this idea.

@chadrik
Copy link
Contributor Author

chadrik commented Aug 3, 2018

I've finally released my solution to this: https://pypi.org/project/doc484/

It's command-line tool that inserts type comments into your code based on docstrings. It has the advantage of working with PyCharm and other PEP484-compatible tools and not just mypy.

Feel free to close this.

edit: closed it myself, because it's my pull request :)

@chadrik chadrik closed this Aug 3, 2018
@gvanrossum
Copy link
Member

gvanrossum commented Aug 3, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-plugins The plugin API and ideas for new plugins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants