Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stubgen improvements: preserving existing annotations #3169

Merged
merged 22 commits into from Sep 22, 2017

Conversation

Projects
None yet
5 participants
@dmoisset
Copy link
Contributor

commented Apr 13, 2017

This PR modifies heavily the stubgen to be able to create better stubs when processing already annotated files (Implements #2106 ). It essentially adds annotations on the stubs for:

  • Variables/attributes with annotations
  • Annotations on functions/methods
  • TypeVar declarations
  • Type Aliases
    It also tries to add all the required imports into the stub to make the resulting file valid, and handle properly forward references when necessary. It also adds some fixes (handling of generics as base classes, some formatting fixes in the output) that were natural consequences of some of the refactoring I made.

I know this is a large patch (and refactors some pieces of the generator), so feel free to comment if it needs discussion.

@dmoisset

This comment has been minimized.

Copy link
Contributor Author

commented Apr 13, 2017

Just checked other stubgen issues and this seems to fix #1782 too

@@ -20,42 +20,52 @@ def g(arg): ...
def f(a, b=2): ...
def g(b=-1, c=0): ...
[out]
def f(a, b: int = ...): ...
def g(b: int = ..., c: int = ...): ...
def f(a, b: int=...): ...

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 13, 2017

Collaborator

TBH I like the old style better, also see this example from PEP 8:

Yes:

def munge(sep: AnyStr = None): ...
def munge(input: AnyStr, sep: AnyStr = None, limit=1000): ...

No:

def munge(input: AnyStr=None): ...
def munge(input: AnyStr, limit = 1000): ...

This comment has been minimized.

Copy link
@JelleZijlstra

JelleZijlstra Apr 13, 2017

Collaborator

typeshed's convention is to put spaces around the =.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Apr 13, 2017

Author Contributor

ok, I was used to the old rule, not the special case for annotations. I'll change that

@dmoisset dmoisset changed the title Stubgen imrpovements: preserving existing annotations Stubgen improvements: preserving existing annotations Apr 13, 2017

@ilevkivskyi
Copy link
Collaborator

left a comment

I like this PR, it will definitely make stubgen more usable. Here are more detailed comments.


y: C
[out]
x: 'C'

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

ForwardRefs are not needed in stub files. So that I would propose to keep all annotations simple, it would create less "noise" when reading stubs.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Apr 27, 2017

Author Contributor

I didn't know that (and a lot of the patch could be made simpler by removing forward ref handling). I couldn't find that documented anywhere (looking at PEP 484); is that actually a defined behaviour or an implementation artifact?

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

This is just a consequence of the fact that .pyi files are actually never executed, and the only reason of ForwardRef's is to "hide" something from the Python runtime. This is already heavily used in typeshed stubs to simplify them, if you think it is worth mentioning this in PEP 484, then you could make a short PR to python/peps repo.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

OK, changed this

y: Any

[case testMultipleAssignmentAnnotated]
x, y = 1, "2" # type: int, str

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I would add few tests with wrong annotations, for example with more types than variables, and other possible mismatches, just to be sure that stubgen will not crash on those.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Apr 27, 2017

Author Contributor

OK... the current code is not crashing, but generating "something" instead of erroring out. I can put an error message although I'm a bit converned that validation is a bit out of scope from the stubgen tool.

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I'm a bit converned that validation is a bit out of scope from the stubgen tool.

Just not crashing on invalid input is already OK, I think. If you can produce a reasonable warning, then it's perfect.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

OK. the current status is that it doesn't crash, just ignores the "extra" data

def f(self): ...
def g(self): ...

[case testExportViaRelativeImport]
from .api import get
[out]
from .api import get as get

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

Note that this form should be preserved. There is currently rule in PEP 484 that says that such form should be used for names that are actually imported (and therefore re-exported by Python runtime). Mypy currently does not follow this rule and treats all names as re-exported, see #2927, but it will be fixed soon.

Therefore, to prevent breakage of stubgen stubs after the fix, and to comply with PEP 484 this form from mod import X as X should be preserved for names that are actually imported in original file. At the same time, a short form from mod import X should be used for auxiliary/fake imports created by stubgen (like typing).

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Even after reading this comment and pep-484, I'm not sure I fully understand what is required here. What I have is:

  • If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?
  • But then, how does stubgen know which imported names where "intended" to be reexported? should it use __all__ and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?

I could use a hand here at least with the test cases, once I have some boundaries defined by those I could probably tweak the implementation to match the spec.

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?

Yes.

But then, how does stubgen know which imported names where "intended" to be reexported? should it use all and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?

I think the safest bet would be to re-export all globally visible names (i.e. ignore imports in function bodies), if there is not __all__ defined, and only re-export names form __all__ if there is one.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

I'm implementing the __all__ scenario. I'm not doing the "reexport everything if there's no __all__" because I think the result ends up being unnecessarily noisy for most cases (you'll end up with tons of modules saying from typing import List as List because they don't have an __all__). In any case, it's easy to change our minds later

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 22, 2017

Collaborator

you'll end up with tons of modules saying from typing import List as List because they don't have an __all__.

You can just manually exclude names imported from typing. In general, this is an important point, since it already caused some bugs in typeshed, see e.g. python/typeshed#1484. I would prefer this test to work.

(Also I don't think people will be unhappy about a bit more bulky stub, but they will be certainly unhappy about false positives when as name isn't present).

def visit_unbound_type(self, t: UnboundType)-> str:
s = t.name
base = s.split('.')[0]
self.stubgen.import_tracker.require_name(base)

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I am not sure I follow the logic here. If I have an unbound type X.Y.Z, will this require import X? But how do we know X is actually a module? Maybe it would be better to just use Any for unbound types?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

when you have an X.Y.Z annotatation, what other things could X be? I'm assuming module here because if it's not the only choice, it's probably the best one.

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

I am still not sure it is a good idea to guess something for an unbound type. I still think it would be better to use Any. Why do you think we should do something at all with unbound types?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it. So essentially 99% of the type annotations it will find will still be unbound

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it.

OK, is it easy to fix this? If not, then maybe add a # TODO: item here and/or open an issue?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

I don't think it's easy to fix at all; it's probably a major change in the tool, the design decision of working syntactically is heavily embedded on it from day one.

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 22, 2017

Collaborator

I would also have an issue for this. Maybe some day we will have time to improve this.

if alias:
self.reverse_alias[alias] = name

def add_import(self, module: str, alias: str=None) -> None:

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

IIRC it was already discussed that PEP 8 specifies spaces around = for this case, it should be alias: str = None not alias: str=None. Also note that we are moving towards explicit Optional types for args with default None.

(This also appears few time below.)

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Fixed

self._state = CLASS

def is_type_expression(self, expr: Expression, top_level: bool=True) -> bool:

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I would like to see more tests for this, to be sure that complex (generic) aliases like Union[T, List[T]] are recognized, but things like type_map[int] are not.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

I can add a test for the former...
The latter will actually go through as an alias and I'm not sure how to prevent it with the kind of information I have in stubgen (I'm assuming that in most libraries that kind of things won't happen too much at a module top level).

def require_name(self, name: str) -> None:
self.required_names.add(name.split('.')[0])

def import_lines(self) -> List[str]:

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

It is difficult to follow the logic of this method. Maybe a doctring (and comments) will help?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Done

self.module_for = {} # type: Dict[str, Optional[str]]
self.direct_imports = {} # type: Dict[str, str]
self.reverse_alias = {} # type: Dict[str, str]
self.required_names = set() # type: Set[str]

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I would document the meaning of these four variables. Maybe just add a comment above every line?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Added these

import urllib.parse
__all__ = ['urllib']
[out]
import urllib.parse

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I think there should still be an error comment in output like

# Names in __all__ with no definition:
#    urllib

before or after the import statement.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Apr 27, 2017

Author Contributor

Running import urllib.parse defines the urllib name and attaches it to the correct module, why this should be an error?

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

Yes, then ignore this comment (or add another test where this produce an error, like from urllib import parse)

@@ -552,14 +602,148 @@ def f(a): ...
[case testInferOptionalOnlyFunc]
class A:
x = None
def __init__(self, a=None) -> None:

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Apr 27, 2017

Collaborator

I would also keep an additional test for __init__.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 1, 2017

Author Contributor

Added

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

commented Jul 15, 2017

@dmoisset Are you still working on this? I think this would be a useful addition (in particular in the context of helping maintainers of third-party libraries that want to support typing).

@dmoisset

This comment has been minimized.

Copy link
Contributor Author

commented Jul 22, 2017

@ilevkivskyi I've been dealing with other stuff, I might take a look at this again during the rest of the month

@JukkaL JukkaL removed their assignment Aug 15, 2017

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

commented Aug 31, 2017

@dmoisset Sorry for pinging again, this is the oldest open PR now (and IMO it contains an important feature).

@ilevkivskyi ilevkivskyi self-assigned this Aug 31, 2017

@dmoisset dmoisset force-pushed the dmoisset:stubgen-patch branch from e9cfae2 to df0d5cf Sep 1, 2017

@ilevkivskyi
Copy link
Collaborator

left a comment

Thanks! This is almost ready, just few minor comments/questions.


def get_init(self, lvalue: str, rvalue: Expression) -> Optional[str]:
def get_init(self, lvalue: str, rvalue: Expression,
annotation: Optional[Type]=None) -> Optional[str]:

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

Missing space around =.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

Fixed

init_stmt = arg_.initialization_statement
if init_stmt:
if isinstance(init_stmt.rvalue, NameExpr) and init_stmt.rvalue.name == 'None':
initializer = 'None'

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

This is a recent agreement that all optional types should be explicit in typeshed, and it is now checked with --no-implicit-optional. If the implicit Optional are added already at the parse stage, then I think the initializer should be always ..., even for optional types. Otherwise, we can keep None.

This comment has been minimized.

Copy link
@JelleZijlstra

JelleZijlstra Sep 1, 2017

Collaborator

Not sure of the exact context here, but the typeshed convention is that all defaults should be ....

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

Ok removing this (actually the result was already ok, but this was a leftover when I tried to take advantage of the implicit optional)

@@ -35,6 +35,7 @@
- we don't seem to always detect properties ('closed' in 'io', for example)
"""

import builtins

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

Why do you need this import?

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

probably leftover of some previous change, removing...


[case testClassVariable]
class C:
x = 1
[out]
class C:
x = ... # type: int
x: int

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

For class attributes it is important to know whether an attribute was just declared like x: int or actually defined like x: int = 1. There is a difference for protocol classes -- in former case an explicit subclass cannot be instantiated. Therefore, I think the translation for class attributes should be like this:

x: int -> x: int
x = 1 -> x: int = ...
x: int = 1 -> x: int = ...

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have self.x = 1 in __init__). Should those be just x: int or should they also be x: int = ...?

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 22, 2017

Collaborator

OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have self.x = 1 in __init__). Should those be just x: int or should they also be x: int = ...?

I think it should be x: int = ... if there is an assignment in __init__ (again mostly from the protocols POW).

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

OK, I've made the change (and some others), I will be pushing un update after some testing

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 22, 2017

Collaborator

Great!

@@ -367,19 +413,19 @@ from re import match, search, sub
__all__ = ['match', 'sub', 'x']
x = 1
[out]
from re import match as match, sub as sub

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

Don't forget about fixing this as discussed.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

Fixed


noalias1: Any
noalias2: Any
noalias3: bool

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 1, 2017

Collaborator

Do you preserve @abstractmethod and other decorators? I think it is important to preserve @abstractmethod, @classmethod, @staticmethod, and @property.

This comment has been minimized.

Copy link
@dmoisset

dmoisset Sep 22, 2017

Author Contributor

I didn't make any changes here wrt the original stubgen. I see some tests checking for property/staticmethod/classmethod being preserved, and other decorators removed (I think abstractmethod is not preserved). If you don't mind, I would add that as a separate issue to avoid scope creep here

This comment has been minimized.

Copy link
@ilevkivskyi

ilevkivskyi Sep 22, 2017

Collaborator

OK, could you please open an issue for this?

@gvanrossum

This comment has been minimized.

Copy link
Member

commented Sep 19, 2017

Would love to see this land -- who's waiting for whom?

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2017

I am waiting for @dmoisset to implement few last comments.

dmoisset added a commit to dmoisset/peps that referenced this pull request Sep 22, 2017

@dmoisset

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2017

OK, I'm pushed a new update. There are a few issues that we've discussed in the comments that were actually issues in the previous version. Something that could be done if you agree is merge this and add these items as separate issues. What I have is:

  • There's no clear road to take to export names when __all__ is not defined #3169 (comment) (The difficulty here is deciding what to do, implementation is easy); I checked a bit what tests should be updated and there will be a lot of import foo as foo everywhere to add and I'm worried that in general it's not what's desired.
  • stubgen should do semantic analysis instead of dealing with unbounded types #3169 (comment) (this is a major change/rewrite)
  • stubgen considers as type aliases some assignments of values which may not be type aliases type_map[int] #3169 (comment) (this could be easier to mitigate also by doing more semantic analysis)
  • make sure abstractmethod is passed through #3169 (comment) (this should be a very small patch, but I didn't want to keep adding new stuff here)
@dmoisset

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2017

(to clarify: I can add the issues, just wanted to sync on how we proceed)

@gvanrossum

This comment has been minimized.

Copy link
Member

commented Sep 22, 2017

@dmoisset

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2017

Oh, the apologies are on me; code reviews have been very quick, and the delays have been on my side. My chances to work on this are limited so it can take me a long time to move forward after the reviews

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

commented Sep 22, 2017

There's no clear road to take to export names when __all__ is not defined #3169 (comment) (The difficulty here is deciding what to do, implementation is easy)

The problem is that stubgen does not conform to PEP 484 both without and with your PR.
I think PEP 484 is clear here and we should just go with it, instead of reconsidering the as name rule again.

I checked a bit what tests should be updated and there will be a lot of import foo as foo everywhere to add and I'm worried that in general it's not what's desired.

These tests might be older than PEP 484 itself, so I am not surprised. We just need to update the tests, although I am fine to do this in a separate PR.

To summarize, I am fine with merging this PR as is to avoid unnecessary growth (it is already large). But, please open four separate issues for the problems you mentioned and mention this PR in every issue.

@ilevkivskyi ilevkivskyi merged commit 4fc4ae2 into python:master Sep 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

commented Sep 22, 2017

@dmoisset
Thank you for working on this! This will make stubgen more usable, and I will be happy i fyou continue to work on the above mentioned issues.

@gvanrossum

This comment has been minimized.

Copy link
Member

commented Sep 22, 2017

Thanks so much Daniel for working on this! And Ivan for reviewing. Stubgen doesn't get enough developer love.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.