Stubgen improvements: preserving existing annotations #3169

dmoisset · 2017-04-13T18:28:57Z

This PR modifies heavily the stubgen to be able to create better stubs when processing already annotated files (Implements #2106 ). It essentially adds annotations on the stubs for:

Variables/attributes with annotations
Annotations on functions/methods
TypeVar declarations
Type Aliases
It also tries to add all the required imports into the stub to make the resulting file valid, and handle properly forward references when necessary. It also adds some fixes (handling of generics as base classes, some formatting fixes in the output) that were natural consequences of some of the refactoring I made.

I know this is a large patch (and refactors some pieces of the generator), so feel free to comment if it needs discussion.

dmoisset · 2017-04-13T18:33:46Z

Just checked other stubgen issues and this seems to fix #1782 too

ilevkivskyi · 2017-04-13T19:38:39Z

test-data/unit/stubgen.test

@@ -20,42 +20,52 @@ def g(arg): ...
 def f(a, b=2): ...
 def g(b=-1, c=0): ...
 [out]
-def f(a, b: int = ...): ...
-def g(b: int = ..., c: int = ...): ...
+def f(a, b: int=...): ...


TBH I like the old style better, also see this example from PEP 8:

Yes: def munge(sep: AnyStr = None): ... def munge(input: AnyStr, sep: AnyStr = None, limit=1000): ... No: def munge(input: AnyStr=None): ... def munge(input: AnyStr, limit = 1000): ...

typeshed's convention is to put spaces around the =.

ok, I was used to the old rule, not the special case for annotations. I'll change that

ilevkivskyi

I like this PR, it will definitely make stubgen more usable. Here are more detailed comments.

ilevkivskyi · 2017-04-27T08:02:31Z

test-data/unit/stubgen.test

+
+y: C
+[out]
+x: 'C'


ForwardRefs are not needed in stub files. So that I would propose to keep all annotations simple, it would create less "noise" when reading stubs.

I didn't know that (and a lot of the patch could be made simpler by removing forward ref handling). I couldn't find that documented anywhere (looking at PEP 484); is that actually a defined behaviour or an implementation artifact?

This is just a consequence of the fact that .pyi files are actually never executed, and the only reason of ForwardRef's is to "hide" something from the Python runtime. This is already heavily used in typeshed stubs to simplify them, if you think it is worth mentioning this in PEP 484, then you could make a short PR to python/peps repo.

OK, changed this

ilevkivskyi · 2017-04-27T08:05:56Z

test-data/unit/stubgen.test

+y: Any
+
+[case testMultipleAssignmentAnnotated]
+x, y = 1, "2" # type: int, str


I would add few tests with wrong annotations, for example with more types than variables, and other possible mismatches, just to be sure that stubgen will not crash on those.

OK... the current code is not crashing, but generating "something" instead of erroring out. I can put an error message although I'm a bit converned that validation is a bit out of scope from the stubgen tool.

I'm a bit converned that validation is a bit out of scope from the stubgen tool.

Just not crashing on invalid input is already OK, I think. If you can produce a reasonable warning, then it's perfect.

OK. the current status is that it doesn't crash, just ignores the "extra" data

ilevkivskyi · 2017-04-27T08:31:31Z

test-data/unit/stubgen.test

        def f(self): ...
    def g(self): ...

 [case testExportViaRelativeImport]
 from .api import get
 [out]
-from .api import get as get


Note that this form should be preserved. There is currently rule in PEP 484 that says that such form should be used for names that are actually imported (and therefore re-exported by Python runtime). Mypy currently does not follow this rule and treats all names as re-exported, see #2927, but it will be fixed soon.

Therefore, to prevent breakage of stubgen stubs after the fix, and to comply with PEP 484 this form from mod import X as X should be preserved for names that are actually imported in original file. At the same time, a short form from mod import X should be used for auxiliary/fake imports created by stubgen (like typing).

Even after reading this comment and pep-484, I'm not sure I fully understand what is required here. What I have is:

If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?

But then, how does stubgen know which imported names where "intended" to be reexported? should it use __all__ and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?

I could use a hand here at least with the test cases, once I have some boundaries defined by those I could probably tweak the implementation to match the spec.

If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?

Yes.

But then, how does stubgen know which imported names where "intended" to be reexported? should it use all and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?

I think the safest bet would be to re-export all globally visible names (i.e. ignore imports in function bodies), if there is not __all__ defined, and only re-export names form __all__ if there is one.

I'm implementing the __all__ scenario. I'm not doing the "reexport everything if there's no __all__" because I think the result ends up being unnecessarily noisy for most cases (you'll end up with tons of modules saying from typing import List as List because they don't have an __all__). In any case, it's easy to change our minds later

you'll end up with tons of modules saying from typing import List as List because they don't have an __all__.

You can just manually exclude names imported from typing. In general, this is an important point, since it already caused some bugs in typeshed, see e.g. python/typeshed#1484. I would prefer this test to work.

(Also I don't think people will be unhappy about a bit more bulky stub, but they will be certainly unhappy about false positives when as name isn't present).

ilevkivskyi · 2017-04-27T08:40:00Z

mypy/stubgen.py

+    def visit_unbound_type(self, t: UnboundType)-> str:
+        s = t.name
+        base = s.split('.')[0]
+        self.stubgen.import_tracker.require_name(base)


I am not sure I follow the logic here. If I have an unbound type X.Y.Z, will this require import X? But how do we know X is actually a module? Maybe it would be better to just use Any for unbound types?

when you have an X.Y.Z annotatation, what other things could X be? I'm assuming module here because if it's not the only choice, it's probably the best one.

I am still not sure it is a good idea to guess something for an unbound type. I still think it would be better to use Any. Why do you think we should do something at all with unbound types?

Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it. So essentially 99% of the type annotations it will find will still be unbound

Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it.

OK, is it easy to fix this? If not, then maybe add a # TODO: item here and/or open an issue?

I don't think it's easy to fix at all; it's probably a major change in the tool, the design decision of working syntactically is heavily embedded on it from day one.

I would also have an issue for this. Maybe some day we will have time to improve this.

ilevkivskyi · 2017-04-27T08:46:38Z

mypy/stubgen.py

+            if alias:
+                self.reverse_alias[alias] = name
+
+    def add_import(self, module: str, alias: str=None) -> None:


IIRC it was already discussed that PEP 8 specifies spaces around = for this case, it should be alias: str = None not alias: str=None. Also note that we are moving towards explicit Optional types for args with default None.

(This also appears few time below.)

ilevkivskyi · 2017-04-27T08:58:35Z

mypy/stubgen.py

        self._state = CLASS

+    def is_type_expression(self, expr: Expression, top_level: bool=True) -> bool:


I would like to see more tests for this, to be sure that complex (generic) aliases like Union[T, List[T]] are recognized, but things like type_map[int] are not.

I can add a test for the former...
The latter will actually go through as an alias and I'm not sure how to prevent it with the kind of information I have in stubgen (I'm assuming that in most libraries that kind of things won't happen too much at a module top level).

ilevkivskyi · 2017-04-27T09:00:35Z

mypy/stubgen.py

+    def require_name(self, name: str) -> None:
+        self.required_names.add(name.split('.')[0])
+
+    def import_lines(self) -> List[str]:


It is difficult to follow the logic of this method. Maybe a doctring (and comments) will help?

ilevkivskyi · 2017-04-27T09:18:25Z

mypy/stubgen.py

+        self.module_for = {}  # type: Dict[str, Optional[str]]
+        self.direct_imports = {}  # type: Dict[str, str]
+        self.reverse_alias = {}  # type: Dict[str, str]
+        self.required_names = set()  # type: Set[str]


I would document the meaning of these four variables. Maybe just add a comment above every line?

Added these

ilevkivskyi · 2017-04-27T09:27:12Z

test-data/unit/stubgen.test

+import urllib.parse
+__all__ = ['urllib']
+[out]
+import urllib.parse


I think there should still be an error comment in output like

# Names in __all__ with no definition: # urllib

before or after the import statement.

Running import urllib.parse defines the urllib name and attaches it to the correct module, why this should be an error?

Yes, then ignore this comment (or add another test where this produce an error, like from urllib import parse)

ilevkivskyi · 2017-04-27T09:28:39Z

test-data/unit/stubgen.test

@@ -552,14 +602,148 @@ def f(a): ...
 [case testInferOptionalOnlyFunc]
 class A:
    x = None
-    def __init__(self, a=None) -> None:


I would also keep an additional test for __init__.

ilevkivskyi · 2017-07-15T13:28:47Z

@dmoisset Are you still working on this? I think this would be a useful addition (in particular in the context of helping maintainers of third-party libraries that want to support typing).

dmoisset · 2017-07-22T23:36:17Z

@ilevkivskyi I've been dealing with other stuff, I might take a look at this again during the rest of the month

ilevkivskyi · 2017-08-31T07:35:11Z

@dmoisset Sorry for pinging again, this is the oldest open PR now (and IMO it contains an important feature).

This also changes the style of initializer in generated stubs to be closer to PEP8 style, this implied a few test changes

PEP8 actually suggest this eception to the normal (no spaces) rule for annotated arguments

…ortTracker

ilevkivskyi · 2017-09-01T11:41:50Z

Thanks for an update! I replied to your questions above, will make a thorough review soon.

ilevkivskyi

Thanks! This is almost ready, just few minor comments/questions.

ilevkivskyi · 2017-09-01T11:46:15Z

mypy/stubgen.py


-    def get_init(self, lvalue: str, rvalue: Expression) -> Optional[str]:
+    def get_init(self, lvalue: str, rvalue: Expression,
+                 annotation: Optional[Type]=None) -> Optional[str]:


Missing space around =.

ilevkivskyi · 2017-09-01T11:50:50Z

mypy/stubgen.py

            init_stmt = arg_.initialization_statement
            if init_stmt:
+                if isinstance(init_stmt.rvalue, NameExpr) and init_stmt.rvalue.name == 'None':
+                    initializer = 'None'


This is a recent agreement that all optional types should be explicit in typeshed, and it is now checked with --no-implicit-optional. If the implicit Optional are added already at the parse stage, then I think the initializer should be always ..., even for optional types. Otherwise, we can keep None.

Not sure of the exact context here, but the typeshed convention is that all defaults should be ....

Ok removing this (actually the result was already ok, but this was a leftover when I tried to take advantage of the implicit optional)

ilevkivskyi · 2017-09-01T18:59:19Z

mypy/stubgen.py

@@ -35,6 +35,7 @@
 - we don't seem to always detect properties ('closed' in 'io', for example)
 """

+import builtins


Why do you need this import?

probably leftover of some previous change, removing...

ilevkivskyi · 2017-09-01T19:07:12Z

test-data/unit/stubgen.test


 [case testClassVariable]
 class C:
    x = 1
 [out]
 class C:
-    x = ...  # type: int
+    x: int


For class attributes it is important to know whether an attribute was just declared like x: int or actually defined like x: int = 1. There is a difference for protocol classes -- in former case an explicit subclass cannot be instantiated. Therefore, I think the translation for class attributes should be like this:

x: int -> x: int x = 1 -> x: int = ... x: int = 1 -> x: int = ...

OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have self.x = 1 in __init__). Should those be just x: int or should they also be x: int = ...?

OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have self.x = 1 in __init__). Should those be just x: int or should they also be x: int = ...?

I think it should be x: int = ... if there is an assignment in __init__ (again mostly from the protocols POW).

OK, I've made the change (and some others), I will be pushing un update after some testing

ilevkivskyi · 2017-09-01T19:09:31Z

test-data/unit/stubgen.test

@@ -367,19 +413,19 @@ from re import match, search, sub
 __all__ = ['match', 'sub', 'x']
 x = 1
 [out]
-from re import match as match, sub as sub


Don't forget about fixing this as discussed.

ilevkivskyi · 2017-09-01T19:12:43Z

test-data/unit/stubgen.test

+
+noalias1: Any
+noalias2: Any
+noalias3: bool


Do you preserve @abstractmethod and other decorators? I think it is important to preserve @abstractmethod, @classmethod, @staticmethod, and @property.

I didn't make any changes here wrt the original stubgen. I see some tests checking for property/staticmethod/classmethod being preserved, and other decorators removed (I think abstractmethod is not preserved). If you don't mind, I would add that as a separate issue to avoid scope creep here

OK, could you please open an issue for this?

gvanrossum · 2017-09-19T22:07:07Z

Would love to see this land -- who's waiting for whom?

ilevkivskyi · 2017-09-19T22:08:43Z

I am waiting for @dmoisset to implement few last comments.

As discussed in python/mypy#3169 (comment)

dmoisset · 2017-09-22T16:09:52Z

OK, I'm pushed a new update. There are a few issues that we've discussed in the comments that were actually issues in the previous version. Something that could be done if you agree is merge this and add these items as separate issues. What I have is:

There's no clear road to take to export names when __all__ is not defined Stubgen improvements: preserving existing annotations #3169 (comment) (The difficulty here is deciding what to do, implementation is easy); I checked a bit what tests should be updated and there will be a lot of import foo as foo everywhere to add and I'm worried that in general it's not what's desired.
stubgen should do semantic analysis instead of dealing with unbounded types Stubgen improvements: preserving existing annotations #3169 (comment) (this is a major change/rewrite)
stubgen considers as type aliases some assignments of values which may not be type aliases type_map[int] Stubgen improvements: preserving existing annotations #3169 (comment) (this could be easier to mitigate also by doing more semantic analysis)
make sure abstractmethod is passed through Stubgen improvements: preserving existing annotations #3169 (comment) (this should be a very small patch, but I didn't want to keep adding new stuff here)

dmoisset · 2017-09-22T16:18:59Z

(to clarify: I can add the issues, just wanted to sync on how we proceed)

gvanrossum · 2017-09-22T16:36:53Z

I haven't reviewed this but I am glad that it is moving forward and I hope we can iterate more quickly in the future! (We dropped the ball on code review here, sorry.)

dmoisset · 2017-09-22T16:52:43Z

Oh, the apologies are on me; code reviews have been very quick, and the delays have been on my side. My chances to work on this are limited so it can take me a long time to move forward after the reviews

ilevkivskyi · 2017-09-22T17:07:11Z

There's no clear road to take to export names when __all__ is not defined #3169 (comment) (The difficulty here is deciding what to do, implementation is easy)

The problem is that stubgen does not conform to PEP 484 both without and with your PR.
I think PEP 484 is clear here and we should just go with it, instead of reconsidering the as name rule again.

I checked a bit what tests should be updated and there will be a lot of import foo as foo everywhere to add and I'm worried that in general it's not what's desired.

These tests might be older than PEP 484 itself, so I am not surprised. We just need to update the tests, although I am fine to do this in a separate PR.

To summarize, I am fine with merging this PR as is to avoid unnecessary growth (it is already large). But, please open four separate issues for the problems you mentioned and mention this PR in every issue.

ilevkivskyi · 2017-09-22T17:32:56Z

@dmoisset
Thank you for working on this! This will make stubgen more usable, and I will be happy i fyou continue to work on the above mentioned issues.

gvanrossum · 2017-09-22T18:49:52Z

Thanks so much Daniel for working on this! And Ivan for reviewing. Stubgen doesn't get enough developer love.

ilevkivskyi reviewed Apr 13, 2017

View reviewed changes

dmoisset changed the title ~~Stubgen imrpovements: preserving existing annotations~~ Stubgen improvements: preserving existing annotations Apr 13, 2017

ilevkivskyi requested changes Apr 27, 2017

View reviewed changes

ilevkivskyi mentioned this pull request May 30, 2017

Preserve annotations and inferred types when generating stubs #3475

Open

gvanrossum assigned JukkaL May 30, 2017

ilevkivskyi mentioned this pull request Jun 29, 2017

How to typecheck "2nd" party packages? #3350

Closed

ilevkivskyi mentioned this pull request Jul 15, 2017

Third-party stubs: recommending a default path for installing stub files, overriding stubs python/typing#84

Closed

JukkaL removed their assignment Aug 15, 2017

ilevkivskyi self-assigned this Aug 31, 2017

dmoisset added 16 commits September 1, 2017 00:43

Stubgen using python3.6 variable syntax for stubs

df22528

Adjust stubgen tests to new output

cb5296d

Added annotation pass-through for variable annotations

da25e1c

Refactor handling on import/names and add imports for annotations

96de85c

Preserve function annotations

fb42258

This also changes the style of initializer in generated stubs to be closer to PEP8 style, this implied a few test changes

Add tests for preserving annotations

749b1ac

Fix linter checks

6fbbf60

Try to preserve typevars and type alias

4b98aa0

Pass through generic base classes

a85b3c2

Use space around '=' on annotated function defaults

a4ec686

PEP8 actually suggest this eception to the normal (no spaces) rule for annotated arguments

Fix broken tests after rebase

530792e

Clarify on the use of visit_call_Expr to handl TypeVar

dad8deb

Add test for infer Optional on __init__ arguments

e50e744

remove uoting in stub files, not needed

cb8d946

Add some clarifications in complex methods and data structures in Imp…

7302599

…ortTracker

Add test for deep, generic type alias

df0d5cf

dmoisset force-pushed the stubgen-patch branch from e9cfae2 to df0d5cf Compare September 1, 2017 11:12

ilevkivskyi reviewed Sep 1, 2017

View reviewed changes

dmoisset added 5 commits September 22, 2017 12:43

remove unused import

7a7c7a9

Remove code no longer required now that implicit optional is discouraged

83006ba

Reexport (with alias) names imported in __all__

a95e7a4

Reexport of modules, not just module attributes

dd09d38

Preserve initializer (with ...) in classes

6d383ac

dmoisset added a commit to dmoisset/peps that referenced this pull request Sep 22, 2017

Note on stubs and forward references

8772aa0

As discussed in python/mypy#3169 (comment)

Fix in annotation to avoid covariance issue

93d300a

ilevkivskyi approved these changes Sep 22, 2017

View reviewed changes

ilevkivskyi merged commit 4fc4ae2 into python:master Sep 22, 2017

ilevkivskyi mentioned this pull request Sep 22, 2017

Preserving existing annotations in stubgen #2106

Closed

dmoisset deleted the stubgen-patch branch September 22, 2017 18:01

This was referenced Sep 25, 2017

stubgen often generates incorrect imports #1782

Closed

Use ellipsis as default argument value in stubgen-generated stubs #1915

Closed

ilevkivskyi mentioned this pull request Sep 25, 2017

Stubgen: Added basic support for importing modules that base classes are a member of #3704

Closed

ilevkivskyi mentioned this pull request Sep 17, 2018

stubgen does not seem to retain member variable type hint #5616

Closed

		self._state = CLASS

		def is_type_expression(self, expr: Expression, top_level: bool=True) -> bool:

Stubgen improvements: preserving existing annotations #3169

Stubgen improvements: preserving existing annotations #3169

Conversation

dmoisset commented Apr 13, 2017

dmoisset commented Apr 13, 2017

ilevkivskyi Apr 13, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi commented Jul 15, 2017

dmoisset commented Jul 22, 2017

ilevkivskyi commented Aug 31, 2017

ilevkivskyi commented Sep 1, 2017

ilevkivskyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvanrossum commented Sep 19, 2017

ilevkivskyi commented Sep 19, 2017

dmoisset commented Sep 22, 2017

dmoisset commented Sep 22, 2017

gvanrossum commented Sep 22, 2017 via email

dmoisset commented Sep 22, 2017

ilevkivskyi commented Sep 22, 2017

ilevkivskyi commented Sep 22, 2017

gvanrossum commented Sep 22, 2017

ilevkivskyi Apr 13, 2017 •

edited