Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and support module aliasing via assignment. #3435

Merged
merged 25 commits into from Jun 3, 2017
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ad646b9
Detect and support module aliasing via assignment.
carljm May 24, 2017
c498e2a
Handle chained assignment and iterable unpacking assignment.
carljm May 24, 2017
fdb3af1
Remove test case that was only for exploration, not intended for incl…
carljm May 25, 2017
14d7d0f
Merge branch 'master' into module-alias
carljm May 30, 2017
ee8bd80
Add some more tests for module assignment.
carljm May 30, 2017
7efbdb7
Also add tests for access/assignment of nonexistent module attribute.
carljm May 30, 2017
fd3eb84
Break down code and add comments for clarity; add test for mismatch l…
carljm May 30, 2017
4006d0a
Naming improvements.
carljm May 30, 2017
7aeb65f
Merge branch 'master' into module-alias
carljm May 31, 2017
6dc0757
Support tracking module assignment in non-global scope.
carljm May 31, 2017
9aa8169
Add more tests for unpacking mismatch cases.
carljm May 31, 2017
6e7cd24
Keep rvals always on the right.
carljm May 31, 2017
d3d8f9f
Don't use a form of unpacking that is Py35+ only.
carljm May 31, 2017
1cbe94c
It's the zip that is problematic, not just the unpacking.
carljm May 31, 2017
325ae94
Add tests for module assignment in class and local scopes.
carljm May 31, 2017
2326425
Merge branch 'master' into module-alias
carljm Jun 3, 2017
9592ce9
Simplify to single method.
carljm Jun 3, 2017
f438f9c
Go back to annotating genericpath as Any in stdlib-sample.
carljm Jun 3, 2017
e1ce5b8
Respect explicit type annotation and don't propagate module reference.
carljm Jun 3, 2017
c51a916
Merge branch 'master' into module-alias
carljm Jun 3, 2017
4d40207
Backtrack to simple ModuleType var in case of incompatible module ali…
carljm Jun 3, 2017
a8a4f44
Remove stray pdb comment.
carljm Jun 3, 2017
6584b0b
Also handle reassignment of an original (non-alias) module reference.
carljm Jun 3, 2017
b292673
Simplify: fail on assignment of different modules to same variable wi…
carljm Jun 3, 2017
f62d58e
Style and working tweaks.
carljm Jun 3, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
59 changes: 59 additions & 0 deletions mypy/semanal.py
Expand Up @@ -1520,6 +1520,7 @@ def visit_assignment_stmt(self, s: AssignmentStmt) -> None:
self.process_namedtuple_definition(s)
self.process_typeddict_definition(s)
self.process_enum_call(s)
self.process_module_assignment(s)

if (len(s.lvalues) == 1 and isinstance(s.lvalues[0], NameExpr) and
s.lvalues[0].name == '__all__' and s.lvalues[0].kind == GDEF and
Expand Down Expand Up @@ -2356,6 +2357,64 @@ def is_classvar(self, typ: Type) -> bool:
def fail_invalid_classvar(self, context: Context) -> None:
self.fail('ClassVar can only be used for assignments in class body', context)

def process_module_assignment(self, s: AssignmentStmt) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a function that just unconditionally calls another function? Maybe one is enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. This was to present a cleaner API (pass in only the AssignmentStmt), but given there's only one call site and will never be more, that's not very useful. Consolidated.

"""Check if s assigns a module an alias name; if so, update symbol table."""
self._process_module_assignment(s.lvalues, s.rvalue, s)

def _process_module_assignment(
self,
lvals: List[Expression],
rval: Expression,
ctx: AssignmentStmt,
) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically this style is not used in mypy. You can just wrap argument list like this:

    def process_module_assignment(self, lvals: List[Expression], rval: Expression,
                                  ctx: AssignmentStmt) -> None:

"""Propagate module references across assignments.

Recursively handles the simple form of iterable unpacking; doesn't
handle advanced unpacking with *rest, dictionary unpacking, etc.

In an expression like x = y = z, z is the rval and lvals will be [x,
y].

"""
if all(isinstance(v, (TupleExpr, ListExpr)) for v in lvals + [rval]):
# rval and all lvals are either list or tuple, so we are dealing
# with unpacking assignment like `x, y = a, b`. Mypy didn't
# understand our all(isinstance(...)), so cast them as
# Union[TupleExpr, ListExpr] so mypy knows it is safe to access
# their .items attribute.
seq_lvals = cast(List[Union[TupleExpr, ListExpr]], lvals)
seq_rval = cast(Union[TupleExpr, ListExpr], rval)
# given an assignment like:
# (x, y) = (m, n) = (a, b)
# we now have:
# seq_lvals = [(x, y), (m, n)]
# seq_rval = (a, b)
# We now zip this into:
# elementwise_assignments = [(a, x, m), (b, y, n)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put a and b on the right here and in the actual zip? (It will be more logical and will match the order of arguments in _process_module_assignment.)

# where each elementwise assignment includes one element of rval and the
# corresponding element of each lval. Basically we unpack
# (x, y) = (m, n) = (a, b)
# into elementwise assignments
# x = m = a
# y = n = b
# and then we recursively call this method for each of those assignments.
# If the rval and all lvals are not all of the same length, zip will just ignore
# extra elements, so no error will be raised here; mypy will later complain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somehow not convincing. Could you please add a test that

import m
x, y = m, m, m

is flagged as an error? (Plus also some length mismatches in nested scenarios.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesterday I added a test for the x, y, z = m, n case; will also push a test for x, y = m, m, m

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and also for length mismatches in nested scenarios.

# about the length mismatch in type-checking.
elementwise_assignments = zip(seq_rval.items, *[v.items for v in seq_lvals])
for rv, *lvs in elementwise_assignments:
self._process_module_assignment(lvs, rv, ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks a bit unclear and probably unnecessary. We need to support only x, y = mod1, mod2 (with equal length of l.h.s and r.h.s.) and x = y = mod but not nested cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific what you find unclear about this code so I can improve the clarity? When I rewrite this code to not support nesting, it doesn't make it any simpler or clearer. The recursive approach is still the simplest even for a single layer of iterable (otherwise you have to do something else weird to support both iterable and single rval), and once you have recursion, not supporting nesting is more complex than supporting it.

I agree that nested unpacked module assignment is likely very rare, but it seems strange to me to reduce compatibility with true Python semantics in order to achieve (AFAICS) a negligible or nonexistent gain in clarity/simplicity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to agree that with all the comprehensions, casts, repetitions of (TupleExpr, ListExpr) and cleverness with zip and *x, I have no confidence that I understand what this code does. And then I'm not even touching on recursion. So the code appears unmaintainable, and I think it needs to be improved (maybe some comments would help).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! Pushed an update that breaks down the code a bit and adds comments to clarify the intent. Let me know if that helps.

elif isinstance(rval, NameExpr):
rnode = self.lookup(rval.name, ctx)
if rnode and rnode.kind == MODULE_REF:
for lval in lvals:
if not isinstance(lval, NameExpr):
continue
lnode = self.lookup(lval.name, ctx)
if lnode:
lnode.kind = MODULE_REF
lnode.node = rnode.node

def process_enum_call(self, s: AssignmentStmt) -> None:
"""Check if s defines an Enum; if yes, store the definition in symbol table."""
if len(s.lvalues) != 1 or not isinstance(s.lvalues[0], NameExpr):
Expand Down
4 changes: 2 additions & 2 deletions test-data/stdlib-samples/3.2/test/test_genericpath.py
Expand Up @@ -23,7 +23,7 @@ def safe_rmdir(dirname: str) -> None:

class GenericTest(unittest.TestCase):
# The path module to be tested
pathmodule = genericpath # type: Any
pathmodule = genericpath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should keep the Any annotation here and not add # type: ignore below. This makes me think about another important aspect, if there is an explicit type annotation in a module assignment, then it should not create a module alias, but just a normal variable with that type. This will be consistent with how type aliases behave (It would be great to document this behaviour, but not necessarily in this PR):

import m
import types
class A: ...
class B(A): ...
Alias = A # creates a type alias
Var: Type[A] = A # creates a normal variable
mod = m # creates a module alias
mod.a # OK
mod_var: types.ModuleType = m # creates a normal variable
mod_var.a # Fails, ModuleType has no attribute a

And don't forget to add a test for module assignment with an explicit type (both types.ModuleType and Any).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so taking several things separately:

  1. Why is it better to keep # Any annotation for the module instead of # type: ignore in the one specific location mypy doesn't understand? It seems to me the latter provides a better test case for mypy, as it will validate all other attribute accesses on the module. If this were in my own code base, I would definitely prefer the # type: ignore on one special-case line over an # Any annotation that turns off all checking for that variable.

  2. Nonetheless, an explicit annotation of Any type should override the module-ref propagation; that was a bug in this PR, very good catch! Pushing fix and test.

  3. I don't understand why an explicit annotation of types.ModuleType should prevent propagation of the module-reference information. It doesn't seem consistent with how class references work. To extend the above example slightly:

from typing import Type
class A:
    foo = 'bar'
var: Type[A] = A
reveal_type(var.foo)

This code type-checks fine, and reveals a type of builtins.str. To me, this is perfectly parallel to:

import m
import types
mod: types.ModuleType = m
reveal_type(m.a)

I would definitely expect the latter code to succeed and reveal type of builtins.str, not fail with Module has no attribute "a". Why does giving an explicit type of types.ModuleType imply that I want mypy to play dumb about which module it is and what attributes it has?

I've pushed the behavior that makes sense to me for now; will be happy to update once I understand why.

Copy link
Member Author

@carljm carljm Jun 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, after further thought I think I understand your point: "type alias" is to "var of type Type" as "module alias" is to "var of type types.ModuleType." I'm still not sure that this provides the most intuitive or useful behavior, given that type aliases and module references are used in different ways. I just tested and I can still pass a real module reference to a function that takes a parameter annotated as types.ModuleType -- is there anything I can't do with a real module reference that I can do with an ordinary variable of types.ModuleType? Put differently, given your preferred behavior, what real-world use case is there to annotate a variable as types.ModuleType?

Anyway, changed my mind and pushing all changes as you suggest. Still would like to understand the reasoning, though.

common_attributes = ['commonprefix', 'getsize', 'getatime', 'getctime',
'getmtime', 'exists', 'isdir', 'isfile']
attributes = [] # type: List[str]
Expand Down Expand Up @@ -143,7 +143,7 @@ def test_exists(self) -> None:
f.close()
self.assertIs(self.pathmodule.exists(support.TESTFN), True)
if not self.pathmodule == genericpath:
self.assertIs(self.pathmodule.lexists(support.TESTFN),
self.assertIs(self.pathmodule.lexists(support.TESTFN), # type: ignore
True)
finally:
if not f.closed:
Expand Down
79 changes: 79 additions & 0 deletions test-data/unit/check-modules.test
Expand Up @@ -1439,3 +1439,82 @@ class C:
a = 'foo'

[builtins fixtures/module.pyi]

[case testModuleAssignment]
import m
m2 = m
reveal_type(m2.a) # E: Revealed type is 'builtins.str'
m2.b # E: Module has no attribute "b"
m2.c = 'bar' # E: Module has no attribute "c"

[file m.py]
a = 'foo'

[builtins fixtures/module.pyi]

[case testClassModuleAssignment]
import m

class C:
x = m
def foo(self) -> None:
reveal_type(self.x.a) # E: Revealed type is 'builtins.str'

[file m.py]
a = 'foo'

[builtins fixtures/module.pyi]

[case testLocalModuleAssignment]
import m

def foo() -> None:
x = m
reveal_type(x.a) # E: Revealed type is 'builtins.str'

class C:
def foo(self) -> None:
x = m
reveal_type(x.a) # E: Revealed type is 'builtins.str'

[file m.py]
a = 'foo'

[builtins fixtures/module.pyi]

[case testChainedModuleAssignment]
import m
m3 = m2 = m
m4 = m3
m5 = m4
reveal_type(m2.a) # E: Revealed type is 'builtins.str'
reveal_type(m3.a) # E: Revealed type is 'builtins.str'
reveal_type(m4.a) # E: Revealed type is 'builtins.str'
reveal_type(m5.a) # E: Revealed type is 'builtins.str'

[file m.py]
a = 'foo'

[builtins fixtures/module.pyi]

[case testMultiModuleAssignment]
import m, n
m2, n2, (m3, n3) = m, n, [m, n]
reveal_type(m2.a) # E: Revealed type is 'builtins.str'
reveal_type(n2.b) # E: Revealed type is 'builtins.str'
reveal_type(m3.a) # E: Revealed type is 'builtins.str'
reveal_type(n3.b) # E: Revealed type is 'builtins.str'

x, y = m # E: 'types.ModuleType' object is not iterable
x, y, z = m, n # E: Need more than 2 values to unpack (3 expected)
x, y = m, m, m # E: Too many values to unpack (2 expected, 3 provided)
x, (y, z) = m, n # E: 'types.ModuleType' object is not iterable
x, (y, z) = m, (n, n, n) # E: Too many values to unpack (2 expected, 3 provided)

[file m.py]
a = 'foo'

[file n.py]
b = 'bar'

[builtins fixtures/module.pyi]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see few more tests. For example, a test that shows this failing:

import m
x, y = m  # must be an error here (Module object is not iterable or similar)

and in general more failures, like failure on attempted access/assignment to a non-existing module attribute.

Also probably some chained assignments:

import m
x = m
y = x
reveal_type(y.a) # 'builtins.str'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Very good point, will add these tests.