Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mypy errors with variable reuse with different types #1174

Closed
timabbott opened this issue Jan 27, 2016 · 27 comments
Closed

Mypy errors with variable reuse with different types #1174

timabbott opened this issue Jan 27, 2016 · 27 comments
Assignees

Comments

@timabbott
Copy link

from typing import *

class Test(object):
    x = 1
    y = 2

for sub in [(1, 2)]:
    pass

subs = {} # type: Dict[Tuple[int, int], Test]                                                                                 
for sub in [Test()]:
    subs[(sub.x, sub.y)] = sub

gives

/home/tabbott/foo.py:11: error: Incompatible types in assignment (expression has type "Test", variable has type "Tuple[int, int]")
/home/tabbott/foo.py:12: error: Tuple[int, ...] has no attribute "x"
/home/tabbott/foo.py:12: error: Tuple[int, ...] has no attribute "y"
/home/tabbott/foo.py:12: error: Incompatible types in assignment

Arguably this is not perfect code, but probably this shouldn't be a mypy error?

@gvanrossum
Copy link
Member

I think this is a duplicate of a common theme: variable reuse within one function body. Mypy really prefers that you don't do that. (I can't find a specific issue to point to though.)

@JukkaL JukkaL changed the title Mypy errors with loop variable reuse with different types Mypy errors with variable reuse with different types Feb 4, 2016
@JukkaL
Copy link
Collaborator

JukkaL commented Feb 4, 2016

Yeah, this one has been discussed many times though there doesn't seem to be a github issue for this.

I'm not really sure about what's the best way to approach this. Perhaps mypy should allow redefining a variable with a different type by default and provide a strict mode or option which disallows this.

A general solution would be to logically define a fresh variable each time an assignment redefines a variable on all possible code paths. This would define a new variable, since the old definition can't reach beyond the second assignment:

if c:
    x = 1
x = ''  # new variable x

This would not define a new variable:

x = 1
if c:
    x = ''  # does not define a new variable (error)
print(x)

Not sure about his:

x = 1
print(x)
if c:
    x = ''  # ??
    print(x)
# x not read here any more

A less general approach would be to only allow redefining with a new type only if the original definition didn't have a type annotation and the new type definition doesn't have a type annotation. So this would be okay:

x = 1
...
x = ''

But this would not be okay because the annotation is lying:

def f(x: int) -> None:
    x = str(x)  # error?

And this would not be okay:

x = []  # type: List[int]
...
x = 3  # error?

@ddfisher ddfisher added this to the Future milestone Mar 1, 2016
@matthiaskramm
Copy link
Contributor

For what it's worth, pytype always allows you do redefine a variable with a different type. We're essentially doing SSA.

On the other hand, that also means that pytype considers code like this correct:

def f(x: int):
  x = "foo"

I know it looks odd, but allowing this pattern is is quite useful when adding function annotations to existing code.

@JukkaL
Copy link
Collaborator

JukkaL commented Apr 14, 2016

I propose moving this to an earlier milestone such as 0.4.0, at least tentatively, since issues like these a little painful to refactor, and they can generate a lot of noise. This is more of a style issue than correctness issue, and I'd rather not make mypy very opinionated about style issues. An optional flag that causes mypy to complain about questionable redefinitions would be better in my opinion.

@ddfisher
Copy link
Collaborator

ddfisher commented Apr 14, 2016

Have we run into this issue a lot internally? It looks to me like this'll be a lot of work -- we'd have to start doing a significant amount of control flow analysis, and I expect doing that properly will be somewhat difficult.

@JukkaL
Copy link
Collaborator

JukkaL commented Apr 14, 2016

I've hit this in some open source code I've experimented with. If we run
mypy against a lot of code without annotations using --implicit-any we
should get an estimate of the prevalence via type errors.
On Thu, Apr 14, 2016 at 21:07 David Fisher notifications@github.com wrote:

Have we run into this issue a lot internally? It looks to me like this'll
be a lot of work -- we'd have to start doing a significant amount of
control flow analysis.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#1174 (comment)

@timabbott
Copy link
Author

When you're checking this out, I'd recommend at least also checking the tests directories for the projects; my experience was that our test code had a much higher ratio of these than the rest of the codebase.

@ddfisher
Copy link
Collaborator

Also just FYI --implicit-any is now --check-untyped-defs.

@gvanrossum
Copy link
Member

It's --check-untyped-defs now.

I ran this over a small corpus (under 150K LOC) and found, among a total of over 1200 errors, about 30 occurrences of this message (though 9 of these didn't have the (expression has type "xxx", variable has type "yyy") suffix -- those were from assignments into container items like x[a] = b).

A bunch (indeed most common in tests) were unrelated types, but in many cases arguably the error was due to a too narrowly inferred initial type. E.g.

x = randint()
x = randint() / 1000.0

A bunch were similar but the two types involved were different subclasses of the same superclass. Common was also assignment of differently sized tuples to a variable, e.g.

x = ()
x = (1,)
x = (1, 2)

I've also seen things like

if ...:
    iter_ = [some list]
else:
    iter_ = <some iterable>
for i in iter_: ...

Other complicating factors were that a few times the assignments occurred in except clauses or other "blocks" where the variable wasn't used after exiting the blocks.

All in all I do think this is a popular idiom that we ought to support better.

@JukkaL
Copy link
Collaborator

JukkaL commented Apr 15, 2016

Perhaps we could start with special casing some common idioms instead of trying to support all of the possible idioms. Here my concern is that the rules that mypy follows should be easy to describe and predictable, and approaches like SSA wouldn't really fit this description.

Here are some idioms that would be easy enough to support and that I think are pretty common.

1) Redefined in the same block

Examples:

def f(x: int) -> None:
    x = str(int)
    print(x)

y = 1
print(y)
y = ''
print(y)

Variables within for, while and try (what about with?) statements would be harder as an intermediate value may escape.

Here nested blocks would be considered separate blocks, and we consider function arguments to be defined in the same block as the function body.

2) Variable never read outside statement that defines it

Examples:

for x in (1, 2) : ...
for x in ('a', 'b'): ...
# x not read outside 'for' statements

_ = 1  # variable defined but never read
_ = ''  

Also 'with'. For try/except this might already work. For module-level variables we'd take the last definition as the type of the variable, but if there is a conditional definition things get harder.

3) Conditional definitions don't agree on type

This is actually a separate issue as here we have to infer a single type for a variable. Example (from above):

if ...:
    x = [...]
else:
    x = <iterable>
y = x

@gvanrossum
Copy link
Member

Sounds like a common theme is that if variables are defined and redefined at the same indentation level ("in the same block") and there's no local flow control in between we should check the code between the definitions using the first definition and after that use the second (last) definition. E.g.

if cond():
    x = 1
    x+1
    x = ''
else:
    x = [1, 2, 3]
    x.append(4)
    x = ''

Should work because the ultimate type in each block is the same.

However this is questionable:

while cond():
    x = f()
    if x < 0:
        break
    x = ''
# The type of x here could be int or str.

In a try block everything is questionable:

try:
    x = 0
    foo(x)
    x = ''
except Exception:
    pass
# Here x could be int or str.

A future improvement could not care if there's no use afterwards.

For branches that don't agree we could have some kind of unification (like I want for conditional expressions e.g. #1094).

@JukkaL JukkaL modified the milestones: 0.5, Future Apr 16, 2016
@JukkaL
Copy link
Collaborator

JukkaL commented Apr 16, 2016

Moved to 0.5 (in some generality).

@JukkaL
Copy link
Collaborator

JukkaL commented May 30, 2017

@rhettinger showed me an example which redefines a set as a frozenset, and apparently he sees cases similar to this pretty often. Here is a sketch (I may have forgotten about some specifics, though):

def f() -> None:
    items: Set[int] = set()
    # populate s with values
    items: FrozenSet[int] = frozenset(items)
    # do stuff with items; we want to make sure it doesn't get mutated here

@JukkaL
Copy link
Collaborator

JukkaL commented Jan 21, 2019

Basic implementation was added in #6197, behind a flag. At the moment only simple cases where the redefinition happens in the same block as the original definition, and at the same nesting level, are supported. I'll create follow-up issues to make this more general. I'd still like to enable this by default at some point.

@JukkaL JukkaL closed this as completed Jan 21, 2019
stephenmcgruer added a commit to web-platform-tests/wpt that referenced this issue May 19, 2020
Reusing variables can cause a mypy type error (python/mypy#1174).
This previously went unnoticed because the previous and current use of `key` was `str`, but in
#23644 the previous use was retyped to `Text`
and as such caused a unicode-vs-str type error in mypy.
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue May 27, 2020
…, a=testonly

Automatic update from web-platform-tests
Rename variable to avoid mypy type error (#23669)

Reusing variables can cause a mypy type error (python/mypy#1174).
This previously went unnoticed because the previous and current use of `key` was `str`, but in
web-platform-tests/wpt#23644 the previous use was retyped to `Text`
and as such caused a unicode-vs-str type error in mypy.
--

wpt-commits: 7b9a66f8a9cf68bbe220473f9eff0e652998c072
wpt-pr: 23669
xeonchen pushed a commit to xeonchen/gecko that referenced this issue May 27, 2020
…, a=testonly

Automatic update from web-platform-tests
Rename variable to avoid mypy type error (#23669)

Reusing variables can cause a mypy type error (python/mypy#1174).
This previously went unnoticed because the previous and current use of `key` was `str`, but in
web-platform-tests/wpt#23644 the previous use was retyped to `Text`
and as such caused a unicode-vs-str type error in mypy.
--

wpt-commits: 7b9a66f8a9cf68bbe220473f9eff0e652998c072
wpt-pr: 23669
@alex4747-pub
Copy link

Is there any solution? I am using mypy 0.770, I am porting python2 code where I have hundred cases like the one below. I cannot afford giving new name to every variable - it will introduce a lot of changes that will be really hard to verify.

I am ok with adding code or "# type: " directive at the "Point X". I tried using 'del val' it did not work. All I want is to forget the type of this variable at the point x (I want to keep type information in between though). And the 'val' could be a class.

def func_5() -> int:
    return 5

def func_x() -> str:
    return "x"


def example(t1: bool, t2: bool) -> bool:
    if t1:
        val = func_5()
        if val != 5:
            return False

    # Point X

    if t2:
        val = func_x()
        if val != "x":
            return False

    return True

@ethanhs
Copy link
Collaborator

ethanhs commented Nov 14, 2020

@alex4747-pub try --allow-redefinition.

@alex4747-pub
Copy link

Did not help:
mypy --allow-redefinition example.py
example.py:18: error: Incompatible types in assignment (expression has type "str", variable has type "int")
Found 1 error in 1 file (checked 1 source file)

@uranusjr
Copy link

From documentation: Only redefinitions within the same block and nesting depth as the original definition are allowed.

Your val is redefined in an independent block, and is therefore not allowed. Such variable declaration style is usually frawned upon outside of type hints as well, since it is too easy to accidentally break such code with seemingly unrelated changes.

@alex4747-pub
Copy link

I understand this but unfortunately there is a lot of python2 code to be ported to python3 and it is written in this style.

pavpen added a commit to pavpen/cadquery that referenced this issue Feb 15, 2021
* Updated `Edge.makeSpline()` to use different temporary variables in
the several for loops in the function body. Python for
[loop target variables
leak](https://eli.thegreenplace.net/2015/the-scope-of-index-variables-in-pythons-for-loops/)
to the function (or enclosing module) scope, and MyPy doesn't accept
[reusing the same variable name with a different
type](python/mypy#1174) (except in specific
cases, and when `--allow-redefinition` is used).
PIG208 added a commit to PIG208/zulip that referenced this issue Jul 26, 2021
Mypy doesn't allow redefinition of a variable using a different type
within the same scope.
python/mypy#1174
PIG208 added a commit to PIG208/zulip that referenced this issue Jul 26, 2021
Mypy doesn't allow redefinition of a variable using a different type
within the same scope.
python/mypy#1174
PIG208 added a commit to PIG208/zulip that referenced this issue Jul 26, 2021
Mypy doesn't allow redefinition of a variable using a different type
within the same scope.
python/mypy#1174
timabbott pushed a commit to zulip/zulip that referenced this issue Jul 26, 2021
Mypy doesn't allow redefinition of a variable using a different type
within the same scope.
python/mypy#1174
shawnbrown added a commit to shawnbrown/EnvLauncher that referenced this issue Sep 11, 2021
Mypy just doesn't like this when it's done in the local scope
(see python/mypy#1174).
Lai-YT added a commit to Lai-YT/webcam-applications that referenced this issue Jan 14, 2022
It's about "variable reuse with different types".
Here's one of the issues:
python/mypy#1174
@tsibley
Copy link

tsibley commented Mar 9, 2022

Another example of where --allow-redefinition isn't sufficient because of the "same block" rule:

try:
    # fetch list of numbers (as strings) from somewhere,
    # like argv, an external command's output, an HTTP request, etc.
    x: List[str] = ...
except ...:
    # handle fetch error
    ...
else:
    # convert to actual ints when fetched successfully
    x: List[int] = list(map(int, x))

tsibley added a commit to nextstrain/cli that referenced this issue Mar 9, 2022
Mypy really doesn't like redefinitions¹, and we start with List[str] but
then convert to List[int].  The limited support for redefinition
(--allow-redefinition) doesn't apply here because two different code
blocks are involved.

This particular shift of code into the try block doesn't impact the
handling of exceptions.

¹ python/mypy#1174
@solsword
Copy link

solsword commented Apr 3, 2022

Adding a comment here despite being closed because #2806 refers here, and #6263 (which is open) could also be addressed, plus I think there's an opportunity to improve lots of code. @alex4747-pub 's comment above could also be resolved. No idea how much work it would be, but why not treat del as ending a variable, allowing any redefinition afterwards? If branch analysis for del positioning is too hard, a good first step would be to allow just same-block del to work?

Examples of how this is useful:

One common issue in this thread and other bugs:

for x in some_list:
    ...

for x in other_list:
    ...

MyPy complains about redefinition of x with a different type, but this is a common pattern. Sure, renaming those variables is probably a good idea if they hold different types, but there's a whole class of bugs where x is used outside of a loop unintentionally, leading to strange behavior. It would actually be safer to write:

for x in some_list:
    ...
del x

for x in other_list:
    ...
del x

These dels ensure at runtime that nobody will accidentally use the loop variable afterwards (and if someone wanted to do so intentionally they could just leave off the del). This would also somewhat satisfy the wish for scoping (in a Pythonic way) expressed here:

#6232 (comment)

So if mypy matched the semantics of del at runtime, people who run into this would find pointers to del their variables and would end up with cleaner code.

My actual use case is conditionals in a loop; I'm doing parsing and multiple branches want to use the same variable names (with the SAME types) but declaring the type locally is not allowed in the second instance. A toy example:

for line in toParse:
    if lineType(line) == 'a':
        name: str = line.split()[0]
    elif lineType(line) == 'b':
        name: str = line.split()[1]

Here mypy complains that the second ": str "is a redefinition, which it is, and I can get it to be quiet if I just remove the type hint. But when the two cases are dozens of lines of code apart, I'd really like to have the second type hint, because conceptually they're different variables. And in fact, the situation is smelly since the variables can leak, and would be improved as:

for line in toParse:
    if lineType(line) == 'a':
        name: str = line.split()[0]
        ...
        del name
    elif lineType(line) == 'b':
        name: str = line.split()[1]
        ...
        del name

With this new code, I've explicitly tied down the variables that should be scoped to individual blocks. I won't run into errors where I accidentally use 'name' without defining it somewhere else, and instead of a warning from my linter I end up using a value from one of these inner cases. But even with the del there, mypy still complains about the second type hint. If mypy just respected del (which I know, might be hard to implement) then we'd be funneling users towards this positive pattern, and could say "just del it" to a number of recurring questions about variable reuse (not all of them, since x = str(x) doesn't have space for a del and is conceptually different).

If I should put this in as a separate feature request I'm happy to re-file it.

Note despite what is written here:

#2806 (comment)

Python at runtime gives a NameError if one attempts to use a deled variable.

One more note: mypy behaves strangely for this program:

x = 1

del x

x: str = 'hi'

print(x)

It both complains about redefinition, and claims the print is an attempt to access a deleted variable. If run with --allow-redefinition then it actually drops both complains, which is good, but also indicates that the support for detecting variable deletions (at least in same block) may already exist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests