mypy does not recognize conversion between fixed length list and tuple #7509

boompig · 2019-09-13T19:35:53Z

Are you reporting a bug, or opening a feature request?

I believe this is a bug

Please insert below the code you are checking with mypy,
or a mock-up repro if the source is private. We would appreciate
if you try to simplify your case to a minimal repro.

Minimal example to reproduce:

from typing import List, Tuple


def get_point_as_list() -> List[int]:
    return [1, 2]


def point_to_tuple(pt: List[int]) -> Tuple[int, int]:
    assert len(pt) == 2
    return tuple(pt)


if __name__ == "__main__":
    l = get_point_as_list()
    p = point_to_tuple(l)
    print(repr(p))

What is the actual behavior/output?

mypy test.py

test.py:10: error: Incompatible return value type (got "Tuple[int, ...]", expected "Tuple[int, int]")

What is the behavior/output you expect?

mypy should evaluate this as correct code

What are the versions of mypy and Python you are using?
Do you see the same issue after installing mypy from Git master?

$ mypy --version
mypy 0.720

$ python --version
Python 3.7.3

The text was updated successfully, but these errors were encountered:

JukkaL · 2019-09-16T12:26:45Z

This is a design limitation of mypy. Mypy currently doesn't even try to keep track of the lengths of list objects. However, issues similar to this come up every once in a while, and this is something mypy might support at some point in the future.

You can work around this limitation by using a cast or a # type: ignore comment.

jamesohortle · 2019-11-18T01:28:31Z

Is there an issue for Sequences of fixed length that I can watch for developments? I am also having the same issues and can see a few use cases (e.g., points in Euclidean space, etc). I mostly just use Sequence[<type>, ...] for my own code.

JukkaL · 2019-11-18T11:46:15Z

We can use this issue to track both fixed-length lists and sequences for now.

simsa-st · 2020-06-05T12:27:06Z

I would also appreciate this feature. My use-case is even simpler, converting fixed-length tuples to fixed-length tuples. Specifically I would like this to work:

from typing import Tuple

t: Tuple[int, int] = (0, 0)
t = tuple(x for x in t)

NeilGirdhar · 2020-06-22T22:13:35Z

This comes up a lot when you're splitting strings:

    return dict(kv.split('=') for kv in ['a=b', 'cdf=ghi'])

It might be nice to have a special split method on string like split_exact(sep: str, n: int) -> SizedTuple[n, str] that somehow binds the n from the call to the SizedTuple. My proposal is impossible unless the issue of fixed length lists or tuples is solved.

thomasgilgenast · 2020-07-05T22:58:04Z

I have run into this as well and wanted to summarize my understanding of the three possible workarounds based on the comments from @JukkaL, @simsa-st, and @NeilGirdhar above.

Rewrite code to avoid lists and generators

A workaround that can be used even in the absence of SizedTuple is to manually write a fixed-size version of the size-naive function you were previously calling. For example, to handle @NeilGirdhar's string splitting use case:

def split_2(s: str, on: str) -> Tuple[str, str]:
    pieces = s.split(on)
    return (pieces[0], pieces[1])

def h() -> Dict[str, str]:
    return dict(split_2(kv, '=') for kv in ['a=b', 'cdf=ghi'])

One caveat here is that the return value of str.split() is not sized, so this will not catch the IndexError thrown when the string you are splitting does not have any = in it. I don't think this kind of IndexError is possible to catch before runtime, though.

Similarly when sorting or doubling every element in a tuple, you can do:

T = TypeVar('T', str, int, float) 

def sort_2(x: Tuple[T, T]) -> Tuple[T, T]:
    if x[0] <= x[1]:
        return x
    return (x[1], x[0])

def double_2(x: Tuple[T, T]) -> Tuple[T, T]:
    #return tuple(y * 2 for y in x)  # can't use generator in mypy
    return (x[0] * 2, x[1] * 2)

def g(x: Tuple[str, str]) -> Tuple[str, str]:
    return sort_2(double_2(x))

In this example sort_2() avoids creating a list unlike the builtin sorted(), and double_2() avoids using a generator (which is I think where @simsa-st's example loses the size information).

In summary, neither lists nor generators have lengths in mypy, but it is often possible to rewrite code so that it avoids using lists or generators and uses only tuples (which can have lengths in mypy).

Ignore the mypy error via comment or cast

An alternative perspective is that sizing is "too much work" or "too unpythonic" to implement this way (we do love generator expressions after all), and so instead of the previous example, we can write (based on @JukkaL's suggestion):

def g_ignore(x: Tuple[str, str]) -> Tuple[str, str]:
    return tuple(sorted(y * 2 for y in x))  # type: ignore

or

def g_ignore(x: Tuple[str, str]) -> Tuple[str, str]:
    return cast(Tuple[str, str], tuple(sorted(y * 2 for y in x)))

These still gain the "benefits" of a sized return type, i. e.

def f() -> str:
    return g_ignore(('t', 's'))[2]

still causes mypy to throw error: Tuple index out of range as expected.

My preliminary understanding is that the main drawback of ignore/cast is that each one you write creates the possibility of false negatives occuring in the future whenever the surrounding code is modified.

Accept the unsized tuple type

If you don't care about catching the index out of range error, you can write:

def g_nosize(x: Tuple[str, ...]) -> Tuple[str, ...]:
    return tuple(sorted(y * 2 for y in x))

which passes mypy but fails to catch the index out of range error in f().

This workaround does not apply to @NeilGirdhar's use case, because dict() does not accept an unsized input.

NeilGirdhar · 2020-07-05T23:22:02Z

@thomasgilgenast That's a good summary of what we can do right now, but we don't want to work around this problem. We want mypy to keep track of sequence lengths. We want:

x: Sequence[int]
if len(x) == 2:
    y: Tuple[int, int] = tuple(x)

We want this to happen without any casting or ignoring or auxilliary functions.

dandersson · 2020-10-12T15:31:08Z

This comes up a lot when you're splitting strings:
    return dict(kv.split('=') for kv in ['a=b', 'cdf=ghi'])

Yes, I found this thread after having attempted to simplify

    return {k: v for k, v in (var.split('=', maxsplit=1) for var in env)}

into

    return dict(var.split('=', maxsplit=1) for var in env)

and then saw the Mypy-complaints.

So yet another workaround that passes at least Mypy 0.720 checks is to go via a dictionary comprehension with tuple unpacking like the original form above. That will instead trigger a Pylint warning:

R1721: Unnecessary use of a comprehension (unnecessary-comprehension)

so for Pylint users not much will be won in any case.

This adds coverage for most of the public API, but there are still some areas that aren't covered yet. * script_host.py is ignored entirely * host.py is ignored entirely * the msgpack_rpc submodule has partial coverage * there are some Any annotations sprinkled around on some of the more complex functions Funtionality should remain largely unchanged. Notable exceptions are: * Buffer.mark() and Window.cursor now return a tuple instead of a list. This is because there is currently no type for a fixed-size list (python/mypy#7509)

### **SUMMARY:** It is unnecessary to perform `n + 1` calls to `cast` (one cast for each parameter name-value pair and one cast for the filter generator itself) in a dictionary comprehension in an effort to avoid mypy errors. Previously, the `cast` to `Tuple[str, str]` was necessary to prevent mypy from complaining that we are trying to create a dictionary out of lists as opposed to tuples (see the mypy issue [here](python/mypy#7509)). However, a `cast` is both adding unnecessary overhead due to the function call and should generally only be used when a linter is unable to properly infer the type of a variable, not to "lie" to it about the type. We can avoid this by instead using a generator within the dictionary comprehension and then indexing into it twice to produce tuples of size 2; mypy recognizes this as a valid dictionary initialization. The above change is much more performant than the previous version of the code. Timing the two versions of the dictionary construction yielded the following results: ``` >python -m timeit -s "from typing import cast, Dict, Tuple, Iterable" -n 100000 -p "dict(cast(Tuple[str, str], pair.split('=')) for pair in cast(Iterable[str], filter(None, 'rank=3&world_size=5'.split('&'))))" 100000 loops, best of 5: 2.66 usec per loop >python -m timeit -n 100000 -p "dict((pair[0], pair[1]) for pair in (pair.split('=') for pair in filter(None, 'rank=3&world_size=5'.split('&'))))" 100000 loops, best of 5: 1.09 usec per loop ``` The `cast` to `Iterable[str]` is similarly a "lie" that is not necessary. It is best to be as transparent as possible to the linter rather than using `cast` to eliminate errors. This actually does not even produce any mypy errors when removed in isolation from the other changes. Further, it is good practice to type hint the return value of a function rather than specifying the type of the return value inside the function. Thus, the unnecessary intermediate variable `query_dict` inside `_query_to_dict` was removed, and the type hint of the return value was moved to the function declaration. The type of the argument `query` is specified as `str`. ### **EDITS (additional commits):** [The sole type hint for `query_dict` (in `_env_rendezvous_handler`) was removed to match all other functions in the file.](76d78bf) [Incorrect typing is fixed for _env_rendezvous_handler typing so that `rank`, `world_size`, `master_port`, and `master_addr` are specified to be `int`, `int`, `int`, and `str`, respectively.](3cc5844) Pull Request resolved: #75959 Approved by: https://github.com/kumpera, https://github.com/mrshenli

) Summary: ### **SUMMARY:** It is unnecessary to perform `n + 1` calls to `cast` (one cast for each parameter name-value pair and one cast for the filter generator itself) in a dictionary comprehension in an effort to avoid mypy errors. Previously, the `cast` to `Tuple[str, str]` was necessary to prevent mypy from complaining that we are trying to create a dictionary out of lists as opposed to tuples (see the mypy issue [here](python/mypy#7509)). However, a `cast` is both adding unnecessary overhead due to the function call and should generally only be used when a linter is unable to properly infer the type of a variable, not to "lie" to it about the type. We can avoid this by instead using a generator within the dictionary comprehension and then indexing into it twice to produce tuples of size 2; mypy recognizes this as a valid dictionary initialization. The above change is much more performant than the previous version of the code. Timing the two versions of the dictionary construction yielded the following results: ``` >python -m timeit -s "from typing import cast, Dict, Tuple, Iterable" -n 100000 -p "dict(cast(Tuple[str, str], pair.split('=')) for pair in cast(Iterable[str], filter(None, 'rank=3&world_size=5'.split('&'))))" 100000 loops, best of 5: 2.66 usec per loop >python -m timeit -n 100000 -p "dict((pair[0], pair[1]) for pair in (pair.split('=') for pair in filter(None, 'rank=3&world_size=5'.split('&'))))" 100000 loops, best of 5: 1.09 usec per loop ``` The `cast` to `Iterable[str]` is similarly a "lie" that is not necessary. It is best to be as transparent as possible to the linter rather than using `cast` to eliminate errors. This actually does not even produce any mypy errors when removed in isolation from the other changes. Further, it is good practice to type hint the return value of a function rather than specifying the type of the return value inside the function. Thus, the unnecessary intermediate variable `query_dict` inside `_query_to_dict` was removed, and the type hint of the return value was moved to the function declaration. The type of the argument `query` is specified as `str`. ### **EDITS (additional commits):** [The sole type hint for `query_dict` (in `_env_rendezvous_handler`) was removed to match all other functions in the file.](76d78bf) [Incorrect typing is fixed for _env_rendezvous_handler typing so that `rank`, `world_size`, `master_port`, and `master_addr` are specified to be `int`, `int`, `int`, and `str`, respectively.](3cc5844) Pull Request resolved: #75959 Approved by: https://github.com/kumpera, https://github.com/mrshenli Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/d65ab9a689bc86fbdf589546dc47c69bee3bb971 Reviewed By: seemethere Differential Revision: D35902000 Pulled By: seemethere fbshipit-source-id: 0f7b357e4aca35945eef576ee13e047982de1e9e

TomFryers · 2022-09-08T21:51:53Z

Another similar nice thing to catch would be

def f(c: tuple[list[int], list[int], list[int]]):
    ...

f(tuple([1, 2] for _ in range(3)))

recognising that comprehensions into tuple over range(n) have length n.

Phil-Barber · 2023-02-09T12:27:29Z

Similarly:

StrictLenIntTuple = tuple[int, int, int, int]

def func(some_tuple: StrictLenIntTuple) -> None:
    return None

str_tuple = ('a', 'bc', 'def', 'ghij')
int_tuple = tuple(map(len, str_tuple))

func(int_tuple)  # Argument 1 to "func" has incompatible type "Tuple[int, ...]"; expected "Tuple[int, int, int, int]"

NeilGirdhar · 2023-02-09T21:26:20Z

@Phil-Barber Great example, but maybe that should be a separate issue? Since something like this would ideally work:

from typing import TypeVar

T = TypeVar('T')
def f(x: T) -> T: ...

class X: pass
class Y: pass
class Z: pass

x = (X(), Y(), Z())
y = tuple(map(f, x))
reveal_type(y)  # Ideally: tuple[X, Y, Z]

In other words, it doesn't just need to keep track of a count here.

This adds coverage for most of the public API, but there are still some areas that aren't covered yet. * script_host.py is ignored entirely * host.py is ignored entirely * the msgpack_rpc submodule has partial coverage * there are some Any annotations sprinkled around on some of the more complex functions Funtionality should remain largely unchanged. Notable exceptions are: * Buffer.mark() and Window.cursor now return a tuple instead of a list. This is because there is currently no type for a fixed-size list (python/mypy#7509)

JukkaL added false-positive mypy gave an error on correct code feature needs discussion priority-2-low labels Sep 16, 2019

hauntsaninja mentioned this issue Jul 11, 2020

feature request/idea: tuple length inference from assertions/tests on len() #9120

Closed

JelleZijlstra mentioned this issue Aug 16, 2020

Allow dict to take a Generator python/typeshed#4450

Closed

m-col mentioned this issue May 28, 2021

Complete type annotations for Qtile class qtile/qtile#2465

Merged

ishaan-mehta mentioned this issue Apr 18, 2022

Improving typing and typing-related performance in rendezvous.py pytorch/pytorch#75959

Closed

sahiljoster32 mentioned this issue Jun 1, 2022

Added Mypy type annotations to validation_decorators.py and job_result_transforms.py oppia/oppia#15501

Merged

4 tasks

grossardt mentioned this issue Feb 26, 2024

Improve mappers, add ModeBasedMapper qiskit-community/qiskit-nature#1301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mypy does not recognize conversion between fixed length list and tuple #7509

mypy does not recognize conversion between fixed length list and tuple #7509

boompig commented Sep 13, 2019

JukkaL commented Sep 16, 2019

jamesohortle commented Nov 18, 2019

JukkaL commented Nov 18, 2019

simsa-st commented Jun 5, 2020

NeilGirdhar commented Jun 22, 2020

thomasgilgenast commented Jul 5, 2020

NeilGirdhar commented Jul 5, 2020 •

edited

Loading

dandersson commented Oct 12, 2020

TomFryers commented Sep 8, 2022

Phil-Barber commented Feb 9, 2023

NeilGirdhar commented Feb 9, 2023

mypy does not recognize conversion between fixed length list and tuple #7509

mypy does not recognize conversion between fixed length list and tuple #7509

Comments

boompig commented Sep 13, 2019

JukkaL commented Sep 16, 2019

jamesohortle commented Nov 18, 2019

JukkaL commented Nov 18, 2019

simsa-st commented Jun 5, 2020

NeilGirdhar commented Jun 22, 2020

thomasgilgenast commented Jul 5, 2020

Rewrite code to avoid lists and generators

Ignore the mypy error via comment or cast

Accept the unsized tuple type

NeilGirdhar commented Jul 5, 2020 • edited Loading

dandersson commented Oct 12, 2020

TomFryers commented Sep 8, 2022

Phil-Barber commented Feb 9, 2023

NeilGirdhar commented Feb 9, 2023

NeilGirdhar commented Jul 5, 2020 •

edited

Loading