Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow tuple types using len() #16237

Merged
merged 20 commits into from
Oct 21, 2023
Merged

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Oct 8, 2023

Fixes #1178
Supersedes #10367

This is includes implementation for fixed length tuples, homogeneous tuples, and variadic tuples (and combinations of those). Generally implementation is straightforward. Some notes:

  • Unfortunately, it is necessary to add a new attribute min_len to TypeVarTupleType, which is probably fine, as it doesn't have that many attributes so far.
  • Supporting more general use cases (like > comparisons for variadic tuples) can cause quick proliferation of unions. I added two mechanisms to counteract this: not applying the narrowing if the integer literal in comparison is itself large, and collapsing unions of tuples into a single tuple (if possible) after we are done with the narrowing. This looks a bit arbitrary, but I think it is important to have.
  • Main missing feature here is probably not inferring type information from indirect comparisons like len(x) > foo() > 1. Supporting this kind of things in full generality is cumbersome, and we may add cases that turn out to be important later.
  • Note I am quite careful with indexing "inside" a TypeVarTuple, it is not really needed now, but I wanted to make everything future proof, so that it will be easy to add support for upper bounds for TypeVarTuples, like Nums = TypeVarTuple("Nums", bound=tuple[float, ...]).
  • I also fix couple existing inconsistencies with Any handling in type narrowing. It looks like they stem from the old incorrect logic that meet of Any and X should be X, while in fact it should be Any. These fixes are not strictly necessary, but otherwise there may be new false positives, because I introduce a bunch of additional type narrowing scenarios here.

cc @hatal175, thanks for the test cases, and for your nice first attempt to implement this!
Co-authored-by: Tal Hayon talhayon1@gmail.com

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

OK, output of the primer is interesting, it turns pattern like this is quite common:

x: tuple[int, ...]
x if len(x) > 1 else x[0]

Now we infer tuple[()] | tuple[int] in the else branch, causing Tuple index out of range for the case of empty tuple. On one hand this is technically a legitimate error, on the other hand it may be annoying. Completely abandoning precise narrowing from homogeneous tuples would be sad. So I see three possible (non-exclusive) things:

  • Support plain assert x as an alias for assert len(x) > 0 to simplify avoiding this error
  • Special-case 0 index in some cases (i.e. assume that a homogeneous tuple is always non-empty)
  • Postpone this part until Unpack is enabled by default. Without Unpack there are couple actual false positives, and also it is impossible to express "tuple with at least one item".

@tusharsadhwani
Copy link
Contributor

tusharsadhwani commented Oct 8, 2023

assume that a homogeneous tuple is always non-empty

This makes sense to me. This can be a flag (--warn-empty-tuple/--disallow-empty-tuple-index) probably?

@tusharsadhwani
Copy link
Contributor

Postpone this part until Unpack is enabled by default

I don't think defining "tuple of at least one item" with Unpack will catch on. Seems too convoluted of a type for a case that's pretty much always true.

@ilevkivskyi
Copy link
Member Author

This can be a flag (--warn-empty-tuple/--disallow-empty-tuple-index) probably?

An entry barrier for a new flag is high, OTOH a separate error code (off by default) is possible. I would propose to postpone this decision until we will enable Unpack, there are more important things now.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

It looks like pandas is hitting a weird corner case with

x: Any
if isinstance(x, (list, tuple)) and len(x) == 0:
    ...
else:
    reveal_type(x)  # Revealed type is "Union[Any, builtins.tuple[Any, ...], builtins.list[Any]]" ???

I will take a look tomorrow.

@ilevkivskyi
Copy link
Member Author

It turns out this is an existing issue. I have found some similar cases with and operator on existing narrowing checks, for example:

t: Any
if isinstance(t, (list, tuple)) and isinstance(t, tuple):
    ...
else:
    reveal_type(t)  # Revealed type is "Union[Any, builtins.list[Any]]" ???

I decided to still try to fix this, to avoid creating what may look like a new false positive. I will update PR description to mention this.

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

As discussed with @JukkaL offline, I am hiding the precise tuple types that caused new errors for code like this:

x: tuple[int, ...]
x if len(x) > 1 else x[0]

behind a separate experimental feature PreciseTupleTypes. I am going to add it to the docs in a separate PR (just to keep this one smaller), where I will also enable Unpack/TypeVarTuple. In that PR I will also expand the docs on --enable-incomplete-feature flag (and corresponding config file option).

In general, we will be using this flag more liberally to experiment with various type system features.

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

@JukkaL @jhance Last chance to review this. It is open for almost two weeks now, so I am merging this ~tomorrow unless I will see some objections.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a useful feature and it seems like a logical extension of the existing type narrowing rules. Feel free to merge once you've addressed comments that you feel are relevant.

if len(x) < 30:
reveal_type(x) # N: Revealed type is "builtins.tuple[builtins.int, ...]"
else:
reveal_type(x) # N: Revealed type is "builtins.tuple[builtins.int, ...]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use --enable-incomplete-feature=PreciseTupleTypes in this test case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point, otherwise we don't really test much.

else:
reveal_type(x1) # N: Revealed type is "Any"
reveal_type(x1) # N: Revealed type is "Any"
[builtins fixtures/len.pyi]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these test cases exist -- they might be useful if we don't have them yet:

  • len(x) == n where n has an unsupported type, such as int or Any
  • Trying to narrow a union of named tuple types, or tuple subclasses in general
  • Trying to narrow a subclass of a variable-length tuple
  • Narrowing a union of different variable-length tuples
  • Narrowing with a type such as Literal[3] (i.e. an explicit literal type) or a union of literal types

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these are not supported. But yes, I think it makes sense to test these anyway:

Trying to narrow a union of named tuple types, or tuple subclasses in general
Trying to narrow a subclass of a variable-length tuple

I decided to not add general support various subclasses for now, since this may create a "mismatch" between the fallback and the count in tuple type. IIRC I left a TODO where this is decided (I use .fullname == ... instead of .has_base(...)). But now looking at this again, I think we can support at least cases that do not create new types, but just select items from a union (and maybe also a fixed size tuples with == from variadic ones). I will check if it is easy to allow.

Narrowing with a type such as Literal[3] (i.e. an explicit literal type) or a union of literal types

I think only a single literal is supported. We can add support for unions of literals, but I guess it will be needed quite rarely. There is already a test case for Final, I will add similar for plain Literal.

) -> None:
super().__init__(name, fullname, id, upper_bound, default, line=line, column=column)
self.tuple_fallback = tuple_fallback
# This value is not settable by a user. It is an internal-only thing to support
# len()-narrowing of variadic tuples.
self.min_len = min_len
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support this in subtyping, join etc. operations? If there is no clear use case, no need to bother with these, since this is not directly exposed to users. If we'd use this more generally, we'd need to include this in the textual representation, perhaps, which seems something to avoid, since this is not specified by the PEP, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only handle this in subtyping (which btw induces correct behavior for meet), joins are not handled, but I think joins of TypeVarTuples are rare. In general, this is supposed to be "short lived". I wanted to support basic use case where there is an assert len(args) > 1 followed by some code that uses e.g. args[1].

mypy/checker.py Outdated
@@ -228,6 +229,9 @@

DEFAULT_LAST_PASS: Final = 1 # Pass numbers start at 0

# Maximum length of fixed tuple types inferred when narrowing from variadic tuples.
MAX_PRECISE_TUPLE_SIZE: Final = 15
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit high to me, but I don't have a strong opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 is another option that I was thinking about.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

zetta_utils (https://github.com/ZettaAI/zetta_utils)
+ zetta_utils/layer/volumetric/frontend.py:48: error: Unused "type: ignore" comment  [unused-ignore]
+ zetta_utils/layer/volumetric/frontend.py:49: error: Unused "type: ignore" comment  [unused-ignore]
+ zetta_utils/layer/volumetric/frontend.py:52: error: Unused "type: ignore" comment  [unused-ignore]
+ zetta_utils/layer/volumetric/frontend.py:54: error: Unused "type: ignore" comment  [unused-ignore]
+ zetta_utils/layer/volumetric/frontend.py:77: error: Unused "type: ignore" comment  [unused-ignore]

black (https://github.com/psf/black)
+ src/blib2to3/pgen2/tokenize.py:265:29: error: Redundant cast to "Tuple[int, str]"  [redundant-cast]
+ src/blib2to3/pgen2/tokenize.py:267:49: error: Redundant cast to "Tuple[int, str, Tuple[int, int], Tuple[int, int], str]"  [redundant-cast]

prefect (https://github.com/PrefectHQ/prefect)
- src/prefect/utilities/collections.py:192: error: Value of type "type[T] | tuple[type[T], ...]" is not indexable  [index]
+ src/prefect/utilities/collections.py:192: error: Value of type "type[T] | tuple[type[T]]" is not indexable  [index]

Tanjun (https://github.com/FasterSpeeding/Tanjun)
- tanjun/clients.py:3139: error: Redundant cast to "Iterable[Any]"  [redundant-cast]

jax (https://github.com/google/jax)
+ jax/_src/lax/other.py:204: error: Redundant cast to "Precision | None"  [redundant-cast]

rich (https://github.com/Textualize/rich)
+ rich/padding.py:69: error: Redundant cast to "tuple[int, int]"  [redundant-cast]
+ rich/padding.py:72: error: Redundant cast to "tuple[int, int, int, int]"  [redundant-cast]

ibis (https://github.com/ibis-project/ibis)
- ibis/expr/types/relations.py:3974: error: Item "Iterable[str]" of "Iterable[str] | Any | None" has no attribute "sort_values"  [union-attr]
+ ibis/expr/types/relations.py:3974: error: Item "Iterable[str]" of "Iterable[str] | None" has no attribute "sort_values"  [union-attr]
- ibis/expr/types/relations.py:3974: error: Item "None" of "Iterable[str] | Any | None" has no attribute "sort_values"  [union-attr]
+ ibis/expr/types/relations.py:3974: error: Item "None" of "Iterable[str] | None" has no attribute "sort_values"  [union-attr]
- ibis/expr/types/relations.py:3974: error: Item "Iterable[str]" of "Iterable[str] | Any | None" has no attribute "columns"  [union-attr]
+ ibis/expr/types/relations.py:3974: error: Item "Iterable[str]" of "Iterable[str] | None" has no attribute "columns"  [union-attr]
- ibis/expr/types/relations.py:3974: error: Item "None" of "Iterable[str] | Any | None" has no attribute "columns"  [union-attr]
+ ibis/expr/types/relations.py:3974: error: Item "None" of "Iterable[str] | None" has no attribute "columns"  [union-attr]
- ibis/expr/types/relations.py:3980: error: Item "Iterable[str]" of "Iterable[str] | Any | None" has no attribute "columns"  [union-attr]
+ ibis/expr/types/relations.py:3980: error: Item "Iterable[str]" of "Iterable[str] | None" has no attribute "columns"  [union-attr]
- ibis/expr/types/relations.py:3980: error: Item "None" of "Iterable[str] | Any | None" has no attribute "columns"  [union-attr]
+ ibis/expr/types/relations.py:3980: error: Item "None" of "Iterable[str] | None" has no attribute "columns"  [union-attr]
- ibis/expr/types/relations.py:3982: error: Item "Iterable[str]" of "Iterable[str] | Any | None" has no attribute "itertuples"  [union-attr]
+ ibis/expr/types/relations.py:3982: error: Item "Iterable[str]" of "Iterable[str] | None" has no attribute "itertuples"  [union-attr]
- ibis/expr/types/relations.py:3982: error: Item "None" of "Iterable[str] | Any | None" has no attribute "itertuples"  [union-attr]
+ ibis/expr/types/relations.py:3982: error: Item "None" of "Iterable[str] | None" has no attribute "itertuples"  [union-attr]
- ibis/backends/base/sql/__init__.py:128: error: Argument 2 to "SQLQueryResult" has incompatible type "Schema | Any | None"; expected "Schema"  [arg-type]
+ ibis/backends/base/sql/__init__.py:128: error: Argument 2 to "SQLQueryResult" has incompatible type "Schema | None"; expected "Schema"  [arg-type]

bokeh (https://github.com/bokeh/bokeh)
+ src/bokeh/colors/color.py:294: error: Unused "type: ignore" comment  [unused-ignore]
+ src/bokeh/colors/color.py:297: error: Unused "type: ignore" comment  [unused-ignore]
+ src/bokeh/server/tornado.py:432: error: Unused "type: ignore" comment  [unused-ignore]

mongo-python-driver (https://github.com/mongodb/mongo-python-driver)
+ bson/decimal128.py:228: error: Unused "type: ignore" comment  [unused-ignore]
+ pymongo/server.py:281: error: Unused "type: ignore" comment  [unused-ignore]
+ pymongo/server.py:284: error: Unused "type: ignore" comment  [unused-ignore]

aiohttp (https://github.com/aio-libs/aiohttp)
+ aiohttp/resolver.py:47: error: Unused "type: ignore" comment  [unused-ignore]

anyio (https://github.com/agronholm/anyio)
+ src/anyio/_core/_sockets.py:652: error: Redundant cast to "tuple[str, int, int, int]"  [redundant-cast]
+ src/anyio/_core/_sockets.py:664: error: Redundant cast to "tuple[str, int]"  [redundant-cast]

vision (https://github.com/pytorch/vision)
+ torchvision/models/_api.py:178: error: Unused "type: ignore" comment  [unused-ignore]

pylint (https://github.com/pycqa/pylint)
+ pylint/checkers/base_checker.py:191: error: Unused "type: ignore" comment  [unused-ignore]
+ pylint/checkers/base_checker.py:194: error: Unused "type: ignore" comment  [unused-ignore]

streamlit (https://github.com/streamlit/streamlit)
+ lib/streamlit/elements/deck_gl_json_chart.py: note: In function "_get_pydeck_tooltip":
+ lib/streamlit/elements/deck_gl_json_chart.py:149:9: error: Returning Any from function declared to return "Optional[Dict[str, str]]"  [no-any-return]
+ lib/streamlit/elements/arrow_altair.py: note: In function "_get_color_encoding":
+ lib/streamlit/elements/arrow_altair.py:1355:17: error: Statement is unreachable  [unreachable]

urllib3 (https://github.com/urllib3/urllib3)
+ src/urllib3/fields.py:228: error: Redundant cast to "tuple[str, str | bytes, str]"  [redundant-cast]
+ src/urllib3/fields.py:232: error: Redundant cast to "tuple[str, str | bytes]"  [redundant-cast]

discord.py (https://github.com/Rapptz/discord.py)
- discord/ext/commands/converter.py:1048: error: Unused "type: ignore" comment  [unused-ignore]
- discord/ext/commands/cog.py:249: error: "Callable[..., CoroutineType[Any, Any, Any]]" has no attribute "__cog_listener_names__"  [attr-defined]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unpacking tuples of variable length
3 participants