-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Improve Tuple/Sequence/Iterable overlapping #5315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm not Ivan, but I'm also bored so here's a code review.
Apologies in advance if the review ended up being incoherent in places -- I'm sort of bouncing between different things atm.
mypy/meet.py
Outdated
@@ -192,9 +203,8 @@ def is_overlapping_tuples(t: Type, s: Type, use_promotions: bool) -> Optional[bo | |||
if all(is_overlapping_types(ti, si, use_promotions) | |||
for ti, si in zip(t.items, s.items)): | |||
return True | |||
# TupleType and non-tuples do not overlap | |||
return False | |||
# No tuples are involved here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think getting rid of the return False
introduces a regression here -- we would no longer return 'False' when checking to see if Tuple[A]
and Tuple[B]
are overlapping, for example.
I think this is probably hard to trigger because mypy doesn't ever really call is_overlapping_types
directly/usually runs other checks first, but here's one example program:
from typing import *
class A: pass
class B: pass
possibles: Tuple[int, Tuple[A]]
x: Optional[Tuple[B]]
if x in possibles:
reveal_type(x)
else:
reveal_type(x)
Previously, this used to reveal Union[Tuple[B], None]
on both lines -- now, the first reveal_type
will incorrectly reveal Tuple[B]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this and added a regression test. Thank you for pointing this out.
mypy/meet.py
Outdated
if isinstance(t, TupleType): | ||
if isinstance(s, Instance) and is_subtype(t, s): | ||
if all(is_overlapping_types(ti, si, use_promotions) | ||
for ti, si in zip(t.items, s.args)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this zip quite does what you want -- if somebody were to call is_overlapping_types(Tuple[int, str], Sequence[int])
, the zip would be zipping together [int, str]
and [int]
-- the first list is truncated so we only end up comparing the pair (int, int)
.
I also think having both the subtype check on line 146 and the overlapping check on lines 147-148 are unnecessary -- I believe is_subtype(t, s)
implies that t
is overlapping with s
.
The converse isn't necessarily true though -- it's possible somebody somebody could pass in two types that ought to overlap despite not being a subtype of one another: for example, is_overlapping_types(Union[A, B], Union[B, C])
. So if the former call is accepted, we probably ought to accept is_overlapping_types(Iterable[Union[A, B]], Tuple[Union[B, C]])
as well for symmetry.
That said, I think this is a pretty extreme edge case, and mypy currently calls is_overlapping_types
under very limited circumstances anyways. In that case, it maybe might be worth just simplifying the logic here so it just defers to calling is_subtype(...)
and doesn't bother recursively checking whatever's inside the containers.
So, maybe just something like if is_subtype(t, s) or is_subtype(s, t): return True
.
mypy/meet.py
Outdated
if isinstance(s, TupleType): | ||
if isinstance(t, Instance) and is_subtype(s, t): | ||
if all(is_overlapping_types(ti, si, use_promotions) | ||
for ti, si in zip(s.items, t.args)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, ti
corresponds to s.items
and si
corresponds to t.args
, which is a little confusing. We should probably swap one of the expressions to keep things consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have few things to add here.
mypy/meet.py
Outdated
return False | ||
# No tuples are involved here | ||
# TupleType and non-tuples are handled later | ||
# Otherwise, no tuples are involved | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it still make sense to return None
here instead of False
here?
mypy/meet.py
Outdated
return True | ||
|
||
if is_subtype(t, s) or is_subtype(s, t): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand I would like to see this general case somewhere at the top of this function as an obvious case of overlap. However the fact that such an obvious case is still not here makes me think of "Chesterton's fence". Is there any danger of putting this here (like performance costs)?
An alternative may be to put return is_subtype(t, s) or is_subtype(s, t)
instead of return False
near the end of is_overlapping_tuples
.
If you think the performance is still fine, then I would keep this check here (or maybe even higher above), but add few more tests that cover the other (non-tuple) situations not covered by existing if
branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree performance is something we might need to think about here, since we're basically traversing the types three times instead of once now. I think the subtype cache might help mitigate this, though of course it's hard to say for certain w/o benchmarking.
I suspect the other reason why the function doesn't already do something like this was because it was originally intended to be super-simple/lightweight -- here's a snapshot from 2015 and from 2016, for example. I guess it then just steadily accumulated patches over the years?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly have a great idea of how I could benchmark this, but I am happy to move it to the end if you think that is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly have a great idea of how I could benchmark this
A good benchmark is self checking mypy mypy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self check is a decent benchmark, I never feel like mypy is large enough to see a difference, but then again, I often forget how large mypy is.
Anyway, the selfcheck test seems to perform the same within a few hundredths of a second on master and this branch (without cache in both cases), so I am not concerned about performance regressions.
mypy/meet.py
Outdated
return False | ||
# No tuples are involved here | ||
return None | ||
return is_subtype(t, s) or is_subtype(s, t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, the selfcheck test seems to perform the same within a few hundredths of a second on master and this branch (without cache in both cases), so I am not concerned about performance regressions.
Did you measure performance when this check is here or at the top of the is_overlapping_types
? If at the top of is_overlapping_types
, then I would rather add if is_subtype(t, s) or is_subtype(s, t): return True
at the top of is_overlapping_types
(after builtins.object
check) and leave this function completely "intact" then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was indeed measured from the top. I just added it after the object check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a delay with this! Feel free to merge this after you consider one last comment.
@Michael0x2a could you please double-check this PR doesn't cause problems in internal codebases?
mypy/meet.py
Outdated
if tup_overlap is not None: | ||
return tup_overlap | ||
if is_overlapping_tuples(t, s, use_promotions): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is now unnecessary (and the corresponding change below) after you added the check at the top of this function.
@ilevkivskyi -- sure! I have to go to a thing, but I can re-run these tests a little later tonight. @ethanhs -- I'm not sure if your PR will run into this case or not, but I remember running into this edge case while working on my other PR yesterday. Could you try checking to see what your PR does on this test case? from typing import *
# E.g. maybe this was imported from a non-typed module
Parent: Any
class Child(Parent):
def some_method(self) -> int:
if self is None:
return None
return 3 I believe the current behavior of mypy is to report no errors when running in strict-optional mode. If your PR complains about the "return None", you may have to swap out the two calls to (The actual code that I synthesized this from does |
Maybe I'm missing something, but shouldn't that code sample produce an error under strict optional? It returns None but is declared to return int. |
@JelleZijlstra mypy infers the branch as unreachable. |
@Michael0x2a thanks! That was indeed an issue, and I also reverted the unneeded changes as suggested by Ivan. |
Fixes #4466
@ilevkivskyi hopefully this is similar to what you were expecting?