Improve Tuple/Sequence/Iterable overlapping #5315

emmatyping · 2018-07-04T03:48:23Z

@ilevkivskyi hopefully this is similar to what you were expecting?

Michael0x2a

Well, I'm not Ivan, but I'm also bored so here's a code review.

Apologies in advance if the review ended up being incoherent in places -- I'm sort of bouncing between different things atm.

Michael0x2a · 2018-07-04T23:43:43Z

mypy/meet.py

@@ -192,9 +203,8 @@ def is_overlapping_tuples(t: Type, s: Type, use_promotions: bool) -> Optional[bo
                if all(is_overlapping_types(ti, si, use_promotions)
                       for ti, si in zip(t.items, s.items)):
                    return True
-        # TupleType and non-tuples do not overlap
-        return False
-    # No tuples are involved here


I think getting rid of the return False introduces a regression here -- we would no longer return 'False' when checking to see if Tuple[A] and Tuple[B] are overlapping, for example.

I think this is probably hard to trigger because mypy doesn't ever really call is_overlapping_types directly/usually runs other checks first, but here's one example program:

from typing import * class A: pass class B: pass possibles: Tuple[int, Tuple[A]] x: Optional[Tuple[B]] if x in possibles: reveal_type(x) else: reveal_type(x)

Previously, this used to reveal Union[Tuple[B], None] on both lines -- now, the first reveal_type will incorrectly reveal Tuple[B].

I fixed this and added a regression test. Thank you for pointing this out.

Michael0x2a · 2018-07-04T23:50:11Z

mypy/meet.py

+    if isinstance(t, TupleType):
+        if isinstance(s, Instance) and is_subtype(t, s):
+            if all(is_overlapping_types(ti, si, use_promotions)
+                    for ti, si in zip(t.items, s.args)):


I don't think this zip quite does what you want -- if somebody were to call is_overlapping_types(Tuple[int, str], Sequence[int]), the zip would be zipping together [int, str] and [int] -- the first list is truncated so we only end up comparing the pair (int, int).

I also think having both the subtype check on line 146 and the overlapping check on lines 147-148 are unnecessary -- I believe is_subtype(t, s) implies that t is overlapping with s.

The converse isn't necessarily true though -- it's possible somebody somebody could pass in two types that ought to overlap despite not being a subtype of one another: for example, is_overlapping_types(Union[A, B], Union[B, C]). So if the former call is accepted, we probably ought to accept is_overlapping_types(Iterable[Union[A, B]], Tuple[Union[B, C]]) as well for symmetry.

That said, I think this is a pretty extreme edge case, and mypy currently calls is_overlapping_types under very limited circumstances anyways. In that case, it maybe might be worth just simplifying the logic here so it just defers to calling is_subtype(...) and doesn't bother recursively checking whatever's inside the containers.

So, maybe just something like if is_subtype(t, s) or is_subtype(s, t): return True.

Michael0x2a · 2018-07-04T23:51:16Z

mypy/meet.py

+    if isinstance(s, TupleType):
+        if isinstance(t, Instance) and is_subtype(s, t):
+            if all(is_overlapping_types(ti, si, use_promotions)
+                    for ti, si in zip(s.items, t.args)):


Currently, ti corresponds to s.items and si corresponds to t.args, which is a little confusing. We should probably swap one of the expressions to keep things consistent.

ilevkivskyi

I think I have few things to add here.

ilevkivskyi · 2018-07-05T15:23:35Z

mypy/meet.py

        return False
-    # No tuples are involved here
+    # TupleType and non-tuples are handled later
+    # Otherwise, no tuples are involved
    return None


Does it still make sense to return None here instead of False here?

ilevkivskyi · 2018-07-05T16:06:25Z

mypy/meet.py

+        return True
+
+    if is_subtype(t, s) or is_subtype(s, t):
+            return True


On one hand I would like to see this general case somewhere at the top of this function as an obvious case of overlap. However the fact that such an obvious case is still not here makes me think of "Chesterton's fence". Is there any danger of putting this here (like performance costs)?

An alternative may be to put return is_subtype(t, s) or is_subtype(s, t) instead of return False near the end of is_overlapping_tuples.

If you think the performance is still fine, then I would keep this check here (or maybe even higher above), but add few more tests that cover the other (non-tuple) situations not covered by existing if branches.

Yeah, I agree performance is something we might need to think about here, since we're basically traversing the types three times instead of once now. I think the subtype cache might help mitigate this, though of course it's hard to say for certain w/o benchmarking.

I suspect the other reason why the function doesn't already do something like this was because it was originally intended to be super-simple/lightweight -- here's a snapshot from 2015 and from 2016, for example. I guess it then just steadily accumulated patches over the years?

I don't particularly have a great idea of how I could benchmark this, but I am happy to move it to the end if you think that is better.

I don't particularly have a great idea of how I could benchmark this

A good benchmark is self checking mypy mypy.

Self check is a decent benchmark, I never feel like mypy is large enough to see a difference, but then again, I often forget how large mypy is.

Anyway, the selfcheck test seems to perform the same within a few hundredths of a second on master and this branch (without cache in both cases), so I am not concerned about performance regressions.

ilevkivskyi · 2018-07-06T14:34:42Z

mypy/meet.py

-        return False
-    # No tuples are involved here
-    return None
+        return is_subtype(t, s) or is_subtype(s, t)


Anyway, the selfcheck test seems to perform the same within a few hundredths of a second on master and this branch (without cache in both cases), so I am not concerned about performance regressions.

Did you measure performance when this check is here or at the top of the is_overlapping_types? If at the top of is_overlapping_types, then I would rather add if is_subtype(t, s) or is_subtype(s, t): return True at the top of is_overlapping_types (after builtins.object check) and leave this function completely "intact" then.

It was indeed measured from the top. I just added it after the object check.

ilevkivskyi

Sorry for a delay with this! Feel free to merge this after you consider one last comment.

@Michael0x2a could you please double-check this PR doesn't cause problems in internal codebases?

ilevkivskyi · 2018-07-21T17:01:33Z

mypy/meet.py

-    if tup_overlap is not None:
-        return tup_overlap
+    if is_overlapping_tuples(t, s, use_promotions):
+        return True


I think this change is now unnecessary (and the corresponding change below) after you added the check at the top of this function.

Michael0x2a · 2018-07-21T18:03:55Z

@ilevkivskyi -- sure! I have to go to a thing, but I can re-run these tests a little later tonight.

@ethanhs -- I'm not sure if your PR will run into this case or not, but I remember running into this edge case while working on my other PR yesterday. Could you try checking to see what your PR does on this test case?

from typing import *

# E.g. maybe this was imported from a non-typed module
Parent: Any

class Child(Parent):
    def some_method(self) -> int:
        if self is None:
            return None
        return 3

I believe the current behavior of mypy is to report no errors when running in strict-optional mode. If your PR complains about the "return None", you may have to swap out the two calls to is_subtype with is_proper_subtype or something.

(The actual code that I synthesized this from does self = super(Child, self).blah() before the if statement, so the above example is a little less stupid than it looks, if only marginally.)

JelleZijlstra · 2018-07-21T19:08:28Z

Maybe I'm missing something, but shouldn't that code sample produce an error under strict optional? It returns None but is declared to return int.

ilevkivskyi · 2018-07-21T19:09:51Z

@JelleZijlstra mypy infers the branch as unreachable.

emmatyping · 2018-07-21T20:15:42Z

@Michael0x2a thanks! That was indeed an issue, and is_proper_subtype fixed it :)

I also reverted the unneeded changes as suggested by Ivan.

Better Tuple/Sequence/Iterable meets

a1ce0e1

ilevkivskyi self-requested a review July 4, 2018 12:04

ilevkivskyi self-assigned this Jul 4, 2018

Michael0x2a reviewed Jul 5, 2018

View reviewed changes

Respond to review

859118a

ilevkivskyi removed their assignment Jul 5, 2018

ilevkivskyi removed their request for review July 5, 2018 09:40

ilevkivskyi reviewed Jul 5, 2018

View reviewed changes

Move subtype check to is_overlapping_tuples

d7e680b

ilevkivskyi reviewed Jul 6, 2018

View reviewed changes

Move is_subtype checks to the top

8c0dda8

ilevkivskyi approved these changes Jul 21, 2018

View reviewed changes

emmatyping added 2 commits July 21, 2018 15:58

Use is_proper_subtype, with thanks to @Michael0x2a

8fb14e4

Revert unneeded changes

5791783

Michael0x2a merged commit 3fb16a2 into python:master Jul 22, 2018

emmatyping deleted the tuplesequence branch July 22, 2018 02:52

Uh oh!

Improve Tuple/Sequence/Iterable overlapping #5315

Improve Tuple/Sequence/Iterable overlapping #5315

Uh oh!

Conversation

emmatyping commented Jul 4, 2018

Uh oh!

Michael0x2a left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emmatyping Jul 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Michael0x2a commented Jul 21, 2018

Uh oh!

JelleZijlstra commented Jul 21, 2018

Uh oh!

ilevkivskyi commented Jul 21, 2018

Uh oh!

emmatyping commented Jul 21, 2018

Uh oh!

Uh oh!

emmatyping Jul 6, 2018 •

edited

Loading