-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subtypes: fast path for Union/Union subtype check #14277
Conversation
This comment has been minimized.
This comment has been minimized.
mypy/subtypes.py
Outdated
@@ -269,6 +270,11 @@ def is_same_type( | |||
) | |||
|
|||
|
|||
class _SubtypeCheck(Protocol): | |||
def __call__(self, left: Type, right: Type, *, subtype_context: SubtypeContext) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too used to mypy's codebase -- is this the convention or is the convention to use mypy's extensions (which include a kwarg type for callables)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use protocols, the extensions are soft-deprecated in my view in favor of callback protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the extensions are soft-deprecated in my view in favor of callback protocols.
Not just in your view — mypy's documentation officially states that they're deprecated :) https://mypy.readthedocs.io/en/stable/additional_features.html#extended-callable-types
Added some numbers about perf impact to hopefully motive review/merge: <1s to >13min due to regression, and back to <1s after fix. |
This comment has been minimized.
This comment has been minimized.
Could you try the newly added tool |
I did. A few observations:
|
I ran
This is similar to what you measured and seems real. Can you try to optimize this further -- I bet you'd need to use a faster code path for small unions or something? I can also give it a try, but it might take a while because of holidays.
Yeah, that can't be avoided right now, unfortunately. Compilation is kind of slow, and we need to compile to get realistic results. Also without multiple measurements the noise level is too high.
On my 2019-vintage Linux desktop noise can be ~5%, and we can work around that to some degree by taking 15 measurements. Even if each measurement has 10% noise, the average over many measurements will typically have much less expected noise. If you don't have access to a desktop computer, a big enough cloud instance might produce more precise results and wouldn't be expensive, since you'd only need it for less than one hour. (Obviously this may not be practical.)
Good point, I should add automatic outlier filtering to the tool.
As I discussed above, the result when outliers are taken out may actually be fairly accurate due to averaging over multiple samples.
I haven't seen it before, but it looks interesting. We could add it as an optional "backend" to the script. (I don't want to require any extra deps by default.) |
Probably the biggest thing to do is to decouple the Instance/Union and Union/Union codepath which I combined to reduce duplication. The former is more common by far and the extra set creations and the few extra branches are probably to blame for any observable regression. I can definitely push an updated version for that. I can also play with further micro-optimizations but that honestly seems a little silly when weighing at slowdown below the noise floor in the common case versus a 7800% speedup in the slow case.
The repeat checkout/compilation cost is a real problem for repeated measurements when iterating through possible approaches though. At the very least you'd want to keep the reference binary around for as long as possible instead of recompiling it every time. |
Enums are exploded into Union of Literal when narrowed. Conditional branches on enum values can result in multiple distinct narrowing of the same enum which are later subject to subtype checks (most notably via `is_same_type`, when exiting frame context in the binder). Such checks would have quadratic complexity: `O(N*M)` where `N` and `M` are the number of entries in each narrowed enum variable, and led to drastic slowdown if any of the enums involved has a large number of valuees. Implemement a linear-time fast path where literals are quickly filtered, with a fallback to the slow path for more complex values. In our codebase there is one method with a chain of a dozen if statements operating on instances of an enum with a hundreds of values. Prior to the regression it was typechecked in less than 1s. After the regression it takes over 13min to typecheck. This patch fully fixes the regression for us. Fixes python#13821
Result for the newest commit:
|
According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉 |
Context: I've spent several days recently reducing the impact of many (mostly) minor performance regressions. The 7800% speedup is amazing, and if we can have the speedup without impacting performance in the average case, it's even better! A slowdown has an impact even if it's below measurement noise floor. Mypy accumulated maybe 15-20% overall slowdown in 2022 because of many minor performance regressions (see #14358). If we'd have similar yearly slowdowns for 5 years, mypy runtimes would increase by 100%. These small regressions are hard to fix months later, so if I can find a potential regression, my preference is to fix it even before merging the PR.
A good idea. Added an issue to track this: #14359 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and for persisting with the final tweaks! I love the huge speedup.
Enums are exploded into Union of Literal when narrowed.
Conditional branches on enum values can result in multiple distinct narrowing
of the same enum which are later subject to subtype checks (most notably via
is_same_type
, when exiting frame context in the binder). Such checks wouldhave quadratic complexity:
O(N*M)
whereN
andM
are the number of entriesin each narrowed enum variable, and led to drastic slowdown if any of the enums
involved has a large number of values.
Implement a linear-time fast path where literals are quickly filtered, with
a fallback to the slow path for more complex values.
In our codebase there is one method with a chain of a dozen
if
statementsoperating on instances of an enum with a hundreds of values. Prior to the
regression it was typechecked in less than 1s. After the regression it takes
over 13min to typecheck. This patch fully fixes the regression for us.
Fixes #13821