-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Remove duplicated constraints before solving and deduplicate fast_container_type
items
#19409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Remove duplicated constraints before solving and deduplicate fast_container_type
items
#19409
Conversation
dd642dd
to
2f2c391
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Okay, this is nice -
|
OK, from what I see there are no significant perf regressions, no bad primer changes, and selfcheck benchmark is zero/slightly immproved: sterliakov/mypy-issues#111 |
This comment has been minimized.
This comment has been minimized.
What's the performance improvement when type checking e.g. the original example from #14718? |
@JukkaL Repeatedly joining generic overloaded callables was a perf killer there, so avoiding at least self-joins helps massively - by orders of magnitude. Runs below were both pre-warmed and use cache fully. Now:
Current master:
Original example with a dictionary exhibits similar improvement. |
join_type_list
fast_container_type
items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice perf fix!
I'm a little concerned about the quadratic ops introduced. For the constraints one, we could at least microoptimise to avoid creating n lists by the slicing in the list comprehension.
Other things we could do: only deduplicate against first n deduplicated elements, do some cleverer deduplication like we do for unions (e.g. things that look more like sort | uniq)
if typ not in values: | ||
values.append(typ) | ||
|
||
if len(values) == 1 and not self.chk.current_node_deferred: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second half of clause is now always true. also dumb question: what is the "inference cycle" in the else branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second half of clause is now always true
Huh, why? We had self.accept
after previous check, it might have changed
what is the "inference cycle" in the else branch
Do you mean what happens when this method returns None
? We run a full typecheck on container_type.__init__(*args)
, which is significantly slower than this happy path where all arguments are already "good enough".
Yes, that's a good idea. However, for short lists this might actually be slower than listcomp - does |
Ough, scratch this. Looking at the implementation, I presumed that hash is not sufficiently good (equal types producing different hashes), but I was wrong. Using plain set is good enough for constraint deduplication: I just ran a whole test suite using def _deduplicate_constraints(constraints: list[Constraint]) -> list[Constraint]:
result = []
for c in constraints:
if c not in result:
result.append(c)
assert set(result) == set(constraints)
return result and got zero failures. Hash implementation is definitely good enough for quick&dirty constraint deduplication purposes - I replaced the quadratic check with standard |
Diff from mypy_primer, showing the effect of this PR on open source code: colour (https://github.com/colour-science/colour)
+ colour/plotting/section.py:390: error: Argument 1 to "LineCollection" has incompatible type "ndarray[tuple[Any, ...], dtype[Any]]"; expected "Sequence[Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]]" [arg-type]
zulip (https://github.com/zulip/zulip)
- zerver/lib/subscription_info.py:919: error: Incompatible types in assignment (expression has type "TypedDict({'audible_notifications': bool | None, 'can_add_subscribers_group': int | UserGroupMembersDict, 'can_administer_channel_group': int | UserGroupMembersDict, 'can_move_messages_out_of_channel_group': int | UserGroupMembersDict, 'can_move_messages_within_channel_group': int | UserGroupMembersDict, 'can_send_message_group': int | UserGroupMembersDict, 'can_remove_subscribers_group': int | UserGroupMembersDict, 'can_resolve_topics_group': int | UserGroupMembersDict, 'can_subscribe_group': int | UserGroupMembersDict, 'color': str, 'creator_id': int | None, 'date_created': int, 'description': str, 'desktop_notifications': bool | None, 'email_notifications': bool | None, 'first_message_id': int | None, 'folder_id': int | None, 'is_recently_active': bool, 'history_public_to_subscribers': bool, 'in_home_view': bool, 'invite_only': bool, 'is_announcement_only': bool, 'is_archived': bool, 'is_muted': bool, 'is_web_public': bool, 'message_retention_days': int | None, 'name': str, 'pin_to_top': bool, 'push_notifications': bool | None, 'rendered_description': str, 'stream_id': int, 'stream_post_policy': int, 'stream_weekly_traffic': int | None, 'subscriber_count': int, 'subscribers'?: list[int], 'partial_subscribers'?: list[int], 'topics_policy': str, 'wildcard_mentions_notify': bool | None})", variable has type "dict[str, Any]") [assignment]
+ zerver/lib/subscription_info.py:919: error: Incompatible types in assignment (expression has type "SubscriptionStreamDict", variable has type "dict[str, Any]") [assignment]
|
Vastly improves #14718. Some type joins are really heavy - especially joins between overloads. This does not fully remove the problem, a collection of different pairwise equivalent overloads differing only in their names is still slow to check, but at least we will not join every such monster callable with itself multiple times.