-
-
Notifications
You must be signed in to change notification settings - Fork 33k
gh-139424: microoptimize _collections_abc._check_methods #139401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We can reduce the number of attribute lookups requried here. I also think we can speed it up a little bit more if we make `methods` arg a packed tuple instead of an unpacked one since we can bypass the additional tuple creation at the time of each call Seeking input on that before I implement the same
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Please provide a A |
Also, please create an issue where you can report the benchmarks as well and the script you used to deduce them. I would suggest creating test cases with deeply nested classes. |
|
||
def _check_methods(C, *methods): | ||
mro = C.__mro__ | ||
mro_dicts = [B.__dict__ for B in C.__mro__] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use a tuple instead of a list to reduce memory usage, since this collection isn’t modified:
mro_dicts = [B.__dict__ for B in C.__mro__] | |
mro_dicts = tuple(B.__dict__ for B in C.__mro__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be reasonable to cache this tuple for re-use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the caching here would need to be more sophisticated, since when the class is updated, the result must also change. Standard caching only takes the input into account, so in this case the input would be cached, but the output should actually differ depending on the class state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a class mro change after the fact?
I know the contents of dict could change but if the cache is just holding references to each class' dict itself, then any modifications of any dict should already be reflected in the cache
If the mro can change this becomes invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear I'm not suggesting to cache the result of this function, only the mro dict tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. I think caching the MRO dict tuple should be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you also need to decide what exactly we want to optimize, since a tuple gives about ~10% better memory efficiency, while a list is about ~10% faster when iterating in loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, I didn't know the lists are faster to iterate thru, even with the length check at each iteration. TIL, thanks
In that case I'd opt for the list for top speed in the subclass check but I'm not sure how that fits into the best practices here or what the people want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not worth debating tiny differences like this until we've verified that the change itself is worth pursing.
for base_dict in mro_dicts: | ||
if method in base_dict: | ||
if base_dict[method] is None: | ||
return NotImplemented | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for base_dict in mro_dicts: | |
if method in base_dict: | |
if base_dict[method] is None: | |
return NotImplemented | |
break | |
for base_dict in mro_dicts: | |
method_impl = base_dict.get(method, sentinel) | |
if method_impl is None: | |
return NotImplemented | |
if method_impl is not sentinel: | |
break |
This avoids calculating hash and looking for key in dict twice. sentinel object can be created at the top of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a clear win, good eye! Might make sense to do the check optimistically, ie: consider the sentinel more likely than the None
if method_impl is sentinel:
continue
if method_impl is None:
return NotImplemented
break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to apply any of the given suggestions, please first open an issue and provide micro-benchmarks (use a PGO+LTO+non-debug build for that, and pyperf
as well).
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Oh wow, who knew the most active PR conversation in my github history would be on a PR microoptimizing a helper function within the collections module! Thank you for the feedback everybody! Clearly this will be more involved than I originally planned, its going to take me some time to put together a benchmark so bear with me. But I'll get that done soonish and test out these suggestions. |
This will be my first time using this tool, do you have any suggestions or best practices for its use within this repo? |
Yes, here are the steps you can do:
I don't have time now to shepherd this but you can take your time for that. However, I'd appreciate we create an issue first as well to give more visibility (possibly, this feature could be rejected if the benchmarks are not satisfactory enough; we usually strive for >= 10% improvements). |
I'm going to close this PR for now, let's reopen this if the benchmarks in the linked issue demonstrate it's worth it. Ben's advice above is the best to follow in terms of demonstrating the worth of the change. A |
We can reduce the number of attribute lookups requried here.
I also think we can speed it up a little bit more if we make
methods
arg a packed tuple instead of an unpacked one since we can bypass the additional tuple creation at the time of each callSeeking input on that before I implement the same, I'm not 100% certain this internal function is safe to change.