-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Harmonize STORE_DEREF with STORE_FAST and LOAD_DEREF #72851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The STORE_FAST, LOAD_FAST, and LOAD_DEREF opcodes all use fast macros for variable access. This patch harmonizes STORE_DEREF to follow the same pattern. Both the C code and the generated assembly look nicer. Gives an approx 40% speed-up (using both Clang and GCC-6) on the "store_nonlocal" portion of the variable access benchmark at http://code.activestate.com/recipes/577834 |
LGTM. This saves function call and INCREF/DECREF pair. What about DELETE_DEREF? PyCell_Set() also is used in unicode_concatenate(). |
New changeset d78d45436753 by Raymond Hettinger in branch '3.6': |
Thanks for the quick review. I'll look at the other two cases when I get a chance. |
New changeset 7ec45e7d2194 by Serhiy Storchaka in branch 'default': |
You forgot to merge and created new head. |
Could you please measure the performance effect of these changes? |
I think the speed benefit for the last two patches is likely too modest to care about. The main reason I did the work was because you suggested it and because it seemed like a reasonable idea (the patched code looks nice, it does only work that is necessary, and it is more consistent with the other opcodes). Let me know if you want to go forward with it or leave it in the current state (the current juxtaposition of PyCell_GET with PyCell_Set looks a little weird but it does get the job done). |
The argument about "harmonizing" doesn't look strong to me. Opcodes for locals use the SETLOCAL() macro which decrefs But performance arguments look more weighty. I made benchmarks. fastcell.diff speeds up STORE_FAST by 40%, delete_deref.diff speeds up DELETE_DEREF by 50%. and concat_deref.diff speeds up string concatenating up to 15%. All these operations are rare in comparison with operations with locals or LOAD_DEREF, but the cognitive cost of the optimization is pretty low. All patches LGTM. I only have doubts that such changes could be pushed in 3.6 at this stage. This is not bug fix and isn't tweaking new 3.6 feature. |
New changeset 5f3b7ceb394c by Raymond Hettinger in branch 'default': |
I think small changes are fine while there is still another beta ahead but would rather just push it to 3.7 than spend more time talking about it. |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: