Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-106581: Split CALL_BOUND_METHOD_EXACT_ARGS into uops #108462

Merged
merged 1 commit into from
Aug 25, 2023

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Aug 24, 2023

Instead of using GO_TO_INSTRUCTION(CALL_PY_EXACT_ARGS); we just add the macro elements of the latter to the macro for the former. This requires lengthening the uops array in struct opcode_macro_expansion. (It also required changes to stacking.py that were merged already.)

This requires lengthening the uops array in struct opcode_macro_expansion.
(It also required changes to stacking.py that were merged already.)
@gvanrossum
Copy link
Member Author

@carljm Interested in reviewing this?

Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I haven't had much opportunity to play with tier 2 yet, so I'm reviewing without the benefit of experience working in this code. But I took a pretty careful look (including the generated code), and I think I understand everything that's happening here, and it all makes sense.

Comment on lines +2951 to +2953
stack_pointer[-1 - oparg] = self; // Patch stack as it is used by _INIT_CALL_PY_EXACT_ARGS
func = Py_NewRef(((PyMethodObject *)callable)->im_func);
stack_pointer[-2 - oparg] = func; // This is used by CALL, upon deoptimization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so these lines appear to be entirely redundant for the uops executor, as the stack-effect output of func, self, ... causes the exact same assignments to be emitted automatically right after these lines.

But this is needed for the bytecode interpreter. Since in that case all the uops are squashed together and their inputs and outputs are chained together as local variables, but we really need to update the actual stack here, for the reasons mentioned in the comments.

I don't know how often such uop stack-patching cases will occur (from what I can find, this is the first one?). If there will be more, it might be nice to have syntax to mark an output in the stack-effects definition of the uop as "must actually modify the stack", and then have the cases generator automatically emit this (and the executor cases wouldn't have the duplicate assignments.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

It looks to be pretty uncommon -- calls are special in many ways. Agreed that if this proliferates we should teach the code generator to do this. It also looks like a C compiler would have a hard time optimizing the duplication in Tier 2 away, because there's an intervening Py_DECREF(). If that becomes an issue but it remains limited to just this case we could surround the flushes with #ifdef TIER_ONE / #endif.

I'll merge this and see what's next on the agenda. (I suspect either KW_NAMES or splitting LOAD_ATTR specializations.)

@gvanrossum gvanrossum merged commit ddf66b5 into python:main Aug 25, 2023
26 checks passed
@gvanrossum gvanrossum deleted the split-call-method branch August 25, 2023 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants