-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YJIT: Refactor into gen_push_frame #6412
Conversation
Cirrus Graviton2 YJIT jobs failed with both gcc and clang. Is this failure related to your change? |
I think it is related. I can reproduce (inconsistently) locally. I'll investigate tomorrow. |
I definitely like the readability and maintainability of this more. |
2d2a7dc
to
26e9b31
Compare
Figured it out! I had mistakenly dropped a cast from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support this effort, it's a good idea 👍
Wrote two very minor comments.
This refactors the "push frame" operation common to both gen_send_iseq and gen_send_cfunc into its own method. This allows that logic to live in one place.
26e9b31
to
341c968
Compare
One of my hopes with the introduction of the new backend IR is the ability to refactor some of our more complex methods into smaller and (hopefully) reusable methods.
This refactors the "push frame" operation common to both gen_send_iseq and gen_send_cfunc into its own method. This allows that logic to live in one place.
A few things motivated this extraction in particular:
gen_send_iseq
andgen_send_cfunc
new
/initialize
)I like how this turned out. It is slightly more code, but much of that is comments. I used a struct to pass the arguments describing how the frame is to be constructed, which I think makes the calls easier to read and less error-prone. I used an enum for
PushFrameBlockHandler
in anticipation of a 3rd type being added for block param proxy support.The code that's generated is very slightly different (I believe better, especially for the cfunc case). The new CFP is written to relative to the caller CFP instead of using a temporary register (a temporary register derived from a memory addition and a memory load for cfunc) in both cases. This takes the function call block of
1.itself
(a cfunc) from 177 bytes to 159 bytes on x86_64 (for iseq calls a more modest 3 bytes of savings 😅). I wasn't able to see a difference in performance, but would be curious to see on an Intel machine (vs my AMD zen 2).