Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: Refactor into gen_push_frame #6412

Merged
merged 1 commit into from
Sep 22, 2022
Merged

Conversation

jhawthorn
Copy link
Member

One of my hopes with the introduction of the new backend IR is the ability to refactor some of our more complex methods into smaller and (hopefully) reusable methods.

This refactors the "push frame" operation common to both gen_send_iseq and gen_send_cfunc into its own method. This allows that logic to live in one place.

A few things motivated this extraction in particular:

  • @matthewd and have been pairing on improving "blockarg" support, which required making an identical change to both gen_send_iseq and gen_send_cfunc
  • I have a wip branch which implements "bmethods", which required performing this same push frame operation in a 3rd place 😅
  • I'd like to experiment with pushing an extra frame for some optimizations (specifically to combine new/initialize)

I like how this turned out. It is slightly more code, but much of that is comments. I used a struct to pass the arguments describing how the frame is to be constructed, which I think makes the calls easier to read and less error-prone. I used an enum for PushFrameBlockHandler in anticipation of a 3rd type being added for block param proxy support.

The code that's generated is very slightly different (I believe better, especially for the cfunc case). The new CFP is written to relative to the caller CFP instead of using a temporary register (a temporary register derived from a memory addition and a memory load for cfunc) in both cases. This takes the function call block of 1.itself (a cfunc) from 177 bytes to 159 bytes on x86_64 (for iseq calls a more modest 3 bytes of savings 😅). I wasn't able to see a difference in performance, but would be curious to see on an Intel machine (vs my AMD zen 2).

@matzbot matzbot requested a review from a team September 21, 2022 03:43
@k0kubun
Copy link
Member

k0kubun commented Sep 21, 2022

Cirrus Graviton2 YJIT jobs failed with both gcc and clang. Is this failure related to your change?

@jhawthorn
Copy link
Member Author

jhawthorn commented Sep 21, 2022

I think it is related. I can reproduce (inconsistently) locally. I'll investigate tomorrow.

@noahgibbs
Copy link
Contributor

I definitely like the readability and maintainability of this more.

@jhawthorn
Copy link
Member Author

jhawthorn commented Sep 22, 2022

Figured it out! I had mistakenly dropped a cast from cme -> usize -> VALUE -> Opnd and only did cme -> usize -> Opnd so it wasn't being compacted/moved correctly. Should be fixed now! 🤞

yjit/src/codegen.rs Outdated Show resolved Hide resolved
yjit/src/codegen.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this effort, it's a good idea 👍

Wrote two very minor comments.

This refactors the "push frame" operation common to both gen_send_iseq
and gen_send_cfunc into its own method. This allows that logic to live
in one place.
@maximecb maximecb merged commit 4b97f1e into ruby:master Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants