Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: Add ability to exit to interpreter from stubs #5180

Merged
merged 3 commits into from Nov 26, 2021
Merged

Conversation

XrXr
Copy link
Member

@XrXr XrXr commented Nov 26, 2021

Previously, YJIT assumed that it's always possible to generate a new
basic block when servicing a stub in branch_stub_hit(). When YJIT is out
of executable memory, for example, this assumption doesn't hold up.

Add handling to branch_stub_hit() for servicing stubs without consuming
more executable memory by adding a code path that exits to the
interpreter at the location the branch stub represents. The new code
path reconstructs interpreter state in branch_stub_hit() and then exits
with a new snippet called code_for_exit_from_stub that returns
Qundef from the YJIT native stack frame.

As this change adds another place where we regenerate code from
branch_t, extract the logic for it into a new function and call it
regenerate_branch(). While we are at it, make the branch shrinking code
path in branch_stub_hit() more explicit.

This new functionality is hard to test without full support for out of
memory conditions. To verify this change, I ran
RUBY_YJIT_ENABLE=1 make check -j12 with the following patch to stress
test the new code path:

diff --git a/yjit_core.c b/yjit_core.c
index 4ab63d9806..5788b8c5ed 100644
--- a/yjit_core.c
+++ b/yjit_core.c
@@ -878,8 +878,12 @@ branch_stub_hit(branch_t *branch, const uint32_t target_idx, rb_execution_contex
                 cb_set_write_ptr(cb, branch->end_addr);
             }

+if (rand() < RAND_MAX/2) {
             // Compile the new block version
             p_block = gen_block_version(target, target_ctx, ec);
+}else{
+    p_block = NULL;
+}

             if (!p_block && branch_modified) {
                 // We couldn't generate a new block for the branch, but we modified the branch.

We can enable the new test along with other OOM tests once full support
lands.

Other small changes:

  • yjit_utils.c (print_str): Update to work with new native frame shape.
    Follow up for 8fa0ee4.
  • yjit_iface.c (rb_yjit_init): Run yjit_init_core() after
    yjit_init_codegen() so cb and ocb are available.

@XrXr
Copy link
Member Author

XrXr commented Nov 26, 2021

Reviews are welcome, but please don't merge this yet. I need to review this more tomorrow.

Previously, YJIT assumed that it's always possible to generate a new
basic block when servicing a stub in branch_stub_hit(). When YJIT is out
of executable memory, for example, this assumption doesn't hold up.

Add handling to branch_stub_hit() for servicing stubs without consuming
more executable memory by adding a code path that exits to the
interpreter at the location the branch stub represents. The new code
path reconstructs interpreter state in branch_stub_hit() and then exits
with a new snippet called `code_for_exit_from_stub` that returns
`Qundef` from the YJIT native stack frame.

As this change adds another place where we regenerate code from
`branch_t`, extract the logic for it into a new function and call it
regenerate_branch(). While we are at it, make the branch shrinking code
path in branch_stub_hit() more explicit.

This new functionality is hard to test without full support for out of
memory conditions. To verify this change, I ran
`RUBY_YJIT_ENABLE=1 make check -j12` with the following patch to stress
test the new code path:

```diff
diff --git a/yjit_core.c b/yjit_core.c
index 4ab63d9806..5788b8c5ed 100644
--- a/yjit_core.c
+++ b/yjit_core.c
@@ -878,8 +878,12 @@ branch_stub_hit(branch_t *branch, const uint32_t target_idx, rb_execution_contex
                 cb_set_write_ptr(cb, branch->end_addr);
             }

+if (rand() < RAND_MAX/2) {
             // Compile the new block version
             p_block = gen_block_version(target, target_ctx, ec);
+}else{
+    p_block = NULL;
+}

             if (!p_block && branch_modified) {
                 // We couldn't generate a new block for the branch, but we modified the branch.
```

We can enable the new test along with other OOM tests once full support
lands.

Other small changes:
 * yjit_utils.c (print_str): Update to work with new native frame shape.
       Follow up for 8fa0ee4.
 * yjit_iface.c (rb_yjit_init): Run yjit_init_core() after
       yjit_init_codegen() so `cb` and `ocb` are available.
@XrXr
Copy link
Member Author

XrXr commented Nov 26, 2021

Off topic for this PR: @junaruga s390x on Travis seems to be timing out in ccache related stuff after successfully running all the tests. I've seen other instances recently. Any idea?

Comment on lines +611 to +612
static void
regenerate_branch(codeblock_t *cb, branch_t *branch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factoring this into its own function is a good idea 👍 🙂

Comment on lines +633 to +636
if (old_write_pos > cb->write_pos) {
// We rewound cb->write_pos to generate the branch, now restore it.
cb_set_pos(cb, old_write_pos);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add a comment explaining why this if statement is there. Otherwise it seems a little strange without detailed prior knowledge of BBV.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! How do you like the comment in f423be9?

I feel like this is not a BBV issue per se and more that the code block abstraction is a bit tricky and has this gocha. I feel like we may want a setup where you derive a short-lived write cursor from the code block to make this more natural. Would be good for tracking which virtual memory pages we have written to if we want a more granular implementation of #5032 too. Totally out of scope for this PR of course, I just wanted to get this idea out there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the comment should explain that if the branch is at the end of the code block and shrinks, we want to recover that space instead of leaving a gap.

You're right that the current setup with the write position tracking the size of the generated code and also being adjusted when patching code is not ideal. I also think you're right that it would be nice to have a more granular mechanism for handling the W^X issue, because otherwise we're sadly going to be leaving performance on the table. Very much open to it if you have ideas for what this better abstraction would look like. Any way that you could sketch the API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, it might be too early to go for heavier abstraction. I searched around and this function seems to be the only place where we seek around in the code block. This if handles and documents this quirk and it should be okay enough for now.

Tracking writes probably involves passing around an array of pointers to start of pages like we currently pass around codeblock_t *. I haven't given it that much thought. #5032 probably needs benchmarking to see if it's already fast enough.

@maximecb
Copy link
Contributor

Looks good. Thanks again for tackling this challenging problem and adding all the tests, and good move factoring out regenerate_branch into its own function, that's a nice maintainability improvement.

@XrXr XrXr marked this pull request as ready for review November 26, 2021 21:14
@XrXr XrXr merged commit b5b6ab4 into ruby:master Nov 26, 2021
@XrXr XrXr deleted the oom-stubs branch November 26, 2021 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants