Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: Fallback Integer#<< if a shift amount varies #9426

Merged
merged 2 commits into from Jan 8, 2024

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Jan 5, 2024

  • Chain-guard Integer#<< once and fallback to a C call if a shift amount varies.
  • Rename an exit reason amt_changed to amount_changed since the name seemed a bit cryptic.

On nqueens Shopify/yjit-bench#260 with --yjit-call-threshold=1, this change pushes ratio_in_yjit from 0.0% to 99.9% and makes it 3.60x faster.

before: ruby 3.4.0dev (2024-01-05T16:51:37Z master 4d03140009) +YJIT [x86_64-linux]
after: ruby 3.4.0dev (2024-01-08T17:07:08Z yjit-ltlt 43edc3419f) +YJIT [x86_64-linux]

-------  -----------  ----------  ----------  ----------  -------------  ------------
bench    before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
nqueens  581.8        0.7         161.8       0.1         3.55           3.60
-------  -----------  ----------  ----------  ----------  -------------  ------------

@k0kubun k0kubun marked this pull request as ready for review January 5, 2024 23:46
@matzbot matzbot requested a review from a team January 5, 2024 23:46
if shift_amt > 63 || shift_amt < 0 {
return false;
}

// Fallback to a C call if the shift amount varies
if asm.ctx.get_chain_depth() > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this effectively disables the specialized path, with ./miniruby --yjit-dump-disasm --yjit-call-threshold=1 -e '23434 << 3' generating a call:

  # regenerate_branch
  # Block: <main>@-e:1 (chain_depth: 1)
  # reg_temps: 00000011
  # Insn: 0004 opt_ltlt (stack_size: 2)
  # call to Integer#<<
  # RUBY_VM_CHECK_INTS(ec)
  0x104d1803c: ldur w11, [x20, #0x20]
  0x104d18040: tst w11, w11
  <snip>

Should be > 1 probably, and it looks like the defer_compilation is from gen_send_general() and it's unconditional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, thanks for catching it. 43edc34

I don't get the point of this increment by the way. If it's only for the "Double defer!" detection, we could assign one bit from chain_depth_return_landing to it instead of overloading chain_depth. It seems counter-intuitive when you look at *_MAX_DEPTH constants.

@maximecb maximecb enabled auto-merge (squash) January 8, 2024 17:22
@maximecb maximecb merged commit a0eecfb into ruby:master Jan 8, 2024
93 of 94 checks passed
@k0kubun k0kubun deleted the yjit-ltlt branch January 8, 2024 17:49
@maximecb
Copy link
Contributor

For some reason I'm not seeing any speedup on nqueens:

interp: ruby 3.4.0dev (2024-01-12T18:04:23Z master 0462b1b350) [arm64-darwin23]
yjit: ruby 3.4.0dev (2024-01-12T18:04:23Z master 0462b1b350) +YJIT dev [arm64-darwin23]

-------  -----------  ----------  ---------  ----------  ------------  -----------
bench    interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
nqueens  729.3        0.5         728.8      0.5         1.01          1.00       
-------  -----------  ----------  ---------  ----------  ------------  -----------
Legend:
- yjit 1st itr: ratio of interp/yjit time for the first benchmarking iteration.
- interp/yjit: ratio of interp/yjit time. Higher is better for yjit. Above 1 represents a speedup.

Wondering if maybe the last commit made this not work?

@k0kubun
Copy link
Member Author

k0kubun commented Jan 12, 2024

Did you use --yjit-call-threshold=1? As I said in Shopify/yjit-bench#260 (comment), the script doesn't JIT the method by default, so it needs to be updated.

@maximecb
Copy link
Contributor

kk. I'll tweak the benchmark parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants