New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YJIT: Fallback Integer#<< if a shift amount varies #9426
Conversation
yjit/src/codegen.rs
Outdated
if shift_amt > 63 || shift_amt < 0 { | ||
return false; | ||
} | ||
|
||
// Fallback to a C call if the shift amount varies | ||
if asm.ctx.get_chain_depth() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this effectively disables the specialized path, with ./miniruby --yjit-dump-disasm --yjit-call-threshold=1 -e '23434 << 3'
generating a call:
# regenerate_branch
# Block: <main>@-e:1 (chain_depth: 1)
# reg_temps: 00000011
# Insn: 0004 opt_ltlt (stack_size: 2)
# call to Integer#<<
# RUBY_VM_CHECK_INTS(ec)
0x104d1803c: ldur w11, [x20, #0x20]
0x104d18040: tst w11, w11
<snip>
Should be > 1
probably, and it looks like the defer_compilation is from gen_send_general() and it's unconditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, thanks for catching it. 43edc34
I don't get the point of this increment by the way. If it's only for the "Double defer!" detection, we could assign one bit from chain_depth_return_landing
to it instead of overloading chain_depth
. It seems counter-intuitive when you look at *_MAX_DEPTH
constants.
For some reason I'm not seeing any speedup on
Wondering if maybe the last commit made this not work? |
Did you use |
kk. I'll tweak the benchmark parameters. |
Integer#<<
once and fallback to a C call if a shift amount varies.amt_changed
toamount_changed
since the name seemed a bit cryptic.On
nqueens
Shopify/yjit-bench#260 with--yjit-call-threshold=1
, this change pushesratio_in_yjit
from 0.0% to 99.9% and makes it 3.60x faster.