New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YJIT: Optimize cmp REG, 0
into test REG, REG
#7471
Conversation
It's nice to be able to do this optimization, but if we keep adding passes, there is a risk that compilation time could become a concern. One way that we could evaluate compilation time is to measure the time taken by 30k_ifelse on the first iteration with Otherwise, I would be tempted to ask, can we do this in the x86_merge pass you added the other day? Typically, peephole optimizations like this are done at the end, because the compiler may introduce inefficiencies during compilation. |
I'll benchmark the 1st itr of 30k_ifelse on Linux and report it here.
What I wanted to explain in the code comments was that |
Looks like the 1st itr of 30k_ifelse visualizes the compilation speed pretty well. The slowdown by
I'll think about removing at least one pass here. At least, maybe we could do this as part of |
If we could do it in x86_split that would be attractive 👍 Another potential alternative is to do it right when the instruction is inserted, peek at previous instruction. |
3d9ff80
to
e01c1b8
Compare
e01c1b8
to
5b0a26c
Compare
I did that in
|
Looks good! Thanks for taking the time to validate the effect on compilation time :) |
This implements the idea suggested at #7242 (comment). When checking equality of a register against 0, using
test
instruction generates more compact code.This is particularly useful when you want to do something like
asm.cmp(block_handler, VM_BLOCK_HANDLER_NONE.into());
where you would need to assertVM_BLOCK_HANDLER_NONE == 0
in order to useasm.test
.Generated code
Before
After
Code size stats
On railsbench, inline code size went down by 0.1%.
Before
After