Help ensure we get the smallest encoding for commutative instructions#128521
Help ensure we get the smallest encoding for commutative instructions#128521tannergooding wants to merge 6 commits into
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new “commutative” instruction flag for xarch SIMD/AVX instructions and uses it in the xarch emitter to opportunistically swap op1/op2 for commutative 3-operand SIMD instructions, improving the chances of using a smaller (2-byte) VEX prefix when only one source register is “extended”.
Changes:
- Add
INS_Flags_IsCommutativetoinsFlagsand mark a set of commutative SIMD instructions with it. - Add
emitter::IsAvxCommutativeand use it inemitIns_SIMD_R_R_Rto swap operands whenop2is extended butop1is not. - Wire the new helper into the emitter’s xarch interface.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/coreclr/jit/instrsxarch.h | Marks specific VEX/EVEX SIMD instructions as commutative via INS_Flags_IsCommutative. |
| src/coreclr/jit/instr.h | Introduces the new INS_Flags_IsCommutative bit in the xarch insFlags enum. |
| src/coreclr/jit/emitxarch.h | Declares the new emitter::IsAvxCommutative helper. |
| src/coreclr/jit/emitxarch.cpp | Implements IsAvxCommutative and adds operand-swapping logic in emitIns_SIMD_R_R_R. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
CC. @dotnet/jit-contrib, @EgorBo. Small change that is a pure improvement, saves 4441-8094 bytes of codegen as we're able to more frequently emit the 2-byte VEX prefix for these commutative instructions. As per the top post, the consideration is that the 2-byte VEX prefix requires This should help minimize future diffs caused by LSRA changing preferences but won't eliminate it entirely due to all the non commutative nodes. There's opportunity to get some more cases as well, such as floating-point comparisons, but that requires fixing up the immediate control byte and so I left it to the future. |
The VEX 2-byte encoding allows
REX.Rto be on or off while it requires thatREX.Bbe off. This means thatop1can encode[XMM0, XMM15]whileop2can only encode `[XMM0, XMM7]. Thus, when we have a commutative instruction we want to check for this scenario and swap the operands. If both are not "extended" then we'll get the 2-byte encoding regardless, while if both are extended we'll require the 3-byte encoding.This doesn't handle some of the comparison cases that also qualify but which require swapping the comparison that is being done.