Skip to content

Help ensure we get the smallest encoding for commutative instructions#128521

Open
tannergooding wants to merge 6 commits into
dotnet:mainfrom
tannergooding:vex2-encoding
Open

Help ensure we get the smallest encoding for commutative instructions#128521
tannergooding wants to merge 6 commits into
dotnet:mainfrom
tannergooding:vex2-encoding

Conversation

@tannergooding
Copy link
Copy Markdown
Member

@tannergooding tannergooding commented May 23, 2026

The VEX 2-byte encoding allows REX.R to be on or off while it requires that REX.B be off. This means that op1 can encode [XMM0, XMM15] while op2 can only encode `[XMM0, XMM7]. Thus, when we have a commutative instruction we want to check for this scenario and swap the operands. If both are not "extended" then we'll get the 2-byte encoding regardless, while if both are extended we'll require the 3-byte encoding.

This doesn't handle some of the comparison cases that also qualify but which require swapping the comparison that is being done.

- vpand    xmm3, xmm3, xmm8
-                        ;; size=5
+ vpand    xmm3, xmm8, xmm3
+                        ;; size=4

Copilot AI review requested due to automatic review settings May 23, 2026 14:03
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 23, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new “commutative” instruction flag for xarch SIMD/AVX instructions and uses it in the xarch emitter to opportunistically swap op1/op2 for commutative 3-operand SIMD instructions, improving the chances of using a smaller (2-byte) VEX prefix when only one source register is “extended”.

Changes:

  • Add INS_Flags_IsCommutative to insFlags and mark a set of commutative SIMD instructions with it.
  • Add emitter::IsAvxCommutative and use it in emitIns_SIMD_R_R_R to swap operands when op2 is extended but op1 is not.
  • Wire the new helper into the emitter’s xarch interface.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/jit/instrsxarch.h Marks specific VEX/EVEX SIMD instructions as commutative via INS_Flags_IsCommutative.
src/coreclr/jit/instr.h Introduces the new INS_Flags_IsCommutative bit in the xarch insFlags enum.
src/coreclr/jit/emitxarch.h Declares the new emitter::IsAvxCommutative helper.
src/coreclr/jit/emitxarch.cpp Implements IsAvxCommutative and adds operand-swapping logic in emitIns_SIMD_R_R_R.

Comment thread src/coreclr/jit/emitxarch.cpp Outdated
Comment thread src/coreclr/jit/emitxarch.cpp Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 23, 2026 14:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/jit/emitxarch.cpp
@tannergooding tannergooding marked this pull request as ready for review May 23, 2026 15:49
@tannergooding tannergooding requested review from Copilot May 23, 2026 15:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/jit/codegenxarch.cpp Outdated
@tannergooding
Copy link
Copy Markdown
Member Author

CC. @dotnet/jit-contrib, @EgorBo. Small change that is a pure improvement, saves 4441-8094 bytes of codegen as we're able to more frequently emit the 2-byte VEX prefix for these commutative instructions.

As per the top post, the consideration is that the 2-byte VEX prefix requires REX.B to be "off" which means that the modr/m rm field (used to encode op2) is restricted to [xmm0, xmm7]. For commutative instructions, we can swap op1/op2 if only one of them is an extended register ([xmm8, xmm15]) and save that byte.

This should help minimize future diffs caused by LSRA changing preferences but won't eliminate it entirely due to all the non commutative nodes. There's opportunity to get some more cases as well, such as floating-point comparisons, but that requires fixing up the immediate control byte and so I left it to the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants