Skip to content

arm64: Clean up SVE embedded masked codegen#127164

Draft
ylpoonlg wants to merge 1 commit intodotnet:mainfrom
ylpoonlg:github-movprfx_refactor_3
Draft

arm64: Clean up SVE embedded masked codegen#127164
ylpoonlg wants to merge 1 commit intodotnet:mainfrom
ylpoonlg:github-movprfx_refactor_3

Conversation

@ylpoonlg
Copy link
Copy Markdown
Contributor

This PR is the last part for #115508, with the following changes:

  • Cleanup to hwintrinsiccodegenarm64.cpp:

    • Move the embedded masked block to a new function genEmbeddedMaskedHWIntrinsic.
    • Combine codepaths for different number of operand cases and centralize movprfx logic.
  • Optimizations to movprfx usage in embedded masked operation codegen:

    • Replace predicated movprfx with unpredicated movprfx when mask is all-true. Unpredicated movprfx is generally preferred due to performance.
    • Allow zero falseOp to be contained when mask is not all-true so that zeroing predicated movprfx can be used.
    • Allow unary embedded masked ops to use movprfx. Unary embedded masked ops are not RMW instructions, but may also support movprfx, which require the target register to be delayed free in LSRA.
  • Fixing the Sve HardwareIntrinsics tests ConditionalSelect_ZeroOp calls: The falseOp->IsVectorZero branch in the codegen was previously untested because the zero vector was passed as a local variable rather than a constant vector. The zero vector needs to be passed directly into the ConditionalSelect intrinsic in the test templates.

* Move the embedded masked block to a new function.

* Combine codepaths for different number of operand cases.

* Optimise predicated movprfx into unpredicated movprfx when mask is all-true.

* Allow zero falseOp to be contained when mask is not all-true.

* Fix Sve HWIntrin tests ConditionalSelect ZeroOp. The zero vector needs
  to be passed directly as constant such that the falseOp->IsVectorZero
  branch can be tested.

* Fix LSRA delay free to allow unary embedded masked ops to use movprfx.
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 20, 2026
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 20, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant