Skip to content

[LLVM][CodeGen][SVE] Refactor isel of 128-bit constant splats.#185652

Merged
paulwalker-arm merged 1 commit intollvm:mainfrom
paulwalker-arm:sve-neon-sized-constants
Mar 11, 2026
Merged

[LLVM][CodeGen][SVE] Refactor isel of 128-bit constant splats.#185652
paulwalker-arm merged 1 commit intollvm:mainfrom
paulwalker-arm:sve-neon-sized-constants

Conversation

@paulwalker-arm
Copy link
Copy Markdown
Collaborator

Rather than lower constant splats that only SVE supports to scalable vectors this patch maintains the use of fixed length vectors but adds isel patterns to select the necessary SVE instructions.

Doing this means we can extend coverage to include SVE operations that take an immediate operand without needing to convert more of the DAG to scalable vectors, which can potentially prevent larger NEON patterns from matching.

Rather than lower constant splats that only SVE supports to scalable
vectors this patch maintains the use of fixed length vectors but adds
isel patterns to select the necessary SVE instructions.

Doing this means we can extend coverage to include SVE operations
that take an immediate operand without needing to convert more of the
DAG to scalable vectors, which can potentially prevent larger NEON
patterns from matching.
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 10, 2026

@llvm/pr-subscribers-backend-aarch64

Author: Paul Walker (paulwalker-arm)

Changes

Rather than lower constant splats that only SVE supports to scalable vectors this patch maintains the use of fixed length vectors but adds isel patterns to select the necessary SVE instructions.

Doing this means we can extend coverage to include SVE operations that take an immediate operand without needing to convert more of the DAG to scalable vectors, which can potentially prevent larger NEON patterns from matching.


Full diff: https://github.com/llvm/llvm-project/pull/185652.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+3-4)
  • (modified) llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (+3)
  • (modified) llvm/lib/Target/AArch64/SVEInstrFormats.td (+3)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index cd9de6c729649..20c4ff566defc 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -16024,10 +16024,9 @@ static SDValue trySVESplat64(SDValue Op, SelectionDAG &DAG,
     return SDValue();
 
   SDLoc DL(Op);
-  SDValue SplatVal = DAG.getSplatVector(MVT::nxv2i64, DL,
-                                        DAG.getConstant(Val64, DL, MVT::i64));
-  SDValue Res = convertFromScalableVector(DAG, MVT::v2i64, SplatVal);
-  return DAG.getNode(AArch64ISD::NVCAST, DL, VT, Res);
+  SDValue SplatVal = DAG.getNode(AArch64ISD::DUP, DL, MVT::v2i64,
+                                 DAG.getConstant(Val64, DL, MVT::i64));
+  return DAG.getNode(AArch64ISD::NVCAST, DL, VT, SplatVal);
 }
 
 static SDValue ConstantBuildVector(SDValue Op, SelectionDAG &DAG,
diff --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
index 273249b9ff44c..fab3dd41e89a1 100644
--- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -982,6 +982,9 @@ let Predicates = [HasSVE_or_SME] in {
   def : Pat<(nxv2i64 (splat_vector (i64 (SVECpyDupImm64Pat i32:$a, i32:$b)))),
             (DUP_ZI_D $a, $b)>;
 
+  def : Pat<(v2i64 (AArch64dup (i64 (SVECpyDupImm64Pat i32:$a, i32:$b)))),
+            (EXTRACT_SUBREG (DUP_ZI_D $a, $b), zsub)>;
+
   // Duplicate immediate FP into all vector elements.
   def : Pat<(nxv2f16 (splat_vector (f16 fpimm:$val))),
             (DUP_ZR_H (MOVi32imm (bitcast_fpimm_to_i32 f16:$val)))>;
diff --git a/llvm/lib/Target/AArch64/SVEInstrFormats.td b/llvm/lib/Target/AArch64/SVEInstrFormats.td
index 9d988b5654a6e..af1c8676a99ce 100644
--- a/llvm/lib/Target/AArch64/SVEInstrFormats.td
+++ b/llvm/lib/Target/AArch64/SVEInstrFormats.td
@@ -2176,6 +2176,9 @@ multiclass sve_int_dup_mask_imm<string asm> {
             (!cast<Instruction>(NAME) i64:$imm)>;
   def : Pat<(nxv2bf16 (splat_vector (bf16 (SVELogicalBFPImmPat i64:$imm)))),
             (!cast<Instruction>(NAME) i64:$imm)>;
+
+  def : Pat<(v2i64 (AArch64dup (i64 (SVELogicalImm64Pat i64:$imm)))),
+            (EXTRACT_SUBREG (!cast<Instruction>(NAME) i64:$imm), zsub)>;
 }
 
 //===----------------------------------------------------------------------===//

@huntergr-arm
Copy link
Copy Markdown
Collaborator

Looks ok overall.

Doing this means we can extend coverage to include SVE operations that take an immediate operand without needing to convert more of the DAG to scalable vectors, which can potentially prevent larger NEON patterns from matching.

Does this mean you expect this PR to enable matching patterns for the immediate form which didn't before? If so, could you please add a test if it's not too convoluted to do so?

@paulwalker-arm
Copy link
Copy Markdown
Collaborator Author

Doing this means we can extend coverage to include SVE operations that take an immediate operand without needing to convert more of the DAG to scalable vectors, which can potentially prevent larger NEON patterns from matching.

Does this mean you expect this PR to enable matching patterns for the immediate form which didn't before? If so, could you please add a test if it's not too convoluted to do so?

No, this patch is pretty much NFC.

DAG.getConstant(Val64, DL, MVT::i64));
SDValue Res = convertFromScalableVector(DAG, MVT::v2i64, SplatVal);
return DAG.getNode(AArch64ISD::NVCAST, DL, VT, Res);
SDValue SplatVal = DAG.getNode(AArch64ISD::DUP, DL, MVT::v2i64,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write a test that shows the benefit?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. This is purely refactoring at this stage.

Copy link
Copy Markdown
Contributor

@david-arm david-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@paulwalker-arm paulwalker-arm merged commit 9ba92ff into llvm:main Mar 11, 2026
12 checks passed
@paulwalker-arm paulwalker-arm deleted the sve-neon-sized-constants branch March 11, 2026 11:32
Copy link
Copy Markdown
Member

@MacDue MacDue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This looks like it regresses some MUL_ZI patterns.

E.g.,

define <2 x i64> @mul_v2i64(<2 x i64> %a) {
entry:
  %mul = mul <2 x i64> %a, splat (i64 123)
  ret <2 x i64> %mul
}

Used to lower to the immediate form, but it now results in:

	mov	z1.d, #123                  
	ptrue	p0.d, vl2
	mul	z0.d, p0/m, z0.d, z1.d                                     

It looks like the reason is the SVE_1_Op_Imm_Arith_Any_Predicate pattern only expects a splat_vector. Noticed while rebasing #165559.

MacDue added a commit to MacDue/llvm-project that referenced this pull request Mar 11, 2026
MacDue added a commit to MacDue/llvm-project that referenced this pull request Mar 11, 2026
@paulwalker-arm
Copy link
Copy Markdown
Collaborator Author

Thanks @MacDue. #186090 adds test coverage and a fix.

ayasin-a pushed a commit to ayasin-a/llvm-project that referenced this pull request Apr 19, 2026
…185652)

Rather than lower constant splats that only SVE supports to scalable
vectors this patch maintains the use of fixed length vectors but adds
isel patterns to select the necessary SVE instructions.

Doing this means we can extend coverage to include SVE operations that
take an immediate operand without needing to convert more of the DAG to
scalable vectors, which can potentially prevent larger NEON patterns
from matching.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants