Skip to content

[AArch64][SVE] Match (add_like x (lsr y, c)) -> usra x, y, c#197657

Open
Harry-Ramsey wants to merge 1 commit into
llvm:mainfrom
Harry-Ramsey:ramsey/sve-generate-ursa
Open

[AArch64][SVE] Match (add_like x (lsr y, c)) -> usra x, y, c#197657
Harry-Ramsey wants to merge 1 commit into
llvm:mainfrom
Harry-Ramsey:ramsey/sve-generate-ursa

Conversation

@Harry-Ramsey
Copy link
Copy Markdown
Contributor

Modify SVEE USRA pattern to accept add_like, so both add and or disjoint forms can select usra.

Add known-bits support for predicated SVE logical shifts, allowing or_disjoint matching to prove disjointness for plain ORs where possible.

@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 14, 2026

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-selectiondag

Author: Harry Ramsey (Harry-Ramsey)

Changes

Modify SVEE USRA pattern to accept add_like, so both add and or disjoint forms can select usra.

Add known-bits support for predicated SVE logical shifts, allowing or_disjoint matching to prove disjointness for plain ORs where possible.


Full diff: https://github.com/llvm/llvm-project/pull/197657.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (-5)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+12)
  • (modified) llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sve2-sli-sri.ll (+16-8)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 9af378ec28868..29bad5c513c55 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -4525,11 +4525,6 @@ KnownBits SelectionDAG::computeKnownBits(SDValue Op, const APInt &DemandedElts,
   case ISD::INTRINSIC_WO_CHAIN:
   case ISD::INTRINSIC_W_CHAIN:
   case ISD::INTRINSIC_VOID:
-    // TODO: Probably okay to remove after audit; here to reduce change size
-    // in initial enablement patch for scalable vectors
-    if (Op.getValueType().isScalableVector())
-      break;
-
     // Allow the target to implement this method for its nodes.
     TLI->computeKnownBitsForTargetNode(Op, Known, DemandedElts, *this, Depth);
     break;
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 9de0b461a216c..e78fd18f6604a 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2902,6 +2902,18 @@ void AArch64TargetLowering::computeKnownBitsForTargetNode(
     }
     break;
   }
+  case AArch64ISD::SHL_PRED:
+  case AArch64ISD::SRL_PRED: {
+    KnownBits KnownVal =
+        DAG.computeKnownBits(Op->getOperand(1), DemandedElts, Depth + 1);
+    KnownBits KnownAmt =
+        DAG.computeKnownBits(Op->getOperand(2), DemandedElts, Depth + 1);
+
+    Known = Op.getOpcode() == AArch64ISD::SHL_PRED
+                ? KnownBits::shl(KnownVal, KnownAmt)
+                : KnownBits::lshr(KnownVal, KnownAmt);
+    break;
+  }
   case ISD::INTRINSIC_WO_CHAIN:
   case ISD::INTRINSIC_VOID: {
     unsigned IntNo = Op.getConstantOperandVal(0);
diff --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
index 9df77f8e93c64..e4c281a84d72b 100644
--- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -362,7 +362,7 @@ def AArch64uaba : PatFrags<(ops node:$op1, node:$op2, node:$op3),
 
 def AArch64usra : PatFrags<(ops node:$op1, node:$op2, node:$op3),
                            [(int_aarch64_sve_usra node:$op1, node:$op2, node:$op3),
-                            (add node:$op1, (AArch64lsr_p (SVEAnyPredicate), node:$op2, (SVEShiftSplatImmR (i32 node:$op3))))]>;
+                            (add_like node:$op1, (AArch64lsr_p (SVEAnyPredicate), node:$op2, (SVEShiftSplatImmR (i32 node:$op3))))]>;
 
 def AArch64ssra : PatFrags<(ops node:$op1, node:$op2, node:$op3),
                            [(int_aarch64_sve_ssra node:$op1, node:$op2, node:$op3),
diff --git a/llvm/test/CodeGen/AArch64/sve2-sli-sri.ll b/llvm/test/CodeGen/AArch64/sve2-sli-sri.ll
index 80999fb1f4864..a7048731850d3 100644
--- a/llvm/test/CodeGen/AArch64/sve2-sli-sri.ll
+++ b/llvm/test/CodeGen/AArch64/sve2-sli-sri.ll
@@ -118,14 +118,22 @@ define <vscale x 8 x i16> @testRightGood8x16(<vscale x 8 x i16> %src1, <vscale x
 }
 
 define <vscale x 8 x i16> @testRightBad8x16(<vscale x 8 x i16> %src1, <vscale x 8 x i16> %src2) {
-; CHECK-LABEL: testRightBad8x16:
-; CHECK:       // %bb.0:
-; CHECK-NEXT:    mov w8, #16500 // =0x4074
-; CHECK-NEXT:    lsr z1.h, z1.h, #14
-; CHECK-NEXT:    mov z2.h, w8
-; CHECK-NEXT:    and z0.d, z0.d, z2.d
-; CHECK-NEXT:    orr z0.d, z0.d, z1.d
-; CHECK-NEXT:    ret
+; SVE-LABEL: testRightBad8x16:
+; SVE:       // %bb.0:
+; SVE-NEXT:    mov w8, #16500 // =0x4074
+; SVE-NEXT:    lsr z1.h, z1.h, #14
+; SVE-NEXT:    mov z2.h, w8
+; SVE-NEXT:    and z0.d, z0.d, z2.d
+; SVE-NEXT:    orr z0.d, z0.d, z1.d
+; SVE-NEXT:    ret
+;
+; SVE2-LABEL: testRightBad8x16:
+; SVE2:       // %bb.0:
+; SVE2-NEXT:    mov w8, #16500 // =0x4074
+; SVE2-NEXT:    mov z2.h, w8
+; SVE2-NEXT:    and z0.d, z0.d, z2.d
+; SVE2-NEXT:    usra z0.h, z1.h, #14
+; SVE2-NEXT:    ret
   %and.i = and <vscale x 8 x i16> %src1, splat(i16 16500)
   %vshl_n = lshr <vscale x 8 x i16> %src2, splat(i16 14)
   %result = or <vscale x 8 x i16> %and.i, %vshl_n

case ISD::INTRINSIC_WO_CHAIN:
case ISD::INTRINSIC_W_CHAIN:
case ISD::INTRINSIC_VOID:
// TODO: Probably okay to remove after audit; here to reduce change size
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if the audit concluded and this code can be removed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks OK, looking at the existing nodes in architectures that have scalable vectors and how they use types.

Comment thread llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
Comment thread llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Outdated
case ISD::INTRINSIC_WO_CHAIN:
case ISD::INTRINSIC_W_CHAIN:
case ISD::INTRINSIC_VOID:
// TODO: Probably okay to remove after audit; here to reduce change size
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks OK, looking at the existing nodes in architectures that have scalable vectors and how they use types.

Comment thread llvm/test/CodeGen/AArch64/sve2-sli-sri.ll
@davemgreen davemgreen requested a review from david-arm May 15, 2026 10:32
@Harry-Ramsey
Copy link
Copy Markdown
Contributor Author

I have taken a moment to additionally create tests for SSRA. Let me know if the test names should be renamed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

🪟 Windows x64 Test Results

  • 135232 tests passed
  • 3335 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

🐧 Linux x64 Test Results

  • 195946 tests passed
  • 5281 tests skipped

✅ The build succeeded and all tests passed.

@Harry-Ramsey Harry-Ramsey force-pushed the ramsey/sve-generate-ursa branch 2 times, most recently from 21e077d to 4e9e66c Compare May 20, 2026 11:03
@Harry-Ramsey Harry-Ramsey requested a review from davemgreen May 20, 2026 12:51
KnownBits KnownAmt =
DAG.computeKnownBits(Op->getOperand(2), DemandedElts, Depth + 1);

Known = Op.getOpcode() == AArch64ISD::SHL_PRED
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell this isn't tested any more, or isn't triggering where it was before? The add_like would be enough to trigger the or disjoint cases IIUC. It might be better to separate this known-bits part out so that it can be tested on it's own.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just a bit confused by what is meant here. The usra/ssra tests are still going down this code path with SVE2 so this should be tested?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The add_like tablegen change you have added will pick up "add like" from it being an add or a or_disjoint. or_disjoint in turn can either be an or with the disjoint flag or where we look at the known-bits for the two inputs having haveNoCommonBitsSet.

This bit is helping with the second known-bits check, but the new tests are going through the first path I believe, with a disjoint or flag. And as no other tests are changing, they are likely not tested any more or not working in some way. Some of the sri tests used to trigger IIRC.

The add_like change alone is enough to fix the issue, so if we can separate this known-bits part out and make sure it is tested separately, we can get the other part in and each half is simpler.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh - When I say different patch I should say different PR, so they can be reviewed and submitted separately.

ret <vscale x 16 x i8> %result
}

define <vscale x 16 x i8> @usra_disjoint_or16xi8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you separate these out into a new test file for usra+ssra or add it to an existing file like llvm/test/CodeGen/AArch64/sve2-sra.ll

Comment thread llvm/test/CodeGen/AArch64/sve2-sra.ll Outdated
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s | FileCheck %s
; RUN: llc -mattr=+sve2 < %s | FileCheck %s
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fit in with the other tests here, the #0 attributes can be added to the new functions to pick up +sve2.

KnownBits KnownAmt =
DAG.computeKnownBits(Op->getOperand(2), DemandedElts, Depth + 1);

Known = Op.getOpcode() == AArch64ISD::SHL_PRED
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The add_like tablegen change you have added will pick up "add like" from it being an add or a or_disjoint. or_disjoint in turn can either be an or with the disjoint flag or where we look at the known-bits for the two inputs having haveNoCommonBitsSet.

This bit is helping with the second known-bits check, but the new tests are going through the first path I believe, with a disjoint or flag. And as no other tests are changing, they are likely not tested any more or not working in some way. Some of the sri tests used to trigger IIRC.

The add_like change alone is enough to fix the issue, so if we can separate this known-bits part out and make sure it is tested separately, we can get the other part in and each half is simpler.

Modify SVEE USRA pattern to accept add_like, so both add and or disjoint
forms can select usra.

Add known-bits support for predicated SVE logical shifts, allowing
or_disjoint matching to prove disjointness for plain ORs where possible.
@Harry-Ramsey Harry-Ramsey force-pushed the ramsey/sve-generate-ursa branch 3 times, most recently from 72d39e9 to 321f9ea Compare May 29, 2026 08:27
Copy link
Copy Markdown
Contributor

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants