[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908

davemgreen · 2024-04-24T10:14:38Z

reassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize to the correct addressing modes yet, but prevents us from optimizating away from them.

llvmbot · 2024-04-24T10:15:07Z

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

reassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize to the correct addressing modes yet, but prevents us from optimizating away from them.

Full diff: https://github.com/llvm/llvm-project/pull/89908.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+31)
(modified) llvm/test/CodeGen/AArch64/sve-reassocadd.ll (+10-20)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index fd265b12d73ca4..0ea5f58bb53270 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -1085,6 +1085,37 @@ bool DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
   if (Opc != ISD::ADD || N0.getOpcode() != ISD::ADD)
     return false;
 
+  // Check for vscale addressing modes.
+  // (load/store (add (add x, y), vscale))
+  // (load/store (add (add x, y), (lsl vscale, C)))
+  // (load/store (add (add x, y), (mul vscale, C)))
+  if ((N1.getOpcode() == ISD::VSCALE ||
+       ((N1.getOpcode() == ISD::SHL || N1.getOpcode() == ISD::MUL) &&
+        N1.getOperand(0).getOpcode() == ISD::VSCALE &&
+        isa<ConstantSDNode>(N1.getOperand(1)))) &&
+      N1.getValueSizeInBits() <= 64) {
+    unsigned ScalableOffset =
+        N1.getOpcode() == ISD::VSCALE
+            ? N1.getConstantOperandVal(0)
+            : (N1.getOperand(0).getConstantOperandVal(0) *
+               (N1.getOpcode() == ISD::SHL ? (1 << N1.getConstantOperandVal(1))
+                                           : N1.getConstantOperandVal(1)));
+    if (all_of(N->uses(), [&](SDNode *Node) {
+          if (auto *LoadStore = dyn_cast<MemSDNode>(Node)) {
+            TargetLoweringBase::AddrMode AM;
+            AM.HasBaseReg = true;
+            AM.ScalableOffset = ScalableOffset;
+            EVT VT = LoadStore->getMemoryVT();
+            unsigned AS = LoadStore->getAddressSpace();
+            Type *AccessTy = VT.getTypeForEVT(*DAG.getContext());
+            return TLI.isLegalAddressingMode(DAG.getDataLayout(), AM, AccessTy,
+                                             AS);
+          }
+          return false;
+        }))
+      return true;
+  }
+
   auto *C2 = dyn_cast<ConstantSDNode>(N1);
   if (!C2)
     return false;
diff --git a/llvm/test/CodeGen/AArch64/sve-reassocadd.ll b/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
index 47ddab8e296478..9ae0a396d07929 100644
--- a/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
+++ b/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
@@ -22,11 +22,9 @@ entry:
 define <vscale x 16 x i8> @i8_4s_1v(ptr %b) {
 ; CHECK-LABEL: i8_4s_1v:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    rdvl x8, #1
 ; CHECK-NEXT:    ptrue p0.b
-; CHECK-NEXT:    mov w9, #4 // =0x4
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    ld1b { z0.b }, p0/z, [x8, x9]
+; CHECK-NEXT:    add x8, x0, #4
+; CHECK-NEXT:    ld1b { z0.b }, p0/z, [x8, #1, mul vl]
 ; CHECK-NEXT:    ret
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %b, i64 4
@@ -58,11 +56,9 @@ entry:
 define <vscale x 8 x i16> @i16_8s_1v(ptr %b) {
 ; CHECK-LABEL: i16_8s_1v:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    rdvl x8, #1
 ; CHECK-NEXT:    ptrue p0.h
-; CHECK-NEXT:    mov x9, #4 // =0x4
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    ld1h { z0.h }, p0/z, [x8, x9, lsl #1]
+; CHECK-NEXT:    add x8, x0, #8
+; CHECK-NEXT:    ld1h { z0.h }, p0/z, [x8, #1, mul vl]
 ; CHECK-NEXT:    ret
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %b, i64 8
@@ -94,11 +90,9 @@ entry:
 define <vscale x 8 x i16> @i16_8s_2v(ptr %b) {
 ; CHECK-LABEL: i16_8s_2v:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    rdvl x8, #2
 ; CHECK-NEXT:    ptrue p0.h
-; CHECK-NEXT:    mov x9, #4 // =0x4
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    ld1h { z0.h }, p0/z, [x8, x9, lsl #1]
+; CHECK-NEXT:    add x8, x0, #8
+; CHECK-NEXT:    ld1h { z0.h }, p0/z, [x8, #2, mul vl]
 ; CHECK-NEXT:    ret
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %b, i64 8
@@ -130,11 +124,9 @@ entry:
 define <vscale x 4 x i32> @i32_16s_2v(ptr %b) {
 ; CHECK-LABEL: i32_16s_2v:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    rdvl x8, #1
 ; CHECK-NEXT:    ptrue p0.s
-; CHECK-NEXT:    mov x9, #4 // =0x4
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    ld1w { z0.s }, p0/z, [x8, x9, lsl #2]
+; CHECK-NEXT:    add x8, x0, #16
+; CHECK-NEXT:    ld1w { z0.s }, p0/z, [x8, #1, mul vl]
 ; CHECK-NEXT:    ret
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %b, i64 16
@@ -166,11 +158,9 @@ entry:
 define <vscale x 2 x i64> @i64_32s_2v(ptr %b) {
 ; CHECK-LABEL: i64_32s_2v:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    rdvl x8, #1
 ; CHECK-NEXT:    ptrue p0.d
-; CHECK-NEXT:    mov x9, #4 // =0x4
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    ld1d { z0.d }, p0/z, [x8, x9, lsl #3]
+; CHECK-NEXT:    add x8, x0, #32
+; CHECK-NEXT:    ld1d { z0.d }, p0/z, [x8, #1, mul vl]
 ; CHECK-NEXT:    ret
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %b, i64 32

david-arm

Looks like a useful patch to me! I just have a few comments ...

david-arm · 2024-04-29T13:32:57Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+       ((N1.getOpcode() == ISD::SHL || N1.getOpcode() == ISD::MUL) &&
+        N1.getOperand(0).getOpcode() == ISD::VSCALE &&
+        isa<ConstantSDNode>(N1.getOperand(1)))) &&
+      N1.getValueSizeInBits() <= 64) {


Can you use getScalarValueSizeInBits here instead to avoid the implicit TypeSize->uint64_t conversion?

david-arm · 2024-04-29T13:42:26Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+        N1.getOperand(0).getOpcode() == ISD::VSCALE &&
+        isa<ConstantSDNode>(N1.getOperand(1)))) &&
+      N1.getValueSizeInBits() <= 64) {
+    unsigned ScalableOffset =


A couple of things I've noticed here:

The AddrMode struct's ScalableOffset field is actually int64_t and the offset can be negative. e.g. add (add x, y), vscale(-3). I think we need to use int64_t here.

This code potentially accepts ISD::VSCALE nodes with result types of 8 or 16-bits, where there is a chance of overflow for large values of vscale. Is it worth just restricting this to integer types that are suitable for pointer arithmetic?

david-arm · 2024-04-29T13:42:57Z

llvm/test/CodeGen/AArch64/sve-reassocadd.ll

Is it possible to add some tests for negative vscale offsets too?

…kAddressingModePattern. reassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize _to_ the correct addressing modes yet, but prevents us from optimizating _away_ from them.

davemgreen · 2024-05-01T09:50:49Z

I've added some negative tests and attempted to extend this to handle them well. In the long run we will want to do the opposite transform too where we combine towards legal addressing modes, which might involve more changes.

david-arm

LGTM! Thanks for making the changes.

huntergr-arm

LGTM

davemgreen requested review from RKSimon, topperc, huntergr-arm and david-arm April 24, 2024 10:14

llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels Apr 24, 2024

david-arm reviewed Apr 29, 2024

View reviewed changes

davemgreen force-pushed the gh-a64-reassocvscale branch from ad140d3 to 99fc468 Compare May 1, 2024 08:47

david-arm approved these changes May 2, 2024

View reviewed changes

huntergr-arm approved these changes May 8, 2024

View reviewed changes

davemgreen merged commit 23b673e into llvm:main May 10, 2024
3 of 4 checks passed

davemgreen deleted the gh-a64-reassocvscale branch May 10, 2024 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908

[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908

davemgreen commented Apr 24, 2024

llvmbot commented Apr 24, 2024 •

edited

Loading

david-arm left a comment

david-arm Apr 29, 2024

david-arm Apr 29, 2024

david-arm Apr 29, 2024

davemgreen commented May 1, 2024

david-arm left a comment

huntergr-arm left a comment

[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908

[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908

Conversation

davemgreen commented Apr 24, 2024

llvmbot commented Apr 24, 2024 • edited Loading

david-arm left a comment

Choose a reason for hiding this comment

david-arm Apr 29, 2024

Choose a reason for hiding this comment

david-arm Apr 29, 2024

Choose a reason for hiding this comment

david-arm Apr 29, 2024

Choose a reason for hiding this comment

davemgreen commented May 1, 2024

david-arm left a comment

Choose a reason for hiding this comment

huntergr-arm left a comment

Choose a reason for hiding this comment

llvmbot commented Apr 24, 2024 •

edited

Loading