-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern #89908
Conversation
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesreassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize to the correct addressing modes yet, but prevents us from optimizating away from them. Full diff: https://github.com/llvm/llvm-project/pull/89908.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index fd265b12d73ca4..0ea5f58bb53270 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -1085,6 +1085,37 @@ bool DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
if (Opc != ISD::ADD || N0.getOpcode() != ISD::ADD)
return false;
+ // Check for vscale addressing modes.
+ // (load/store (add (add x, y), vscale))
+ // (load/store (add (add x, y), (lsl vscale, C)))
+ // (load/store (add (add x, y), (mul vscale, C)))
+ if ((N1.getOpcode() == ISD::VSCALE ||
+ ((N1.getOpcode() == ISD::SHL || N1.getOpcode() == ISD::MUL) &&
+ N1.getOperand(0).getOpcode() == ISD::VSCALE &&
+ isa<ConstantSDNode>(N1.getOperand(1)))) &&
+ N1.getValueSizeInBits() <= 64) {
+ unsigned ScalableOffset =
+ N1.getOpcode() == ISD::VSCALE
+ ? N1.getConstantOperandVal(0)
+ : (N1.getOperand(0).getConstantOperandVal(0) *
+ (N1.getOpcode() == ISD::SHL ? (1 << N1.getConstantOperandVal(1))
+ : N1.getConstantOperandVal(1)));
+ if (all_of(N->uses(), [&](SDNode *Node) {
+ if (auto *LoadStore = dyn_cast<MemSDNode>(Node)) {
+ TargetLoweringBase::AddrMode AM;
+ AM.HasBaseReg = true;
+ AM.ScalableOffset = ScalableOffset;
+ EVT VT = LoadStore->getMemoryVT();
+ unsigned AS = LoadStore->getAddressSpace();
+ Type *AccessTy = VT.getTypeForEVT(*DAG.getContext());
+ return TLI.isLegalAddressingMode(DAG.getDataLayout(), AM, AccessTy,
+ AS);
+ }
+ return false;
+ }))
+ return true;
+ }
+
auto *C2 = dyn_cast<ConstantSDNode>(N1);
if (!C2)
return false;
diff --git a/llvm/test/CodeGen/AArch64/sve-reassocadd.ll b/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
index 47ddab8e296478..9ae0a396d07929 100644
--- a/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
+++ b/llvm/test/CodeGen/AArch64/sve-reassocadd.ll
@@ -22,11 +22,9 @@ entry:
define <vscale x 16 x i8> @i8_4s_1v(ptr %b) {
; CHECK-LABEL: i8_4s_1v:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: rdvl x8, #1
; CHECK-NEXT: ptrue p0.b
-; CHECK-NEXT: mov w9, #4 // =0x4
-; CHECK-NEXT: add x8, x0, x8
-; CHECK-NEXT: ld1b { z0.b }, p0/z, [x8, x9]
+; CHECK-NEXT: add x8, x0, #4
+; CHECK-NEXT: ld1b { z0.b }, p0/z, [x8, #1, mul vl]
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i8, ptr %b, i64 4
@@ -58,11 +56,9 @@ entry:
define <vscale x 8 x i16> @i16_8s_1v(ptr %b) {
; CHECK-LABEL: i16_8s_1v:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: rdvl x8, #1
; CHECK-NEXT: ptrue p0.h
-; CHECK-NEXT: mov x9, #4 // =0x4
-; CHECK-NEXT: add x8, x0, x8
-; CHECK-NEXT: ld1h { z0.h }, p0/z, [x8, x9, lsl #1]
+; CHECK-NEXT: add x8, x0, #8
+; CHECK-NEXT: ld1h { z0.h }, p0/z, [x8, #1, mul vl]
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i8, ptr %b, i64 8
@@ -94,11 +90,9 @@ entry:
define <vscale x 8 x i16> @i16_8s_2v(ptr %b) {
; CHECK-LABEL: i16_8s_2v:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: rdvl x8, #2
; CHECK-NEXT: ptrue p0.h
-; CHECK-NEXT: mov x9, #4 // =0x4
-; CHECK-NEXT: add x8, x0, x8
-; CHECK-NEXT: ld1h { z0.h }, p0/z, [x8, x9, lsl #1]
+; CHECK-NEXT: add x8, x0, #8
+; CHECK-NEXT: ld1h { z0.h }, p0/z, [x8, #2, mul vl]
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i8, ptr %b, i64 8
@@ -130,11 +124,9 @@ entry:
define <vscale x 4 x i32> @i32_16s_2v(ptr %b) {
; CHECK-LABEL: i32_16s_2v:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: rdvl x8, #1
; CHECK-NEXT: ptrue p0.s
-; CHECK-NEXT: mov x9, #4 // =0x4
-; CHECK-NEXT: add x8, x0, x8
-; CHECK-NEXT: ld1w { z0.s }, p0/z, [x8, x9, lsl #2]
+; CHECK-NEXT: add x8, x0, #16
+; CHECK-NEXT: ld1w { z0.s }, p0/z, [x8, #1, mul vl]
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i8, ptr %b, i64 16
@@ -166,11 +158,9 @@ entry:
define <vscale x 2 x i64> @i64_32s_2v(ptr %b) {
; CHECK-LABEL: i64_32s_2v:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: rdvl x8, #1
; CHECK-NEXT: ptrue p0.d
-; CHECK-NEXT: mov x9, #4 // =0x4
-; CHECK-NEXT: add x8, x0, x8
-; CHECK-NEXT: ld1d { z0.d }, p0/z, [x8, x9, lsl #3]
+; CHECK-NEXT: add x8, x0, #32
+; CHECK-NEXT: ld1d { z0.d }, p0/z, [x8, #1, mul vl]
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i8, ptr %b, i64 32
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a useful patch to me! I just have a few comments ...
((N1.getOpcode() == ISD::SHL || N1.getOpcode() == ISD::MUL) && | ||
N1.getOperand(0).getOpcode() == ISD::VSCALE && | ||
isa<ConstantSDNode>(N1.getOperand(1)))) && | ||
N1.getValueSizeInBits() <= 64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use getScalarValueSizeInBits
here instead to avoid the implicit TypeSize->uint64_t conversion?
N1.getOperand(0).getOpcode() == ISD::VSCALE && | ||
isa<ConstantSDNode>(N1.getOperand(1)))) && | ||
N1.getValueSizeInBits() <= 64) { | ||
unsigned ScalableOffset = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of things I've noticed here:
- The AddrMode struct's ScalableOffset field is actually int64_t and the offset can be negative. e.g. add (add x, y), vscale(-3). I think we need to use int64_t here.
- This code potentially accepts ISD::VSCALE nodes with result types of 8 or 16-bits, where there is a chance of overflow for large values of vscale. Is it worth just restricting this to integer types that are suitable for pointer arithmetic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add some tests for negative vscale offsets too?
…kAddressingModePattern. reassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize _to_ the correct addressing modes yet, but prevents us from optimizating _away_ from them.
ad140d3
to
99fc468
Compare
I've added some negative tests and attempted to extend this to handle them well. In the long run we will want to do the opposite transform too where we combine towards legal addressing modes, which might involve more changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for making the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
reassociationCanBreakAddressingModePattern tries to prevent bad add reassociations that would break adrressing mode patterns. This adds support for vscale offset addressing modes, making sure we don't break patterns that already exist. It does not optimize to the correct addressing modes yet, but prevents us from optimizating away from them.