-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[RISCV] More explicitly check that combineOp_VLToVWOp_VL removes the extends it is supposed to. #166710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
These are test cases where we have an add and a sub with the same operands. One operand is a sign extend and the other is a zero extend. The sub can only form a vwsub.wv but because add is commutable, it could form vwadd.wv or vwaddu.wv depending on which extend is removed. We want to form vwadd.wv to match the sub so the vsext can be removed. Depending on the order of the instructions and the operand order of the add, we might form vwaddu.wv instead.
…extends it is supposed to. If we visit multiple root nodes, make sure the strategy chosen for other nodes includes the nodes we've already committed to remove. This can occur if have add/sub nodes where one operand is a zext and the other is a sext. We might see that the nodes share a common extension but pick a strategy that doesn't share it. Stacked on llvm#166700
|
@llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesIf we visit multiple root nodes, make sure the strategy chosen This can occur if have add/sub nodes where one operand is a zext Stacked on #166700 Full diff: https://github.com/llvm/llvm-project/pull/166710.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index c3f100e3197b1..3b5dfd5845f0b 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -17876,6 +17876,7 @@ static SDValue combineOp_VLToVWOp_VL(SDNode *N,
SmallVector<SDNode *> Worklist;
SmallPtrSet<SDNode *, 8> Inserted;
+ SmallPtrSet<SDNode *, 8> ExtensionsToRemove;
Worklist.push_back(N);
Inserted.insert(N);
SmallVector<CombineResult> CombinesToApply;
@@ -17886,8 +17887,10 @@ static SDValue combineOp_VLToVWOp_VL(SDNode *N,
NodeExtensionHelper LHS(Root, 0, DAG, Subtarget);
NodeExtensionHelper RHS(Root, 1, DAG, Subtarget);
auto AppendUsersIfNeeded = [&Worklist, &Subtarget,
- &Inserted](const NodeExtensionHelper &Op) {
+ &Inserted, &ExtensionsToRemove](const NodeExtensionHelper &Op) {
if (Op.needToPromoteOtherUsers()) {
+ // Remember that we're supposed to remove this extension.
+ ExtensionsToRemove.insert(Op.OrigOperand.getNode());
for (SDUse &Use : Op.OrigOperand->uses()) {
SDNode *TheUser = Use.getUser();
if (!NodeExtensionHelper::isSupportedRoot(TheUser, Subtarget))
@@ -17921,6 +17924,15 @@ static SDValue combineOp_VLToVWOp_VL(SDNode *N,
std::optional<CombineResult> Res =
FoldingStrategy(Root, LHS, RHS, DAG, Subtarget);
if (Res) {
+ // If this strategy wouldn't remove an extension we're supposed to
+ // remove, reject it.
+ if (!Res->LHSExt.has_value() &&
+ ExtensionsToRemove.contains(LHS.OrigOperand.getNode()))
+ continue;
+ if (!Res->RHSExt.has_value() &&
+ ExtensionsToRemove.contains(RHS.OrigOperand.getNode()))
+ continue;
+
Matched = true;
CombinesToApply.push_back(*Res);
// All the inputs that are extended need to be folded, otherwise
diff --git a/llvm/test/CodeGen/RISCV/rvv/vscale-vw-web-simplification.ll b/llvm/test/CodeGen/RISCV/rvv/vscale-vw-web-simplification.ll
index ad2ed47e67e64..034186210513c 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vscale-vw-web-simplification.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vscale-vw-web-simplification.ll
@@ -570,7 +570,82 @@ define <vscale x 2 x i32> @vwop_vscale_zext_i8i32_multiple_users(ptr %x, ptr %y,
ret <vscale x 2 x i32> %i
}
+define <vscale x 4 x i32> @mismatched_extend_sub_add(<vscale x 4 x i16> %x, <vscale x 4 x i16> %y) {
+; FOLDING-LABEL: mismatched_extend_sub_add:
+; FOLDING: # %bb.0:
+; FOLDING-NEXT: vsetvli a0, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vzext.vf2 v10, v8
+; FOLDING-NEXT: vsetvli zero, zero, e16, m1, ta, ma
+; FOLDING-NEXT: vwsub.wv v12, v10, v9
+; FOLDING-NEXT: vwadd.wv v10, v10, v9
+; FOLDING-NEXT: vsetvli zero, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vmul.vv v8, v12, v10
+; FOLDING-NEXT: ret
+ %a = zext <vscale x 4 x i16> %x to <vscale x 4 x i32>
+ %b = sext <vscale x 4 x i16> %y to <vscale x 4 x i32>
+ %c = sub <vscale x 4 x i32> %a, %b
+ %d = add <vscale x 4 x i32> %a, %b
+ %e = mul <vscale x 4 x i32> %c, %d
+ ret <vscale x 4 x i32> %e
+}
+
+; FIXME: this should remove the vsext
+define <vscale x 4 x i32> @mismatched_extend_sub_add_commuted(<vscale x 4 x i16> %x, <vscale x 4 x i16> %y) {
+; FOLDING-LABEL: mismatched_extend_sub_add_commuted:
+; FOLDING: # %bb.0:
+; FOLDING-NEXT: vsetvli a0, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vzext.vf2 v10, v8
+; FOLDING-NEXT: vsetvli zero, zero, e16, m1, ta, ma
+; FOLDING-NEXT: vwsub.wv v12, v10, v9
+; FOLDING-NEXT: vwadd.wv v10, v10, v9
+; FOLDING-NEXT: vsetvli zero, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vmul.vv v8, v12, v10
+; FOLDING-NEXT: ret
+ %a = zext <vscale x 4 x i16> %x to <vscale x 4 x i32>
+ %b = sext <vscale x 4 x i16> %y to <vscale x 4 x i32>
+ %c = sub <vscale x 4 x i32> %a, %b
+ %d = add <vscale x 4 x i32> %b, %a
+ %e = mul <vscale x 4 x i32> %c, %d
+ ret <vscale x 4 x i32> %e
+}
+define <vscale x 4 x i32> @mismatched_extend_add_sub(<vscale x 4 x i16> %x, <vscale x 4 x i16> %y) {
+; FOLDING-LABEL: mismatched_extend_add_sub:
+; FOLDING: # %bb.0:
+; FOLDING-NEXT: vsetvli a0, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vzext.vf2 v10, v8
+; FOLDING-NEXT: vsetvli zero, zero, e16, m1, ta, ma
+; FOLDING-NEXT: vwadd.wv v12, v10, v9
+; FOLDING-NEXT: vwsub.wv v10, v10, v9
+; FOLDING-NEXT: vsetvli zero, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vmul.vv v8, v12, v10
+; FOLDING-NEXT: ret
+ %a = zext <vscale x 4 x i16> %x to <vscale x 4 x i32>
+ %b = sext <vscale x 4 x i16> %y to <vscale x 4 x i32>
+ %c = add <vscale x 4 x i32> %a, %b
+ %d = sub <vscale x 4 x i32> %a, %b
+ %e = mul <vscale x 4 x i32> %c, %d
+ ret <vscale x 4 x i32> %e
+}
+
+define <vscale x 4 x i32> @mismatched_extend_add_sub_commuted(<vscale x 4 x i16> %x, <vscale x 4 x i16> %y) {
+; FOLDING-LABEL: mismatched_extend_add_sub_commuted:
+; FOLDING: # %bb.0:
+; FOLDING-NEXT: vsetvli a0, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vzext.vf2 v10, v8
+; FOLDING-NEXT: vsetvli zero, zero, e16, m1, ta, ma
+; FOLDING-NEXT: vwadd.wv v12, v10, v9
+; FOLDING-NEXT: vwsub.wv v10, v10, v9
+; FOLDING-NEXT: vsetvli zero, zero, e32, m2, ta, ma
+; FOLDING-NEXT: vmul.vv v8, v12, v10
+; FOLDING-NEXT: ret
+ %a = zext <vscale x 4 x i16> %x to <vscale x 4 x i32>
+ %b = sext <vscale x 4 x i16> %y to <vscale x 4 x i32>
+ %c = add <vscale x 4 x i32> %a, %b
+ %d = sub <vscale x 4 x i32> %a, %b
+ %e = mul <vscale x 4 x i32> %c, %d
+ ret <vscale x 4 x i32> %e
+}
;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
; RV32: {{.*}}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| ret <vscale x 4 x i32> %e | ||
| } | ||
|
|
||
| ; FIXME: this should remove the vsext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This FIXME can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
| // we would be leaving the old input (since it may still be used), |
If we visit multiple root nodes, make sure the strategy chosen
for other nodes includes the nodes we've already committed to remove.
This can occur if have add/sub nodes where one operand is a zext
and the other is a sext. We might see that the nodes share a common
extension but pick a strategy that doesn't share it.