-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Use SHL_ADD in remaining strength reduce cases for MUL #89789
Conversation
The interesting bit is the zext folding. This is the first case where we end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y. See zext_mul68 from rv64zba.ll. The test differences are cases where we can legally fold (only because there's no one use check). These are not profitable or harmful, but we can't a oneuse check without breaking the zext_mul68 case. Note that THead doesn't appear to have the equivalent patterns so this only shows up in Zba.
@llvm/pr-subscribers-backend-risc-v Author: Philip Reames (preames) ChangesThe interesting bit is the zext folding. This is the first case where we end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y. See zext_mul68 from rv64zba.ll. The test differences are cases where we can legally fold (only because there's no one use check). These are not profitable or harmful, but we can't a oneuse check without breaking the zext_mul68 case. Note that XTHeadBa doesn't appear to have the equivalent patterns so this only shows up in Zba. Full diff: https://github.com/llvm/llvm-project/pull/89789.diff 4 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 605b4a66d6221b..9c66f09a0cbc85 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -13449,9 +13449,8 @@ static SDValue expandMul(SDNode *N, SelectionDAG &DAG,
SDValue X = DAG.getFreeze(N->getOperand(0));
SDValue Shift1 =
DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ShiftAmt, DL, VT));
- SDValue Shift2 =
- DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ScaleShift, DL, VT));
- return DAG.getNode(ISD::ADD, DL, VT, Shift1, Shift2);
+ return DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
+ DAG.getConstant(ScaleShift, DL, VT), Shift1);
}
}
@@ -13485,10 +13484,9 @@ static SDValue expandMul(SDNode *N, SelectionDAG &DAG,
SDValue X = DAG.getFreeze(N->getOperand(0));
SDValue Shift1 =
DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ShiftAmt, DL, VT));
- SDValue Shift2 =
- DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ScaleShift, DL, VT));
return DAG.getNode(ISD::ADD, DL, VT, Shift1,
- DAG.getNode(ISD::ADD, DL, VT, Shift2, X));
+ DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
+ DAG.getConstant(ScaleShift, DL, VT), X));
}
}
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
index 986148bca84950..ffe2b7e2712084 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
@@ -747,6 +747,8 @@ foreach i = {1,2,3} in {
defvar shxadd_uw = !cast<Instruction>("SH"#i#"ADD_UW");
def : Pat<(i64 (add_like_non_imm12 (shl (and GPR:$rs1, 0xFFFFFFFF), (i64 i)), (XLenVT GPR:$rs2))),
(shxadd_uw GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(i64 (riscv_shl_add (and GPR:$rs1, 0xFFFFFFFF), (i64 i), GPR:$rs2)),
+ (shxadd_uw GPR:$rs1, GPR:$rs2)>;
}
def : Pat<(i64 (add_like_non_imm12 (and (shl GPR:$rs1, (i64 1)), 0x1FFFFFFFF), (XLenVT GPR:$rs2))),
diff --git a/llvm/test/CodeGen/RISCV/rv64-legal-i32/xaluo.ll b/llvm/test/CodeGen/RISCV/rv64-legal-i32/xaluo.ll
index a1de326d16b536..1c794a1bd1684d 100644
--- a/llvm/test/CodeGen/RISCV/rv64-legal-i32/xaluo.ll
+++ b/llvm/test/CodeGen/RISCV/rv64-legal-i32/xaluo.ll
@@ -810,9 +810,9 @@ define zeroext i1 @umulo2.i32(i32 signext %v1, ptr %res) {
;
; RV64ZBA-LABEL: umulo2.i32:
; RV64ZBA: # %bb.0: # %entry
-; RV64ZBA-NEXT: zext.w a0, a0
-; RV64ZBA-NEXT: sh1add a2, a0, a0
-; RV64ZBA-NEXT: sh2add a2, a2, a0
+; RV64ZBA-NEXT: zext.w a2, a0
+; RV64ZBA-NEXT: sh1add.uw a0, a0, a2
+; RV64ZBA-NEXT: sh2add a2, a0, a2
; RV64ZBA-NEXT: srli a0, a2, 32
; RV64ZBA-NEXT: snez a0, a0
; RV64ZBA-NEXT: sw a2, 0(a1)
diff --git a/llvm/test/CodeGen/RISCV/xaluo.ll b/llvm/test/CodeGen/RISCV/xaluo.ll
index 1a88563c0ea2ed..b1efe53290e8ee 100644
--- a/llvm/test/CodeGen/RISCV/xaluo.ll
+++ b/llvm/test/CodeGen/RISCV/xaluo.ll
@@ -1759,9 +1759,9 @@ define zeroext i1 @umulo2.i32(i32 signext %v1, ptr %res) {
;
; RV64ZBA-LABEL: umulo2.i32:
; RV64ZBA: # %bb.0: # %entry
-; RV64ZBA-NEXT: zext.w a0, a0
-; RV64ZBA-NEXT: sh1add a2, a0, a0
-; RV64ZBA-NEXT: sh2add a2, a2, a0
+; RV64ZBA-NEXT: zext.w a2, a0
+; RV64ZBA-NEXT: sh1add.uw a0, a0, a2
+; RV64ZBA-NEXT: sh2add a2, a0, a2
; RV64ZBA-NEXT: srli a0, a2, 32
; RV64ZBA-NEXT: snez a0, a0
; RV64ZBA-NEXT: sw a2, 0(a1)
|
|
I can't reproduce this failure. I suspect you didn't use a recent enough ToT revision as the base of the patch. |
Baseline: c108653 |
As I said, too old. This change depends on 03760ad which is in ToT, and not in that revision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Doing so avoids negative interactions with other combines which don't know the shl_add is a single instruction. From the commit log, we've had several combine loops already. This was originally posted as part of llvm#88791, where a bug was pointed out. That bug was fixed by llvm#89789 which hits the same issue from another angle. To confirm the fix, I included the reduce test case here.
Doing so avoids negative interactions with other combines which don't know the shl_add is a single instruction. From the commit log, we've had several combine loops already. This was originally posted as part of #88791, where a bug was pointed out. That bug was fixed by #89789 which hits the same issue from another angle. To confirm the fix, I included the reduced test case here.
The interesting bit is the zext folding. This is the first case where we end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y. See zext_mul68 from rv64zba.ll.
The test differences are cases where we can legally fold (only because there's no one use check). These are not profitable or harmful, but we can't a oneuse check without breaking the zext_mul68 case.
Note that XTHeadBa doesn't appear to have the equivalent patterns so this only shows up in Zba.