-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Narrow vector absolute value #82041
Conversation
If we have a abs(sext a) we can legally perform this as a sext (abs a). (See the same combine in instcombine - note that the IntMinIsPoison flag doesn't exist in SDAG yet.) On RVV, this is likely profitable because it may allow us to perform the arithmetic operations involved in the abs at a narrower lmul before widening for the user. We could arguably avoid narrowing below DLEN, but the transform should at worst move around the sext and create one extra vsetvli toggle if the source could previously be handled via loads explicit w/EEW.
@llvm/pr-subscribers-backend-risc-v Author: Philip Reames (preames) ChangesIf we have a abs(sext a) we can legally perform this as a sext (abs a). On RVV, this is likely profitable because it may allow us to perform the arithmetic operations involved in the abs at a narrower lmul before widening for the user. We could arguably avoid narrowing below DLEN, but the transform should at worst move around the sext and create one extra vsetvli toggle if the source could previously be handled via loads explicit w/EEW. Full diff: https://github.com/llvm/llvm-project/pull/82041.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d4cee26d5f727f..2fc1418d5e768a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1417,7 +1417,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::SHL, ISD::STORE, ISD::SPLAT_VECTOR,
ISD::BUILD_VECTOR, ISD::CONCAT_VECTORS,
ISD::EXPERIMENTAL_VP_REVERSE, ISD::MUL,
- ISD::INSERT_VECTOR_ELT});
+ ISD::INSERT_VECTOR_ELT, ISD::ABS});
if (Subtarget.hasVendorXTHeadMemPair())
setTargetDAGCombine({ISD::LOAD, ISD::STORE});
if (Subtarget.useRVVForFixedLengthVectors())
@@ -15611,6 +15611,19 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
return DAG.getNode(ISD::AND, DL, VT, NewFMV,
DAG.getConstant(~SignBit, DL, VT));
}
+ case ISD::ABS: {
+ EVT VT = N->getValueType(0);
+ SDValue N0 = N->getOperand(0);
+ // abs (sext) -> sext (abs)
+ // abs (zext) -> zext (handled elsewhere)
+ if (VT.isVector() && N0.hasOneUse() && N0.getOpcode() == ISD::SIGN_EXTEND) {
+ SDValue Src = N0.getOperand(0);
+ SDLoc DL(N);
+ return DAG.getNode(ISD::SIGN_EXTEND, DL, VT,
+ DAG.getNode(ISD::ABS, DL, Src.getValueType(), Src));
+ }
+ break;
+ }
case ISD::ADD: {
if (SDValue V = combineBinOp_VLToVWBinOp_VL(N, DCI, Subtarget))
return V;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
index d2e0113e69b900..118c0743ab9698 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
@@ -152,12 +152,13 @@ declare <4 x i64> @llvm.abs.v4i64(<4 x i64>, i1)
define void @abs_v4i64_of_sext_v4i8(ptr %x) {
; CHECK-LABEL: abs_v4i64_of_sext_v4i8:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT: vsetivli zero, 4, e8, mf4, ta, ma
; CHECK-NEXT: vle8.v v8, (a0)
+; CHECK-NEXT: vrsub.vi v9, v8, 0
+; CHECK-NEXT: vmax.vv v8, v8, v9
+; CHECK-NEXT: vsetvli zero, zero, e64, m2, ta, ma
; CHECK-NEXT: vsext.vf8 v10, v8
-; CHECK-NEXT: vrsub.vi v8, v10, 0
-; CHECK-NEXT: vmax.vv v8, v10, v8
-; CHECK-NEXT: vse64.v v8, (a0)
+; CHECK-NEXT: vse64.v v10, (a0)
; CHECK-NEXT: ret
%a = load <4 x i8>, ptr %x
%a.ext = sext <4 x i8> %a to <4 x i64>
@@ -169,12 +170,13 @@ define void @abs_v4i64_of_sext_v4i8(ptr %x) {
define void @abs_v4i64_of_sext_v4i16(ptr %x) {
; CHECK-LABEL: abs_v4i64_of_sext_v4i16:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT: vsetivli zero, 4, e16, mf2, ta, ma
; CHECK-NEXT: vle16.v v8, (a0)
+; CHECK-NEXT: vrsub.vi v9, v8, 0
+; CHECK-NEXT: vmax.vv v8, v8, v9
+; CHECK-NEXT: vsetvli zero, zero, e64, m2, ta, ma
; CHECK-NEXT: vsext.vf4 v10, v8
-; CHECK-NEXT: vrsub.vi v8, v10, 0
-; CHECK-NEXT: vmax.vv v8, v10, v8
-; CHECK-NEXT: vse64.v v8, (a0)
+; CHECK-NEXT: vse64.v v10, (a0)
; CHECK-NEXT: ret
%a = load <4 x i16>, ptr %x
%a.ext = sext <4 x i16> %a to <4 x i64>
@@ -186,12 +188,13 @@ define void @abs_v4i64_of_sext_v4i16(ptr %x) {
define void @abs_v4i64_of_sext_v4i32(ptr %x) {
; CHECK-LABEL: abs_v4i64_of_sext_v4i32:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
; CHECK-NEXT: vle32.v v8, (a0)
+; CHECK-NEXT: vrsub.vi v9, v8, 0
+; CHECK-NEXT: vmax.vv v8, v8, v9
+; CHECK-NEXT: vsetvli zero, zero, e64, m2, ta, ma
; CHECK-NEXT: vsext.vf2 v10, v8
-; CHECK-NEXT: vrsub.vi v8, v10, 0
-; CHECK-NEXT: vmax.vv v8, v10, v8
-; CHECK-NEXT: vse64.v v8, (a0)
+; CHECK-NEXT: vse64.v v10, (a0)
; CHECK-NEXT: ret
%a = load <4 x i32>, ptr %x
%a.ext = sext <4 x i32> %a to <4 x i64>
|
case ISD::ABS: { | ||
EVT VT = N->getValueType(0); | ||
SDValue N0 = N->getOperand(0); | ||
// abs (sext) -> sext (abs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be abs(sext) -> zext(abs)?
InstCombine
// abs (sext X) --> zext (abs X*)
// Clear the IsIntMin (nsw) bit on the abs to allow narrowing.
if (match(IIOperand, m_OneUse(m_SExt(m_Value(X))))) {
Value *NarrowAbs =
Builder.CreateBinaryIntrinsic(Intrinsic::abs, X, Builder.getFalse());
return CastInst::Create(Instruction::ZExt, NarrowAbs, II->getType());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're 100% right here. Change pushed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If we have a abs(sext a) we can legally perform this as a zext (abs a).
(See the same combine in instcombine - note that the IntMinIsPoison flag
doesn't exist in SDAG yet.)
On RVV, this is likely profitable because it may allow us to perform the arithmetic
operations involved in the abs at a narrower lmul before widening for the user.
We could arguably avoid narrowing below DLEN, but the transform should at worst move
around the extend and create one extra vsetvli toggle if the source could previously be
handled via loads explicit w/EEW.