Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Narrow vector absolute value #82041

Merged
merged 2 commits into from
Feb 17, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Feb 16, 2024

If we have a abs(sext a) we can legally perform this as a zext (abs a).
(See the same combine in instcombine - note that the IntMinIsPoison flag
doesn't exist in SDAG yet.)

On RVV, this is likely profitable because it may allow us to perform the arithmetic
operations involved in the abs at a narrower lmul before widening for the user.
We could arguably avoid narrowing below DLEN, but the transform should at worst move
around the extend and create one extra vsetvli toggle if the source could previously be
handled via loads explicit w/EEW.

If we have a abs(sext a) we can legally perform this as a sext (abs a).
(See the same combine in instcombine - note that the IntMinIsPoison flag
 doesn't exist in SDAG yet.)

On RVV, this is likely profitable because it may allow us to perform
the arithmetic operations involved in the abs at a narrower lmul
before widening for the user.  We could arguably avoid narrowing
below DLEN, but the transform should at worst move around the
sext and create one extra vsetvli toggle if the source could
previously be handled via loads explicit w/EEW.
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 16, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

If we have a abs(sext a) we can legally perform this as a sext (abs a).
(See the same combine in instcombine - note that the IntMinIsPoison flag
doesn't exist in SDAG yet.)

On RVV, this is likely profitable because it may allow us to perform the arithmetic operations involved in the abs at a narrower lmul before widening for the user. We could arguably avoid narrowing below DLEN, but the transform should at worst move around the sext and create one extra vsetvli toggle if the source could previously be handled via loads explicit w/EEW.


Full diff: https://github.com/llvm/llvm-project/pull/82041.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+14-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll (+15-12)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d4cee26d5f727f..2fc1418d5e768a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1417,7 +1417,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
                          ISD::SHL, ISD::STORE, ISD::SPLAT_VECTOR,
                          ISD::BUILD_VECTOR, ISD::CONCAT_VECTORS,
                          ISD::EXPERIMENTAL_VP_REVERSE, ISD::MUL,
-                         ISD::INSERT_VECTOR_ELT});
+                         ISD::INSERT_VECTOR_ELT, ISD::ABS});
   if (Subtarget.hasVendorXTHeadMemPair())
     setTargetDAGCombine({ISD::LOAD, ISD::STORE});
   if (Subtarget.useRVVForFixedLengthVectors())
@@ -15611,6 +15611,19 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
     return DAG.getNode(ISD::AND, DL, VT, NewFMV,
                        DAG.getConstant(~SignBit, DL, VT));
   }
+  case ISD::ABS: {
+    EVT VT = N->getValueType(0);
+    SDValue N0 = N->getOperand(0);
+    // abs (sext) -> sext (abs)
+    // abs (zext) -> zext (handled elsewhere)
+    if (VT.isVector() && N0.hasOneUse() && N0.getOpcode() == ISD::SIGN_EXTEND) {
+      SDValue Src = N0.getOperand(0);
+      SDLoc DL(N);
+      return DAG.getNode(ISD::SIGN_EXTEND, DL, VT,
+                         DAG.getNode(ISD::ABS, DL, Src.getValueType(), Src));
+    }
+    break;
+  }
   case ISD::ADD: {
     if (SDValue V = combineBinOp_VLToVWBinOp_VL(N, DCI, Subtarget))
       return V;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
index d2e0113e69b900..118c0743ab9698 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
@@ -152,12 +152,13 @@ declare <4 x i64> @llvm.abs.v4i64(<4 x i64>, i1)
 define void @abs_v4i64_of_sext_v4i8(ptr %x) {
 ; CHECK-LABEL: abs_v4i64_of_sext_v4i8:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
 ; CHECK-NEXT:    vle8.v v8, (a0)
+; CHECK-NEXT:    vrsub.vi v9, v8, 0
+; CHECK-NEXT:    vmax.vv v8, v8, v9
+; CHECK-NEXT:    vsetvli zero, zero, e64, m2, ta, ma
 ; CHECK-NEXT:    vsext.vf8 v10, v8
-; CHECK-NEXT:    vrsub.vi v8, v10, 0
-; CHECK-NEXT:    vmax.vv v8, v10, v8
-; CHECK-NEXT:    vse64.v v8, (a0)
+; CHECK-NEXT:    vse64.v v10, (a0)
 ; CHECK-NEXT:    ret
   %a = load <4 x i8>, ptr %x
   %a.ext = sext <4 x i8> %a to <4 x i64>
@@ -169,12 +170,13 @@ define void @abs_v4i64_of_sext_v4i8(ptr %x) {
 define void @abs_v4i64_of_sext_v4i16(ptr %x) {
 ; CHECK-LABEL: abs_v4i64_of_sext_v4i16:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; CHECK-NEXT:    vle16.v v8, (a0)
+; CHECK-NEXT:    vrsub.vi v9, v8, 0
+; CHECK-NEXT:    vmax.vv v8, v8, v9
+; CHECK-NEXT:    vsetvli zero, zero, e64, m2, ta, ma
 ; CHECK-NEXT:    vsext.vf4 v10, v8
-; CHECK-NEXT:    vrsub.vi v8, v10, 0
-; CHECK-NEXT:    vmax.vv v8, v10, v8
-; CHECK-NEXT:    vse64.v v8, (a0)
+; CHECK-NEXT:    vse64.v v10, (a0)
 ; CHECK-NEXT:    ret
   %a = load <4 x i16>, ptr %x
   %a.ext = sext <4 x i16> %a to <4 x i64>
@@ -186,12 +188,13 @@ define void @abs_v4i64_of_sext_v4i16(ptr %x) {
 define void @abs_v4i64_of_sext_v4i32(ptr %x) {
 ; CHECK-LABEL: abs_v4i64_of_sext_v4i32:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
 ; CHECK-NEXT:    vle32.v v8, (a0)
+; CHECK-NEXT:    vrsub.vi v9, v8, 0
+; CHECK-NEXT:    vmax.vv v8, v8, v9
+; CHECK-NEXT:    vsetvli zero, zero, e64, m2, ta, ma
 ; CHECK-NEXT:    vsext.vf2 v10, v8
-; CHECK-NEXT:    vrsub.vi v8, v10, 0
-; CHECK-NEXT:    vmax.vv v8, v10, v8
-; CHECK-NEXT:    vse64.v v8, (a0)
+; CHECK-NEXT:    vse64.v v10, (a0)
 ; CHECK-NEXT:    ret
   %a = load <4 x i32>, ptr %x
   %a.ext = sext <4 x i32> %a to <4 x i64>

case ISD::ABS: {
EVT VT = N->getValueType(0);
SDValue N0 = N->getOperand(0);
// abs (sext) -> sext (abs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be abs(sext) -> zext(abs)?

InstCombine

    // abs (sext X) --> zext (abs X*)                                            
    // Clear the IsIntMin (nsw) bit on the abs to allow narrowing.               
    if (match(IIOperand, m_OneUse(m_SExt(m_Value(X))))) {                        
      Value *NarrowAbs =                                                         
          Builder.CreateBinaryIntrinsic(Intrinsic::abs, X, Builder.getFalse());  
      return CastInst::Create(Instruction::ZExt, NarrowAbs, II->getType());      
    } 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're 100% right here. Change pushed.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preames preames merged commit db836f6 into llvm:main Feb 17, 2024
4 checks passed
@preames preames deleted the pr-riscv-narrow-vector-abs branch February 17, 2024 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants