Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Refine cost on Min/Max reduction #79402

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

arcbbb
Copy link
Contributor

@arcbbb arcbbb commented Jan 25, 2024

This patch is split off from #77342, and follows #79103

  • Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
  • Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
  • Use MVT to estimate VL.

This patch is split off from llvm#77342, and follows llvm#79103

- Correct for CodeSize cost that 1 instruction is not included. 3 is
  from {VMV.S, ReductionOp, VMV.X}
- Add SplitCost which chains a series of VMAX/VMIN/... which scales
    with LMUL.
- Use MVT to estimate VL.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 25, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-risc-v

Author: Shih-Po Hung (arcbbb)

Changes

This patch is split off from #77342, and follows #79103

  • Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
  • Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
  • Use MVT to estimate VL.

Patch is 107.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79402.diff

5 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+36-7)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-max.ll (+70-70)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-min.ll (+70-70)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-fp.ll (+42-42)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-int.ll (+48-48)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 0799af0a7bad550..0ab795ec99d8a87 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -942,13 +942,42 @@ RISCVTTIImpl::getMinMaxReductionCost(Intrinsic::ID IID, VectorType *Ty,
     return (LT.first - 1) + 3;
 
   // IR Reduction is composed by two vmv and one rvv reduction instruction.
-  InstructionCost BaseCost = 2;
-
-  if (CostKind == TTI::TCK_CodeSize)
-    return (LT.first - 1) + BaseCost;
-
-  unsigned VL = getEstimatedVLFor(Ty);
-  return (LT.first - 1) + BaseCost + Log2_32_Ceil(VL);
+  unsigned SplitOp;
+  SmallVector<unsigned, 3> Opcodes;
+  switch (IID) {
+  default:
+    llvm_unreachable("Unsupported intrinsic");
+  case Intrinsic::smax:
+    SplitOp = RISCV::VMAX_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAX_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::smin:
+    SplitOp = RISCV::VMIN_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMIN_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umax:
+    SplitOp = RISCV::VMAXU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAXU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umin:
+    SplitOp = RISCV::VMINU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMINU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::maxnum:
+    SplitOp = RISCV::VFMAX_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMAX_VS, RISCV::VFMV_F_S};
+    break;
+  case Intrinsic::minnum:
+    SplitOp = RISCV::VFMIN_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMIN_VS, RISCV::VFMV_F_S};
+    break;
+  }
+  // Add a cost for data larger than LMUL8
+  InstructionCost SplitCost =
+      (LT.first > 1) ? (LT.first - 1) *
+                           getRISCVInstructionCost(SplitOp, LT.second, CostKind)
+                     : 0;
+  return SplitCost + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
 }
 
 InstructionCost
diff --git a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
index c21b8520112cf88..11fa50d355e833d 100644
--- a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
@@ -51,14 +51,14 @@ define i32 @reduce_umax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
@@ -85,14 +85,14 @@ define i32 @reduce_umax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
@@ -115,18 +115,18 @@ define i32 @reduce_umax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
@@ -148,19 +148,19 @@ define i32 @reduce_umax_i64(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i64'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
@@ -221,14 +221,14 @@ define i32 @reduce_smax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
@@ -255,14 +255,14 @@ define i32 @reduce_smax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
@@ -285,18 +285,18 @@ define i32 @reduce_smax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.smax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.smax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.smax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.smax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.smax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an ...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arcbbb arcbbb merged commit 2800448 into llvm:main Jan 30, 2024
5 of 6 checks passed
@arcbbb arcbbb deleted the cost-minmax-reduction branch January 30, 2024 08:47
hanhanW added a commit to iree-org/llvm-project that referenced this pull request Feb 1, 2024
@hanhanW
Copy link
Contributor

hanhanW commented Feb 1, 2024

Hi there, I bisected to this change in a downstream compiler crash on RISC-V. Here is the stack trace:

FAILED: tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb /work/build-linux-riscv_64/tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb 
cd /work/build-linux-riscv_64/tests/e2e/regression && /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
Unsupported intrinsic
UNREACHABLE executed at /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:954!
Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
 #0 0x00007f53278d96eb llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x00007f53278d7740 llvm::sys::RunSignalHandlers() /work/third_party/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00007f53278d9dcf SignalHandler(int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f532f3fd420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007f532106f00b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
 #5 0x00007f532104e859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
 #6 0x00007f5327859151 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0x6459151)
 #7 0x00007f532c7383c3 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0xb3383c3)
 #8 0x00007f532c73dadc llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getTypeBasedIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
 #9 0x00007f532c735d01 llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
#10 0x00007f532c735ba5 llvm::RISCVTTIImpl::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:837:17
#11 0x00007f532c6f217d llvm::TargetTransformInfoImplCRTPBase<llvm::RISCVTTIImpl>::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:1169:25
#12 0x00007f532e50c20d llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:270:3
#13 0x00007f532d372149 llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:411:12
#14 0x00007f532e2c00a4 llvm::InstructionCost::propagateState(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:57:19
#15 0x00007f532e2c00a4 llvm::InstructionCost::operator+=(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:100:5
#16 0x00007f532e2c00a4 llvm::CodeMetrics::analyzeBasicBlock(llvm::BasicBlock const*, llvm::TargetTransformInfo const&, llvm::SmallPtrSetImpl<llvm::Value const*> const&, bool) /work/third_party/llvm-project/llvm/lib/Analysis/CodeMetrics.cpp:180:14
#17 0x00007f532d9e5c9b llvm::FunctionSpecializer::run() /work/third_party/llvm-project/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:0:0
#18 0x00007f532d9df563 runIPSCCP(llvm::Module&, llvm::DataLayout const&, llvm::AnalysisManager<llvm::Function>*, std::function<llvm::TargetLibraryInfo const& (llvm::Function&)>, std::function<llvm::TargetTransformInfo& (llvm::Function&)>, std::function<llvm::AssumptionCache& (llvm::Function&)>, std::function<llvm::DominatorTree& (llvm::Function&)>, std::function<llvm::BlockFrequencyInfo& (llvm::Function&)>, bool) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:166:5
#19 0x00007f532d9df563 llvm::IPSCCPPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:415:8
#20 0x00007f532d20c77d llvm::detail::PassModel<llvm::Module, llvm::IPSCCPPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:5
#21 0x00007f532eaaf312 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:547:10
#22 0x00007f5329394380 llvm::SmallPtrSetImplBase::isSmall() const /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
#23 0x00007f5329394380 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:10
#24 0x00007f5329394380 llvm::PreservedAnalyses::~PreservedAnalyses() /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:172:7
#25 0x00007f5329394380 mlir::iree_compiler::IREE::HAL::runLLVMIRPasses(mlir::iree_compiler::IREE::HAL::LLVMTarget const&, llvm::TargetMachine*, llvm::Module*) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMIRPasses.cpp:105:5
#26 0x00007f532930b41f mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#27 0x00007f532930b41f mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#28 0x00007f532930b41f mlir::iree_compiler::IREE::HAL::LLVMCPUTargetBackend::serializeExecutable(mlir::iree_compiler::IREE::HAL::TargetBackend::SerializationOptions const&, mlir::iree_compiler::IREE::HAL::ExecutableVariantOp, mlir::OpBuilder&) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMCPUTarget.cpp:526:9
#29 0x00007f5328f955a0 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#30 0x00007f5328f955a0 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#31 0x00007f5328f955a0 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeTargetExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:87:11
#32 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#33 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#34 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#35 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#36 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#37 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#38 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#39 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#40 0x00007f5327a86d41 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_6>(long, mlir::OpPassManager&, mlir::Operation*) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#41 0x00007f5328f96425 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#42 0x00007f5328f96425 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#43 0x00007f5328f96425 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:118:9
#44 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#45 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#46 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#47 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#48 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#49 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#50 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#51 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#52 0x00007f5327a87e4e mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:810:5
#53 0x00007f5327a833fb mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#54 0x00007f5327a833fb mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#55 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:46:11
#56 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:92:10
#57 0x00007f5327a833fb mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:815:14
#58 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::runOnOperation(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:706:5
#59 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:517:20
#60 0x00007f5327a815a2 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#61 0x00007f5327a815a2 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#62 0x00007f5327a815a2 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#63 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#64 0x00007f5327a844b6 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#65 0x00007f5327a844b6 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#66 0x00007f5327a844b6 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#67 0x00007f5327a844b6 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:896:10
#68 0x00007f5327a8425f mlir::PassManager::run(mlir::Operation*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:0
#69 0x00007f532782e993 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#70 0x00007f532782e993 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#71 0x00007f532782e993 mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:958:7
#72 0x00007f532782e993 ireeCompilerInvocationPipeline /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1388:23
#73 0x00007f5327a4b779 mlir::iree_compiler::runIreecMain(int, char**)::$_0::operator()(iree_compiler_source_t*) const /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:247:11
#74 0x00007f5327a4aff0 mlir::iree_compiler::runIreecMain(int, char**) /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:348:10
#75 0x00007f5321050083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#76 0x000000000020102e _start (/work/full-build-dir/install/bin/iree-compile+0x20102e)
Aborted (core dumped)

I'm going to revert it locally, because many of our RISC-V tests and benchmarks are failing. I'm not sure how to generate a repro for you at this moment, but I will try to do that.

preames added a commit that referenced this pull request Feb 1, 2024
preames added a commit that referenced this pull request Feb 1, 2024
Reverts #79402. Crash reported. On closer inspection,
this patch does not handle Intrinsic::maximum and Intrinsic::minimum.
@preames
Copy link
Collaborator

preames commented Feb 1, 2024

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

@hanhanW
Copy link
Contributor

hanhanW commented Feb 2, 2024

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

Thanks!

@arcbbb
Copy link
Contributor Author

arcbbb commented Feb 3, 2024

Thanks @preames and @hanhanW . I created a test coverage for this in #80553

agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
Reverts llvm#79402. Crash reported. On closer inspection,
this patch does not handle Intrinsic::maximum and Intrinsic::minimum.
arcbbb added a commit to arcbbb/llvm-project that referenced this pull request Mar 15, 2024
The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs.
and if any element of the vector is a NaN.

Following llvm#79402, the patch add the cost of NaN check  (vmfne + vcpop)
arcbbb added a commit that referenced this pull request Mar 25, 2024
…mum (#80697)

The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs
if any element of the vector is a NaN.
Following #79402, the patch adds the cost for NaN check (vmfne + vcpop)
arcbbb added a commit that referenced this pull request Apr 1, 2024
This is recommitted as the test and fix for
llvm.vector.reduce.fmaximum/fminimum are covered in #80553 and #80697
arcbbb added a commit to arcbbb/llvm-project that referenced this pull request Apr 1, 2024
This is recommitted as the test and fix for
llvm.vector.reduce.fmaximum/fminimum are covered in llvm#80553 and llvm#80697
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants