-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV][TTI] Refine the cost of FCmp #88833
Conversation
- Support all fp predicates - Support f16 operations using zvfhmin in the absence of zvfh - Use the Val type to estimate the latency/throughput cost - Assign a cost of 1 for mask operations as LMULCost for mask types cannot be correctly estimated.
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-analysis Author: Shih-Po Hung (arcbbb) ChangesThis patch introduces following changes
Patch is 73.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88833.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 56f5bd8794ae76..3ed9a35198032e 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -91,6 +91,8 @@ RISCVTTIImpl::getRISCVInstructionCost(ArrayRef<unsigned> OpCodes, MVT VT,
case RISCV::VMV_S_X:
case RISCV::VFMV_F_S:
case RISCV::VFMV_S_F:
+ case RISCV::VMOR_MM:
+ case RISCV::VMAND_MM:
case RISCV::VMNAND_MM:
case RISCV::VCPOP_M:
Cost += 1;
@@ -1383,28 +1385,69 @@ InstructionCost RISCVTTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
getRISCVInstructionCost(RISCV::VMSLT_VV, LT.second, CostKind);
}
- if ((Opcode == Instruction::FCmp) && ValTy->isVectorTy()) {
+ if ((Opcode == Instruction::FCmp) && ValTy->isVectorTy() &&
+ CmpInst::isFPPredicate(VecPred)) {
+
+ // Use VMXOR_MM and VMXNOR_MM to generate all true/false mask
+ if ((VecPred == CmpInst::FCMP_FALSE) || (VecPred == CmpInst::FCMP_TRUE))
+ return getRISCVInstructionCost(RISCV::VMXOR_MM, LT.second, CostKind);
+
// If we do not support the input floating point vector type, use the base
// one which will calculate as:
// ScalarizeCost + Num * Cost for fixed vector,
// InvalidCost for scalable vector.
- if ((ValTy->getScalarSizeInBits() == 16 && !ST->hasVInstructionsF16()) ||
+ if ((ValTy->getScalarSizeInBits() == 16 && !ST->hasVInstructionsF16() &&
+ !ST->hasVInstructionsF16Minimal()) ||
(ValTy->getScalarSizeInBits() == 32 && !ST->hasVInstructionsF32()) ||
(ValTy->getScalarSizeInBits() == 64 && !ST->hasVInstructionsF64()))
return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,
I);
+
+ if ((ValTy->getScalarSizeInBits() == 16) && !ST->hasVInstructionsF16()) {
+ // pre-widening Op1 and Op2 to f32 before comparison
+ VectorType *VecF32Ty =
+ VectorType::get(Type::getDoubleTy(ValTy->getContext()),
+ cast<VectorType>(ValTy)->getElementCount());
+ std::pair<InstructionCost, MVT> VecF32LT =
+ getTypeLegalizationCost(VecF32Ty);
+ InstructionCost WidenCost =
+ 2 * getRISCVInstructionCost(RISCV::VFWCVT_F_F_V, VecF32LT.second,
+ CostKind);
+ InstructionCost CmpCost =
+ getCmpSelInstrCost(Opcode, VecF32Ty, CondTy, VecPred, CostKind, I);
+ return VecF32LT.first * WidenCost + CmpCost;
+ }
+
+ // Assuming vector fp compare and mask instructions are all the same cost
+ // until a need arises to differentiate them.
switch (VecPred) {
- // Support natively.
- case CmpInst::FCMP_OEQ:
- case CmpInst::FCMP_OGT:
- case CmpInst::FCMP_OGE:
- case CmpInst::FCMP_OLT:
- case CmpInst::FCMP_OLE:
- case CmpInst::FCMP_UNE:
- return LT.first * 1;
- // TODO: Other comparisons?
+ case CmpInst::FCMP_ONE: // vmflt.vv + vmflt.vv + vmor.mm
+ case CmpInst::FCMP_ORD: // vmfeq.vv + vmfeq.vv + vmand.mm
+ case CmpInst::FCMP_UNO: // vmfne.vv + vmfne.vv + vmor.mm
+ case CmpInst::FCMP_UEQ: // vmflt.vv + vmflt.vv + vmnor.mm
+ return LT.first * getRISCVInstructionCost(
+ {RISCV::VMFLT_VV, RISCV::VMFLT_VV, RISCV::VMOR_MM},
+ LT.second, CostKind);
+
+ case CmpInst::FCMP_UGT: // vmfle.vv + vmnot.m
+ case CmpInst::FCMP_UGE: // vmflt.vv + vmnot.m
+ case CmpInst::FCMP_ULT: // vmfle.vv + vmnot.m
+ case CmpInst::FCMP_ULE: // vmflt.vv + vmnot.m
+ return LT.first *
+ getRISCVInstructionCost({RISCV::VMFLT_VV, RISCV::VMNAND_MM},
+ LT.second, CostKind);
+
+ case CmpInst::FCMP_OEQ: // vmfeq.vv
+ case CmpInst::FCMP_OGT: // vmflt.vv
+ case CmpInst::FCMP_OGE: // vmfle.vv
+ case CmpInst::FCMP_OLT: // vmflt.vv
+ case CmpInst::FCMP_OLE: // vmfle.vv
+ case CmpInst::FCMP_UNE: // vmfne.vv
+ return LT.first *
+ getRISCVInstructionCost(RISCV::VMFLT_VV, LT.second, CostKind);
default:
- break;
+ return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,
+ I);
}
}
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-cmp.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-cmp.ll
index caa6d6f483a243..2632f6066ed715 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-cmp.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-cmp.ll
@@ -878,28 +878,28 @@ define void @fcmp_oeq() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = fcmp oeq <2 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f16 = fcmp oeq <4 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f16 = fcmp oeq <8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f16 = fcmp oeq <16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v16f16 = fcmp oeq <16 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f16 = fcmp oeq <vscale x 1 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f16 = fcmp oeq <vscale x 2 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f16 = fcmp oeq <vscale x 4 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f16 = fcmp oeq <vscale x 8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f16 = fcmp oeq <vscale x 16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv8f16 = fcmp oeq <vscale x 8 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv16f16 = fcmp oeq <vscale x 16 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32 = fcmp oeq <2 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32 = fcmp oeq <4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32 = fcmp oeq <8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32 = fcmp oeq <16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v8f32 = fcmp oeq <8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v16f32 = fcmp oeq <16 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f32 = fcmp oeq <vscale x 1 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f32 = fcmp oeq <vscale x 2 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f32 = fcmp oeq <vscale x 4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f32 = fcmp oeq <vscale x 8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f32 = fcmp oeq <vscale x 16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32 = fcmp oeq <vscale x 4 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv8f32 = fcmp oeq <vscale x 8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %nxv16f32 = fcmp oeq <vscale x 16 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64 = fcmp oeq <2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64 = fcmp oeq <4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64 = fcmp oeq <8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v4f64 = fcmp oeq <4 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v8f64 = fcmp oeq <8 x double> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f64 = fcmp oeq <vscale x 1 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f64 = fcmp oeq <vscale x 2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f64 = fcmp oeq <vscale x 4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f64 = fcmp oeq <vscale x 8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv2f64 = fcmp oeq <vscale x 2 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv4f64 = fcmp oeq <vscale x 4 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %nxv8f64 = fcmp oeq <vscale x 8 x double> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
;
%v2f16 = fcmp oeq <2 x half> undef, undef
@@ -938,31 +938,31 @@ define void @fcmp_oeq() {
define void @fcmp_one() {
; CHECK-LABEL: 'fcmp_one'
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = fcmp one <2 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f16 = fcmp one <4 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f16 = fcmp one <8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f16 = fcmp one <16 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f16 = fcmp one <vscale x 1 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f16 = fcmp one <vscale x 2 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f16 = fcmp one <vscale x 4 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f16 = fcmp one <vscale x 8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f16 = fcmp one <vscale x 16 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32 = fcmp one <2 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32 = fcmp one <4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32 = fcmp one <8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32 = fcmp one <16 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f32 = fcmp one <vscale x 1 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f32 = fcmp one <vscale x 2 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f32 = fcmp one <vscale x 4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f32 = fcmp one <vscale x 8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f32 = fcmp one <vscale x 16 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64 = fcmp one <2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64 = fcmp one <4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64 = fcmp one <8 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f64 = fcmp one <vscale x 1 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f64 = fcmp one <vscale x 2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f64 = fcmp one <vscale x 4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f64 = fcmp one <vscale x 8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = fcmp one <2 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v4f16 = fcmp one <4 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v8f16 = fcmp one <8 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v16f16 = fcmp one <16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv1f16 = fcmp one <vscale x 1 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv2f16 = fcmp one <vscale x 2 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv4f16 = fcmp one <vscale x 4 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %nxv8f16 = fcmp one <vscale x 8 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %nxv16f16 = fcmp one <vscale x 16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f32 = fcmp one <2 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v4f32 = fcmp one <4 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v8f32 = fcmp one <8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v16f32 = fcmp one <16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv1f32 = fcmp one <vscale x 1 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv2f32 = fcmp one <vscale x 2 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %nxv4f32 = fcmp one <vscale x 4 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %nxv8f32 = fcmp one <vscale x 8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %nxv16f32 = fcmp one <vscale x 16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2f64 = fcmp one <2 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %v4f64 = fcmp one <4 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %v8f64 = fcmp one <8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %nxv1f64 = fcmp one <vscale x 1 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %nxv2f64 = fcmp one <vscale x 2 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %nxv4f64 = fcmp one <vscale x 4 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %nxv8f64 = fcmp one <vscale x 8 x double> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
;
%v2f16 = fcmp one <2 x half> undef, undef
@@ -1004,28 +1004,28 @@ define void @fcmp_olt() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = fcmp olt <2 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f16 = fcmp olt <4 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f16 = fcmp olt <8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f16 = fcmp olt <16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v16f16 = fcmp olt <16 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f16 = fcmp olt <vscale x 1 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f16 = fcmp olt <vscale x 2 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f16 = fcmp olt <vscale x 4 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f16 = fcmp olt <vscale x 8 x half> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f16 = fcmp olt <vscale x 16 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv8f16 = fcmp olt <vscale x 8 x half> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv16f16 = fcmp olt <vscale x 16 x half> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32 = fcmp olt <2 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32 = fcmp olt <4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32 = fcmp olt <8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32 = fcmp olt <16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v8f32 = fcmp olt <8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v16f32 = fcmp olt <16 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f32 = fcmp olt <vscale x 1 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f32 = fcmp olt <vscale x 2 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f32 = fcmp olt <vscale x 4 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f32 = fcmp olt <vscale x 8 x float> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv16f32 = fcmp olt <vscale x 16 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32 = fcmp olt <vscale x 4 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv8f32 = fcmp olt <vscale x 8 x float> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %nxv16f32 = fcmp olt <vscale x 16 x float> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64 = fcmp olt <2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64 = fcmp olt <4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64 = fcmp olt <8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v4f64 = fcmp olt <4 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v8f64 = fcmp olt <8 x double> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv1f64 = fcmp olt <vscale x 1 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv2f64 = fcmp olt <vscale x 2 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv4f64 = fcmp olt <vscale x 4 x double> undef, undef
-; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %nxv8f64 = fcmp olt <vscale x 8 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %nxv2f64 = fcmp olt <vscale x 2 x double> undef, undef
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %nxv4f64 = fcmp olt <vscale x 4 x double> ...
[truncated]
|
; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %18 = select i1 undef, <vscale x 16 x i1> undef, <vscale x 16 x i1> undef | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %19 = select i1 undef, <vscale x 32 x i1> undef, <vscale x 32 x i1> undef | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %18 = select i1 undef, <vscale x 16 x i1> undef, <vscale x 16 x i1> undef | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %19 = select i1 undef, <vscale x 32 x i1> undef, <vscale x 32 x i1> undef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is affected by the assignment of a cost of 1 to mask operations, because <vscale x N x i1>
translates directly into <vscale x N x i8>
in RISCVTargetLowering::getLMUL(MVT)
if ((ValTy->getScalarSizeInBits() == 16) && !ST->hasVInstructionsF16()) { | ||
// pre-widening Op1 and Op2 to f32 before comparison | ||
VectorType *VecF32Ty = | ||
VectorType::get(Type::getDoubleTy(ValTy->getContext()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not to use getFloatTy
for f32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
Can you separate the "Support f16 operations using zvfhmin in the absence of zvfh" piece into it's own review? The rest of this looks pretty straight forward, and once that bit is pulled out, probably an easy LGTM |
Sure. will create a separate pr after this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
CmpInst::isFPPredicate(VecPred)) { | ||
|
||
// Use VMXOR_MM and VMXNOR_MM to generate all true/false mask | ||
if ((VecPred == CmpInst::FCMP_FALSE) || (VecPred == CmpInst::FCMP_TRUE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have tests for this? I didn't see a true/false conditions in rvv-cmp.ll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I forgot to include this. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch introduces following changes