-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CostModel/RISCV: tweak cost of vector ctpop under ZVBB #67020
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-analysis ChangesUnder RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect. -- 8< -- Full diff: https://github.com/llvm/llvm-project/pull/67020.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 6b950cd8a49fc09..2dbc3663046304e 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1074,6 +1074,12 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return LT.first;
break;
}
+ case Intrinsic::ctpop: {
+ auto LT = getTypeLegalizationCost(RetTy);
+ if (ST->hasVInstructions() && ST->hasStdExtZvbb() && LT.second.isVector())
+ return LT.first;
+ break;
+ }
case Intrinsic::abs: {
auto LT = getTypeLegalizationCost(RetTy);
if (ST->hasVInstructions() && LT.second.isVector()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/ctpop.ll b/llvm/test/Analysis/CostModel/RISCV/ctpop.ll
new file mode 100644
index 000000000000000..dc07df9a3b40f02
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/RISCV/ctpop.ll
@@ -0,0 +1,95 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -S -mtriple=riscv64 -mattr=+v,+experimental-zvbb -riscv-v-vector-bits-min=-1 | FileCheck %s --check-prefix=ZVBB
+; Vector ctpop only exists under zvbb
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -S -mtriple=riscv64 -mattr=+v -riscv-v-vector-bits-min=-1 | FileCheck %s --check-prefix=NOZVBB
+
+define void @ctpop() {
+; ZVBB-LABEL: 'ctpop'
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %1 = call i8 @llvm.ctpop.i8(i8 undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %6 = call i16 @llvm.ctpop.i16(i16 undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %8 = call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %10 = call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %11 = call i32 @llvm.ctpop.i32(i32 undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %13 = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %14 = call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %15 = call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %16 = call i64 @llvm.ctpop.i64(i64 undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %17 = call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %18 = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %19 = call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %20 = call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
+; NOZVBB-LABEL: 'ctpop'
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %1 = call i8 @llvm.ctpop.i8(i8 undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %2 = call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %3 = call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %4 = call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %5 = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %6 = call i16 @llvm.ctpop.i16(i16 undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %7 = call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %8 = call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %9 = call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %10 = call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %11 = call i32 @llvm.ctpop.i32(i32 undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %12 = call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %13 = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %14 = call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %15 = call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %16 = call i64 @llvm.ctpop.i64(i64 undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %17 = call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %18 = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %19 = call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %20 = call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
+ call i8 @llvm.ctpop.i8(i8 undef)
+ call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+ call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+ call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+ call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+ call i16 @llvm.ctpop.i16(i16 undef)
+ call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+ call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+ call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+ call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+ call i32 @llvm.ctpop.i32(i32 undef)
+ call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+ call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+ call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+ call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+ call i64 @llvm.ctpop.i64(i64 undef)
+ call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+ call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+ call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+ call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+ ret void
+}
+
+declare i8 @llvm.ctpop.i8(i8)
+declare <2 x i8> @llvm.ctpop.v2i8(<2 x i8>)
+declare <4 x i8> @llvm.ctpop.v4i8(<4 x i8>)
+declare <8 x i8> @llvm.ctpop.v8i8(<8 x i8>)
+declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
+declare i16 @llvm.ctpop.i16(i16)
+declare <2 x i16> @llvm.ctpop.v2i16(<2 x i16>)
+declare <4 x i16> @llvm.ctpop.v4i16(<4 x i16>)
+declare <8 x i16> @llvm.ctpop.v8i16(<8 x i16>)
+declare <16 x i16> @llvm.ctpop.v16i16(<16 x i16>)
+declare i32 @llvm.ctpop.i32(i32)
+declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32>)
+declare <4 x i32> @llvm.ctpop.v4i32(<4 x i32>)
+declare <8 x i32> @llvm.ctpop.v8i32(<8 x i32>)
+declare <16 x i32> @llvm.ctpop.v16i32(<16 x i32>)
+declare i64 @llvm.ctpop.i64(i64)
+declare <2 x i64> @llvm.ctpop.v2i64(<2 x i64>)
+declare <4 x i64> @llvm.ctpop.v4i64(<4 x i64>)
+declare <8 x i64> @llvm.ctpop.v8i64(<8 x i64>)
+declare <16 x i64> @llvm.ctpop.v16i64(<16 x i64>)
|
Code change LGTM once rebased over revised test. (See other review) |
ffe603b
to
a45f727
Compare
@preames Fixed, thanks. |
Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect.
a45f727
to
3c64a8b
Compare
Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect.
Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect.