Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CostModel/RISCV: tweak cost of vector ctpop under ZVBB #67020

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Sep 21, 2023

Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-analysis

Changes

Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a single instruction: vcpop. The cost-model does not check for the ZVBB extension, and always associates a high cost to vector variants of llvm.ctpop. Fix this defect.

-- 8< --
Based on #67013. Please review only the second patch.


Full diff: https://github.com/llvm/llvm-project/pull/67020.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+6)
  • (added) llvm/test/Analysis/CostModel/RISCV/ctpop.ll (+95)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 6b950cd8a49fc09..2dbc3663046304e 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1074,6 +1074,12 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
       return LT.first;
     break;
   }
+  case Intrinsic::ctpop: {
+    auto LT = getTypeLegalizationCost(RetTy);
+    if (ST->hasVInstructions() && ST->hasStdExtZvbb() && LT.second.isVector())
+      return LT.first;
+    break;
+  }
   case Intrinsic::abs: {
     auto LT = getTypeLegalizationCost(RetTy);
     if (ST->hasVInstructions() && LT.second.isVector()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/ctpop.ll b/llvm/test/Analysis/CostModel/RISCV/ctpop.ll
new file mode 100644
index 000000000000000..dc07df9a3b40f02
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/RISCV/ctpop.ll
@@ -0,0 +1,95 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -S -mtriple=riscv64 -mattr=+v,+experimental-zvbb -riscv-v-vector-bits-min=-1 | FileCheck %s --check-prefix=ZVBB
+; Vector ctpop only exists under zvbb
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -S -mtriple=riscv64 -mattr=+v -riscv-v-vector-bits-min=-1 | FileCheck %s --check-prefix=NOZVBB
+
+define void @ctpop() {
+; ZVBB-LABEL: 'ctpop'
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %1 = call i8 @llvm.ctpop.i8(i8 undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %4 = call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %5 = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %6 = call i16 @llvm.ctpop.i16(i16 undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %8 = call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %9 = call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %10 = call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %11 = call i32 @llvm.ctpop.i32(i32 undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %13 = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %14 = call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %15 = call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %16 = call i64 @llvm.ctpop.i64(i64 undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %17 = call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %18 = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %19 = call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %20 = call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+; ZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
+; NOZVBB-LABEL: 'ctpop'
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %1 = call i8 @llvm.ctpop.i8(i8 undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %2 = call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %3 = call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %4 = call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %5 = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %6 = call i16 @llvm.ctpop.i16(i16 undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 19 for instruction: %7 = call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 19 for instruction: %8 = call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 19 for instruction: %9 = call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 19 for instruction: %10 = call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %11 = call i32 @llvm.ctpop.i32(i32 undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %12 = call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %13 = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %14 = call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %15 = call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %16 = call i64 @llvm.ctpop.i64(i64 undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %17 = call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %18 = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %19 = call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %20 = call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+; NOZVBB-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
+  call i8 @llvm.ctpop.i8(i8 undef)
+  call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> undef)
+  call <4 x i8> @llvm.ctpop.v4i8(<4 x i8> undef)
+  call <8 x i8> @llvm.ctpop.v8i8(<8 x i8> undef)
+  call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> undef)
+  call i16 @llvm.ctpop.i16(i16 undef)
+  call <2 x i16> @llvm.ctpop.v2i16(<2 x i16> undef)
+  call <4 x i16> @llvm.ctpop.v4i16(<4 x i16> undef)
+  call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> undef)
+  call <16 x i16> @llvm.ctpop.v16i16(<16 x i16> undef)
+  call i32 @llvm.ctpop.i32(i32 undef)
+  call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> undef)
+  call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> undef)
+  call <8 x i32> @llvm.ctpop.v8i32(<8 x i32> undef)
+  call <16 x i32> @llvm.ctpop.v16i32(<16 x i32> undef)
+  call i64 @llvm.ctpop.i64(i64 undef)
+  call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> undef)
+  call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> undef)
+  call <8 x i64> @llvm.ctpop.v8i64(<8 x i64> undef)
+  call <16 x i64> @llvm.ctpop.v16i64(<16 x i64> undef)
+  ret void
+}
+
+declare i8 @llvm.ctpop.i8(i8)
+declare <2 x i8> @llvm.ctpop.v2i8(<2 x i8>)
+declare <4 x i8> @llvm.ctpop.v4i8(<4 x i8>)
+declare <8 x i8> @llvm.ctpop.v8i8(<8 x i8>)
+declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
+declare i16 @llvm.ctpop.i16(i16)
+declare <2 x i16> @llvm.ctpop.v2i16(<2 x i16>)
+declare <4 x i16> @llvm.ctpop.v4i16(<4 x i16>)
+declare <8 x i16> @llvm.ctpop.v8i16(<8 x i16>)
+declare <16 x i16> @llvm.ctpop.v16i16(<16 x i16>)
+declare i32 @llvm.ctpop.i32(i32)
+declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32>)
+declare <4 x i32> @llvm.ctpop.v4i32(<4 x i32>)
+declare <8 x i32> @llvm.ctpop.v8i32(<8 x i32>)
+declare <16 x i32> @llvm.ctpop.v16i32(<16 x i32>)
+declare i64 @llvm.ctpop.i64(i64)
+declare <2 x i64> @llvm.ctpop.v2i64(<2 x i64>)
+declare <4 x i64> @llvm.ctpop.v4i64(<4 x i64>)
+declare <8 x i64> @llvm.ctpop.v8i64(<8 x i64>)
+declare <16 x i64> @llvm.ctpop.v16i64(<16 x i64>)

@preames
Copy link
Collaborator

preames commented Sep 21, 2023

Code change LGTM once rebased over revised test. (See other review)

@artagnon artagnon force-pushed the ctpop-zvbb-cost branch 2 times, most recently from ffe603b to a45f727 Compare September 25, 2023 15:50
@artagnon
Copy link
Contributor Author

@preames Fixed, thanks.

Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a
single instruction: vcpop. The cost-model does not check for the ZVBB
extension, and always associates a high cost to vector variants of
llvm.ctpop. Fix this defect.
@artagnon artagnon merged commit 7c128f6 into llvm:main Sep 27, 2023
2 of 3 checks passed
@artagnon artagnon deleted the ctpop-zvbb-cost branch September 27, 2023 12:00
legrosbuffle pushed a commit to legrosbuffle/llvm-project that referenced this pull request Sep 29, 2023
Under RISCV experimental-zvbb, vector variants of llvm.ctpop lower to a
single instruction: vcpop. The cost-model does not check for the ZVBB
extension, and always associates a high cost to vector variants of
llvm.ctpop. Fix this defect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants