[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399

mshockwave · 2025-10-07T23:47:45Z

sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable.

The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the isFP64Throttled knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.

llvmbot · 2025-10-07T23:48:22Z

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable.

The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the isFP64Throttled knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.

Full diff: https://github.com/llvm/llvm-project/pull/162399.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+4)
(modified) llvm/test/CodeGen/RISCV/features-info.ll (+1)

diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 27cf057112869..0d3df0e188505 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1823,6 +1823,10 @@ def TuneConditionalCompressedMoveFusion
 def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">;
 def NoConditionalMoveFusion  : Predicate<"!Subtarget->hasConditionalMoveFusion()">;
 
+def TuneHasThrottledVecFP64
+  : SubtargetFeature<"throttled-vec-fp64", "HasThrottledVectorFP64", "true",
+                     "Certain vector FP64 operations have limited performance">;
+
 def TuneMIPSP8700
     : SubtargetFeature<"mips-p8700", "RISCVProcFamily", "MIPSP8700",
                        "MIPS p8700 processor">;
diff --git a/llvm/test/CodeGen/RISCV/features-info.ll b/llvm/test/CodeGen/RISCV/features-info.ll
index 1a7a72d3e072b..40a976e871988 100644
--- a/llvm/test/CodeGen/RISCV/features-info.ll
+++ b/llvm/test/CodeGen/RISCV/features-info.ll
@@ -179,6 +179,7 @@
 ; CHECK-NEXT:   svpbmt                           - 'Svpbmt' (Page-Based Memory Types).
 ; CHECK-NEXT:   svvptc                           - 'Svvptc' (Obviating Memory-Management Instructions after Marking PTEs Valid).
 ; CHECK-NEXT:   tagged-globals                   - Use an instruction sequence for taking the address of a global that allows a memory tag in the upper address bits.
+; CHECK-NEXT:   throttled-vec-fp64               - Certain vector FP64 operations have limited performance.
 ; CHECK-NEXT:   unaligned-scalar-mem             - Has reasonably performant unaligned scalar loads and stores.
 ; CHECK-NEXT:   unaligned-vector-mem             - Has reasonably performant unaligned vector loads and stores.
 ; CHECK-NEXT:   use-postra-scheduler             - Schedule again after register allocation.

wangpc-pp

LGTM.

llvm-ci · 2025-10-10T00:50:33Z

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/23808

Here is the relevant piece of the build log for the reference

Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/323/333' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-2510779-323-333.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=333 GTEST_SHARD_INDEX=323 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Script:
--
/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests --gtest_filter=StdLibTests.StdLibDocComments
--
ASTWorker building file /clangd-test/foo.cc version null with command 
[/clangd-test]
clang -ffreestanding -isystem/clangd-test/stdlib /clangd-test/foo.cc
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name foo.cc -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -ffreestanding -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test -fcoverage-compilation-dir=/clangd-test -resource-dir lib/clang/22 -isystem /clangd-test/stdlib -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x c++ /clangd-test/foo.cc
Building first preamble for /clangd-test/foo.cc version null
../llvm/clang-tools-extra/clangd/unittests/StdLibTests.cpp:188: Failure
Value of: Server.blockUntilIdleForTest()
  Actual: false
Expected: true

Built preamble of size 737048 for file /clangd-test/foo.cc version null in 16.14 seconds
Indexing c++17 standard library in the context of /clangd-test/foo.cc
indexed preamble AST for /clangd-test/foo.cc version null:
  symbol slab: 0 symbols, 120 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
indexed preamble AST for /clangd-test/foo.cc version :
  symbol slab: 3 symbols, 4912 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
Indexed c++17 standard library: 3 symbols, 0 filtered
Build dynamic index for header symbols with estimated memory usage of 8820 bytes

../llvm/clang-tools-extra/clangd/unittests/StdLibTests.cpp:188
Value of: Server.blockUntilIdleForTest()
  Actual: false
Expected: true



********************

[RISCV] Add a new subtarget feature for throttled vector FP64

ad7940b

mshockwave requested review from lukel97, mikhailramalho, preames, topperc and wangpc-pp October 7, 2025 23:47

llvmbot added the backend:RISC-V label Oct 7, 2025

mshockwave mentioned this pull request Oct 7, 2025

[RISCV] Toggle throttled FP64 feature in SiFive7 scheduling model with subtarget feature #162400

Merged

fixup! Rename to TuneHasSingleElementVecFP64

da837f3

wangpc-pp approved these changes Oct 9, 2025

View reviewed changes

mshockwave merged commit 69f9138 into llvm:main Oct 10, 2025
9 checks passed

mshockwave deleted the patch/riscv/throttled-fp64-feature branch October 10, 2025 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399

[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399

Uh oh!

mshockwave commented Oct 7, 2025

Uh oh!

llvmbot commented Oct 7, 2025

Uh oh!

wangpc-pp left a comment

Uh oh!

Uh oh!

llvm-ci commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399

[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399

Uh oh!

Conversation

mshockwave commented Oct 7, 2025

Uh oh!

llvmbot commented Oct 7, 2025

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants