Skip to content

Conversation

mshockwave
Copy link
Member

sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable.

The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the isFP64Throttled knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.

@llvmbot
Copy link
Member

llvmbot commented Oct 7, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable.

The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the isFP64Throttled knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.


Full diff: https://github.com/llvm/llvm-project/pull/162399.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+4)
  • (modified) llvm/test/CodeGen/RISCV/features-info.ll (+1)
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 27cf057112869..0d3df0e188505 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1823,6 +1823,10 @@ def TuneConditionalCompressedMoveFusion
 def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">;
 def NoConditionalMoveFusion  : Predicate<"!Subtarget->hasConditionalMoveFusion()">;
 
+def TuneHasThrottledVecFP64
+  : SubtargetFeature<"throttled-vec-fp64", "HasThrottledVectorFP64", "true",
+                     "Certain vector FP64 operations have limited performance">;
+
 def TuneMIPSP8700
     : SubtargetFeature<"mips-p8700", "RISCVProcFamily", "MIPSP8700",
                        "MIPS p8700 processor">;
diff --git a/llvm/test/CodeGen/RISCV/features-info.ll b/llvm/test/CodeGen/RISCV/features-info.ll
index 1a7a72d3e072b..40a976e871988 100644
--- a/llvm/test/CodeGen/RISCV/features-info.ll
+++ b/llvm/test/CodeGen/RISCV/features-info.ll
@@ -179,6 +179,7 @@
 ; CHECK-NEXT:   svpbmt                           - 'Svpbmt' (Page-Based Memory Types).
 ; CHECK-NEXT:   svvptc                           - 'Svvptc' (Obviating Memory-Management Instructions after Marking PTEs Valid).
 ; CHECK-NEXT:   tagged-globals                   - Use an instruction sequence for taking the address of a global that allows a memory tag in the upper address bits.
+; CHECK-NEXT:   throttled-vec-fp64               - Certain vector FP64 operations have limited performance.
 ; CHECK-NEXT:   unaligned-scalar-mem             - Has reasonably performant unaligned scalar loads and stores.
 ; CHECK-NEXT:   unaligned-vector-mem             - Has reasonably performant unaligned vector loads and stores.
 ; CHECK-NEXT:   use-postra-scheduler             - Schedule again after register allocation.

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mshockwave mshockwave merged commit 69f9138 into llvm:main Oct 10, 2025
9 checks passed
@mshockwave mshockwave deleted the patch/riscv/throttled-fp64-feature branch October 10, 2025 00:32
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 10, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/23808

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/323/333' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-2510779-323-333.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=333 GTEST_SHARD_INDEX=323 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Script:
--
/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests --gtest_filter=StdLibTests.StdLibDocComments
--
ASTWorker building file /clangd-test/foo.cc version null with command 
[/clangd-test]
clang -ffreestanding -isystem/clangd-test/stdlib /clangd-test/foo.cc
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name foo.cc -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -ffreestanding -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test -fcoverage-compilation-dir=/clangd-test -resource-dir lib/clang/22 -isystem /clangd-test/stdlib -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x c++ /clangd-test/foo.cc
Building first preamble for /clangd-test/foo.cc version null
../llvm/clang-tools-extra/clangd/unittests/StdLibTests.cpp:188: Failure
Value of: Server.blockUntilIdleForTest()
  Actual: false
Expected: true

Built preamble of size 737048 for file /clangd-test/foo.cc version null in 16.14 seconds
Indexing c++17 standard library in the context of /clangd-test/foo.cc
indexed preamble AST for /clangd-test/foo.cc version null:
  symbol slab: 0 symbols, 120 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
indexed preamble AST for /clangd-test/foo.cc version :
  symbol slab: 3 symbols, 4912 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
Indexed c++17 standard library: 3 symbols, 0 filtered
Build dynamic index for header symbols with estimated memory usage of 8820 bytes

../llvm/clang-tools-extra/clangd/unittests/StdLibTests.cpp:188
Value of: Server.blockUntilIdleForTest()
  Actual: false
Expected: true



********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants