-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Add a new subtarget feature for throttled FP64 vector performance #162399
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Min-Yih Hsu (mshockwave) Changessifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable. The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the Full diff: https://github.com/llvm/llvm-project/pull/162399.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 27cf057112869..0d3df0e188505 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1823,6 +1823,10 @@ def TuneConditionalCompressedMoveFusion
def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">;
def NoConditionalMoveFusion : Predicate<"!Subtarget->hasConditionalMoveFusion()">;
+def TuneHasThrottledVecFP64
+ : SubtargetFeature<"throttled-vec-fp64", "HasThrottledVectorFP64", "true",
+ "Certain vector FP64 operations have limited performance">;
+
def TuneMIPSP8700
: SubtargetFeature<"mips-p8700", "RISCVProcFamily", "MIPSP8700",
"MIPS p8700 processor">;
diff --git a/llvm/test/CodeGen/RISCV/features-info.ll b/llvm/test/CodeGen/RISCV/features-info.ll
index 1a7a72d3e072b..40a976e871988 100644
--- a/llvm/test/CodeGen/RISCV/features-info.ll
+++ b/llvm/test/CodeGen/RISCV/features-info.ll
@@ -179,6 +179,7 @@
; CHECK-NEXT: svpbmt - 'Svpbmt' (Page-Based Memory Types).
; CHECK-NEXT: svvptc - 'Svvptc' (Obviating Memory-Management Instructions after Marking PTEs Valid).
; CHECK-NEXT: tagged-globals - Use an instruction sequence for taking the address of a global that allows a memory tag in the upper address bits.
+; CHECK-NEXT: throttled-vec-fp64 - Certain vector FP64 operations have limited performance.
; CHECK-NEXT: unaligned-scalar-mem - Has reasonably performant unaligned scalar loads and stores.
; CHECK-NEXT: unaligned-vector-mem - Has reasonably performant unaligned vector loads and stores.
; CHECK-NEXT: use-postra-scheduler - Schedule again after register allocation.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/23808 Here is the relevant piece of the build log for the reference
|
sifive-x390 and sifive-x280 both share the SiFIve7 scheduling model, yet the former has a limited FP64 vector performance. Right now we account for it by instantiating two separate scheduling models (throttled v.s. non-throttled) from the base SiFive7 model. However, this approach (which is also used in other performance features like fast vrgather in SiFive7) does not scale if we add more of these performance features in the future -- the number of scheduling models will simply become unmanageable.
The new solution I've been working on is to let a single scheduling model be configured by subtarget features on performance features like these, such that we no longer need to create those derived models. This patch creates the subtarget feature that'll ultimately replace the
isFP64Throttled
knob in SiFive7 scheduling model mentioned earlier. There will be a follow-up patch to integrate this into the scheduling model.