-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[RISCV] Adopt SpacemitX60's scheduling model for -mtune=generic
#167008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v Author: Min-Yih Hsu (mshockwave) ChangesPer our discussions in RISC-V roundtable during LLVM Dev Meeting 2025, there has been a consensus to create a base / generic scheduling model for the most common performance tuning usages. As the first step, we agree to use SpacemitX60's scheduling model for that purpose for the time being, with an expectation to create a standalone generic model that could evolve independently in the future. This patch sets Note that I didn't make the same change on For reference:
So I guess the question here is: should we update all affected codegen tests? On one hand it makes the tests less stable whenever we update SpacemitX60's scheduling model -- which is still relatively active. On the other hand, it's more straightforward if we maintain the consistency between Full diff: https://github.com/llvm/llvm-project/pull/167008.diff 3 Files Affected:
diff --git a/llvm/docs/RISCVUsage.rst b/llvm/docs/RISCVUsage.rst
index d03f383a92b3b..e213db332cf38 100644
--- a/llvm/docs/RISCVUsage.rst
+++ b/llvm/docs/RISCVUsage.rst
@@ -601,3 +601,16 @@ Clang's ``-msmall-data-limit=`` option controls what the threshold size is (in b
The small data limit threshold is also used to separate small constants into sections with names starting with ``.srodata``. LLD does not place these with the ``.sdata`` and ``.sbss`` sections as ``.srodata`` sections are read only and the other two are writable. Instead the ``.srodata`` sections are placed adjacent to ``.rodata``.
Data suggests that these options can produce significant improvements across a range of benchmarks.
+
+Scheduling Model and Tuning
+===========================
+
+RISC-V is highly configurable, meaning its scheduling models could be highly diversified as well. Yet we still believe it is helpful to provide a "generic" tuning processor / scheduling model that represents the "lowest common denominator" RISC-V implementation at the time. The idea is that it could serve as a "good-enough" baseline model for performance tuning purposes on some of the most common use cases.
+
+Though details of this generic scheduling model might evolve over time, we always have some _expectations_ on the kind of processors it is used for.
+
+For example, the ``generic`` tuning processor is expected to target in-order, superscalar application processors designed for general-purpose computing. It is usually RVA22U64- or RVA23U64-capable intended to run Linux. The ``generic-ooo`` has a similar set of expectations, except it is targeting out-of-order application processors.
+
+Right now, we simply assign a scheduling model that is widely used by the community to ``generic``. But in the future, we can create a standalone scheduling model for ``generic``, or even create a generic model for each of the individual sectors. For example, a ``generic-embedded`` for embedded processors and a ``generic-server`` for server workloads.
+
+These future generic models could even serve as the "base" model for other scheduling models to derive from: it's not uncommon for multiple processors to share a similar set of instruction scheduling info except a few key instructions, and this is especially true for RISC-V given its highly configurable nature. If we could design the base model in a way that it can be _parameterized_ by subtarget tuning features, we can substitue the traditional way of creating individual scheduling models with a combination of base scheduling model + different subtarget features.
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 23bba99ec874f..d330120214705 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -138,6 +138,7 @@ Changes to the RISC-V Backend
* Adds experimental support for the 'Zibi` (Branch with Immediate) extension.
* Add support for Zvfofp8min (OFP8 conversion extension)
* Adds assembler support for the Andes `XAndesvsinth` (Andes Vector Small Int Handling Extension).
+* `-mtune=generic` now uses the scheduling model from SpacemitX60 instead of an empty scheduling model.
Changes to the WebAssembly Backend
----------------------------------
diff --git a/llvm/lib/Target/RISCV/RISCVProcessors.td b/llvm/lib/Target/RISCV/RISCVProcessors.td
index e86431f78f1ba..9e31d08ae2243 100644
--- a/llvm/lib/Target/RISCV/RISCVProcessors.td
+++ b/llvm/lib/Target/RISCV/RISCVProcessors.td
@@ -102,7 +102,12 @@ def GENERIC_RV64 : RISCVProcessorModel<"generic-rv64",
GenericTuneInfo;
// Support generic for compatibility with other targets. The triple will be used
// to change to the appropriate rv32/rv64 version.
-def GENERIC : RISCVTuneProcessorModel<"generic", NoSchedModel>, GenericTuneInfo;
+// `generic` is expected to target in-order application processors designed for
+// general-purpose computing.
+def GENERIC : RISCVTuneProcessorModel<"generic", SpacemitX60Model>,
+ GenericTuneInfo;
+// `generic-ooo` is expected to target out-of-order application processors designed
+// for general-purpose computing.
def GENERIC_OOO : RISCVTuneProcessorModel<"generic-ooo", GenericOOOModel>,
GenericTuneInfo;
|
| def GENERIC : RISCVTuneProcessorModel<"generic", NoSchedModel>, GenericTuneInfo; | ||
| // `generic` is expected to target in-order application processors designed for | ||
| // general-purpose computing. | ||
| def GENERIC : RISCVTuneProcessorModel<"generic", SpacemitX60Model>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't RISCVSubtarget.cpp always remap "generic" to "generic-rv32" or "generic-rv64"?
Please see description here edd6632
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but it will not remap -mtune=generic.
So after this patch, -mtune=generic effectively becomes -mtune=spacemit-x60 while -mcpu=generic/generic-rv32/generic-rv64 stay the same (i.e. noschedmodel).
Now, if we want to make -mtune=generic and -mcpu=generic having the same behavior, then we need to update nearly all the test cases, hence the question I raised in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been saying -mtune=generic, but I think what we actually mean was no -mtune provided. Meaning default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no -mtune provided. Meaning default behavior.
fair enough, if that's the case I guess we have to bite the bullet and update all the codegen tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should just update the codegen tests, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok the behavior between clang and llvm is a bit messy.
clang rejects -mcpu=generic. It requires generic-rv32/rv64 for mcpu.
With no -mcpu, clang passes generic-rv32/rv64 as the CPU.
Without an explicit mtune clang only passes a cpu in the target-cpu attribute.
Clang accepts -mtune=generic, -mtune=generic-rv32, or -mtune=generic-rv64. These will be passed in the tune-cpu function attribute. These will use 3 different RISCVProcessor.td entries.
So -mtune=generic uses the "generic" entry RISCVProcessors.td, but no -mtune and -mcpu=generic or no -mcpu uses the generic-rv32/rv64 entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be trying to resolve some of this strangeness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be trying to resolve some of this strangeness?
Done by #168612
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow-up to this: RISCVTuneProcessorModel<"generic", ...> is revert back to NoSchedModel, because any "generic" -- either -mcpu=generic or -mtune=generic -- will boil down to generic-rv32/rv64 now.
And update the tests.
|
Alright I just updated all the codegen tests (~700 of them got affected). I noticed that This actually didn't abort the compilation but still raise an error. I marked this test as |
This error is because the schedule sank the copies from vector argument registers to virtual registers. This overconstrained register allocation. This is effectively the same issue as #41914 This patch fixes it by giving the COPY a hardware resource and latency: |
As it becomes a dummy RISCVTuneProcessorModel entry now.
🐧 Linux x64 Test Results
|
Per our discussions in RISC-V roundtable during LLVM Dev Meeting 2025, there has been a consensus to create a base / generic scheduling model for the most common performance tuning usages. As the first step, we agree to use SpacemitX60's scheduling model for that purpose for the time being, with an expectation to create a standalone generic model that could evolve independently in the future.
This patch sets
-mtune=genericto use SpacemitX60's scheduling model, and documents the rationale behind it, including the roadmap ahead as we discussed.Update: the change was reflected on
-mcpu=generic-rv32/64as well.Note that I didn't make the same change on
-mcpu=generic-rv32/64, because it would invalidate most of, if not all of the codegen tests (we set TuneCPU to generic-rv32/64 when-mtuneis empty). I would like to solicit opinions on whether we should do it or not.For reference:
-mtune=genericis equal to-mtune=sandy-bridge. But in the absent of-mtune, the-mtuneis set of i586, which usesNoSchedModel. The rationale behind it is similar to what I mentioned above: setting to i586 can make sure it won't break any the codegen tests-mtune=genericis equal to-mtune=cortex-a510-- same for the case where-mtuneis empty. I guess they bit the bullet and updated all the test cases when they made that decision (i.e. to use cotext-a510 rather than NoSchedModel for codegen tests)So I guess the question here is: should we update all affected codegen tests? On one hand it makes the tests less stable whenever we update SpacemitX60's scheduling model -- which is still relatively active. On the other hand, it's more straightforward if we maintain a consistency between
genericandgeneric-rv32/64-- both of them use the same scheduling model.