Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] SiFive7 VLDS Sched should not depend on VL when stride is x0. #70266

Merged
merged 4 commits into from
Oct 30, 2023

Conversation

michaelmaitland
Copy link
Contributor

When stride is x0, a strided load should behave like a unit stride load, which uses the VLDE sched class.

When stride is x0, a strided load should behave like a unit stride load,
which uses the VLDE sched class.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 25, 2023

@llvm/pr-subscribers-backend-risc-v

Author: Michael Maitland (michaelmaitland)

Changes

When stride is x0, a strided load should behave like a unit stride load, which uses the VLDE sched class.


Full diff: https://github.com/llvm/llvm-project/pull/70266.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVSchedSiFive7.td (+31-7)
  • (modified) llvm/lib/Target/RISCV/RISCVScheduleV.td (+39)
diff --git a/llvm/lib/Target/RISCV/RISCVSchedSiFive7.td b/llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
index 96ebe8e3e67686a..7433d88bfaba096 100644
--- a/llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
+++ b/llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
@@ -455,11 +455,20 @@ foreach mx = SchedMxList in {
 // specific suffixes, but since SEW is already encoded in the name of the
 // resource, we do not need to use LMULSEWXXX constructors. However, we do
 // use the SEW from the name to determine the number of Cycles.
+
+// This predicate is true when the rs2 operand of vlse or vsse is x0, false
+// otherwise.
+def VLDSX0Pred: SchedPredicate<[{ MI->getOperand(3).getReg() == RISCV::X0 }]>;
+
 foreach mx = SchedMxList in {
+  defvar VLDSX0Cycles = SiFive7GetCyclesDefault<mx>.c;
   defvar Cycles = SiFive7GetCyclesOnePerElement<mx, 8>.c;
   defvar IsWorstCase = SiFive7IsWorstCaseMX<mx, SchedMxList>.c;
+  defm "" : LMULWriteResMXVariant<"WriteVLDS8",  VLDSX0Pred, [SiFive7VL],
+                                  4, [VLDSX0Cycles],
+                                  !add(3, Cycles), [Cycles], mx,
+                                  IsWorstCase, "SiFive7">;
   let Latency = !add(3, Cycles), ReleaseAtCycles = [Cycles] in {
-    defm "" : LMULWriteResMX<"WriteVLDS8",  [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDUX8", [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDOX8", [SiFive7VL], mx, IsWorstCase>;
   }
@@ -469,11 +478,18 @@ foreach mx = SchedMxList in {
     defm "" : LMULWriteResMX<"WriteVSTOX8", [SiFive7VS], mx, IsWorstCase>;
   }
 }
-foreach mx = SchedMxList in {
+// TODO: The MxLists need to be filtered by EEW. We only need to support
+// LMUL >= SEW_min/ELEN. Here, the smallest EEW prevents us from having MF8
+// since LMUL >= 16/64.
+foreach mx = ["MF4", "MF2", "M1", "M2", "M4", "M8"] in {
+  defvar VLDSX0Cycles = SiFive7GetCyclesDefault<mx>.c;
   defvar Cycles = SiFive7GetCyclesOnePerElement<mx, 16>.c;
   defvar IsWorstCase = SiFive7IsWorstCaseMX<mx, SchedMxList>.c;
+  defm "" : LMULWriteResMXVariant<"WriteVLDS16",  VLDSX0Pred, [SiFive7VL],
+                                  4, [VLDSX0Cycles],
+                                  !add(3, Cycles), [Cycles], mx,
+                                  IsWorstCase, "SiFive7">;
   let Latency = !add(3, Cycles), ReleaseAtCycles = [Cycles] in {
-    defm "" : LMULWriteResMX<"WriteVLDS16",  [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDUX16", [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDOX16", [SiFive7VL], mx, IsWorstCase>;
   }
@@ -483,11 +499,15 @@ foreach mx = SchedMxList in {
     defm "" : LMULWriteResMX<"WriteVSTOX16", [SiFive7VS], mx, IsWorstCase>;
   }
 }
-foreach mx = SchedMxList in {
+foreach mx = ["MF2", "M1", "M2", "M4", "M8"] in {
+  defvar VLDSX0Cycles = SiFive7GetCyclesDefault<mx>.c;
   defvar Cycles = SiFive7GetCyclesOnePerElement<mx, 32>.c;
   defvar IsWorstCase = SiFive7IsWorstCaseMX<mx, SchedMxList>.c;
+  defm "" : LMULWriteResMXVariant<"WriteVLDS32",  VLDSX0Pred, [SiFive7VL],
+                                  4, [VLDSX0Cycles],
+                                  !add(3, Cycles), [Cycles], mx,
+                                  IsWorstCase, "SiFive7">;
   let Latency = !add(3, Cycles), ReleaseAtCycles = [Cycles] in {
-    defm "" : LMULWriteResMX<"WriteVLDS32",  [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDUX32", [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDOX32", [SiFive7VL], mx, IsWorstCase>;
   }
@@ -497,11 +517,15 @@ foreach mx = SchedMxList in {
     defm "" : LMULWriteResMX<"WriteVSTOX32", [SiFive7VS], mx, IsWorstCase>;
   }
 }
-foreach mx = SchedMxList in {
+foreach mx = ["M1", "M2", "M4", "M8"] in {
+  defvar VLDSX0Cycles = SiFive7GetCyclesDefault<mx>.c;
   defvar Cycles = SiFive7GetCyclesOnePerElement<mx, 64>.c;
   defvar IsWorstCase = SiFive7IsWorstCaseMX<mx, SchedMxList>.c;
+  defm "" : LMULWriteResMXVariant<"WriteVLDS64",  VLDSX0Pred, [SiFive7VL],
+                                  4, [VLDSX0Cycles],
+                                  !add(3, Cycles), [Cycles], mx,
+                                  IsWorstCase, "SiFive7">;
   let Latency = !add(3, Cycles), ReleaseAtCycles = [Cycles] in {
-    defm "" : LMULWriteResMX<"WriteVLDS64",  [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDUX64", [SiFive7VL], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVLDOX64", [SiFive7VL], mx, IsWorstCase>;
   }
diff --git a/llvm/lib/Target/RISCV/RISCVScheduleV.td b/llvm/lib/Target/RISCV/RISCVScheduleV.td
index 7af7716c96b8564..2bad3b88867a975 100644
--- a/llvm/lib/Target/RISCV/RISCVScheduleV.td
+++ b/llvm/lib/Target/RISCV/RISCVScheduleV.td
@@ -62,6 +62,45 @@ multiclass LMULSEWWriteResMXSEW<string name, list<ProcResourceKind> resources,
     def : WriteRes<!cast<SchedWrite>(name # "_WorstCase"), resources>;
 }
 
+// Define a SchedAlias for the SchedWrite associated with (name, mx) whose
+// behavior is aliased to a Variant. The Variant has Latency predLad and
+// ReleaseAtCycles predCycles if the SchedPredicate Pred is true, otherwise has
+// Latency noPredLat and ReleaseAtCycles noPredCycles. The WorstCase SchedWrite
+// is created similiarly if IsWorstCase is true. The prefix is used to uniquely
+// define SchedWriteRes and Variant resources.
+multiclass LMULWriteResMXVariant<string name, SchedPredicate Pred,
+                                 list<ProcResourceKind> resources,
+                                 int predLat, list<int> predCycles,
+                                 int noPredLat, list<int> noPredCycles,
+                                 string mx, bit IsWorstCase, string prefix> {
+  defvar nameMX = prefix # name # "_" # mx;
+
+  // Define the different behaviors
+  def nameMX # "_Pred" : SchedWriteRes<resources>
+  { let Latency = predLat; let ReleaseAtCycles = predCycles; }
+  def nameMX # "_NoPred" : SchedWriteRes<resources>
+  { let Latency = noPredLat; let ReleaseAtCycles = noPredCycles; }
+
+  // Tie behavior to predicate
+  def nameMX # "_Variant" : SchedWriteVariant<[
+    SchedVar<Pred, [!cast<SchedWriteRes>(nameMX # "_Pred")]>,
+    SchedVar<NoSchedPred, [!cast<SchedWriteRes>(nameMX # "_NoPred")]>
+  ]>;
+  def : SchedAlias<
+    !cast<SchedReadWrite>(name # "_" # mx),
+    !cast<SchedReadWrite>(nameMX # "_Variant")>;
+
+  if IsWorstCase then {
+    def prefix # name # "_WorstCase_Variant" : SchedWriteVariant<[
+      SchedVar<Pred, [!cast<SchedWriteRes>(nameMX # "_Pred")]>,
+      SchedVar<NoSchedPred, [!cast<SchedWriteRes>(nameMX # "_NoPred")]>
+    ]>;
+    def : SchedAlias<
+      !cast<SchedReadWrite>(name # "_WorstCase"),
+      !cast<SchedReadWrite>(prefix # name # "_WorstCase_Variant")>;
+  }
+}
+
 // Define multiclasses to define SchedWrite, SchedRead,  WriteRes, and
 // ReadAdvance for each (name, LMUL) pair and for each LMUL in each of the
 // SchedMxList variants above. Each multiclass is responsible for defining

defvar nameMX = prefix # name # "_" # mx;

// Define the different behaviors
def nameMX # "_Pred" : SchedWriteRes<resources>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am all ears if anyone has an idea to remove the use of prefix. The problem is that if we removed it and there is another processor who wants to create a LMULWriteResMXVariant for the same name and mx, then we will be creating two records with the same name, which is not allowed.

But using it feels awkward because we would like it to be in the namespace of the caller of LMULWriteResMXVariant. We cant give a name like defm "SiFive7" : LMULWriteResMXVariant because then it will suffix the SchedAlias with that, and we don't want that.

I tried to use defvar Pred = SchedWriteRes<resources>; but that leads to error: Record anonymous_53386', field SchedModel' does not have a def initializer!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with the Name being applied to the SchedAlias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. There is no name on SchedAlias. It is anonymous.

Copy link
Collaborator

@topperc topperc Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because then it will suffix the SchedAlias with that, and we don't want that.

Why don't we want that? Or why is it a problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of SchedAlias doesn't matter I think, we can just use the form defm "SiFive7" : LMULWriteResMXVariant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with the Name being applied to the SchedAlias?

Say we had something like this:

defm "SiFive7" : LMULWriteResMXVariant<"WriteVLDS8",  VLDSX0Pred, [SiFive7VL],

Then when we define a SchedWriteRes using LMULWriteResMXVariant, then it also gets "SiFIve7" prepended to it, creating something like "SiFive7VLDS_M1_Pred". Then to create the variant, we must have that prefix:

   // Tie behavior to predicate
   def nameMX # "_Variant" : SchedWriteVariant<[
-    SchedVar<Pred, [!cast<SchedWriteRes>(nameMX # "_Pred")]>,
-    SchedVar<NoSchedPred, [!cast<SchedWriteRes>(nameMX # "_NoPred")]>
+    SchedVar<Pred, [!cast<SchedWriteRes>("SiFive7" # nameMX # "_Pred")]>,
+    SchedVar<NoSchedPred, [!cast<SchedWriteRes>("SiFive7" # nameMX # "_NoPred")]>
   ]>;
   def : SchedAlias<
-    !cast<SchedReadWrite>(name # "_" # mx),
-    !cast<SchedReadWrite>(nameMX # "_Variant")>;
+    !cast<SchedReadWrite>(nameMX),
+    !cast<SchedReadWrite>("SiFive7" # nameMX # "_Variant")>;

The problem here is that we need to know the prefix anyway, to create the Variant and SchedAlias, so specifying it in the defm achieves absolutely nothing.

The only alternative I can think of is for us to use something like LMULSchedWritesVariant similiar to LMULSchedWrites to be used in conjunction with LMULWriteResVariant. However, I don't think this is the right approach either because this would force all subtargets to use a variant in those places, even if they didn't need to use a variant. I think the current approach may be the best approach.

Copy link
Collaborator

@topperc topperc Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use NAME to get the name of the defm that was used?

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some llvm-mca tests?

defvar nameMX = prefix # name # "_" # mx;

// Define the different behaviors
def nameMX # "_Pred" : SchedWriteRes<resources>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of SchedAlias doesn't matter I think, we can just use the form defm "SiFive7" : LMULWriteResMXVariant.

@michaelmaitland
Copy link
Contributor Author

Can we add some llvm-mca tests?

I have added them, but llvm-mca does not know how to give a correct report yet. I have confirmed that this change does work as expected on some internal benchmarks.

llvm/lib/Target/RISCV/RISCVSchedSiFive7.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVScheduleV.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVSchedSiFive7.td Outdated Show resolved Hide resolved
Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with suggestions.

llvm/lib/Target/RISCV/RISCVScheduleV.td Outdated Show resolved Hide resolved
Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
@michaelmaitland michaelmaitland merged commit 093bc6b into llvm:main Oct 30, 2023
2 of 3 checks passed
@michaelmaitland michaelmaitland deleted the sifive7-vldsx0 branch October 30, 2023 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants