Skip to content

[AMDGPU][Scheduler] Delete RescheduleRegions bitvector from scheduler (NFC) #141595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2025

Conversation

lucas-rami
Copy link
Contributor

The GCNScheduleDAGMILive's RescheduleRegions bitvector is only used by the rematerialization stage (PreRARematStage). Its presence in the scheduler's state forces us to maintain its value throughout scheduling even though it is of no use to the iterative scheduling process itself, which instead relies on each stage's initGCNRegion hook to determine whether the current region should be rescheduled.

This moves the bitvector to the PreRARematStage, which uses it to store the set of regions that must be rescheduled between stage initialization and region initialization.

This NFC also swaps a call to GCNRegPressure::getArchVGPRNum(false) for a call to GCNRegPressure::getArchVGPRNum()---which is equivalent but simpler in the context---and makes GCNSchedStage::finalizeGCNRegion use its own API to advance to the next region.

@llvmbot
Copy link
Member

llvmbot commented May 27, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Lucas Ramirez (lucas-rami)

Changes

The GCNScheduleDAGMILive's RescheduleRegions bitvector is only used by the rematerialization stage (PreRARematStage). Its presence in the scheduler's state forces us to maintain its value throughout scheduling even though it is of no use to the iterative scheduling process itself, which instead relies on each stage's initGCNRegion hook to determine whether the current region should be rescheduled.

This moves the bitvector to the PreRARematStage, which uses it to store the set of regions that must be rescheduled between stage initialization and region initialization.

This NFC also swaps a call to GCNRegPressure::getArchVGPRNum(false) for a call to GCNRegPressure::getArchVGPRNum()---which is equivalent but simpler in the context---and makes GCNSchedStage::finalizeGCNRegion use its own API to advance to the next region.


Full diff: https://github.com/llvm/llvm-project/pull/141595.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp (+4-14)
  • (modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.h (+4-5)
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
index a9c891a1f1dd1..706ae92c9e47c 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
@@ -937,12 +937,10 @@ void GCNScheduleDAGMILive::finalizeSchedule() {
   // GCNScheduleDAGMILive::schedule().
   LiveIns.resize(Regions.size());
   Pressure.resize(Regions.size());
-  RescheduleRegions.resize(Regions.size());
   RegionsWithHighRP.resize(Regions.size());
   RegionsWithExcessRP.resize(Regions.size());
   RegionsWithMinOcc.resize(Regions.size());
   RegionsWithIGLPInstrs.resize(Regions.size());
-  RescheduleRegions.set();
   RegionsWithHighRP.reset();
   RegionsWithExcessRP.reset();
   RegionsWithMinOcc.reset();
@@ -1236,10 +1234,7 @@ bool ClusteredLowOccStage::initGCNRegion() {
 }
 
 bool PreRARematStage::initGCNRegion() {
-  if (!DAG.RescheduleRegions[RegionIdx])
-    return false;
-
-  return GCNSchedStage::initGCNRegion();
+  return RescheduleRegions[RegionIdx] && GCNSchedStage::initGCNRegion();
 }
 
 void GCNSchedStage::setupNewBlock() {
@@ -1258,7 +1253,6 @@ void GCNSchedStage::setupNewBlock() {
 
 void GCNSchedStage::finalizeGCNRegion() {
   DAG.Regions[RegionIdx] = std::pair(DAG.RegionBegin, DAG.RegionEnd);
-  DAG.RescheduleRegions[RegionIdx] = false;
   if (S.HasHighPressure)
     DAG.RegionsWithHighRP[RegionIdx] = true;
 
@@ -1271,7 +1265,7 @@ void GCNSchedStage::finalizeGCNRegion() {
     SavedMutations.swap(DAG.Mutations);
 
   DAG.exitRegion();
-  RegionIdx++;
+  advanceRegion();
 }
 
 void GCNSchedStage::checkScheduling() {
@@ -1332,10 +1326,9 @@ void GCNSchedStage::checkScheduling() {
   unsigned MaxSGPRs = ST.getMaxNumSGPRs(MF);
 
   if (PressureAfter.getVGPRNum(ST.hasGFX90AInsts()) > MaxVGPRs ||
-      PressureAfter.getVGPRNum(false) > MaxArchVGPRs ||
+      PressureAfter.getArchVGPRNum() > MaxArchVGPRs ||
       PressureAfter.getAGPRNum() > MaxArchVGPRs ||
       PressureAfter.getSGPRNum() > MaxSGPRs) {
-    DAG.RescheduleRegions[RegionIdx] = true;
     DAG.RegionsWithHighRP[RegionIdx] = true;
     DAG.RegionsWithExcessRP[RegionIdx] = true;
   }
@@ -1577,9 +1570,6 @@ void GCNSchedStage::revertScheduling() {
   DAG.RegionsWithMinOcc[RegionIdx] =
       PressureBefore.getOccupancy(ST) == DAG.MinOccupancy;
   LLVM_DEBUG(dbgs() << "Attempting to revert scheduling.\n");
-  DAG.RescheduleRegions[RegionIdx] =
-      S.hasNextStage() &&
-      S.getNextStage() != GCNSchedStageID::UnclusteredHighRPReschedule;
   DAG.RegionEnd = DAG.RegionBegin;
   int SkippedDebugInstr = 0;
   for (MachineInstr *MI : Unsched) {
@@ -2154,7 +2144,7 @@ void PreRARematStage::rematerialize() {
   AchievedOcc = TargetOcc;
   for (auto &[I, OriginalRP] : ImpactedRegions) {
     bool IsEmptyRegion = DAG.Regions[I].first == DAG.Regions[I].second;
-    DAG.RescheduleRegions[I] = !IsEmptyRegion;
+    RescheduleRegions[I] = !IsEmptyRegion;
     if (!RecomputeRP.contains(I))
       continue;
 
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
index ca4ab4a2c560f..aa48c7c9eaed9 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -243,10 +243,6 @@ class GCNScheduleDAGMILive final : public ScheduleDAGMILive {
   // Vector of regions recorder for later rescheduling
   SmallVector<RegionBoundaries, 32> Regions;
 
-  // Records if a region is not yet scheduled, or schedule has been reverted,
-  // or we generally desire to reschedule it.
-  BitVector RescheduleRegions;
-
   // Record regions with high register pressure.
   BitVector RegionsWithHighRP;
 
@@ -476,6 +472,9 @@ class PreRARematStage : public GCNSchedStage {
   /// In case we need to rollback rematerializations, save lane masks for all
   /// rematerialized registers in all regions in which they are live-ins.
   DenseMap<std::pair<unsigned, Register>, LaneBitmask> RegMasks;
+  /// After successful stage initialization, indicates which regions should be
+  /// rescheduled.
+  BitVector RescheduleRegions;
   /// Target occupancy the stage estimates is reachable through
   /// rematerialization. Greater than or equal to the pre-stage min occupancy.
   unsigned TargetOcc;
@@ -520,7 +519,7 @@ class PreRARematStage : public GCNSchedStage {
   bool shouldRevertScheduling(unsigned WavesAfter) override;
 
   PreRARematStage(GCNSchedStageID StageID, GCNScheduleDAGMILive &DAG)
-      : GCNSchedStage(StageID, DAG) {}
+      : GCNSchedStage(StageID, DAG), RescheduleRegions(DAG.Regions.size()) {}
 };
 
 class ILPInitialScheduleStage : public GCNSchedStage {

@lucas-rami lucas-rami requested review from arsenm and jrbyrnes May 27, 2025 13:13
@lucas-rami lucas-rami merged commit 3e18216 into llvm:main May 27, 2025
13 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
… (NFC) (llvm#141595)

The `GCNScheduleDAGMILive`'s `RescheduleRegions` bitvector is only used
by the rematerialization stage (`PreRARematStage`). Its presence in the
scheduler's state forces us to maintain its value throughout scheduling
even though it is of no use to the iterative scheduling process itself,
which instead relies on each stage's `initGCNRegion` hook to determine
whether the current region should be rescheduled.

This moves the bitvector to the `PreRARematStage`, which uses it to
store the set of regions that must be rescheduled between stage
initialization and region initialization.

This NFC also swaps a call to `GCNRegPressure::getArchVGPRNum(false)`
for a call to `GCNRegPressure::getArchVGPRNum()`---which is equivalent
but simpler in the context---and makes
`GCNSchedStage::finalizeGCNRegion` use its own API to advance to the
next region.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants