-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[MachineScheduler] Add support for scheduling while in SSA #161054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Allow targets to add an MachineScheduler before PHI elimination, i.e. in SSA mode. Add initial support in AMDGPU backend for using SSA Machine Scheduler instead of normal Machine Scheduler. (This behaviour is disabled by default.) Also add basic "kick the tyres" demonstrator tests. This change is intended to support the introduction of a pre-RA spilling pass which runs prior to PHI elimination. Machine scheduler has a significant impact on register pressure and as such is best run before this new spilling pass. Co-authored-by: Konstantina Mitropoulou <KonstantinaMitropoulou@amd.com>
@llvm/pr-subscribers-backend-amdgpu Author: Carl Ritson (perlfu) ChangesAllow targets to add an MachineScheduler before PHI elimination, i.e. in SSA mode. Add initial support in AMDGPU backend for using SSA Machine Scheduler instead of normal Machine Scheduler. This change is intended to support the introduction of a pre-RA spilling pass which runs prior to PHI elimination. Machine scheduler has a significant impact on register pressure and as such is best run before this new spilling pass. Patch is 77.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161054.diff 16 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/MachineScheduler.h b/llvm/include/llvm/CodeGen/MachineScheduler.h
index 5a2aee2fa7643..e2866d21ef580 100644
--- a/llvm/include/llvm/CodeGen/MachineScheduler.h
+++ b/llvm/include/llvm/CodeGen/MachineScheduler.h
@@ -103,6 +103,7 @@ namespace impl_detail {
// FIXME: Remove these declarations once RegisterClassInfo is queryable as an
// analysis.
class MachineSchedulerImpl;
+class SSAMachineSchedulerImpl;
class PostMachineSchedulerImpl;
} // namespace impl_detail
@@ -1464,6 +1465,20 @@ class MachineSchedulerPass : public PassInfoMixin<MachineSchedulerPass> {
MachineFunctionAnalysisManager &MFAM);
};
+class SSAMachineSchedulerPass : public PassInfoMixin<SSAMachineSchedulerPass> {
+ // FIXME: Remove this member once RegisterClassInfo is queryable as an
+ // analysis.
+ std::unique_ptr<impl_detail::SSAMachineSchedulerImpl> Impl;
+ const TargetMachine *TM;
+
+public:
+ SSAMachineSchedulerPass(const TargetMachine *TM);
+ SSAMachineSchedulerPass(SSAMachineSchedulerPass &&Other);
+ ~SSAMachineSchedulerPass();
+ PreservedAnalyses run(MachineFunction &MF,
+ MachineFunctionAnalysisManager &MFAM);
+};
+
class PostMachineSchedulerPass
: public PassInfoMixin<PostMachineSchedulerPass> {
// FIXME: Remove this member once RegisterClassInfo is queryable as an
diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h
index f17d550623efc..2c30ac21446f5 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -165,6 +165,9 @@ LLVM_ABI extern char &MachineSchedulerID;
/// PostMachineScheduler - This pass schedules machine instructions postRA.
LLVM_ABI extern char &PostMachineSchedulerID;
+/// SSAMachineScheduler - This pass schedules machine instructions in SSA.
+LLVM_ABI extern char &SSAMachineSchedulerID;
+
/// SpillPlacement analysis. Suggest optimal placement of spill code between
/// basic blocks.
LLVM_ABI extern char &SpillPlacementID;
diff --git a/llvm/include/llvm/CodeGen/TargetPassConfig.h b/llvm/include/llvm/CodeGen/TargetPassConfig.h
index 5e0e641a981f9..1bf8cfc639ff7 100644
--- a/llvm/include/llvm/CodeGen/TargetPassConfig.h
+++ b/llvm/include/llvm/CodeGen/TargetPassConfig.h
@@ -135,6 +135,10 @@ class LLVM_ABI TargetPassConfig : public ImmutablePass {
/// replace a copy.
bool EnableSinkAndFold = false;
+ /// Enable insertion of SSAMachineScheduler pass, this triggers early
+ /// computation of live intervals.
+ bool EnableSSAMachineScheduler = false;
+
/// Require processing of functions such that callees are generated before
/// callers.
bool RequireCodeGenSCCOrder = false;
@@ -205,6 +209,13 @@ class LLVM_ABI TargetPassConfig : public ImmutablePass {
setOpt(RequireCodeGenSCCOrder, Enable);
}
+ bool getEnableSSAMachineScheduler() const {
+ return EnableSSAMachineScheduler;
+ }
+ void setEnableSSAMachineScheduler(bool Enable) {
+ setOpt(EnableSSAMachineScheduler, Enable);
+ }
+
/// Allow the target to override a specific pass without overriding the pass
/// pipeline. When passes are added to the standard pipeline at the
/// point where StandardID is expected, add TargetID in its place.
diff --git a/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h b/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
index a8c7a8aff83cf..fd2c4a0d13a36 100644
--- a/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
@@ -220,6 +220,10 @@ class LLVM_ABI TargetSubtargetInfo : public MCSubtargetInfo {
/// allocation.
virtual bool enablePostRAMachineScheduler() const;
+ /// True if the subtarget should run a machine scheduler before PHI
+ /// elimination.
+ virtual bool enableSSAMachineScheduler() const;
+
/// True if the subtarget should run the atomic expansion pass.
virtual bool enableAtomicExpand() const;
diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index 88272f053c114..c2f2765854945 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -289,6 +289,7 @@ LLVM_ABI void initializeReplaceWithVeclibLegacyPass(PassRegistry &);
LLVM_ABI void initializeResetMachineFunctionPass(PassRegistry &);
LLVM_ABI void initializeSCEVAAWrapperPassPass(PassRegistry &);
LLVM_ABI void initializeSROALegacyPassPass(PassRegistry &);
+LLVM_ABI void initializeSSAMachineSchedulerPass(PassRegistry &);
LLVM_ABI void initializeSafeStackLegacyPassPass(PassRegistry &);
LLVM_ABI void initializeSafepointIRVerifierPass(PassRegistry &);
LLVM_ABI void initializeSelectOptimizePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 04a0da06fb6ec..a0825d0bf7780 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -122,6 +122,7 @@ MACHINE_FUNCTION_PASS("machine-cse", MachineCSEPass())
MACHINE_FUNCTION_PASS("machine-latecleanup", MachineLateInstrsCleanupPass())
MACHINE_FUNCTION_PASS("machine-sanmd", MachineSanitizerBinaryMetadataPass())
MACHINE_FUNCTION_PASS("machine-scheduler", MachineSchedulerPass(TM))
+MACHINE_FUNCTION_PASS("ssamisched", SSAMachineSchedulerPass(TM))
MACHINE_FUNCTION_PASS("machinelicm", MachineLICMPass())
MACHINE_FUNCTION_PASS("no-op-machine-function", NoOpMachineFunctionPass())
MACHINE_FUNCTION_PASS("opt-phis", OptimizePHIsPass())
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index 9e0cb3bf44906..cf81a0d240004 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -120,6 +120,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeRemoveLoadsIntoFakeUsesLegacyPass(Registry);
initializeRemoveRedundantDebugValuesLegacyPass(Registry);
initializeRenameIndependentSubregsLegacyPass(Registry);
+ initializeSSAMachineSchedulerPass(Registry);
initializeSafeStackLegacyPassPass(Registry);
initializeSelectOptimizePass(Registry);
initializeShadowStackGCLoweringPass(Registry);
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 299bcc46e4bd2..31c5eac4ec618 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -350,6 +350,33 @@ class MachineSchedulerImpl : public MachineSchedulerBase {
ScheduleDAGInstrs *createMachineScheduler();
};
+/// Impl class for SSAMachineScheduler.
+class SSAMachineSchedulerImpl : public MachineSchedulerBase {
+ // These are only for using MF.verify()
+ // remove when verify supports passing in all analyses
+ MachineFunctionPass *P = nullptr;
+ MachineFunctionAnalysisManager *MFAM = nullptr;
+
+public:
+ struct RequiredAnalyses {
+ MachineLoopInfo &MLI;
+ MachineDominatorTree &MDT;
+ AAResults &AA;
+ LiveIntervals &LIS;
+ };
+
+ SSAMachineSchedulerImpl() {}
+ // Migration only
+ void setLegacyPass(MachineFunctionPass *P) { this->P = P; }
+ void setMFAM(MachineFunctionAnalysisManager *MFAM) { this->MFAM = MFAM; }
+
+ bool run(MachineFunction &MF, const TargetMachine &TM,
+ const RequiredAnalyses &Analyses);
+
+protected:
+ ScheduleDAGInstrs *createMachineScheduler();
+};
+
/// Impl class for PostMachineScheduler.
class PostMachineSchedulerImpl : public MachineSchedulerBase {
// These are only for using MF.verify()
@@ -380,6 +407,7 @@ class PostMachineSchedulerImpl : public MachineSchedulerBase {
using impl_detail::MachineSchedulerBase;
using impl_detail::MachineSchedulerImpl;
using impl_detail::PostMachineSchedulerImpl;
+using impl_detail::SSAMachineSchedulerImpl;
namespace {
/// MachineScheduler runs after coalescing and before register allocation.
@@ -394,6 +422,18 @@ class MachineSchedulerLegacy : public MachineFunctionPass {
static char ID; // Class identification, replacement for typeinfo
};
+/// SSAMachineScheduler runs before PHI elimination.
+class SSAMachineScheduler : public MachineFunctionPass {
+ SSAMachineSchedulerImpl Impl;
+
+public:
+ SSAMachineScheduler();
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+ bool runOnMachineFunction(MachineFunction &) override;
+
+ static char ID; // Class identification, replacement for typeinfo
+};
+
/// PostMachineScheduler runs after shortly before code emission.
class PostMachineSchedulerLegacy : public MachineFunctionPass {
PostMachineSchedulerImpl Impl;
@@ -439,6 +479,35 @@ void MachineSchedulerLegacy::getAnalysisUsage(AnalysisUsage &AU) const {
MachineFunctionPass::getAnalysisUsage(AU);
}
+char SSAMachineScheduler::ID = 0;
+
+char &llvm::SSAMachineSchedulerID = SSAMachineScheduler::ID;
+
+INITIALIZE_PASS_BEGIN(SSAMachineScheduler, "ssamisched",
+ "SSA Machine Instruction Scheduler", false, false)
+INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(MachineLoopInfoWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(SlotIndexesWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(LiveIntervalsWrapperPass)
+INITIALIZE_PASS_END(SSAMachineScheduler, "ssamisched",
+ "SSA Machine Instruction Scheduler", false, false)
+
+SSAMachineScheduler::SSAMachineScheduler() : MachineFunctionPass(ID) {
+ initializeSSAMachineSchedulerPass(*PassRegistry::getPassRegistry());
+}
+
+void SSAMachineScheduler::getAnalysisUsage(AnalysisUsage &AU) const {
+ AU.setPreservesCFG();
+ AU.addRequired<MachineDominatorTreeWrapperPass>();
+ AU.addRequired<MachineLoopInfoWrapperPass>();
+ AU.addRequired<AAResultsWrapperPass>();
+ AU.addRequired<TargetPassConfig>();
+ AU.addRequired<SlotIndexesWrapperPass>();
+ AU.addRequired<LiveIntervalsWrapperPass>();
+ MachineFunctionPass::getAnalysisUsage(AU);
+}
+
char PostMachineSchedulerLegacy::ID = 0;
char &llvm::PostMachineSchedulerID = PostMachineSchedulerLegacy::ID;
@@ -490,6 +559,11 @@ static cl::opt<bool> EnableMachineSched(
cl::desc("Enable the machine instruction scheduling pass."), cl::init(true),
cl::Hidden);
+static cl::opt<bool> EnableSSAMachineSched(
+ "enable-ssa-misched",
+ cl::desc("Enable the machine instruction scheduling pass in SSA."),
+ cl::init(false), cl::Hidden);
+
static cl::opt<bool> EnablePostRAMachineSched(
"enable-post-misched",
cl::desc("Enable the post-ra machine instruction scheduling pass."),
@@ -586,6 +660,53 @@ bool MachineSchedulerImpl::run(MachineFunction &Func, const TargetMachine &TM,
return true;
}
+/// Instantiate a ScheduleDAGInstrs that will be owned by the caller.
+ScheduleDAGInstrs *SSAMachineSchedulerImpl::createMachineScheduler() {
+ // Get the default scheduler set by the target for this function.
+ ScheduleDAGInstrs *Scheduler = TM->createMachineScheduler(this);
+ if (Scheduler)
+ return Scheduler;
+
+ // Default to GenericScheduler.
+ return createSchedLive(this);
+}
+
+bool SSAMachineSchedulerImpl::run(MachineFunction &Func,
+ const TargetMachine &TM,
+ const RequiredAnalyses &Analyses) {
+ MF = &Func;
+ MLI = &Analyses.MLI;
+ MDT = &Analyses.MDT;
+ this->TM = &TM;
+ AA = &Analyses.AA;
+ LIS = &Analyses.LIS;
+
+ if (VerifyScheduling) {
+ LLVM_DEBUG(LIS->dump());
+ const char *MSchedBanner = "Before machine scheduling.";
+ if (P)
+ MF->verify(P, MSchedBanner, &errs());
+ else
+ MF->verify(*MFAM, MSchedBanner, &errs());
+ }
+ RegClassInfo->runOnMachineFunction(*MF);
+
+ // Instantiate the selected scheduler for this target, function, and
+ // optimization level.
+ std::unique_ptr<ScheduleDAGInstrs> Scheduler(createMachineScheduler());
+ scheduleRegions(*Scheduler, false);
+
+ LLVM_DEBUG(LIS->dump());
+ if (VerifyScheduling) {
+ const char *MSchedBanner = "After machine scheduling.";
+ if (P)
+ MF->verify(P, MSchedBanner, &errs());
+ else
+ MF->verify(*MFAM, MSchedBanner, &errs());
+ }
+ return true;
+}
+
/// Instantiate a ScheduleDAGInstrs for PostRA scheduling that will be owned by
/// the caller. We don't have a command line option to override the postRA
/// scheduler. The Target must configure it.
@@ -668,12 +789,40 @@ bool MachineSchedulerLegacy::runOnMachineFunction(MachineFunction &MF) {
return Impl.run(MF, TM, {MLI, MDT, AA, LIS});
}
+bool SSAMachineScheduler::runOnMachineFunction(MachineFunction &MF) {
+ if (skipFunction(MF.getFunction()))
+ return false;
+
+ if (EnableSSAMachineSched.getNumOccurrences()) {
+ if (!EnableSSAMachineSched)
+ return false;
+ } else if (!MF.getSubtarget().enableSSAMachineScheduler()) {
+ return false;
+ }
+
+ LLVM_DEBUG(dbgs() << "Before ssa-MI-sched:\n"; MF.print(dbgs()));
+
+ auto &MLI = getAnalysis<MachineLoopInfoWrapperPass>().getLI();
+ auto &MDT = getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree();
+ auto &TM = getAnalysis<TargetPassConfig>().getTM<TargetMachine>();
+ auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
+ auto &LIS = getAnalysis<LiveIntervalsWrapperPass>().getLIS();
+ Impl.setLegacyPass(this);
+ return Impl.run(MF, TM, {MLI, MDT, AA, LIS});
+}
+
MachineSchedulerPass::MachineSchedulerPass(const TargetMachine *TM)
: Impl(std::make_unique<MachineSchedulerImpl>()), TM(TM) {}
MachineSchedulerPass::~MachineSchedulerPass() = default;
MachineSchedulerPass::MachineSchedulerPass(MachineSchedulerPass &&Other) =
default;
+SSAMachineSchedulerPass::SSAMachineSchedulerPass(const TargetMachine *TM)
+ : Impl(std::make_unique<SSAMachineSchedulerImpl>()), TM(TM) {}
+SSAMachineSchedulerPass::SSAMachineSchedulerPass(
+ SSAMachineSchedulerPass &&Other) = default;
+SSAMachineSchedulerPass::~SSAMachineSchedulerPass() = default;
+
PostMachineSchedulerPass::PostMachineSchedulerPass(const TargetMachine *TM)
: Impl(std::make_unique<PostMachineSchedulerImpl>()), TM(TM) {}
PostMachineSchedulerPass::PostMachineSchedulerPass(
@@ -708,6 +857,33 @@ MachineSchedulerPass::run(MachineFunction &MF,
.preserve<LiveIntervalsAnalysis>();
}
+PreservedAnalyses
+SSAMachineSchedulerPass::run(MachineFunction &MF,
+ MachineFunctionAnalysisManager &MFAM) {
+ if (EnableSSAMachineSched.getNumOccurrences()) {
+ if (!EnableSSAMachineSched)
+ return PreservedAnalyses::all();
+ } else if (!MF.getSubtarget().enableSSAMachineScheduler()) {
+ LLVM_DEBUG(dbgs() << "Subtarget disables ssa-MI-sched.\n");
+ return PreservedAnalyses::all();
+ }
+ LLVM_DEBUG(dbgs() << "Before ssa-MI-sched:\n"; MF.print(dbgs()));
+ auto &MLI = MFAM.getResult<MachineLoopAnalysis>(MF);
+ auto &MDT = MFAM.getResult<MachineDominatorTreeAnalysis>(MF);
+ auto &FAM = MFAM.getResult<FunctionAnalysisManagerMachineFunctionProxy>(MF)
+ .getManager();
+ auto &AA = FAM.getResult<AAManager>(MF.getFunction());
+ auto &LIS = MFAM.getResult<LiveIntervalsAnalysis>(MF);
+ Impl->setMFAM(&MFAM);
+ bool Changed = Impl->run(MF, *TM, {MLI, MDT, AA, LIS});
+ if (!Changed)
+ return PreservedAnalyses::all();
+
+ PreservedAnalyses PA = getMachineFunctionPassPreservedAnalyses();
+ PA.preserveSet<CFGAnalyses>();
+ return PA;
+}
+
bool PostMachineSchedulerLegacy::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction()))
return false;
@@ -764,11 +940,10 @@ PostMachineSchedulerPass::run(MachineFunction &MF,
/// the boundary, but there would be no benefit to postRA scheduling across
/// calls this late anyway.
static bool isSchedBoundary(MachineBasicBlock::iterator MI,
- MachineBasicBlock *MBB,
- MachineFunction *MF,
+ MachineBasicBlock *MBB, MachineFunction *MF,
const TargetInstrInfo *TII) {
return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF) ||
- MI->isFakeUse();
+ MI->isFakeUse() || MI->isPHI();
}
using MBBRegionsVector = SmallVector<SchedRegion, 16>;
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index b6169e6c4dc34..2f4c47212215e 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1479,6 +1479,12 @@ void TargetPassConfig::addOptimizedRegAlloc() {
addPass(&UnreachableMachineBlockElimID);
addPass(&LiveVariablesID);
+ // Run SSA machine scheduler runs just before PHI elimination.
+ if (EnableSSAMachineScheduler) {
+ addPass(&LiveIntervalsID);
+ addPass(&SSAMachineSchedulerID);
+ }
+
// Edge splitting is smarter with machine loop info.
addPass(&MachineLoopInfoID);
addPass(&PHIEliminationID);
diff --git a/llvm/lib/CodeGen/TargetSubtargetInfo.cpp b/llvm/lib/CodeGen/TargetSubtargetInfo.cpp
index cd396e6a619a8..cee5162223f71 100644
--- a/llvm/lib/CodeGen/TargetSubtargetInfo.cpp
+++ b/llvm/lib/CodeGen/TargetSubtargetInfo.cpp
@@ -54,6 +54,8 @@ bool TargetSubtargetInfo::enablePostRAMachineScheduler() const {
return enableMachineScheduler() && enablePostRAScheduler();
}
+bool TargetSubtargetInfo::enableSSAMachineScheduler() const { return false; }
+
bool TargetSubtargetInfo::useAA() const {
return false;
}
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 92a587b5771b6..f8ec24c21efaf 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -526,6 +526,11 @@ static cl::opt<bool> HasClosedWorldAssumption(
cl::desc("Whether has closed-world assumption at link time"),
cl::init(false), cl::Hidden);
+static cl::opt<bool>
+ UseSSAMachineScheduler("amdgpu-use-ssa-machine-scheduler",
+ cl::desc("Use the machine scheduler in SSA mode."),
+ cl::init(false), cl::Hidden);
+
extern "C" LLVM_ABI LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() {
// Register the target
RegisterTargetMachine<R600TargetMachine> X(getTheR600Target());
@@ -1255,6 +1260,12 @@ AMDGPUPassConfig::AMDGPUPassConfig(TargetMachine &TM, PassManagerBase &PM)
// Garbage collection is not supported.
disablePass(&GCLoweringID);
disablePass(&ShadowStackGCLoweringID);
+
+ if (UseSSAMachineScheduler) {
+ // Use SSA Machine Scheduler instead of regular Machine Scheduler.
+ disablePass(&MachineSchedulerID);
+ setEnableSSAMachineScheduler(true);
+ }
}
void AMDGPUPassConfig::addEarlyCSEOrGVNPass() {
@@ -1594,20 +1605,24 @@ void GCNPassConfig::addOptimizedRegAlloc() {
if (EnableRewritePartialRegUses)
insertPass(&RenameIndependentSubregsID, &GCNRewritePartialRegUsesID);
+ // Insertion point for passes depends on whether MachineScheduler is enabled.
+ AnalysisID EndOfPreRA = UseSSAMachineScheduler ? &RenameIndependentSubregsID
+ : &MachineSchedulerID;
+
if (isPassEnabled(EnablePreRAOptimizations))
- insertPass(&MachineSchedulerID, &GCNPreRAOptimizationsID);
+ insertPass(EndOfPreRA, &GCNPreRAOptimizationsID);
// Allow the scheduler to run before SIWholeQuadMode inserts exec manipulation
// instructions that cause scheduling barriers.
- insertPass(&MachineSchedulerID, &SIWholeQuadModeID);
+ insertPass(EndOfPreRA, &SIWholeQuadModeID);
if (OptExecMaskPreRA)
- insertPass(&MachineSchedulerID, &SIOptimizeExecMaskingPreRAID);
+ insertPass(EndOfPreRA, &SIOptimizeExecMaskingPreRAID);
// This is not an essential optimization and it has a noticeable impact on
// compilation time, so we only enable it from O2.
if (TM->getOptLevel() > CodeGenOptLevel::Less)
- insertPass(&MachineSchedulerID, &SIFormMemoryClausesID);
+ insertPass(EndOfPreRA, &SIFormMemoryClausesID);
TargetPassConfig::addOptimizedRegAlloc();
}
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index ef63acc6355d2..65068114e8231 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -615,6 +615,8 @@ bool GCNDownwardRPTracker::advanceBeforeNext(MachineInstr *MI,
continue;
if (MO.isUse() && !MO.readsReg())
continue;
+ if (MO.isUse() && MO.getParent()->getOpcode() == AMDGPU::PHI)
+ continue;
if (!UseInternalIterator && MO.isDef())
continue;
if (!SeenRegs.insert(MO.getReg()).secon...
[truncated]
|
continue; | ||
if (MO.isUse() && !MO.readsReg()) | ||
continue; | ||
if (MO.isUse() && MO.getParent()->getOpcode() == AMDGPU::PHI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (MO.isUse() && MO.getParent()->getOpcode() == AMDGPU::PHI) | |
if (MO.isUse() && CurrMI->getOpcode() == AMDGPU::PHI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively skip this whole loop for phis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are cases where we still needs to process the def from the PHI, so I have modified this to break from the loop on the first PHI use.
static cl::opt<bool> | ||
UseSSAMachineScheduler("amdgpu-use-ssa-machine-scheduler", | ||
cl::desc("Use the machine scheduler in SSA mode."), | ||
cl::init(false), cl::Hidden); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix this to have one general flag that works instead of making every target override a virtual and add its own flag? This is a really stupid pattern that keeps repeating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but my concern with that is it means embedding an AMDGPU specific design decision in the general pass list.
This target specific flag both enables SSA scheduler and disables the non-SSA one.
It's possible to imagine other backends (or even future AMDGPU) running both SSA and non-SSA schedulers.
The current design offers greater flexibility to backend implementators.
Allow targets to add an MachineScheduler before PHI elimination, i.e. in SSA mode.
Add initial support in AMDGPU backend for using SSA Machine Scheduler instead of normal Machine Scheduler.
(This behaviour is disabled by default.)
Also add basic "kick the tyres" demonstrator tests.
This change is intended to support the introduction of a pre-RA spilling pass which runs prior to PHI elimination. Machine scheduler has a significant impact on register pressure and as such is best run before this new spilling pass.