Skip to content

Conversation

linuxrocks123
Copy link
Contributor

Still needs tests and testing. Also needs camel case. If the snake case bothers anyone, I'll fix it now.

Copy link

github-actions bot commented Sep 18, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/lib/Target/AMDGPU/SICustomBranchBundles.h llvm/lib/Target/AMDGPU/SIRestoreNormalEpilog.cpp llvm/lib/Target/AMDGPU/AMDGPU.h llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp llvm/lib/Target/AMDGPU/SILowerControlFlow.h

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 8210ad88b..d2c1f7f49 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1564,7 +1564,7 @@ void GCNPassConfig::addFastRegAlloc() {
   // This must be run immediately after phi elimination and before
   // TwoAddressInstructions, otherwise the processing of the tied operand of
   // SI_ELSE will introduce a copy of the tied operand source after the else.
-  //insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
+  // insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
 
   insertPass(&TwoAddressInstructionPassID, &SIWholeQuadModeID);
 
@@ -1592,12 +1592,12 @@ void GCNPassConfig::addOptimizedRegAlloc() {
   // This must be run immediately after phi elimination and before
   // TwoAddressInstructions, otherwise the processing of the tied operand of
   // SI_ELSE will introduce a copy of the tied operand source after the else.
-  //insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
+  // insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
 
   if (EnableRewritePartialRegUses)
     insertPass(&RenameIndependentSubregsID, &GCNRewritePartialRegUsesID);
-  
-  insertPass(&RenameIndependentSubregsID,&SIRestoreNormalEpilogLegacyID);
+
+  insertPass(&RenameIndependentSubregsID, &SIRestoreNormalEpilogLegacyID);
 
   if (isPassEnabled(EnablePreRAOptimizations))
     insertPass(&MachineSchedulerID, &GCNPreRAOptimizationsID);
@@ -2261,7 +2261,7 @@ void AMDGPUCodeGenPassBuilder::addOptimizedRegAlloc(
   // This must be run immediately after phi elimination and before
   // TwoAddressInstructions, otherwise the processing of the tied operand of
   // SI_ELSE will introduce a copy of the tied operand source after the else.
-  //insertPass<PHIEliminationPass>(SILowerControlFlowPass());
+  // insertPass<PHIEliminationPass>(SILowerControlFlowPass());
 
   if (EnableRewritePartialRegUses)
     insertPass<RenameIndependentSubregsPass>(GCNRewritePartialRegUsesPass());
diff --git a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
index ef091088a..de53c1984 100644
--- a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
@@ -48,12 +48,12 @@
 /// %exec = S_OR_B64 %exec, %sgpr0     // Re-enable saved exec mask bits
 //===----------------------------------------------------------------------===//
 
-#include "SICustomBranchBundles.h"
 #include "SILowerControlFlow.h"
 #include "AMDGPU.h"
 #include "AMDGPULaneMaskUtils.h"
 #include "GCNSubtarget.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "SICustomBranchBundles.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/CodeGen/LiveIntervals.h"
 #include "llvm/CodeGen/LiveVariables.h"
@@ -160,7 +160,7 @@ public:
   MachineFunctionProperties getClearedProperties() const override {
     return MachineFunctionProperties().setNoPHIs();
   }
-  
+
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.addUsedIfAvailable<LiveIntervalsWrapperPass>();
     // Should preserve the same set that TwoAddressInstructions does.
diff --git a/llvm/lib/Target/AMDGPU/SIRestoreNormalEpilog.cpp b/llvm/lib/Target/AMDGPU/SIRestoreNormalEpilog.cpp
index d6d02c940..09567d424 100644
--- a/llvm/lib/Target/AMDGPU/SIRestoreNormalEpilog.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRestoreNormalEpilog.cpp
@@ -1,7 +1,7 @@
-#include "SICustomBranchBundles.h"
 #include "AMDGPU.h"
 #include "GCNSubtarget.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "SICustomBranchBundles.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/CodeGen/LiveIntervals.h"
 #include "llvm/CodeGen/LiveVariables.h"
@@ -12,8 +12,7 @@
 
 #define DEBUG_TYPE "si-restore-normal-epilog"
 
-namespace
-{
+namespace {
 
 class SIRestoreNormalEpilogLegacy : public MachineFunctionPass {
 public:
@@ -34,7 +33,6 @@ public:
   MachineFunctionProperties getRequiredProperties() const override {
     return MachineFunctionProperties().setNoPHIs();
   }
-
 };
 
 } // namespace


bool phi_seen = false;
MachineBasicBlock::iterator first_phi;
for (first_phi = MBB.begin(); first_phi != MBB.end(); first_phi++)
Copy link
Contributor

@alex-t alex-t Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool phi_seen = !MBB.phis().empty() ?
or even:
if (!MBB.phis().empty()) first_phi = MBB.phis().begin();

break;
}

if (!phi_seen) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary as SILowerControlFlow always insert at block begin.

MachineBasicBlock::iterator Start = MBB.begin();
Register SaveReg = MRI->createVirtualRegister(BoolRC);
MachineInstr *OrSaveExec =
BuildMI(MBB, Start, DL, TII->get(LMC.OrSaveExecOpc), SaveReg)
.add(MI.getOperand(1)); // Saved EXEC

move_ins_before_phis(*OrSaveExec);

if (branch_MI.isBranch() && TII.getBranchDestBlock(branch_MI) == &succ_MBB)
return ++Epilog_Iterator(branch_MI.getIterator());

llvm_unreachable("There should always be a branch to succ_MBB.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is about fall through?
bb.1: successors: %bb.2(0x80000000)

%64:vgpr_32 = V_MOV_B32_e32 100, implicit $exec

bb.2: successors: %bb.5(0x40000000), %bb.3(0x40000000)

Copy link
Contributor Author

@linuxrocks123 linuxrocks123 Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looked like the IR wasn't allowed to have fall-throughs at this stage, because I didn't see any and I did see unconditional-branch-to-next. Are fall-throughs instead allowed and just not common at this stage?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fall-throughs are not prohibited and I have seen enough valid MIR in SSA with fall-through CF.

@@ -323,6 +332,8 @@ void SILowerControlFlow::emitElse(MachineInstr &MI) {
if (LV)
LV->replaceKillInstruction(SrcReg, MI, *OrSaveExec);

move_ins_before_phis(*OrSaveExec);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXEC restoring instructions happen not only in ELSE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the other ones are at the end of the blocks, not before the PHIs, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we lower SI_END_CF and don't split block we are going to have Exec = Exec OR Masked in the beginning or middle.

MachineInstr *cloned_MI = MF.CloneMachineInstr(&MI);
cloned_MI->getOperand(0).setReg(cloned_reg);
phi.addReg(cloned_reg).addMBB(pred_MBB);
pred_MBB->insertAfterBundle(branch_MI.getIterator(), cloned_MI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to make sure that there is no Machine Verifier between SILowerControlFlow and SIRestoreNormalEpilog
Also, check if unusual placement instructions after the branch does not break SILowerControlFlow logic itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must not be a machine verifier or the testcase would have broken, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants