-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[AMDGPU][NPM] Fix CFG invalidation detection in insertSimulatedTrap #169290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When SIMULATED_TRAP is at the end of a block with no successors, insertSimulatedTrap incorrectly returns the original MBB despite adding HaltLoopBB to the CFG. EmitInstrWithCustomInserter detects CFG changes by comparing the returned MBB with the original. When they match, it assumes no modification occurred and skips MachineLoopInfo invalidation. This causes stale loop information in subsequent passes. Fix: Return HaltLoopBB to properly signal the CFG modification.
|
@llvm/pr-subscribers-backend-amdgpu Author: Prasoon Mishra (PrasoonMishra) ChangesWhen SIMULATED_TRAP is at the end of a block with no successors, insertSimulatedTrap incorrectly returns the original MBB despite adding HaltLoopBB to the CFG. EmitInstrWithCustomInserter detects CFG changes by comparing the returned MBB with the original. When they match, it assumes no modification occurred and skips MachineLoopInfo invalidation. This causes stale loop information in subsequent passes. Fix: Return HaltLoopBB to properly signal the CFG modification. Full diff: https://github.com/llvm/llvm-project/pull/169290.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index a7333e3373f38..9b05d99e265e6 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -1963,6 +1963,10 @@ MachineBasicBlock *SIInstrInfo::insertSimulatedTrap(MachineRegisterInfo &MRI,
BuildMI(MBB, MI, DL, get(AMDGPU::S_CBRANCH_EXECNZ)).addMBB(TrapBB);
MF->push_back(TrapBB);
MBB.addSuccessor(TrapBB);
+ } else {
+ // Since we're adding HaltLoopBB and modifying the CFG, we must return a
+ // different block to signal the change.
+ ContBB = HaltLoopBB;
}
// Start with a `s_trap 2`, if we're in PRIV=1 and we need the workaround this
|
|
This bug was discovered while enabling the NPM pipeline for AMDGPU. Without this patch, the existing test trap-abis.ll fails under NPM. The Expected: This happens only when using NPM pipeline because legacy pass manager conservatively recomputes analyses, so MachineLoopInfo is refreshed before AsmPrinter, masking the bug. NPM relies on accurate invalidation signals hence when insertSimulatedTrap incorrectly signals “no CFG change,” MachineLoopInfo remains stale, causing missing loop headers. This fix resolves the bug. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/21528 Here is the relevant piece of the build log for the reference |
When SIMULATED_TRAP is at the end of a block with no successors, insertSimulatedTrap incorrectly returns the original MBB despite adding HaltLoopBB to the CFG.
EmitInstrWithCustomInserter detects CFG changes by comparing the returned MBB with the original. When they match, it assumes no modification occurred and skips MachineLoopInfo invalidation. This causes stale loop information in subsequent passes, particularly when using the NPM which relies on accurate invalidation signals.
Fix: Return HaltLoopBB to properly signal the CFG modification.