-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TwoAddressInstruction] Propagate undef flags for partial defs #79286
Conversation
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-amdgpu Author: Carl Ritson (perlfu) ChangesIf part of a register (lowered from REG_SEQUENCE) is undefined then we should propagate undef flags to uses of those lanes. This is only performed when live intervals are present as it requires live intervals to correctly match uses to defs, and the primary goal is to allow precise computation of subrange intervals. Full diff: https://github.com/llvm/llvm-project/pull/79286.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
index 74d7904aee33a2d..0b8bef24d1a7bb6 100644
--- a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+++ b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
@@ -1929,21 +1929,26 @@ eliminateRegSequence(MachineBasicBlock::iterator &MBBI) {
Register DstReg = MI.getOperand(0).getReg();
SmallVector<Register, 4> OrigRegs;
+ VNInfo *DefVN = nullptr;
if (LIS) {
OrigRegs.push_back(MI.getOperand(0).getReg());
for (unsigned i = 1, e = MI.getNumOperands(); i < e; i += 2)
OrigRegs.push_back(MI.getOperand(i).getReg());
+ if (LIS->hasInterval(DstReg)) {
+ DefVN = LIS->getInterval(DstReg).Query(
+ LIS->getInstructionIndex(MI)).valueOut();
+ }
}
+ LaneBitmask UndefLanes = LaneBitmask::getNone();
bool DefEmitted = false;
- bool DefIsPartial = false;
for (unsigned i = 1, e = MI.getNumOperands(); i < e; i += 2) {
MachineOperand &UseMO = MI.getOperand(i);
Register SrcReg = UseMO.getReg();
unsigned SubIdx = MI.getOperand(i+1).getImm();
// Nothing needs to be inserted for undef operands.
if (UseMO.isUndef()) {
- DefIsPartial = true;
+ UndefLanes |= TRI->getSubRegIndexLaneMask(SubIdx);
continue;
}
@@ -1991,11 +1996,24 @@ eliminateRegSequence(MachineBasicBlock::iterator &MBBI) {
MI.removeOperand(j);
} else {
if (LIS) {
- // Force interval recomputation if we moved from full definition
- // of register to partial.
- if (DefIsPartial && LIS->hasInterval(DstReg) &&
- MRI->shouldTrackSubRegLiveness(DstReg))
+ // Force live interval recomputation if we moved to a partial defintion
+ // of the register. Undef flags must be propagate to uses of undefined
+ // subregister for accurate interval computation.
+ if (UndefLanes.any() && DefVN && MRI->shouldTrackSubRegLiveness(DstReg)) {
+ auto &LI = LIS->getInterval(DstReg);
+ for (MachineOperand &UseOp : MRI->use_operands(DstReg)) {
+ unsigned SubReg = UseOp.getSubReg();
+ if (UseOp.isUndef() || !SubReg)
+ continue;
+ auto *VN = LI.getVNInfoAt(LIS->getInstructionIndex(*UseOp.getParent()));
+ if (DefVN != VN)
+ continue;
+ LaneBitmask LaneMask = TRI->getSubRegIndexLaneMask(SubReg);
+ if ((UndefLanes & LaneMask).any())
+ UseOp.setIsUndef(true);
+ }
LIS->removeInterval(DstReg);
+ }
LIS->RemoveMachineInstrFromMaps(MI);
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
index ac153183be642a2..3f11c122a681463 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
@@ -1,5 +1,6 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,GPRIDX %s
+; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -early-live-intervals -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,GPRIDX %s
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,MOVREL %s
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1010 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX10PLUS,GFX10 %s
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1100 -amdgpu-enable-delay-alu=0 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX10PLUS,GFX11 %s
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
19bba42
to
27f8e24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is required for correctness, but it is not clear to me what the failure is without this. Do you have a small MIR example that needs this?
I've switch this to a concrete MIR example. The root cause could be seen as an issue/bug with GlobalISel code gen, although I think this change is still beneficial. The result is that GlobalISel generates a
Then generates code to extract that element if selected:
Without propagating the undef flags, the above instructions will fail the verifier if LiveIntervals are available. |
llvm/test/CodeGen/AMDGPU/GlobalISel/twoaddr-extract-dyn-v7f64.mir
Outdated
Show resolved
Hide resolved
llvm/test/CodeGen/AMDGPU/GlobalISel/twoaddr-extract-dyn-v7f64.mir
Outdated
Show resolved
Hide resolved
If part of a register (lowered from REG_SEQUENCE) is undefined then we should propagate undef flags to uses of those lanes. This is only performed when live intervals are present as it requires live intervals to correctly match uses to defs, and the primary goal is to allow precise computation of subrange intervals.
b9bc693
to
95b7bab
Compare
|
If part of a register (lowered from REG_SEQUENCE) is undefined then we should propagate undef flags to uses of those lanes. This is only performed when live intervals are present as it requires live intervals to correctly match uses to defs, and the primary goal is to allow precise computation of subrange intervals.