-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[TII] Split isTrivialReMaterializable into two versions [nfc] #160377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change builds on llvm#160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where.
@llvm/pr-subscribers-llvm-regalloc @llvm/pr-subscribers-backend-amdgpu Author: Philip Reames (preames) ChangesThis change builds on #160319 which tries to clarify which callers (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does not enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where. Full diff: https://github.com/llvm/llvm-project/pull/160377.diff 11 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index cefb0d35690ba..046b60256fb29 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -168,6 +168,19 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
/// registers so that the instructions result is independent of the place
/// in the function.
bool isTriviallyReMaterializable(const MachineInstr &MI) const {
+ if (!isReMaterializable(MI))
+ return false;
+ for (const MachineOperand &MO : MI.all_uses()) {
+ if (MO.getReg().isVirtual())
+ return false;
+ }
+ return true;
+ }
+
+ /// Return true if the instruction would be materializable at a point
+ /// in the containing function where all virtual register uses were
+ /// known to be live and available in registers.
+ bool isReMaterializable(const MachineInstr &MI) const {
return (MI.getOpcode() == TargetOpcode::IMPLICIT_DEF &&
MI.getNumOperands() == 1) ||
(MI.getDesc().isRematerializable() &&
@@ -194,10 +207,9 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
protected:
/// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
/// set, this hook lets the target specify whether the instruction is actually
- /// trivially rematerializable, taking into consideration its operands. This
+ /// rematerializable, taking into consideration its operands. This
/// predicate must return false if the instruction has any side effects other
- /// than producing a value, or if it requres any address registers that are
- /// not always available.
+ /// than producing a value.
virtual bool isReMaterializableImpl(const MachineInstr &MI) const;
/// This method commutes the operands of the given machine instruction MI.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 1db742253d594..05d3d18aa9557 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2212,11 +2212,7 @@ findPrologueEndLoc(const MachineFunction *MF) {
-> std::optional<std::pair<const MachineInstr *, bool>> {
// Is this instruction trivial data shuffling or frame-setup?
bool isCopy = (TII.isCopyInstr(MI) ? true : false);
- bool isTrivRemat =
- TII.isTriviallyReMaterializable(MI) &&
- llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) {
- return MO.getReg().isVirtual();
- });
+ bool isTrivRemat = TII.isTriviallyReMaterializable(MI);
bool isFrameSetup = MI.getFlag(MachineInstr::FrameSetup);
if (!isFrameSetup && MI.getDebugLoc()) {
diff --git a/llvm/lib/CodeGen/CalcSpillWeights.cpp b/llvm/lib/CodeGen/CalcSpillWeights.cpp
index b16694eafd90e..f5a6eec57ee7e 100644
--- a/llvm/lib/CodeGen/CalcSpillWeights.cpp
+++ b/llvm/lib/CodeGen/CalcSpillWeights.cpp
@@ -122,7 +122,7 @@ bool VirtRegAuxInfo::isRematerializable(const LiveInterval &LI,
assert(MI && "Dead valno in interval");
}
- if (!TII.isTriviallyReMaterializable(*MI))
+ if (!TII.isReMaterializable(*MI))
return false;
}
return true;
diff --git a/llvm/lib/CodeGen/LiveRangeEdit.cpp b/llvm/lib/CodeGen/LiveRangeEdit.cpp
index 33e980a5993d3..2f65be51bb726 100644
--- a/llvm/lib/CodeGen/LiveRangeEdit.cpp
+++ b/llvm/lib/CodeGen/LiveRangeEdit.cpp
@@ -80,7 +80,7 @@ void LiveRangeEdit::scanRemattable() {
MachineInstr *DefMI = LIS.getInstructionFromIndex(OrigVNI->def);
if (!DefMI)
continue;
- if (TII.isTriviallyReMaterializable(*DefMI))
+ if (TII.isReMaterializable(*DefMI))
Remattable.insert(OrigVNI);
}
ScannedRemattable = true;
@@ -387,7 +387,7 @@ void LiveRangeEdit::eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink) {
// register uses. That may provoke RA to split an interval at the KILL
// and later result in an invalid live segment end.
if (isOrigDef && DeadRemats && !HasLiveVRegUses &&
- TII.isTriviallyReMaterializable(*MI)) {
+ TII.isReMaterializable(*MI)) {
LiveInterval &NewLI = createEmptyIntervalFrom(Dest, false);
VNInfo::Allocator &Alloc = LIS.getVNInfoAllocator();
VNInfo *VNI = NewLI.getNextValue(Idx, Alloc);
diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index 4f164e2d53460..7acddff753693 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -244,8 +244,6 @@ namespace {
bool IsGuaranteedToExecute(MachineBasicBlock *BB, MachineLoop *CurLoop);
- bool isTriviallyReMaterializable(const MachineInstr &MI) const;
-
void EnterScope(MachineBasicBlock *MBB);
void ExitScope(MachineBasicBlock *MBB);
@@ -771,23 +769,6 @@ bool MachineLICMImpl::IsGuaranteedToExecute(MachineBasicBlock *BB,
return true;
}
-/// Check if \p MI is trivially remateralizable and if it does not have any
-/// virtual register uses. Even though rematerializable RA might not actually
-/// rematerialize it in this scenario. In that case we do not want to hoist such
-/// instruction out of the loop in a belief RA will sink it back if needed.
-bool MachineLICMImpl::isTriviallyReMaterializable(
- const MachineInstr &MI) const {
- if (!TII->isTriviallyReMaterializable(MI))
- return false;
-
- for (const MachineOperand &MO : MI.all_uses()) {
- if (MO.getReg().isVirtual())
- return false;
- }
-
- return true;
-}
-
void MachineLICMImpl::EnterScope(MachineBasicBlock *MBB) {
LLVM_DEBUG(dbgs() << "Entering " << printMBBReference(*MBB) << '\n');
@@ -1300,9 +1281,9 @@ bool MachineLICMImpl::IsProfitableToHoist(MachineInstr &MI,
return false;
}
- // Rematerializable instructions should always be hoisted providing the
- // register allocator can just pull them down again when needed.
- if (isTriviallyReMaterializable(MI))
+ // Trivially rematerializable instructions should always be hoisted
+ // providing the register allocator can just pull them down again when needed.
+ if (TII->isTriviallyReMaterializable(MI))
return true;
// FIXME: If there are long latency loop-invariant instructions inside the
@@ -1386,7 +1367,7 @@ bool MachineLICMImpl::IsProfitableToHoist(MachineInstr &MI,
// High register pressure situation, only hoist if the instruction is going
// to be remat'ed.
- if (!isTriviallyReMaterializable(MI) &&
+ if (!TII->isTriviallyReMaterializable(MI) &&
!MI.isDereferenceableInvariantLoad()) {
LLVM_DEBUG(dbgs() << "Can't remat / high reg-pressure: " << MI);
return false;
diff --git a/llvm/lib/CodeGen/RegAllocScore.cpp b/llvm/lib/CodeGen/RegAllocScore.cpp
index ce1eea3519b71..25a142a2a20d8 100644
--- a/llvm/lib/CodeGen/RegAllocScore.cpp
+++ b/llvm/lib/CodeGen/RegAllocScore.cpp
@@ -80,10 +80,7 @@ llvm::calculateRegAllocScore(const MachineFunction &MF,
},
[&](const MachineInstr &MI) {
auto *TTI = MF.getSubtarget().getInstrInfo();
- return TTI->isTriviallyReMaterializable(MI) &&
- llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) {
- return MO.getReg().isVirtual();
- });
+ return TTI->isTriviallyReMaterializable(MI);
});
}
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index db00f54daeb62..2d84d6b385956 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1325,7 +1325,7 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
if (!TII->isAsCheapAsAMove(*DefMI))
return false;
- if (!TII->isTriviallyReMaterializable(*DefMI))
+ if (!TII->isReMaterializable(*DefMI))
return false;
if (!definesFullReg(*DefMI, SrcReg))
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index c9e46182decc2..9e5c1bfdd745e 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -67,7 +67,7 @@ bool TargetRegisterInfo::shouldRegionSplitForVirtReg(
const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
const MachineRegisterInfo &MRI = MF.getRegInfo();
MachineInstr *MI = MRI.getUniqueVRegDef(VirtReg.reg());
- if (MI && TII->isTriviallyReMaterializable(*MI) &&
+ if (MI && TII->isReMaterializable(*MI) &&
VirtReg.size() > HugeSizeForSplit)
return false;
return true;
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
index 254b75b784e75..f52581132a999 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
@@ -1779,7 +1779,7 @@ bool PreRARematStage::canIncreaseOccupancyOrReduceSpill() {
for (auto MI = Region.first; MI != Region.second; ++MI) {
// The instruction must be trivially rematerializable.
MachineInstr &DefMI = *MI;
- if (!isTriviallyReMaterializable(DefMI))
+ if (!isReMaterializable(DefMI))
continue;
// We only support rematerializing virtual registers with one definition.
@@ -2002,8 +2002,8 @@ void PreRARematStage::rematerialize() {
}
// Copied from MachineLICM
-bool PreRARematStage::isTriviallyReMaterializable(const MachineInstr &MI) {
- if (!DAG.TII->isTriviallyReMaterializable(MI))
+bool PreRARematStage::isReMaterializable(const MachineInstr &MI) {
+ if (!DAG.TII->isReMaterializable(MI))
return false;
for (const MachineOperand &MO : MI.all_uses()) {
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
index 790370ff8ab4d..256a422e22f5f 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -485,7 +485,7 @@ class PreRARematStage : public GCNSchedStage {
/// Whether the MI is trivially rematerializable and does not have any virtual
/// register use.
- bool isTriviallyReMaterializable(const MachineInstr &MI);
+ bool isReMaterializable(const MachineInstr &MI);
/// Rematerializes all instructions in PreRARematStage::Rematerializations
/// and stores the achieved occupancy after remat in
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
index 7591541779884..08ca20b5eef6e 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
@@ -260,10 +260,7 @@ static void query(const MachineInstr &MI, bool &Read, bool &Write,
// Test whether Def is safe and profitable to rematerialize.
static bool shouldRematerialize(const MachineInstr &Def,
const WebAssemblyInstrInfo *TII) {
- return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def) &&
- llvm::all_of(Def.all_uses(), [](const MachineOperand &MO) {
- return MO.getReg().isVirtual();
- });
+ return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def);
}
// Identify the definition for this register at this point. This is a
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, went through the call sites and the choices between trivial and non-trivial rematerialization made sense.
Co-authored-by: Luke Lau <luke_lau@icloud.com>
In particular, the partial revert which effected llvm::calculateRegAllocScore and adapt the code in that spot to use non-trivial remat for now.
// The instruction must be rematerializable. | ||
MachineInstr &DefMI = *MI; | ||
if (!isTriviallyReMaterializable(DefMI)) | ||
if (!isReMaterializable(DefMI)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arsenm I want to draw your attention to this change. I'm suspecting the use of non-trivial remat might be a latent bug here from the comments, and code structure. When I tried to make this actually trivial, I do see test failures though.
@arsenm Can I get a second LGTM here? This is a conceptually big change despite being NFC, and I want to make sure we have agreement on direction before moving forward. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/18839 Here is the relevant piece of the build log for the reference
|
llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) { | ||
return MO.getReg().isVirtual(); | ||
}); | ||
bool isTrivRemat = TII.isTriviallyReMaterializable(MI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little puzzled.
This patch changes the behavior of the code here.
So before, for isTrivRemat to be set, all registers in the MI had to be virtual.
With the patch we call
/// Return true if the instruction is trivially rematerializable, meaning it
/// has no side effects and requires no operands that aren't always available.
/// This means the only allowed uses are constants and unallocatable physical
/// registers so that the instructions result is independent of the place
/// in the function.
bool isTriviallyReMaterializable(const MachineInstr &MI) const {
if (!isReMaterializable(MI))
return false;
for (const MachineOperand &MO : MI.all_uses()) {
if (MO.getReg().isVirtual())
return false;
}
return true;
}
which will return false if any of the registers in the MI is virtual.
Switching
if (MO.getReg().isVirtual())
to
if (!MO.getReg().isVirtual())
makes it behave as before but then I guess it doesn't do what the method comment says.
I noticed this for a downstream testcase where the prologue_end placement changed with ca2e8fc and then again with this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are completely correct. Oops. This was a major bug in #160319. The check should have been "are all of the operands not a virtual register", but this is not what actually got written and landed.
I think this change accidentally fixed this (by doing what the intention had been all along), but boy does that make both patch descriptions misleading. Do you think this is worth a revert-and-reapply on both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so now it's doing what is intended. Good!
Ah, I don't know if you need to fiddle with revert/reapply, I just got confused about the ping/pong behavior of our testcase and that's clear now.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for flagging this. I'm going to put a note on each review for later reference as this could be quite confusing later when digging through history.
Warning to anyone looking at this later, the landed version contained of #160319 contained a nasty bug where I'd accidentally reversed the condition being checked for. This was (accidentally) undone in this review, so this ends up not being NFC as intended. The final state is in line with the original intention of the two patches combined, so not going to revert. It just makes the descriptions really dang confusing. Sorry! |
…60377) This change builds on llvm#160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>
#159211) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However #160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and #160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%
…erialization (#159211) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However llvm/llvm-project#160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and llvm/llvm-project#160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%
This change builds on #160319 which tries to clarify which callers (not backends) assume that the result is actually trivial.
This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic.
This change does not enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where.