[TII] Split isTrivialReMaterializable into two versions [nfc] #160377

preames · 2025-09-23T19:32:09Z

This change builds on #160319 which tries to clarify which callers (not backends) assume that the result is actually trivial.

This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic.

This change does not enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where.

This change builds on llvm#160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where.

llvmbot · 2025-09-23T19:32:40Z

@llvm/pr-subscribers-llvm-regalloc
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-amdgpu

Author: Philip Reames (preames)

Changes

This change builds on #160319 which tries to clarify which callers (not backends) assume that the result is actually trivial.

This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic.

This change does not enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where.

Full diff: https://github.com/llvm/llvm-project/pull/160377.diff

11 Files Affected:

(modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+15-3)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+1-5)
(modified) llvm/lib/CodeGen/CalcSpillWeights.cpp (+1-1)
(modified) llvm/lib/CodeGen/LiveRangeEdit.cpp (+2-2)
(modified) llvm/lib/CodeGen/MachineLICM.cpp (+4-23)
(modified) llvm/lib/CodeGen/RegAllocScore.cpp (+1-4)
(modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+1-1)
(modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+1-1)
(modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp (+3-3)
(modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.h (+1-1)
(modified) llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp (+1-4)

diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index cefb0d35690ba..046b60256fb29 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -168,6 +168,19 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   /// registers so that the instructions result is independent of the place
   /// in the function.
   bool isTriviallyReMaterializable(const MachineInstr &MI) const {
+    if (!isReMaterializable(MI))
+      return false;
+    for (const MachineOperand &MO : MI.all_uses()) {
+      if (MO.getReg().isVirtual())
+        return false;
+    }
+    return true;
+  }
+
+  /// Return true if the instruction would be materializable at a point
+  /// in the containing function where all virtual register uses were
+  /// known to be live and available in registers.
+  bool isReMaterializable(const MachineInstr &MI) const {
     return (MI.getOpcode() == TargetOpcode::IMPLICIT_DEF &&
             MI.getNumOperands() == 1) ||
            (MI.getDesc().isRematerializable() &&
@@ -194,10 +207,9 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
 protected:
   /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
   /// set, this hook lets the target specify whether the instruction is actually
-  /// trivially rematerializable, taking into consideration its operands. This
+  /// rematerializable, taking into consideration its operands. This
   /// predicate must return false if the instruction has any side effects other
-  /// than producing a value, or if it requres any address registers that are
-  /// not always available.
+  /// than producing a value.
   virtual bool isReMaterializableImpl(const MachineInstr &MI) const;
 
   /// This method commutes the operands of the given machine instruction MI.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 1db742253d594..05d3d18aa9557 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2212,11 +2212,7 @@ findPrologueEndLoc(const MachineFunction *MF) {
       -> std::optional<std::pair<const MachineInstr *, bool>> {
     // Is this instruction trivial data shuffling or frame-setup?
     bool isCopy = (TII.isCopyInstr(MI) ? true : false);
-    bool isTrivRemat =
-        TII.isTriviallyReMaterializable(MI) &&
-        llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) {
-          return MO.getReg().isVirtual();
-        });
+    bool isTrivRemat = TII.isTriviallyReMaterializable(MI);
     bool isFrameSetup = MI.getFlag(MachineInstr::FrameSetup);
 
     if (!isFrameSetup && MI.getDebugLoc()) {
diff --git a/llvm/lib/CodeGen/CalcSpillWeights.cpp b/llvm/lib/CodeGen/CalcSpillWeights.cpp
index b16694eafd90e..f5a6eec57ee7e 100644
--- a/llvm/lib/CodeGen/CalcSpillWeights.cpp
+++ b/llvm/lib/CodeGen/CalcSpillWeights.cpp
@@ -122,7 +122,7 @@ bool VirtRegAuxInfo::isRematerializable(const LiveInterval &LI,
       assert(MI && "Dead valno in interval");
     }
 
-    if (!TII.isTriviallyReMaterializable(*MI))
+    if (!TII.isReMaterializable(*MI))
       return false;
   }
   return true;
diff --git a/llvm/lib/CodeGen/LiveRangeEdit.cpp b/llvm/lib/CodeGen/LiveRangeEdit.cpp
index 33e980a5993d3..2f65be51bb726 100644
--- a/llvm/lib/CodeGen/LiveRangeEdit.cpp
+++ b/llvm/lib/CodeGen/LiveRangeEdit.cpp
@@ -80,7 +80,7 @@ void LiveRangeEdit::scanRemattable() {
     MachineInstr *DefMI = LIS.getInstructionFromIndex(OrigVNI->def);
     if (!DefMI)
       continue;
-    if (TII.isTriviallyReMaterializable(*DefMI))
+    if (TII.isReMaterializable(*DefMI))
       Remattable.insert(OrigVNI);
   }
   ScannedRemattable = true;
@@ -387,7 +387,7 @@ void LiveRangeEdit::eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink) {
     // register uses. That may provoke RA to split an interval at the KILL
     // and later result in an invalid live segment end.
     if (isOrigDef && DeadRemats && !HasLiveVRegUses &&
-        TII.isTriviallyReMaterializable(*MI)) {
+        TII.isReMaterializable(*MI)) {
       LiveInterval &NewLI = createEmptyIntervalFrom(Dest, false);
       VNInfo::Allocator &Alloc = LIS.getVNInfoAllocator();
       VNInfo *VNI = NewLI.getNextValue(Idx, Alloc);
diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index 4f164e2d53460..7acddff753693 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -244,8 +244,6 @@ namespace {
 
     bool IsGuaranteedToExecute(MachineBasicBlock *BB, MachineLoop *CurLoop);
 
-    bool isTriviallyReMaterializable(const MachineInstr &MI) const;
-
     void EnterScope(MachineBasicBlock *MBB);
 
     void ExitScope(MachineBasicBlock *MBB);
@@ -771,23 +769,6 @@ bool MachineLICMImpl::IsGuaranteedToExecute(MachineBasicBlock *BB,
   return true;
 }
 
-/// Check if \p MI is trivially remateralizable and if it does not have any
-/// virtual register uses. Even though rematerializable RA might not actually
-/// rematerialize it in this scenario. In that case we do not want to hoist such
-/// instruction out of the loop in a belief RA will sink it back if needed.
-bool MachineLICMImpl::isTriviallyReMaterializable(
-    const MachineInstr &MI) const {
-  if (!TII->isTriviallyReMaterializable(MI))
-    return false;
-
-  for (const MachineOperand &MO : MI.all_uses()) {
-    if (MO.getReg().isVirtual())
-      return false;
-  }
-
-  return true;
-}
-
 void MachineLICMImpl::EnterScope(MachineBasicBlock *MBB) {
   LLVM_DEBUG(dbgs() << "Entering " << printMBBReference(*MBB) << '\n');
 
@@ -1300,9 +1281,9 @@ bool MachineLICMImpl::IsProfitableToHoist(MachineInstr &MI,
     return false;
   }
 
-  // Rematerializable instructions should always be hoisted providing the
-  // register allocator can just pull them down again when needed.
-  if (isTriviallyReMaterializable(MI))
+  // Trivially rematerializable instructions should always be hoisted
+  // providing the register allocator can just pull them down again when needed.
+  if (TII->isTriviallyReMaterializable(MI))
     return true;
 
   // FIXME: If there are long latency loop-invariant instructions inside the
@@ -1386,7 +1367,7 @@ bool MachineLICMImpl::IsProfitableToHoist(MachineInstr &MI,
 
   // High register pressure situation, only hoist if the instruction is going
   // to be remat'ed.
-  if (!isTriviallyReMaterializable(MI) &&
+  if (!TII->isTriviallyReMaterializable(MI) &&
       !MI.isDereferenceableInvariantLoad()) {
     LLVM_DEBUG(dbgs() << "Can't remat / high reg-pressure: " << MI);
     return false;
diff --git a/llvm/lib/CodeGen/RegAllocScore.cpp b/llvm/lib/CodeGen/RegAllocScore.cpp
index ce1eea3519b71..25a142a2a20d8 100644
--- a/llvm/lib/CodeGen/RegAllocScore.cpp
+++ b/llvm/lib/CodeGen/RegAllocScore.cpp
@@ -80,10 +80,7 @@ llvm::calculateRegAllocScore(const MachineFunction &MF,
       },
       [&](const MachineInstr &MI) {
         auto *TTI = MF.getSubtarget().getInstrInfo();
-        return TTI->isTriviallyReMaterializable(MI) &&
-               llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) {
-                 return MO.getReg().isVirtual();
-               });
+        return TTI->isTriviallyReMaterializable(MI);
       });
 }
 
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index db00f54daeb62..2d84d6b385956 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1325,7 +1325,7 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
   if (!TII->isAsCheapAsAMove(*DefMI))
     return false;
 
-  if (!TII->isTriviallyReMaterializable(*DefMI))
+  if (!TII->isReMaterializable(*DefMI))
     return false;
 
   if (!definesFullReg(*DefMI, SrcReg))
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index c9e46182decc2..9e5c1bfdd745e 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -67,7 +67,7 @@ bool TargetRegisterInfo::shouldRegionSplitForVirtReg(
   const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
   const MachineRegisterInfo &MRI = MF.getRegInfo();
   MachineInstr *MI = MRI.getUniqueVRegDef(VirtReg.reg());
-  if (MI && TII->isTriviallyReMaterializable(*MI) &&
+  if (MI && TII->isReMaterializable(*MI) &&
       VirtReg.size() > HugeSizeForSplit)
     return false;
   return true;
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
index 254b75b784e75..f52581132a999 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
@@ -1779,7 +1779,7 @@ bool PreRARematStage::canIncreaseOccupancyOrReduceSpill() {
     for (auto MI = Region.first; MI != Region.second; ++MI) {
       // The instruction must be trivially rematerializable.
       MachineInstr &DefMI = *MI;
-      if (!isTriviallyReMaterializable(DefMI))
+      if (!isReMaterializable(DefMI))
         continue;
 
       // We only support rematerializing virtual registers with one definition.
@@ -2002,8 +2002,8 @@ void PreRARematStage::rematerialize() {
 }
 
 // Copied from MachineLICM
-bool PreRARematStage::isTriviallyReMaterializable(const MachineInstr &MI) {
-  if (!DAG.TII->isTriviallyReMaterializable(MI))
+bool PreRARematStage::isReMaterializable(const MachineInstr &MI) {
+  if (!DAG.TII->isReMaterializable(MI))
     return false;
 
   for (const MachineOperand &MO : MI.all_uses()) {
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
index 790370ff8ab4d..256a422e22f5f 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -485,7 +485,7 @@ class PreRARematStage : public GCNSchedStage {
 
   /// Whether the MI is trivially rematerializable and does not have any virtual
   /// register use.
-  bool isTriviallyReMaterializable(const MachineInstr &MI);
+  bool isReMaterializable(const MachineInstr &MI);
 
   /// Rematerializes all instructions in PreRARematStage::Rematerializations
   /// and stores the achieved occupancy after remat in
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
index 7591541779884..08ca20b5eef6e 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
@@ -260,10 +260,7 @@ static void query(const MachineInstr &MI, bool &Read, bool &Write,
 // Test whether Def is safe and profitable to rematerialize.
 static bool shouldRematerialize(const MachineInstr &Def,
                                 const WebAssemblyInstrInfo *TII) {
-  return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def) &&
-         llvm::all_of(Def.all_uses(), [](const MachineOperand &MO) {
-           return MO.getReg().isVirtual();
-         });
+  return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def);
 }
 
 // Identify the definition for this register at this point. This is a

github-actions · 2025-09-23T19:35:51Z

✅ With the latest revision this PR passed the C/C++ code formatter.

lukel97

LGTM, went through the call sites and the choices between trivial and non-trivial rematerialization made sense.

llvm/lib/CodeGen/RegisterCoalescer.cpp

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

Co-authored-by: Luke Lau <luke_lau@icloud.com>

In particular, the partial revert which effected llvm::calculateRegAllocScore and adapt the code in that spot to use non-trivial remat for now.

preames · 2025-09-24T14:39:13Z

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

+      // The instruction must be rematerializable.
      MachineInstr &DefMI = *MI;
-      if (!isTriviallyReMaterializable(DefMI))
+      if (!isReMaterializable(DefMI))


@arsenm I want to draw your attention to this change. I'm suspecting the use of non-trivial remat might be a latent bug here from the comments, and code structure. When I tried to make this actually trivial, I do see test failures though.

preames · 2025-09-24T18:07:55Z

@arsenm Can I get a second LGTM here? This is a conceptually big change despite being NFC, and I want to make sure we have agreement on direction before moving forward.

llvm-ci · 2025-09-25T01:59:47Z

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/18839

Here is the relevant piece of the build log for the reference

Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x564b750324c0 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x564b750324c0 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

mikaelholmen · 2025-09-25T06:39:31Z

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

-        llvm::all_of(MI.all_uses(), [](const MachineOperand &MO) {
-          return MO.getReg().isVirtual();
-        });
+    bool isTrivRemat = TII.isTriviallyReMaterializable(MI);


I'm a little puzzled.
This patch changes the behavior of the code here.

So before, for isTrivRemat to be set, all registers in the MI had to be virtual.
With the patch we call

/// Return true if the instruction is trivially rematerializable, meaning it /// has no side effects and requires no operands that aren't always available. /// This means the only allowed uses are constants and unallocatable physical /// registers so that the instructions result is independent of the place /// in the function. bool isTriviallyReMaterializable(const MachineInstr &MI) const { if (!isReMaterializable(MI)) return false; for (const MachineOperand &MO : MI.all_uses()) { if (MO.getReg().isVirtual()) return false; } return true; }

which will return false if any of the registers in the MI is virtual.

Switching
if (MO.getReg().isVirtual())
to
if (!MO.getReg().isVirtual())
makes it behave as before but then I guess it doesn't do what the method comment says.

I noticed this for a downstream testcase where the prologue_end placement changed with ca2e8fc and then again with this patch.

Ah, you are completely correct. Oops. This was a major bug in #160319. The check should have been "are all of the operands not a virtual register", but this is not what actually got written and landed.

I think this change accidentally fixed this (by doing what the intention had been all along), but boy does that make both patch descriptions misleading. Do you think this is worth a revert-and-reapply on both?

Ok, so now it's doing what is intended. Good!

Ah, I don't know if you need to fiddle with revert/reapply, I just got confused about the ping/pong behavior of our testcase and that's clear now.
Thanks.

Thank you for flagging this. I'm going to put a note on each review for later reference as this could be quite confusing later when digging through history.

preames · 2025-09-25T14:58:09Z

Warning to anyone looking at this later, the landed version contained of #160319 contained a nasty bug where I'd accidentally reversed the condition being checked for. This was (accidentally) undone in this review, so this ends up not being NFC as intended. The final state is in line with the original intention of the two patches combined, so not going to revert. It just makes the descriptions really dang confusing. Sorry!

…60377) This change builds on llvm#160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>

#159211) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However #160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and #160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%

…erialization (#159211) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However llvm/llvm-project#160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and llvm/llvm-project#160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%

preames requested review from arsenm, jmorse, lukel97, davemgreen, topperc, guy-david, efriedma-quic and pcwang-thead September 23, 2025 19:32

llvmbot added backend:AMDGPU backend:WebAssembly llvm:codegen llvm:regalloc debuginfo labels Sep 23, 2025

preames mentioned this pull request Sep 23, 2025

[DRAFT][RegisterCoalescer] Enable non-trivial rematerialization #160153

Closed

preames mentioned this pull request Sep 24, 2025

Update callers of isTriviallyReMaterializable to check trivialness #160319

Merged

lukel97 approved these changes Sep 24, 2025

View reviewed changes

llvm/lib/CodeGen/RegisterCoalescer.cpp Show resolved Hide resolved

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved

preames and others added 3 commits September 24, 2025 07:05

Update llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

391c5a7

Co-authored-by: Luke Lau <luke_lau@icloud.com>

Update comments to match code behavior

53c5426

Merge branch 'main' into pr-tti-remat-hook-split-nfc

7d6b8a4

In particular, the partial revert which effected llvm::calculateRegAllocScore and adapt the code in that spot to use non-trivial remat for now.

preames commented Sep 24, 2025

View reviewed changes

Clang-format

fce5aaf

arsenm requested review from jrbyrnes and lucas-rami September 25, 2025 00:35

arsenm approved these changes Sep 25, 2025

View reviewed changes

preames merged commit ea721e2 into llvm:main Sep 25, 2025
9 checks passed

preames deleted the pr-tti-remat-hook-split-nfc branch September 25, 2025 01:52

mikaelholmen reviewed Sep 25, 2025

View reviewed changes

preames mentioned this pull request Sep 26, 2025

[RegAlloc] Remove default restriction on non-trivial rematerialization #159211

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TII] Split isTrivialReMaterializable into two versions [nfc] #160377

[TII] Split isTrivialReMaterializable into two versions [nfc] #160377

preames commented Sep 23, 2025

Uh oh!

llvmbot commented Sep 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

lukel97 left a comment

Uh oh!

Uh oh!

Uh oh!

preames Sep 24, 2025

Uh oh!

preames commented Sep 24, 2025

Uh oh!

Uh oh!

llvm-ci commented Sep 25, 2025

Uh oh!

mikaelholmen Sep 25, 2025

Uh oh!

preames Sep 25, 2025

Uh oh!

mikaelholmen Sep 25, 2025

Uh oh!

preames Sep 25, 2025

Uh oh!

preames commented Sep 25, 2025

Uh oh!

Uh oh!

[TII] Split isTrivialReMaterializable into two versions [nfc] #160377

[TII] Split isTrivialReMaterializable into two versions [nfc] #160377

Conversation

preames commented Sep 23, 2025

Uh oh!

llvmbot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

preames Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

preames commented Sep 24, 2025

Uh oh!

Uh oh!

llvm-ci commented Sep 25, 2025

Uh oh!

mikaelholmen Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

preames Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

mikaelholmen Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

preames Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

preames commented Sep 25, 2025

Uh oh!

Uh oh!

llvmbot commented Sep 23, 2025 •

edited

Loading

github-actions bot commented Sep 23, 2025 •

edited

Loading