[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes #86149

SLTozer · 2024-03-21T16:37:15Z

This patch is part of a set of patches that add an -fextend-lifetimes flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable -fextend-lifetimes, and an -fextend-this-ptr flag which functions as -fextend-lifetimes for this pointers only. All changes and tests in these patches were written by @wolfy1961, though I will be managing these reviews and addressing any comments raised. The extend lifetimes flag is intended to eventually be set on by -Og, as discussed in the RFC here.

This patch implements a new intrinsic instruction in LLVM, llvm.fake.use in IR and FAKE_USE in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA).

llvmbot · 2024-03-21T16:37:34Z

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-ir

Author: Stephen Tozer (SLTozer)

Changes

This patch is part of a set of patches that add an -fextend-lifetimes flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable -fextend-lifetimes, and an -fextend-this-ptr flag which functions as -fextend-lifetimes for this pointers only. All changes and tests in these patches were written by @wolfy1961, though I will be managing these reviews and addressing any comments raised. The extend lifetimes flag is intended to eventually be set on by -Og, as discussed in the RFC here.

This patch implements a new intrinsic instruction in LLVM, llvm.fake.use in IR and FAKE_USE in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA).

Patch is 75.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86149.diff

46 Files Affected:

(modified) llvm/docs/LangRef.rst (+36)
(modified) llvm/include/llvm/Analysis/PtrUseVisitor.h (+1)
(modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+5)
(modified) llvm/include/llvm/CodeGen/MachineInstr.h (+2)
(modified) llvm/include/llvm/CodeGen/SelectionDAGISel.h (+1)
(modified) llvm/include/llvm/IR/Intrinsics.td (+3)
(modified) llvm/include/llvm/Support/TargetOpcodes.def (+3)
(modified) llvm/include/llvm/Target/Target.td (+8)
(modified) llvm/lib/CodeGen/Analysis.cpp (+2-1)
(modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+43)
(modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+40-2)
(modified) llvm/lib/CodeGen/DeadMachineInstructionElim.cpp (+2-1)
(modified) llvm/lib/CodeGen/MachineCSE.cpp (+1-1)
(modified) llvm/lib/CodeGen/MachineScheduler.cpp (+2-1)
(modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+3)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+20)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (+19)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (+7)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp (+11)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+36-1)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+18)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+2)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+64)
(modified) llvm/lib/IR/Instruction.cpp (+4-1)
(modified) llvm/lib/IR/Verifier.cpp (+1)
(modified) llvm/lib/Target/X86/X86FloatingPoint.cpp (+32)
(modified) llvm/lib/Transforms/Scalar/SROA.cpp (+30)
(modified) llvm/lib/Transforms/Utils/CloneFunction.cpp (+6)
(modified) llvm/lib/Transforms/Utils/Local.cpp (+3)
(modified) llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp (+2-1)
(added) llvm/test/CodeGen/MIR/X86/fake-use-phi.mir (+95)
(added) llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir (+115)
(added) llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir (+106)
(added) llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll (+40)
(added) llvm/test/CodeGen/X86/fake-use-hpfloat.ll (+17)
(added) llvm/test/CodeGen/X86/fake-use-ld.ll (+51)
(added) llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll (+33)
(added) llvm/test/CodeGen/X86/fake-use-split-ret.ll (+53)
(added) llvm/test/CodeGen/X86/fake-use-sroa.ll (+54)
(added) llvm/test/CodeGen/X86/fake-use-suppress-load.ll (+23)
(added) llvm/test/CodeGen/X86/fake-use-tailcall.ll (+23)
(added) llvm/test/CodeGen/X86/fake-use-vector.ll (+45)
(added) llvm/test/CodeGen/X86/fake-use-vector2.ll (+33)
(added) llvm/test/DebugInfo/X86/Inputs/check-fake-use.py (+100)
(added) llvm/test/DebugInfo/X86/fake-use.ll (+98)
(added) llvm/test/Transforms/GVN/fake-use-constprop.ll (+69)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 8bc1cab01bf0a6..e7b8af3a2116c7 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -27896,6 +27896,42 @@ execution, but is unknown at compile time.
 If the result value does not fit in the result type, then the result is
 a :ref:`poison value <poisonvalues>`.
 
+.. _llvm_fake_use:
+
+'``llvm.fake.use``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare void @llvm.fake.use(...)
+
+Overview:
+"""""""""
+
+The ``llvm.fake.use`` intrinsic is a no-op. It takes a single
+value as an operand and is treated as a use of that operand, to force the
+optimizer to preserve that value prior to the fake use. This is used for
+extending the lifetimes of variables, where this intrinsic placed at the end of
+a variable's scope helps prevent that variable from being optimized out.
+
+Arguments:
+""""""""""
+
+The ``llvm.fake.use`` intrinsic takes one argument, which may be any
+function-local SSA value. Note that the signature is variadic so that the
+intrinsic can take any type of argument, but passing more than one argument will
+result in an error.
+
+Semantics:
+""""""""""
+
+This intrinsic does nothing, but optimizers must consider it a use of its single
+operand and should try to preserve the intrinsic and its position in the
+function.
+
 
 Stack Map Intrinsics
 --------------------
diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index 86206b2d5e9f88..4e46d441fce17e 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -279,6 +279,7 @@ class PtrUseVisitor : protected InstVisitor<DerivedT>,
     default:
       return Base::visitIntrinsicInst(II);
 
+    case Intrinsic::fake_use:
     case Intrinsic::lifetime_start:
     case Intrinsic::lifetime_end:
       return; // No-op intrinsics.
diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 49d51a27e3c0f6..43aa02b5b03762 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -1304,6 +1304,11 @@ enum NodeType {
   LIFETIME_START,
   LIFETIME_END,
 
+  /// FAKE_USE represents a use of the operand but does not do anything.
+  /// Its purpose is the extension of the operand's lifetime mainly for
+  /// debugging purposes.
+  FAKE_USE,
+
   /// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the
   /// beginning and end of GC transition  sequence, and carry arbitrary
   /// information that target might need for lowering.  The first operand is
diff --git a/llvm/include/llvm/CodeGen/MachineInstr.h b/llvm/include/llvm/CodeGen/MachineInstr.h
index fcdd73d8b65fdd..163bcf6aa5a1a2 100644
--- a/llvm/include/llvm/CodeGen/MachineInstr.h
+++ b/llvm/include/llvm/CodeGen/MachineInstr.h
@@ -1401,6 +1401,8 @@ class MachineInstr
     return getOpcode() == TargetOpcode::EXTRACT_SUBREG;
   }
 
+  bool isFakeUse() const { return getOpcode() == TargetOpcode::FAKE_USE; }
+
   /// Return true if the instruction behaves like a copy.
   /// This does not include native copy instructions.
   bool isCopyLike() const {
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 837f8bf7263ea9..cb8dec554a57b6 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -453,6 +453,7 @@ class SelectionDAGISel : public MachineFunctionPass {
   void Select_READ_REGISTER(SDNode *Op);
   void Select_WRITE_REGISTER(SDNode *Op);
   void Select_UNDEF(SDNode *N);
+  void Select_FAKE_USE(SDNode *N);
   void CannotYetSelect(SDNode *N);
 
   void Select_FREEZE(SDNode *N);
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 091f9b38107989..23a9f586056267 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1778,6 +1778,9 @@ def int_is_constant : DefaultAttrsIntrinsic<[llvm_i1_ty], [llvm_any_ty],
                                 [IntrNoMem, IntrWillReturn, IntrConvergent],
                                 "llvm.is.constant">;
 
+// Introduce a use of the argument without generating any code.
+def int_fake_use : Intrinsic<[], [llvm_vararg_ty]>;
+
 // Intrinsic to mask out bits of a pointer.
 // First argument must be pointer or vector of pointer. This is checked by the
 // verifier.
diff --git a/llvm/include/llvm/Support/TargetOpcodes.def b/llvm/include/llvm/Support/TargetOpcodes.def
index 899eaad5842ae0..9797bbe02a9b5c 100644
--- a/llvm/include/llvm/Support/TargetOpcodes.def
+++ b/llvm/include/llvm/Support/TargetOpcodes.def
@@ -217,6 +217,9 @@ HANDLE_TARGET_OPCODE(PATCHABLE_TYPED_EVENT_CALL)
 
 HANDLE_TARGET_OPCODE(ICALL_BRANCH_FUNNEL)
 
+/// Represents a use of the operand but generates no code.
+HANDLE_TARGET_OPCODE(FAKE_USE)
+
 // This is a fence with the singlethread scope. It represents a compiler memory
 // barrier, but does not correspond to any generated instruction.
 HANDLE_TARGET_OPCODE(MEMBARRIER)
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index cb1c0ed2513d45..2f5d9a5b8ea877 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1401,6 +1401,14 @@ def FAULTING_OP : StandardPseudoInstruction {
   let isTerminator = true;
   let isBranch = true;
 }
+def FAKE_USE : StandardPseudoInstruction {
+  // An instruction that uses its operands but does nothing.
+  let OutOperandList = (outs);
+  let InOperandList = (ins variable_ops);
+  let AsmString = "FAKE_USE";
+  let hasSideEffects = 0;
+  let isMeta = true;
+}
 def PATCHABLE_OP : StandardPseudoInstruction {
   let OutOperandList = (outs);
   let InOperandList = (ins variable_ops);
diff --git a/llvm/lib/CodeGen/Analysis.cpp b/llvm/lib/CodeGen/Analysis.cpp
index af7643d93591f7..05840c963ea526 100644
--- a/llvm/lib/CodeGen/Analysis.cpp
+++ b/llvm/lib/CodeGen/Analysis.cpp
@@ -566,7 +566,8 @@ bool llvm::isInTailCallPosition(const CallBase &Call, const TargetMachine &TM) {
     if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(BBI))
       if (II->getIntrinsicID() == Intrinsic::lifetime_end ||
           II->getIntrinsicID() == Intrinsic::assume ||
-          II->getIntrinsicID() == Intrinsic::experimental_noalias_scope_decl)
+          II->getIntrinsicID() == Intrinsic::experimental_noalias_scope_decl ||
+          II->getIntrinsicID() == Intrinsic::fake_use)
         continue;
     if (BBI->mayHaveSideEffects() || BBI->mayReadFromMemory() ||
         !isSafeToSpeculativelyExecute(&*BBI))
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index a15538755d73b3..1e680630e6cb58 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1099,11 +1099,46 @@ void AsmPrinter::emitFunctionEntryLabel() {
   }
 }
 
+// Recognize cases where a spilled register is reloaded solely to feed into a
+// FAKE_USE.
+static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI) {
+  const MachineFunction *MF = MI.getMF();
+  const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
+
+  // If the restore size is std::nullopt then we are not dealing with a reload
+  // of a spilled register.
+  if (!MI.getRestoreSize(TII))
+    return false;
+
+  // Check if this register is the operand of a FAKE_USE and
+  // does it have the kill flag set there.
+  auto NextI = std::next(MI.getIterator());
+  if (NextI == MI.getParent()->end() || !NextI->isFakeUse())
+    return false;
+
+  unsigned Reg = MI.getOperand(0).getReg();
+  for (const MachineOperand &MO : NextI->operands()) {
+    // Return true if we came across the register from the
+    // previous spill instruction that is killed in NextI.
+    if (MO.isReg() && MO.isUse() && MO.isKill() && MO.getReg() == Reg)
+      return true;
+  }
+
+  return false;
+}
+
 /// emitComments - Pretty-print comments for instructions.
 static void emitComments(const MachineInstr &MI, raw_ostream &CommentOS) {
   const MachineFunction *MF = MI.getMF();
   const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
 
+  // If this is a reload of a spilled register that only feeds into a FAKE_USE
+  // instruction, meaning the load value has no effect on the program and has
+  // only been kept alive for debugging; since it is still available on the
+  // stack, we can skip the load itself.
+  if (isLoadFeedingIntoFakeUse(MI))
+    return;
+
   // Check for spills and reloads
 
   // We assume a single instruction only has a spill or reload, not
@@ -1828,6 +1863,8 @@ void AsmPrinter::emitFunctionBody() {
       case TargetOpcode::KILL:
         if (isVerbose()) emitKill(&MI, *this);
         break;
+      case TargetOpcode::FAKE_USE:
+        break;
       case TargetOpcode::PSEUDO_PROBE:
         emitPseudoProbe(MI);
         break;
@@ -1843,6 +1880,12 @@ void AsmPrinter::emitFunctionBody() {
         // purely meta information.
         break;
       default:
+        // If this is a reload of a spilled register that only feeds into a
+        // FAKE_USE instruction, meaning the load value has no effect on the
+        // program and has only been kept alive for debugging; since it is
+        // still available on the stack, we can skip the load itself.
+        if (isLoadFeedingIntoFakeUse(MI))
+          break;
         emitInstruction(&MI);
         if (CanDoExtraAnalysis) {
           MCInst MCI;
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 9f99bb7e693f7e..a2314779c8e687 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -2663,12 +2663,34 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
     return false;
   };
 
+  SmallVector<const IntrinsicInst *, 4> FakeUses;
+
+  auto isFakeUse = [&FakeUses](const Instruction *Inst) {
+    if (auto *II = dyn_cast<IntrinsicInst>(Inst);
+        II && II->getIntrinsicID() == Intrinsic::fake_use) {
+      // Record the instruction so it can be preserved when the exit block is
+      // removed. Do not preserve the fake use that uses the result of the
+      // PHI instruction.
+      // Do not copy fake uses that use the result of a PHI node.
+      // FIXME: If we do want to copy the fake use into the return blocks, we
+      // have to figure out which of the PHI node operands to use for each
+      // copy.
+      if (!isa<PHINode>(II->getOperand(0))) {
+        FakeUses.push_back(II);
+      }
+      return true;
+    }
+
+    return false;
+  };
+
   // Make sure there are no instructions between the first instruction
   // and return.
   const Instruction *BI = BB->getFirstNonPHI();
   // Skip over debug and the bitcast.
   while (isa<DbgInfoIntrinsic>(BI) || BI == BCI || BI == EVI ||
-         isa<PseudoProbeInst>(BI) || isLifetimeEndOrBitCastFor(BI))
+         isa<PseudoProbeInst>(BI) || isLifetimeEndOrBitCastFor(BI) ||
+         isFakeUse(BI))
     BI = BI->getNextNode();
   if (BI != RetI)
     return false;
@@ -2677,6 +2699,7 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
   /// call.
   const Function *F = BB->getParent();
   SmallVector<BasicBlock *, 4> TailCallBBs;
+  SmallVector<CallInst *, 4> CallInsts;
   if (PN) {
     for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
       // Look through bitcasts.
@@ -2710,6 +2733,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
             attributesPermitTailCall(F, CI, RetI, *TLI))
           TailCallBBs.push_back(PredBB);
       }
+      // Record the call instruction so we can insert any fake uses
+      // that need to be preserved before it.
+      CallInsts.push_back(CI);
     }
   } else {
     SmallPtrSet<BasicBlock *, 4> VisitedBBs;
@@ -2726,6 +2752,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
               (isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
                V == CI->getArgOperand(0))) {
             TailCallBBs.push_back(Pred);
+            // Record the call instruction so we can insert any fake uses
+            // that need to be preserved before it.
+            CallInsts.push_back(CI);
           }
         }
       }
@@ -2752,8 +2781,17 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
   }
 
   // If we eliminated all predecessors of the block, delete the block now.
-  if (Changed && !BB->hasAddressTaken() && pred_empty(BB))
+  if (Changed && !BB->hasAddressTaken() && pred_empty(BB)) {
+    // Copy the fake uses found in the original return block to all blocks
+    // that contain tail calls.
+    for (auto *CI : CallInsts) {
+      for (auto const *FakeUse : FakeUses) {
+        auto *ClonedInst = FakeUse->clone();
+        ClonedInst->insertBefore(CI);
+      }
+    }
     BB->eraseFromParent();
+  }
 
   return Changed;
 }
diff --git a/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp b/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
index facc01452d2f12..468949fd4cf498 100644
--- a/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
+++ b/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
@@ -87,7 +87,8 @@ bool DeadMachineInstructionElimImpl::isDead(const MachineInstr *MI) const {
     return false;
 
   // Don't delete frame allocation labels.
-  if (MI->getOpcode() == TargetOpcode::LOCAL_ESCAPE)
+  if (MI->getOpcode() == TargetOpcode::LOCAL_ESCAPE ||
+      MI->getOpcode() == TargetOpcode::FAKE_USE)
     return false;
 
   // Don't delete instructions with side effects.
diff --git a/llvm/lib/CodeGen/MachineCSE.cpp b/llvm/lib/CodeGen/MachineCSE.cpp
index 26a8d00e662651..c3c7f48677d5c2 100644
--- a/llvm/lib/CodeGen/MachineCSE.cpp
+++ b/llvm/lib/CodeGen/MachineCSE.cpp
@@ -406,7 +406,7 @@ bool MachineCSE::PhysRegDefsReach(MachineInstr *CSMI, MachineInstr *MI,
 
 bool MachineCSE::isCSECandidate(MachineInstr *MI) {
   if (MI->isPosition() || MI->isPHI() || MI->isImplicitDef() || MI->isKill() ||
-      MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo())
+      MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() || MI->isFakeUse())
     return false;
 
   // Ignore copies.
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 0d5bf329938781..1cc611107e5984 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -524,7 +524,8 @@ static bool isSchedBoundary(MachineBasicBlock::iterator MI,
                             MachineBasicBlock *MBB,
                             MachineFunction *MF,
                             const TargetInstrInfo *TII) {
-  return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF);
+  return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF) ||
+         MI->isFakeUse();
 }
 
 /// A region of an MBB for scheduling.
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index 27b8472ddb73d8..5451abcdfcf9d2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -1461,6 +1461,9 @@ bool FastISel::selectIntrinsicCall(const IntrinsicInst *II) {
     updateValueMap(II, ResultReg);
     return true;
   }
+  case Intrinsic::fake_use:
+    // At -O0, we don't need fake use, so just ignore it.
+    return true;
   case Intrinsic::experimental_stackmap:
     return selectStackmap(II);
   case Intrinsic::experimental_patchpoint_void:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 3332c02ec72358..ac1242ff59f164 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -2229,6 +2229,9 @@ bool DAGTypeLegalizer::PromoteFloatOperand(SDNode *N, unsigned OpNo) {
       report_fatal_error("Do not know how to promote this operator's operand!");
 
     case ISD::BITCAST:    R = PromoteFloatOp_BITCAST(N, OpNo); break;
+    case ISD::FAKE_USE:
+      R = PromoteFloatOp_FAKE_USE(N, OpNo);
+      break;
     case ISD::FCOPYSIGN:  R = PromoteFloatOp_FCOPYSIGN(N, OpNo); break;
     case ISD::FP_TO_SINT:
     case ISD::FP_TO_UINT:
@@ -2268,6 +2271,13 @@ SDValue DAGTypeLegalizer::PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo) {
   return DAG.getBitcast(N->getValueType(0), Convert);
 }
 
+SDValue DAGTypeLegalizer::PromoteFloatOp_FAKE_USE(SDNode *N, unsigned OpNo) {
+  assert(OpNo == 1 && "Only Operand 1 must need promotion here");
+  SDValue Op = GetPromotedFloat(N->getOperand(OpNo));
+  return DAG.getNode(N->getOpcode(), SDLoc(N), MVT::Other, N->getOperand(0),
+                     Op);
+}
+
 // Promote Operand 1 of FCOPYSIGN.  Operand 0 ought to be handled by
 // PromoteFloatRes_FCOPYSIGN.
 SDValue DAGTypeLegalizer::PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo) {
@@ -3138,6 +3148,9 @@ bool DAGTypeLegalizer::SoftPromoteHalfOperand(SDNode *N, unsigned OpNo) {
                        "operand!");
 
   case ISD::BITCAST:    Res = SoftPromoteHalfOp_BITCAST(N); break;
+  case ISD::FAKE_USE:
+    Res = SoftPromoteHalfOp_FAKE_USE(N, OpNo);
+    break;
   case ISD::FCOPYSIGN:  Res = SoftPromoteHalfOp_FCOPYSIGN(N, OpNo); break;
   case ISD::FP_TO_SINT:
   case ISD::FP_TO_UINT: Res = SoftPromoteHalfOp_FP_TO_XINT(N); break;
@@ -3175,6 +3188,13 @@ SDValue DAGTypeLegalizer::SoftPromoteHalfOp_BITCAST(SDNode *N) {
   return DAG.getNode(ISD::BITCAST, SDLoc(N), N->getValueType(0), Op0);
 }
 
+SDValue DAGTypeLegalizer::SoftPromoteHalfOp_FAKE_USE(SDNode *N, unsigned OpNo) {
+  assert(OpNo == 1 && "Only Operand 1 must need promotion here");
+  SDValue Op = GetSoftPromotedHalf(N->getOperand(OpNo));
+  return DAG.getNode(N->getOpcode(), SDLoc(N), MVT::Other, N->getOperand(0),
+                     Op);
+}
+
 SDValue DAGTypeLegalizer::SoftPromoteHalfOp_FCOPYSIGN(SDNode *N,
                                                       unsigned OpNo) {
   assert(OpNo == 1 && "Only Operand 1 must need promotion here");
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index 93ce9c22af5525..022cb9a4a82cdc 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -1818,6 +1818,9 @@ bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
   case ISD::BUILD_VECTOR: Res = PromoteIntOp_BUILD_VECTOR(N); break;
   case ISD::CONCAT_VECTORS: Res = PromoteIntOp_CONCAT_VECTORS(N); break;
   case ISD::EXTRACT_VECTOR_ELT: Res = PromoteIntOp_EXTRACT_VECTOR_ELT(N); break;
+  case ISD::FAKE_USE:
+    Res = PromoteIntOp_FAKE_USE(N);
+    break;
   case ISD::INSERT_VECTOR_ELT:
     Res = PromoteIntOp_INSERT_VECTOR_ELT(N, OpNo);
     break;
@@ -5124,6 +5127,9 @@ bool DAGTypeLegalizer::ExpandIntegerOperand(SDNode *N, unsigned OpNo) {
   case ISD::BR_CC:             Res = ExpandIntOp_BR_CC(N); break;
   case ISD::BUILD_VECTOR:      Res = ExpandOp_BUILD_VECTOR(N); break;
   case ISD::EXTRACT_ELEMENT:   Res = ExpandOp_EXTRACT_ELEMENT(N); break;
+  case ISD::FAKE_USE:
+    Res = ExpandOp_FAKE_USE(N);
+    break;
   case ISD::INSERT_VECTOR_ELT: Res = ExpandOp_INSERT_VECTOR_ELT(N); break;
   case ISD::SCALAR_TO_VECTOR:  Res = ExpandOp_SCALAR_TO_VECTOR(N); break;
   case ISD::SPLAT_VECTOR:      Res = ExpandIntOp_SPLAT_VECTOR(N); break;
@@ -5931,6 +5937,19 @@ SDValue DAGTypeLegalizer::PromoteIntOp_INSERT_SUBVECTOR(SDNode *N) {
   return DAG.getAnyExtOrTrunc(Ext, dl, N->getValueType(0));
 }
 
+// FIXME: We wouldn't need this if clang could promote short integers
+// that are arguments to FAKE_USE.
+SDValue DAGTypeLegalizer::PromoteIntOp_FAKE_USE(SDNode *N) {
+  SDLoc dl(N);
+  SDValue V0 = N->getOperand(0);
+  SDValue V1 = N->getOperand(1);
+  EVT InVT1 = V1.getValueType();
+  SDValue VPromoted =
+      DAG.getNode(ISD::ANY_EXTEND, dl,
+                  TLI.getTypeToTransformTo(*DAG.getContext(), InVT1), V1);
+  return DAG.getNode(N->getOpcode(), dl, N->getValueType(0), V0, VPromoted);
+}
+
 SDValue DAGTypeLegalizer::PromoteIntOp_EXTRACT_SUBVECTOR(SDNode *N) {
   SDLoc dl(N);
   SDValue V0 = GetPromotedInteger(N->getOperand(0));
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index e08acd36b41d4e..6a3f76ae4b6ded 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeG...
[truncated]

github-actions · 2024-03-21T16:39:48Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff f5c90f3000bc75a344bf01bd4e0401e3fb7f9453 c8b6aa4c5733eab712314eff67043ca3c1ee9431 -- llvm/include/llvm/Analysis/PtrUseVisitor.h llvm/include/llvm/CodeGen/ISDOpcodes.h llvm/include/llvm/CodeGen/MachineInstr.h llvm/include/llvm/CodeGen/SelectionDAGISel.h llvm/lib/CodeGen/Analysis.cpp llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/CodeGenPrepare.cpp llvm/lib/CodeGen/DeadMachineInstructionElim.cpp llvm/lib/CodeGen/MachineCSE.cpp llvm/lib/CodeGen/MachineScheduler.cpp llvm/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/lib/IR/Instruction.cpp llvm/lib/IR/Verifier.cpp llvm/lib/Target/X86/X86FloatingPoint.cpp llvm/lib/Transforms/Scalar/SROA.cpp llvm/lib/Transforms/Utils/CloneFunction.cpp llvm/lib/Transforms/Utils/Local.cpp llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/CodeGen/MachineCSE.cpp b/llvm/lib/CodeGen/MachineCSE.cpp
index c3c7f48677..a1e33d4b67 100644
--- a/llvm/lib/CodeGen/MachineCSE.cpp
+++ b/llvm/lib/CodeGen/MachineCSE.cpp
@@ -406,7 +406,8 @@ bool MachineCSE::PhysRegDefsReach(MachineInstr *CSMI, MachineInstr *MI,
 
 bool MachineCSE::isCSECandidate(MachineInstr *MI) {
   if (MI->isPosition() || MI->isPHI() || MI->isImplicitDef() || MI->isKill() ||
-      MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() || MI->isFakeUse())
+      MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() ||
+      MI->isFakeUse())
     return false;
 
   // Ignore copies.

felipepiovezan · 2024-03-21T16:53:28Z

I only skimmed the diff, but noticed there is no global isel lowering here. Were there any challenges in doing that or was it not considered?

pogo59 · 2024-03-21T17:03:22Z

I only skimmed the diff, but noticed there is no global isel lowering here. Were there any challenges in doing that or was it not considered?

Our target of interest is X86, so I believe it was not considered. @wolfy1961 can correct me if I'm mis-remembering.

SLTozer · 2024-03-21T17:05:01Z

I only skimmed the diff, but noticed there is no global isel lowering here. Were there any challenges in doing that or was it not considered?

I wasn't the original patch writer, but I think it would be the latter. I'm not familiar with the internals of GlobalISel, so I'm not sure how confident I'd be implementing support for fake use there myself - though FWIW, since it's a no-op it's mostly just filling in the boilerplate, with a few more complex cases (such as moving fake uses before tail calls).

felipepiovezan · 2024-03-21T17:18:44Z

I only skimmed the diff, but noticed there is no global isel lowering here. Were there any challenges in doing that or was it not considered?

I wasn't the original patch writer, but I think it would be the latter. I'm not familiar with the internals of GlobalISel, so I'm not sure how confident I'd be implementing support for fake use there myself - though FWIW, since it's a no-op it's mostly just filling in the boilerplate, with a few more complex cases (such as moving fake uses before tail calls).

The reason I asked is -- and I could be wrong -- that GlobalISel will give up and fall back to SelectionDAG when it seems something it doesn't know how to lower. That usually happens in IRTranslator.cpp:

        if (translate(Inst))
          continue;

        ...
        reportTranslationError(*MF, *TPC, *ORE, R);
        return false;
      }

So, depending on how/when these intrinsics are generated, this patch could affect a few different targets.

edit: not this patch, but the one that would generate the intrinsics

adrian-prantl · 2024-03-21T17:25:43Z

So, depending on how/when these intrinsics are generated, this patch could affect a few different targets.

And it would introduce a situation we usually try to avoid: the presence of debug info could significantly change code generation.

wolfy1961 · 2024-03-21T17:27:01Z

I only skimmed the diff, but noticed there is no global isel lowering here. Were there any challenges in doing that or was it not considered?

IIRC, the original patch predated Global Isel, or else it was in a very early stage at the time, so it was simply not considered.

Our target of interest is X86, so I believe it was not considered. @wolfy1961 can correct me if I'm mis-remembering.

The patch wasn't meant to be specific to any target, although our efforts were concentrating on X86.

SLTozer · 2024-03-21T17:32:16Z

So, depending on how/when these intrinsics are generated, this patch could affect a few different targets.

Interesting; I think falling back on SelectionDAG would produce a relatively normal result (emitting FAKE_USE), but hard to say whether there would be any knock-on effects. In the worst case scenario we could just drop fake uses in GlobalISel - it should be safe to do so, since the intrinsics themselves are no-ops, it just means that the values will no longer be protected from the start of ISel onwards.

And it would introduce a situation we usually try to avoid: the presence of debug info could significantly change code generation.

This intrinsic has no technical relation to debug info; the purpose is to improve debug info quality when enabled, but the codegen effects of this flag are unrelated to whether debug info is emitted or not; if you use -fextend-lifetimes without -g, you'll prevent optimizations that would have harmed debug info even when that debug info isn't present.

wolfy1961 · 2024-03-21T17:32:23Z

So, depending on how/when these intrinsics are generated, this patch could affect a few different targets.

And it would introduce a situation we usually try to avoid: the presence of debug info could significantly change code generation.

Conceptually, though, extend-lifetimes is purely a codegen change. It could stand on its own, even though it would only be useful to improve debuggability. I think it's important to separate the 2 concepts. Debug info generation by itself does not trigger extend-lifetimes unless we specifically also request extend-lifetimes as well.

felipepiovezan · 2024-03-21T17:47:47Z

Interesting; I think falling back on SelectionDAG would produce a relatively normal result (emitting FAKE_USE),

To clarify, the entire function would be skipped by GlobalISEL, not just the lowering of one instruction.

felipepiovezan · 2024-03-21T17:48:46Z

Conceptually, though, extend-lifetimes is purely a codegen change. It could stand on its own, even though it would only be useful to improve debuggability. I think it's important to separate the 2 concepts. Debug info generation by itself does not trigger extend-lifetimes unless we specifically also request extend-lifetimes as well.

Isn't the following statement contradicting the above?

The extend lifetimes flag is intended to eventually be set on by -Og

SLTozer · 2024-03-21T17:57:34Z

To clarify, the entire function would be skipped by GlobalISEL, not just the lowering of one instruction.

Ah, I misunderstood - in which case forcing it to be dropped until a complete implementation is present would be preferable.

Isn't the following statement contradicting the above?

-Og is also purely a codegen change, it doesn't control whether or not debug info is emitted; in both cases, we're telling the compiler to change codegen to be better for debugging.

arsenm · 2024-03-22T11:04:27Z

Passing through a pseudo through globalisel should be trivial; it will just need legalization handling approximately equivalent to whatever every target does for G_IMPLICIT_DEF

llvm/include/llvm/Target/Target.td

pogo59 · 2024-04-25T12:33:31Z

There was also a lightning talk at the 2017 US Dev Meeting. Our users have been happy with this feature for years.

jryans · 2024-04-25T12:36:31Z

I hope to provide an additional review of this work today / tomorrow.

jryans

Similar to @OCHyams, this looks good to me overall, though it does touch some parts I am not familiar with (like the legalizer). Sony's multi-year usage of this downstream adds confidence in the work. The feature as a while has received positive feedback in various discussions, so it would be great to move forward here.

I believe the only major missing component is GlobalISel support. Once that has been added, I think this would be ready to go.

arsenm · 2024-05-03T18:05:22Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI) {
+  const MachineFunction *MF = MI.getMF();
+  const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
+


Should probably just pass in TII instead of doing the 4 level dance to get to it

arsenm · 2024-05-03T18:06:44Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+  for (const MachineOperand &MO : NextI->operands()) {
+    // Return true if we came across the register from the
+    // previous spill instruction that is killed in NextI.
+    if (MO.isReg() && MO.isUse() && MO.isKill() && MO.getReg() == Reg)
+      return true;
+  }


This doesn't really work correctly for register aliases. Is this the same as MI.killsRegister?

Also it would be best to avoid relying on kill flags. Hopefully we will eliminate them someday

Do you have a suggestion for what to use instead? The intent here is to determine whether the only reason this load exists is for the immediately following fake_use; if we just removed the check for MO.isKill(), this would presumably lead to us deleting loads that are actually being used. Is there some alternative information, such as an analysis, that can determine whether any other instruction is using the loaded value?

The preferred method is to maintain LiveRegUnits while walking from the end of the block (which is the wrong direction for the asmprinter...)

SLTozer · 2024-06-04T12:06:27Z

Apologies for the delay on updating this patch, my attention has been elsewhere for a while. I've added GlobalISel support that seems to work - in the new and updated check-fake-use.py, we test that we can successfully extend the lifetimes of variables for --global-isel=1 -mtriple=aarch64--linux-gnu.

Re: check-fake-use.py, I agree that it's not ideal to need a script for this. Having looked a little more at it, I've changed the script so that instead of checking that the variable's location range covers from DW_AT_low_pc to within 16 bytes of DW_AT_high_pc, we check that the variable's location range covers at least from prologue_end† to epilogue_begin. I personally think testing for this instead of testing for exact byte offsets is more logical and less brittle. We could also get rid of the python script if we currently had a measurement in LLVM for "scope bytes covered excluding prologue/epilogue", but we don't, so for now I think this ought to stay. Thoughts @OCHyams @dwblaikie ?

† It might make sense to use DW_AT_low_pc rather than prologue_end when talking about parameter lifetimes, but for the AArch64 GlobalISel version the prologue precedes the first DBG_VALUE instruction (and this is correct by IR semantics imo!) so I chose prologue_end in this case.

dwblaikie · 2024-06-04T18:24:16Z

A prologue begin/end solution only applies to top level locals in a function, right? For smaller scopes they won't extend that far and would need to be measured relative to the high/low/ranges bytes covered by the scope the variable is within (& of course that'll always neglect the issues of variables starting their lifetime part-way through their scope ({ foo(); int x; foo(); } won't be live for the first foo call, etc))

Though if llvm-dwarfdump statistics need improvement, I'd hope that could be done rather than building a statistic in a separate script, by the sounds of it?

SLTozer · 2024-06-05T11:04:28Z

A prologue begin/end solution only applies to top level locals in a function, right?

Yes, the main point here is about what's needed to test this specific case more than a general point about llvm-dwarfdump --statistics, although there are several things that could be improved about that measurement as you've suggested. I think to summarize, there is a larger scale problem of how to measure debug scope coverage, and a smaller problem of how to determine whether -fextend-lifetimes has worked, which is conceptually simple (does the variable have a DW_AT_location that covers [prologue_end, epilogue_begin)), but for which there is no existing LLVM component that tells us that.

Though, if using a python script is a major blocker there are a few alternatives I think: we could implement this statistic in llvm-dwarfdump (though I'm not sure whether that would be a justifiable change); we could make this a Dexter test in cross-project-tests (at the cost of involving extra components that aren't really related to the test); or we could extract this information some other way, e.g. using some series of sed commands.

tru · 2024-06-05T11:13:26Z

Since python is already a dependency for running the tests I don't see a problem with it. While I am not sure we have tests implemented in python for clang, lldb has a lot of them so it's not unprecedented. I am interested in hearing what the objections would be in this case.

OCHyams

Since python is already a dependency for running the tests I don't see a problem with it. While I am not sure we have tests implemented in python for clang, lldb has a lot of them so it's not unprecedented. I am interested in hearing what the objections would be in this case.

I can't speak for @dwblaikie but my objection was based on the fact that I would personally rather see a more brittle test that is simpler. It's not an objection against using python, just a question of whether it's really needed here.

That said, I don't feel strongly enough to insist and can see the benefit in having a slightly fuzzy test. Especially if it's going to be fiddly due to different debug info emission coming out of SelectionDAG/GlobalISel.

Does the script handle exprloc DW_AT_locations (i.e., single location that implicitly covers the scope with no explicit range)?

OCHyams · 2024-06-06T09:34:02Z

llvm/test/DebugInfo/X86/fake-use.ll

+; RUN: %llc_dwarf -O2 -debugger-tune=sce --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: %python %p/Inputs/check-fake-use.py %t
+; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s | %llc_dwarf - -O2 -debugger-tune=sce -filetype=obj -dwarf-linkage-names=Abstract | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: not %python %p/Inputs/check-fake-use.py %t


check-fake-use.py has a few different fail states - perhaps we should FileCheck for Location list for 'b' does not cover the whole function here and below. wdyt?

Perhaps - it didn't occur to me because we probably shouldn't see any other fail state emerge as a result of just turning off extend-lifetimes, but no harm in making it more specific!

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

SLTozer · 2024-06-06T11:50:22Z

Does the script handle exprloc DW_AT_locations (i.e., single location that implicitly covers the scope with no explicit range)?

No, although I think that shouldn't be possible? I think we only emit that for constants or declares.

arsenm · 2024-06-06T12:04:53Z

llvm/test/CodeGen/X86/fake-use-tailcall.ll

+  notail call void (...) @llvm.fake.use(i32 %v)
+  ret void
+}
+


Test a case where the value produced by the call isn't used by the fake use, and is the return value?

arsenm · 2024-06-06T12:05:15Z

llvm/test/CodeGen/X86/fake-use-suppress-load.ll

+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{!"clang version 9.0.0"}
+!2 = !{i32 -2147471544}


Don't need the metadata

arsenm · 2024-06-06T12:05:38Z

llvm/test/CodeGen/X86/fake-use-sroa.ll

@@ -0,0 +1,54 @@
+; RUN: opt -S -passes=sroa %s | FileCheck %s


This should go in test/Transforms/SROA, not codegen

arsenm · 2024-06-06T12:06:06Z

llvm/test/CodeGen/X86/fake-use-split-ret.ll

@@ -0,0 +1,53 @@
+; RUN: opt -mtriple=x86_64-unknown-unknown -S -codegenprepare <%s -o - | FileCheck %s


Move to test/Transforms/CodeGenPrepare/

arsenm · 2024-06-06T12:06:11Z

llvm/test/CodeGen/X86/fake-use-split-ret.ll

+; ModuleID = 'test.cpp'
+source_filename = "test.cpp"
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-unknown"


Don't need this

arsenm · 2024-06-06T12:06:27Z

llvm/test/CodeGen/X86/fake-use-split-ret.ll

+!llvm.module.flags = !{!0, !1}
+!llvm.ident = !{!2}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{!"clang version 7.0.0"}


Don't need the metadata

arsenm · 2024-06-06T12:06:52Z

llvm/test/CodeGen/X86/fake-use-hpfloat.ll

@@ -0,0 +1,17 @@
+; assert in DAGlegalizer with fake use of half precision float.
+; Changes to half float promotion.
+; RUN: llc -O2 -stop-after=finalize-isel -filetype=asm -o - %s | FileCheck %s


-O2 and -filetype=asm are the defaults

arsenm · 2024-06-06T12:06:59Z

llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll

+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{!"clang version 10.0.0"}


Missing newline at end of file

arsenm · 2024-06-06T12:07:46Z

llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll

@@ -0,0 +1,40 @@
+; RUN: llc < %s -stop-after=finalize-isel | FileCheck %s --implicit-check-not=FAKE_USE


test/CodeGen/MIR is for testing of the MIR printer/parser, not tests that check MIR output. This should go to test/CodeGen/X86

arsenm · 2024-06-06T12:09:10Z

llvm/test/CodeGen/MIR/X86/fake-use-phi.mir

@@ -0,0 +1,95 @@
+# RUN: llc < %s -x mir -run-pass=codegenprepare | FileCheck %s --implicit-check-not="llvm.fake.use"


This test doesn't make sense, and I'm surprised this doesn't error on you. You're trying to run an IR pass on MIR, and the IR is supposed to just be a constant here and the MIR should not exist yet. You should move this to test/CodeGenPrepare and run it with opt

OCHyams · 2024-06-06T12:18:35Z

I said
Does the script handle exprloc DW_AT_locations (i.e., single location that implicitly covers the scope with no explicit range)?

@SLTozer said
No, although I think that shouldn't be possible? I think we only emit that for constants or declares.

It's definitely possible:

$ cat test.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-unknown"

define i32 @f(i32 %a) !dbg !4 {
entry:
  tail call void @llvm.dbg.value(metadata i32 %a, metadata !8, metadata !DIExpression()), !dbg !10
  ret i32 0, !dbg !10
}

declare void @llvm.dbg.value(metadata, metadata, metadata)

!llvm.dbg.cu = !{!0}
!llvm.debugify = !{!2, !2}
!llvm.module.flags = !{!3}

!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!1 = !DIFile(filename: "test.ll", directory: "/")
!2 = !{i32 1}
!3 = !{i32 2, !"Debug Info Version", i32 3}
!4 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !1, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !7)
!5 = !DISubroutineType(types: !6)
!6 = !{}
!7 = !{!8}
!8 = !DILocalVariable(name: "1", scope: !4, file: !1, line: 1, type: !9)
!9 = !DIBasicType(name: "ty32", size: 32, encoding: DW_ATE_unsigned)
!10 = !DILocation(line: 1, column: 1, scope: !4)

$ llc test.ll --filetype=obj -o - | llvm-dwarfdump -
<...>
0x00000043:     DW_TAG_variable
                  DW_AT_location        (DW_OP_reg5 RDI)
                  DW_AT_name    ("1")
                  DW_AT_decl_file       ("/test.ll")
                  DW_AT_decl_line       (1)
                  DW_AT_type    (0x00000051 "ty32")

SLTozer · 2024-06-06T13:27:28Z

It's definitely possible:

Ah that's true, if it's going to be live at a single location for every instruction in the function then it'll emit with a fixed location. That shouldn't happen in this test since we have the big register kill operation, but no point in making assumptions when we can add coverage instead - will update.

arsenm · 2024-06-06T17:32:55Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+  // If this is a reload of a spilled register that only feeds into a FAKE_USE
+  // instruction, meaning the load value has no effect on the program and has
+  // only been kept alive for debugging; since it is still available on the
+  // stack, we can skip the load itself.


I don't understand the comment in this context. The load isn't being skipped, it's just skipping the comments about the instruction

arsenm · 2024-06-06T17:37:58Z

llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir

+  attributes #1 = { nocallback nofree nosync nounwind willreturn }
+
+  !llvm.module.flags = !{!0}
+  !llvm.ident = !{!1}


Can drop this

arsenm · 2024-06-06T17:38:17Z

llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir

+    %0:gr64 = COPY $rdi
+    %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load (s32) from @glb, align 16, !tbaa !2)
+    MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store (s32) into %ir.p, !tbaa !2)
+    %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load (s32) from `ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1)`, !tbaa !2)


Drop the IR references?

arsenm · 2024-06-06T17:38:25Z

llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir

+liveins:
+  - { reg: '$rdi', virtual-reg: '%0' }
+frameInfo:
+  isFrameAddressTaken: false


Can remove most of this, -simplify-mir helps

arsenm · 2024-06-06T17:38:41Z

llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir

@@ -0,0 +1,223 @@
+# Prevent the machine scheduler from moving instructions past FAKE_USE.
+# RUN: llc -run-pass machine-scheduler -debug-only=machine-scheduler 2>&1 -o - %s | FileCheck %s


This belongs in test/CodeGen/X86, not under MIR as it is not a MIR support test

arsenm · 2024-06-06T17:39:28Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+  for (const MachineOperand &MO : NextI->operands()) {
+    // Return true if we came across the register from the
+    // previous spill instruction that is killed in NextI.
+    if (MO.isReg() && MO.isUse() && MO.isKill() && MO.getReg() == Reg)
+      return true;
+  }


The preferred method is to maintain LiveRegUnits while walking from the end of the block (which is the wrong direction for the asmprinter...)

arsenm · 2024-06-06T17:41:22Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+        if (isLoadFeedingIntoFakeUse(MI, TII))
+          break;


It's pretty inappropriate for an instruction to disappear in the AsmPrinter. Can you handle this optimization the same way we do for kill?

dwblaikie · 2024-06-06T18:39:05Z

Since python is already a dependency for running the tests I don't see a problem with it. While I am not sure we have tests implemented in python for clang, lldb has a lot of them so it's not unprecedented. I am interested in hearing what the objections would be in this case.

I can't speak for @dwblaikie but my objection was based on the fact that I would personally rather see a more brittle test that is simpler. It's not an objection against using python, just a question of whether it's really needed here.

Yeah, I jumped in the middle a bit - I haven't looked at the python script/tests in question yet.

But perhaps one example to discuss would be useful?

& yeah, for sufficiently narrow/single-transform tests I don't think it'd be that bad to test for the exact result...

SLTozer · 2024-06-06T22:54:38Z

Thank you @arsenm for the thorough reviews - I'll fix up the remaining comments (regarding AsmPrinter) early tomorrow.

But perhaps one example to discuss would be useful?

The script only applies to one test right now, though it's run in 4 different ways; the script is specialized for the test file, but it could be parameterized. The purpose of this test is essentially checking that FAKE_USE does what it says on the tin, that a variable that should otherwise by all rights be killed is available for its full scope.

The difficult part is that the test still isn't narrow; the test runs llc from IR to object with optimizations enabled and checks at the end that the variable under test is still available for its full scope. We could try making the checks exact, but it'd mean 4 different checks (or more if we increase coverage further) that could each change if any pass changes its behaviour. The actual check is simple I think - we read the output of dwarfdump to confirm that the parameter has a continuous location range that extends from [prologue_end, epilogue_begin). Ideally we'd have a proper way of doing this, but I think it'd be messy and brittle to substitute exact checks.

I think I'd be cautious on principle about adding special scripts to run a single test, but I don't see any easy alternatives without major drawbacks, and I think this is worth testing for; ultimately if the verdict comes down against the python script, my backup proposal is to make it a Dexter test (which will need to be moved to the subsequent patch which adds front-end support).

SLTozer · 2024-06-07T14:52:23Z

It ended up taking a little longer than I expected to make the AsmPrinter changes; I couldn't see a sensible place to put the "remove loads feeding into fake use" behaviour, so I made it into a pass. The main goal here was to have this take place very late, so that no subsequent pass takes advantage of the removed load/fake_use to shorten the stack lifetime of a variable, but before LiveDebugValues, so that LDV has an accurate understanding of which machine locations house the required values.

The pass itself runs only if the function has optdebug enabled, since those should be the only functions with fake uses (tests have been updated to reflect this); it's not an error if we don't run this pass for a function that has fake use instructions though. I'm happy to move it to a separate patch if it makes it easier for reviewers - even outside of test files, this patch is already very large.

arsenm · 2024-06-07T14:49:40Z

llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp

+
+  SmallDenseMap<Register, SmallVector<MachineInstr *>> RegFakeUses;
+  LivePhysRegs.init(*TRI);
+  dbgs() << "Running the pass!\n";


Stray debug printing

I've pushed up the removal for those already, but this was some very quick reviewing!

arsenm · 2024-06-07T14:49:51Z

llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp

+      skipFunction(MF.getFunction()))
+    return false;
+
+  dbgs() << "We could run the pass!\n";


Stray debug printing

arsenm · 2024-06-07T14:50:39Z

llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp

+          dbgs() << "Reg is live: ";
+          TRI->dumpReg(Reg);
+          dbgs() << "\n";


More leftover debugging

arsenm · 2024-06-07T14:51:55Z

llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp

+INITIALIZE_PASS_END(RemoveLoadsIntoFakeUses, DEBUG_TYPE,
+                    "Remove Loads Into Fake Uses", false, false)
+
+bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {


I wonder if we can just do this in ExpandPostRAPseudos and save having yet another pass.

This would also bring the side effect where I think we should have ExpandPostRAPseudos run in reverse and lazily maintain LiveRegUnits

I'm fine to move it into another pass, and patch up that pass as I go through it - just a quick question though: Might this step too far outside the scope of what ExpandPostRAPseudos is expected to do? If so, would that justify changing the name slightly, or do we prefer to keep those fixed over time?

And also a minor unrelated, I noticed that passes may be added to TargetPassConfig::addMachinePasses or to CodeGenPassBuilder<Derived>::addMachinePasses, and most (but not all) passes are in both - is there an important distinction between them for determining where a new pass should go?

Edit: Although one concern about moving it to ExpandPostRAPseudos would be if there are passes between that and LiveDebugValues that might shorten the stack lifetime of a variable - I don't see any obvious candidates, but if you know any off the top of your head that would be a good argument for not moving this behaviour into there.

I do believe we have an elide-reload-of-kill optimization but a 1 minute scan didn't find it, wherever that happens would be the best place.
TargetPassConfig::addMachinePasses = OldPM CodeGenPassBuilder = NewPM and incomplete

Might be MachineLateInstrsCleanup? I see that that relies on isKill() for liveness checks as well, but might otherwise make sense as a location. It's hard to tell whether there are any optimizations afterwards that might cause an issue; it'd probably be faster to implement and test it than scan through the source of all the applicable passes.

It's not MachineLateInstrsCleanup (which is also running in the wrong direction). It's looking for killed register operands, not KILL instructions

OCHyams · 2024-06-07T14:58:15Z

Ah that's true, if it's going to be live at a single location for every instruction in the function then it'll emit with a fixed location. That shouldn't happen in this test since we have the big register kill operation, but no point in making assumptions when we can add coverage instead - will update.

Ah ok if it can't occur in the test then no need. We can extend it as and when necessary (that's better than having additional complexity that isn't used in the script).

[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes

c6d6ccd

SLTozer added debuginfo llvm Umbrella label for LLVM issues llvm:ir labels Mar 21, 2024

SLTozer requested review from tru, adrian-prantl, OCHyams and dwblaikie March 21, 2024 16:37

SLTozer self-assigned this Mar 21, 2024

llvmbot added backend:X86 llvm:support llvm:SelectionDAG SelectionDAGISel as well llvm:analysis llvm:transforms labels Mar 21, 2024

arsenm reviewed Mar 22, 2024

View reviewed changes

llvm/include/llvm/Target/Target.td Show resolved Hide resolved

Expand FAKE_USE tablegen comment to note special handling

c8b6aa4

jryans reviewed Apr 25, 2024

View reviewed changes

arsenm reviewed May 3, 2024

View reviewed changes

SLTozer added 2 commits June 4, 2024 10:32

Fix globalisel stuff

9b9439f

Python3-ify the check-fake-use script

99b193f

llvmbot added the llvm:globalisel label Jun 4, 2024

Pass TII into isLoadFeedingIntoFakeUse

f462542

OCHyams reviewed Jun 6, 2024

View reviewed changes

arsenm reviewed Jun 6, 2024

View reviewed changes

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp Outdated Show resolved Hide resolved

Remove unnecessary copyFlagsFromInstruction

cc89b2e

arsenm requested changes Jun 6, 2024

View reviewed changes

Update tests

77eb807

arsenm reviewed Jun 6, 2024

View reviewed changes

Fix scheduler, update comment in AsmPrinter

1a17102

SLTozer added 3 commits June 7, 2024 15:30

Remove dangling test, add 'optdebug' to existing fake use tests

9ce05eb

Add pass to remove loads that only feed into FAKE_USEs

f68f10f

Remove printf debugging, move RemoveLoads to run before LDV

d4d7568

arsenm reviewed Jun 7, 2024

View reviewed changes

		@@ -0,0 +1,54 @@
		; RUN: opt -S -passes=sroa %s \| FileCheck %s

		@@ -0,0 +1,53 @@
		; RUN: opt -mtriple=x86_64-unknown-unknown -S -codegenprepare <%s -o - \| FileCheck %s

		@@ -0,0 +1,40 @@
		; RUN: llc < %s -stop-after=finalize-isel \| FileCheck %s --implicit-check-not=FAKE_USE

		@@ -0,0 +1,95 @@
		# RUN: llc < %s -x mir -run-pass=codegenprepare \| FileCheck %s --implicit-check-not="llvm.fake.use"

		@@ -0,0 +1,223 @@
		# Prevent the machine scheduler from moving instructions past FAKE_USE.
		# RUN: llc -run-pass machine-scheduler -debug-only=machine-scheduler 2>&1 -o - %s \| FileCheck %s

[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes #86149

Are you sure you want to change the base?

[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes #86149

Conversation

SLTozer commented Mar 21, 2024

llvmbot commented Mar 21, 2024 • edited

github-actions bot commented Mar 21, 2024 • edited

felipepiovezan commented Mar 21, 2024

pogo59 commented Mar 21, 2024

SLTozer commented Mar 21, 2024

felipepiovezan commented Mar 21, 2024 • edited

adrian-prantl commented Mar 21, 2024

wolfy1961 commented Mar 21, 2024

SLTozer commented Mar 21, 2024

wolfy1961 commented Mar 21, 2024

felipepiovezan commented Mar 21, 2024

felipepiovezan commented Mar 21, 2024

SLTozer commented Mar 21, 2024

arsenm commented Mar 22, 2024

pogo59 commented Apr 25, 2024

jryans commented Apr 25, 2024

jryans left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SLTozer commented Jun 4, 2024 • edited

dwblaikie commented Jun 4, 2024

SLTozer commented Jun 5, 2024

tru commented Jun 5, 2024

OCHyams left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SLTozer commented Jun 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OCHyams commented Jun 6, 2024 • edited

SLTozer commented Jun 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwblaikie commented Jun 6, 2024

SLTozer commented Jun 6, 2024

SLTozer commented Jun 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SLTozer Jun 7, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OCHyams commented Jun 7, 2024

llvmbot commented Mar 21, 2024 •

edited

github-actions bot commented Mar 21, 2024 •

edited

felipepiovezan commented Mar 21, 2024 •

edited

SLTozer commented Jun 4, 2024 •

edited

OCHyams commented Jun 6, 2024 •

edited

SLTozer Jun 7, 2024 •

edited