Fix stack layout for frames larger than 2gb #84114

wesleywiser · 2024-03-06T04:21:11Z

For very large stack frames, the offset from the stack pointer to a
local can be more than 2^31 which overflows various int offsets
in the frame lowering code.

This patch updates the frame lowering code to calculate the offsets as
64-bit values and resolves the overflows, resulting in the correct
codegen for very large frames.

Fixes #48911

llvmbot · 2024-03-06T04:21:40Z

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-mc
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-msp430

@llvm/pr-subscribers-debuginfo

Author: Wesley Wiser (wesleywiser)

Changes

For very large stack frames, the offset from the stack pointer to a
local can be more than 2^31 which overflows various int offsets
in the frame lowering code.

This patch updates the frame lowering code to calculate the offsets as
64-bit values and resolves the overflows, resulting in the correct
codegen for very large frames.

Fixes #48911

Patch is 35.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/84114.diff

21 Files Affected:

(modified) llvm/include/llvm/CodeGen/MachineFrameInfo.h (+7-9)
(modified) llvm/include/llvm/CodeGen/TargetFrameLowering.h (+2-2)
(modified) llvm/include/llvm/MC/MCAsmBackend.h (+1-1)
(modified) llvm/include/llvm/MC/MCDwarf.h (+19-19)
(modified) llvm/lib/CodeGen/CFIInstrInserter.cpp (+5-5)
(modified) llvm/lib/CodeGen/MachineFrameInfo.cpp (+1-1)
(modified) llvm/lib/CodeGen/PrologEpilogInserter.cpp (+2-2)
(modified) llvm/lib/MC/MCDwarf.cpp (+3-3)
(modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp (+5-5)
(modified) llvm/lib/Target/ARM/ARMFrameLowering.cpp (+2-2)
(modified) llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp (+1-1)
(modified) llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h (+1-1)
(modified) llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp (+2-2)
(modified) llvm/lib/Target/MSP430/MSP430FrameLowering.cpp (+1-1)
(modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+6-6)
(modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+7-6)
(modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+14-14)
(modified) llvm/lib/Target/X86/X86FrameLowering.h (+2-2)
(modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+6-4)
(modified) llvm/test/CodeGen/PowerPC/huge-frame-size.ll (+1-1)
(added) llvm/test/CodeGen/X86/huge-stack.ll (+22)

diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 7d11d63d4066f4..243b124d89ff40 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -251,7 +251,7 @@ class MachineFrameInfo {
   /// targets, this value is only used when generating debug info (via
   /// TargetRegisterInfo::getFrameIndexReference); when generating code, the
   /// corresponding adjustments are performed directly.
-  int OffsetAdjustment = 0;
+  int64_t OffsetAdjustment = 0;
 
   /// The prolog/epilog code inserter may process objects that require greater
   /// alignment than the default alignment the target provides.
@@ -280,7 +280,7 @@ class MachineFrameInfo {
   /// setup/destroy pseudo instructions (as defined in the TargetFrameInfo
   /// class).  This information is important for frame pointer elimination.
   /// It is only valid during and after prolog/epilog code insertion.
-  unsigned MaxCallFrameSize = ~0u;
+  unsigned long MaxCallFrameSize = ~0ul;
 
   /// The number of bytes of callee saved registers that the target wants to
   /// report for the current function in the CodeView S_FRAMEPROC record.
@@ -591,10 +591,10 @@ class MachineFrameInfo {
   uint64_t estimateStackSize(const MachineFunction &MF) const;
 
   /// Return the correction for frame offsets.
-  int getOffsetAdjustment() const { return OffsetAdjustment; }
+  int64_t getOffsetAdjustment() const { return OffsetAdjustment; }
 
   /// Set the correction for frame offsets.
-  void setOffsetAdjustment(int Adj) { OffsetAdjustment = Adj; }
+  void setOffsetAdjustment(int64_t Adj) { OffsetAdjustment = Adj; }
 
   /// Return the alignment in bytes that this function must be aligned to,
   /// which is greater than the default stack alignment provided by the target.
@@ -651,17 +651,15 @@ class MachineFrameInfo {
   /// CallFrameSetup/Destroy pseudo instructions are used by the target, and
   /// then only during or after prolog/epilog code insertion.
   ///
-  unsigned getMaxCallFrameSize() const {
+  unsigned long getMaxCallFrameSize() const {
     // TODO: Enable this assert when targets are fixed.
     //assert(isMaxCallFrameSizeComputed() && "MaxCallFrameSize not computed yet");
     if (!isMaxCallFrameSizeComputed())
       return 0;
     return MaxCallFrameSize;
   }
-  bool isMaxCallFrameSizeComputed() const {
-    return MaxCallFrameSize != ~0u;
-  }
-  void setMaxCallFrameSize(unsigned S) { MaxCallFrameSize = S; }
+  bool isMaxCallFrameSizeComputed() const { return MaxCallFrameSize != ~0ul; }
+  void setMaxCallFrameSize(unsigned long S) { MaxCallFrameSize = S; }
 
   /// Returns how many bytes of callee-saved registers the target pushed in the
   /// prologue. Only used for debug info.
diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index 0b9cacecc7cbe1..a256574d9d2bee 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -51,7 +51,7 @@ class TargetFrameLowering {
   // Maps a callee saved register to a stack slot with a fixed offset.
   struct SpillSlot {
     unsigned Reg;
-    int Offset; // Offset relative to stack pointer on function entry.
+    long Offset; // Offset relative to stack pointer on function entry.
   };
 
   struct DwarfFrameBase {
@@ -66,7 +66,7 @@ class TargetFrameLowering {
       // Used with FrameBaseKind::Register.
       unsigned Reg;
       // Used with FrameBaseKind::CFA.
-      int Offset;
+      long Offset;
       struct WasmFrameBase WasmLoc;
     } Location;
   };
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 01a64fb425a94f..689e3cd5dbf206 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -232,7 +232,7 @@ class MCAsmBackend {
   virtual void handleAssemblerFlag(MCAssemblerFlag Flag) {}
 
   /// Generate the compact unwind encoding for the CFI instructions.
-  virtual uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  virtual uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                                  const MCContext *Ctxt) const {
     return 0;
   }
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index 18056c5fdf816a..e39c44f38a342a 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -508,7 +508,7 @@ class MCCFIInstruction {
   MCSymbol *Label;
   unsigned Register;
   union {
-    int Offset;
+    long Offset;
     unsigned Register2;
   };
   unsigned AddressSpace = ~0u;
@@ -516,7 +516,7 @@ class MCCFIInstruction {
   std::vector<char> Values;
   std::string Comment;
 
-  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int O, SMLoc Loc,
+  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, long O, SMLoc Loc,
                    StringRef V = "", StringRef Comment = "")
       : Operation(Op), Label(L), Register(R), Offset(O), Loc(Loc),
         Values(V.begin(), V.end()), Comment(Comment) {
@@ -528,7 +528,7 @@ class MCCFIInstruction {
     assert(Op == OpRegister);
   }
 
-  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int O, unsigned AS,
+  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, long O, unsigned AS,
                    SMLoc Loc)
       : Operation(Op), Label(L), Register(R), Offset(O), AddressSpace(AS),
         Loc(Loc) {
@@ -538,7 +538,7 @@ class MCCFIInstruction {
 public:
   /// .cfi_def_cfa defines a rule for computing CFA as: take address from
   /// Register and add Offset to it.
-  static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, int Offset,
+  static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, long Offset,
                                     SMLoc Loc = {}) {
     return MCCFIInstruction(OpDefCfa, L, Register, Offset, Loc);
   }
@@ -547,13 +547,13 @@ class MCCFIInstruction {
   /// on Register will be used instead of the old one. Offset remains the same.
   static MCCFIInstruction createDefCfaRegister(MCSymbol *L, unsigned Register,
                                                SMLoc Loc = {}) {
-    return MCCFIInstruction(OpDefCfaRegister, L, Register, 0, Loc);
+    return MCCFIInstruction(OpDefCfaRegister, L, Register, 0L, Loc);
   }
 
   /// .cfi_def_cfa_offset modifies a rule for computing CFA. Register
   /// remains the same, but offset is new. Note that it is the absolute offset
   /// that will be added to a defined register to the compute CFA address.
-  static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int Offset,
+  static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, long Offset,
                                           SMLoc Loc = {}) {
     return MCCFIInstruction(OpDefCfaOffset, L, 0, Offset, Loc);
   }
@@ -561,7 +561,7 @@ class MCCFIInstruction {
   /// .cfi_adjust_cfa_offset Same as .cfi_def_cfa_offset, but
   /// Offset is a relative value that is added/subtracted from the previous
   /// offset.
-  static MCCFIInstruction createAdjustCfaOffset(MCSymbol *L, int Adjustment,
+  static MCCFIInstruction createAdjustCfaOffset(MCSymbol *L, long Adjustment,
                                                 SMLoc Loc = {}) {
     return MCCFIInstruction(OpAdjustCfaOffset, L, 0, Adjustment, Loc);
   }
@@ -581,7 +581,7 @@ class MCCFIInstruction {
   /// .cfi_offset Previous value of Register is saved at offset Offset
   /// from CFA.
   static MCCFIInstruction createOffset(MCSymbol *L, unsigned Register,
-                                       int Offset, SMLoc Loc = {}) {
+                                       long Offset, SMLoc Loc = {}) {
     return MCCFIInstruction(OpOffset, L, Register, Offset, Loc);
   }
 
@@ -589,7 +589,7 @@ class MCCFIInstruction {
   /// Offset from the current CFA register. This is transformed to .cfi_offset
   /// using the known displacement of the CFA register from the CFA.
   static MCCFIInstruction createRelOffset(MCSymbol *L, unsigned Register,
-                                          int Offset, SMLoc Loc = {}) {
+                                          long Offset, SMLoc Loc = {}) {
     return MCCFIInstruction(OpRelOffset, L, Register, Offset, Loc);
   }
 
@@ -602,12 +602,12 @@ class MCCFIInstruction {
 
   /// .cfi_window_save SPARC register window is saved.
   static MCCFIInstruction createWindowSave(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpWindowSave, L, 0, 0, Loc);
+    return MCCFIInstruction(OpWindowSave, L, 0, 0L, Loc);
   }
 
   /// .cfi_negate_ra_state AArch64 negate RA state.
   static MCCFIInstruction createNegateRAState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpNegateRAState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpNegateRAState, L, 0, 0L, Loc);
   }
 
   /// .cfi_restore says that the rule for Register is now the same as it
@@ -615,31 +615,31 @@ class MCCFIInstruction {
   /// by .cfi_startproc were executed.
   static MCCFIInstruction createRestore(MCSymbol *L, unsigned Register,
                                         SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRestore, L, Register, 0, Loc);
+    return MCCFIInstruction(OpRestore, L, Register, 0L, Loc);
   }
 
   /// .cfi_undefined From now on the previous value of Register can't be
   /// restored anymore.
   static MCCFIInstruction createUndefined(MCSymbol *L, unsigned Register,
                                           SMLoc Loc = {}) {
-    return MCCFIInstruction(OpUndefined, L, Register, 0, Loc);
+    return MCCFIInstruction(OpUndefined, L, Register, 0L, Loc);
   }
 
   /// .cfi_same_value Current value of Register is the same as in the
   /// previous frame. I.e., no restoration is needed.
   static MCCFIInstruction createSameValue(MCSymbol *L, unsigned Register,
                                           SMLoc Loc = {}) {
-    return MCCFIInstruction(OpSameValue, L, Register, 0, Loc);
+    return MCCFIInstruction(OpSameValue, L, Register, 0L, Loc);
   }
 
   /// .cfi_remember_state Save all current rules for all registers.
   static MCCFIInstruction createRememberState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRememberState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpRememberState, L, 0, 0L, Loc);
   }
 
   /// .cfi_restore_state Restore the previously saved state.
   static MCCFIInstruction createRestoreState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRestoreState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpRestoreState, L, 0, 0L, Loc);
   }
 
   /// .cfi_escape Allows the user to add arbitrary bytes to the unwind
@@ -650,7 +650,7 @@ class MCCFIInstruction {
   }
 
   /// A special wrapper for .cfi_escape that indicates GNU_ARGS_SIZE
-  static MCCFIInstruction createGnuArgsSize(MCSymbol *L, int Size,
+  static MCCFIInstruction createGnuArgsSize(MCSymbol *L, long Size,
                                             SMLoc Loc = {}) {
     return MCCFIInstruction(OpGnuArgsSize, L, 0, Size, Loc);
   }
@@ -677,7 +677,7 @@ class MCCFIInstruction {
     return AddressSpace;
   }
 
-  int getOffset() const {
+  long getOffset() const {
     assert(Operation == OpDefCfa || Operation == OpOffset ||
            Operation == OpRelOffset || Operation == OpDefCfaOffset ||
            Operation == OpAdjustCfaOffset || Operation == OpGnuArgsSize ||
@@ -705,7 +705,7 @@ struct MCDwarfFrameInfo {
   unsigned CurrentCfaRegister = 0;
   unsigned PersonalityEncoding = 0;
   unsigned LsdaEncoding = 0;
-  uint32_t CompactUnwindEncoding = 0;
+  uint64_t CompactUnwindEncoding = 0;
   bool IsSignalFrame = false;
   bool IsSimple = false;
   unsigned RAReg = static_cast<unsigned>(INT_MAX);
diff --git a/llvm/lib/CodeGen/CFIInstrInserter.cpp b/llvm/lib/CodeGen/CFIInstrInserter.cpp
index 87b062a16df1d2..06c4686caba96d 100644
--- a/llvm/lib/CodeGen/CFIInstrInserter.cpp
+++ b/llvm/lib/CodeGen/CFIInstrInserter.cpp
@@ -68,9 +68,9 @@ class CFIInstrInserter : public MachineFunctionPass {
   struct MBBCFAInfo {
     MachineBasicBlock *MBB;
     /// Value of cfa offset valid at basic block entry.
-    int IncomingCFAOffset = -1;
+    long IncomingCFAOffset = -1;
     /// Value of cfa offset valid at basic block exit.
-    int OutgoingCFAOffset = -1;
+    long OutgoingCFAOffset = -1;
     /// Value of cfa register valid at basic block entry.
     unsigned IncomingCFARegister = 0;
     /// Value of cfa register valid at basic block exit.
@@ -120,7 +120,7 @@ class CFIInstrInserter : public MachineFunctionPass {
   /// Return the cfa offset value that should be set at the beginning of a MBB
   /// if needed. The negated value is needed when creating CFI instructions that
   /// set absolute offset.
-  int getCorrectCFAOffset(MachineBasicBlock *MBB) {
+  long getCorrectCFAOffset(MachineBasicBlock *MBB) {
     return MBBVector[MBB->getNumber()].IncomingCFAOffset;
   }
 
@@ -175,7 +175,7 @@ void CFIInstrInserter::calculateCFAInfo(MachineFunction &MF) {
 
 void CFIInstrInserter::calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo) {
   // Outgoing cfa offset set by the block.
-  int SetOffset = MBBInfo.IncomingCFAOffset;
+  long SetOffset = MBBInfo.IncomingCFAOffset;
   // Outgoing cfa register set by the block.
   unsigned SetRegister = MBBInfo.IncomingCFARegister;
   MachineFunction *MF = MBBInfo.MBB->getParent();
@@ -188,7 +188,7 @@ void CFIInstrInserter::calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo) {
   for (MachineInstr &MI : *MBBInfo.MBB) {
     if (MI.isCFIInstruction()) {
       std::optional<unsigned> CSRReg;
-      std::optional<int> CSROffset;
+      std::optional<long> CSROffset;
       unsigned CFIIndex = MI.getOperand(0).getCFIIndex();
       const MCCFIInstruction &CFI = Instrs[CFIIndex];
       switch (CFI.getOperation()) {
diff --git a/llvm/lib/CodeGen/MachineFrameInfo.cpp b/llvm/lib/CodeGen/MachineFrameInfo.cpp
index 280d3a6a41edc9..d7dc2a926f56e2 100644
--- a/llvm/lib/CodeGen/MachineFrameInfo.cpp
+++ b/llvm/lib/CodeGen/MachineFrameInfo.cpp
@@ -196,7 +196,7 @@ void MachineFrameInfo::computeMaxCallFrameSize(const MachineFunction &MF) {
     for (const MachineInstr &MI : MBB) {
       unsigned Opcode = MI.getOpcode();
       if (Opcode == FrameSetupOpcode || Opcode == FrameDestroyOpcode) {
-        unsigned Size = TII.getFrameSize(MI);
+        unsigned long Size = TII.getFrameSize(MI);
         MaxCallFrameSize = std::max(MaxCallFrameSize, Size);
         AdjustsStack = true;
       } else if (MI.isInlineAsm()) {
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index 8af17e63e25c75..1193a79ee0b308 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -358,7 +358,7 @@ void PEI::calculateCallFrameInfo(MachineFunction &MF) {
   const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();
   MachineFrameInfo &MFI = MF.getFrameInfo();
 
-  unsigned MaxCallFrameSize = 0;
+  unsigned long MaxCallFrameSize = 0;
   bool AdjustsStack = MFI.adjustsStack();
 
   // Get the function call frame set-up and tear-down instruction opcode
@@ -374,7 +374,7 @@ void PEI::calculateCallFrameInfo(MachineFunction &MF) {
   for (MachineBasicBlock &BB : MF)
     for (MachineBasicBlock::iterator I = BB.begin(); I != BB.end(); ++I)
       if (TII.isFrameInstr(*I)) {
-        unsigned Size = TII.getFrameSize(*I);
+        unsigned long Size = TII.getFrameSize(*I);
         if (Size > MaxCallFrameSize) MaxCallFrameSize = Size;
         AdjustsStack = true;
         FrameSDOps.push_back(I);
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index d0face9140de66..9cb26596e82250 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -1293,8 +1293,8 @@ static void EmitPersonality(MCStreamer &streamer, const MCSymbol &symbol,
 namespace {
 
 class FrameEmitterImpl {
-  int CFAOffset = 0;
-  int InitialCFAOffset = 0;
+  long CFAOffset = 0;
+  long InitialCFAOffset = 0;
   bool IsEH;
   MCObjectStreamer &Streamer;
 
@@ -1408,7 +1408,7 @@ void FrameEmitterImpl::emitCFIInstruction(const MCCFIInstruction &Instr) {
     if (!IsEH)
       Reg = MRI->getDwarfRegNumFromDwarfEHRegNum(Reg);
 
-    int Offset = Instr.getOffset();
+    long Offset = Instr.getOffset();
     if (IsRelative)
       Offset -= CFAOffset;
     Offset = Offset / dataAlignmentFactor;
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
index 30ef3680ae79c9..cdb4ddf471d418 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
@@ -584,7 +584,7 @@ class DarwinAArch64AsmBackend : public AArch64AsmBackend {
   /// Encode compact unwind stack adjustment for frameless functions.
   /// See UNWIND_ARM64_FRAMELESS_STACK_SIZE_MASK in compact_unwind_encoding.h.
   /// The stack size always needs to be 16 byte aligned.
-  uint32_t encodeStackAdjustment(uint32_t StackSize) const {
+  uint64_t encodeStackAdjustment(uint64_t StackSize) const {
     return (StackSize / 16) << 12;
   }
 
@@ -602,7 +602,7 @@ class DarwinAArch64AsmBackend : public AArch64AsmBackend {
   }
 
   /// Generate the compact unwind encoding from the CFI directives.
-  uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                          const MCContext *Ctxt) const override {
     ArrayRef<MCCFIInstruction> Instrs = FI->Instructions;
     if (Instrs.empty())
@@ -612,10 +612,10 @@ class DarwinAArch64AsmBackend : public AArch64AsmBackend {
       return CU::UNWIND_ARM64_MODE_DWARF;
 
     bool HasFP = false;
-    unsigned StackSize = 0;
+    unsigned long StackSize = 0;
 
-    uint32_t CompactUnwindEncoding = 0;
-    int CurOffset = 0;
+    uint64_t CompactUnwindEncoding = 0;
+    long CurOffset = 0;
     for (size_t i = 0, e = Instrs.size(); i != e; ++i) {
       const MCCFIInstruction &Inst = Instrs[i];
 
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 9b54dd4e4e618d..a1012f3996e76b 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -1165,7 +1165,7 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
         if (STI.splitFramePushPop(MF)) {
           unsigned DwarfReg = MRI->getDwarfRegNum(
               Reg == ARM::R12 ? ARM::RA_AUTH_CODE : Reg, true);
-          unsigned Offset = MFI.getObjectOffset(FI);
+          uint64_t Offset = MFI.getObjectOffset(FI);
           unsigned CFIIndex = MF.addFrameInst(
               MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
           BuildMI(MBB, Pos, dl, TII.get(TargetOpcode::CFI_INSTRUCTION))
@@ -1187,7 +1187,7 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
       if ((Reg >= ARM::D0 && Reg <= ARM::D31) &&
           (Reg < ARM::D8 || Reg >= ARM::D8 + AFI->getNumAlignedDPRCS2Regs())) {
         unsigned DwarfReg = MRI->getDwarfRegNum(Reg, true);
-        unsigned Offset = MFI.getObjectOffset(FI);
+        uint64_t Offset = MFI.getObjectOffset(FI);
         unsigned CFIIndex = MF.addFrameInst(
             MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
         BuildMI(MBB, Pos, dl, TII.get(TargetOpcode::CFI_INSTRUCTION))
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
index 6cd4badb7704b7..9671f69bfd2268 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
@@ -1148,7 +1148,7 @@ enum CompactUnwindEncodings {
 /// instructions. If the CFI instructions describe a frame that cannot be
 /// encoded in compact unwind, the method returns UNWIND_ARM_MODE_DWARF which
 /// tells the runtime to fallback and unwind using dwarf.
-uint32_t ARMAsmBackendDarwin::generateCompactUnwindEncoding(
+uint64_t ARMAsmBackendDarwin::generateCompactUnwindEncoding(
     const MCDwarfFrameInfo *FI, const MCContext *Ctxt) const {
   DEBUG_WITH_TYPE("compact-unwind", llvm::dbgs() << "generateCU()\n");
   // Only armv7k uses CFI based unwinding.
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h
index ac0c9b101cae13..9c958003ca756a 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h
+++ b/llvm/lib/Target/ARM/MCTargetDe...
[truncated]

llvm/include/llvm/CodeGen/MachineFrameInfo.h

llvm/test/CodeGen/X86/huge-stack.ll

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp

wesleywiser · 2024-03-15T16:52:32Z

The Windows failure appears to be unrelated to my change. If I checkout the commit prior to my first change, I still get the same failure in python/dialects/arith_dialect.py.

For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code. This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames.

wesleywiser · 2024-03-20T02:32:47Z

I rebased to resolve merge conflicts. Please let me know if you'd like me to squash the history!

RKSimon

LGTM - @MaskRay any more comments?

chenzheng1030 · 2024-03-21T08:21:04Z

llvm/test/CodeGen/PowerPC/huge-frame-size.ll

@@ -18,7 +18,7 @@ define void @foo(i8 %x) {
 ; CHECK-LE-NEXT:    oris 0, 0, 65535
 ; CHECK-LE-NEXT:    ori 0, 0, 65504
 ; CHECK-LE-NEXT:    stdux 1, 1, 0
-; CHECK-LE-NEXT:    .cfi_def_cfa_offset 32
+; CHECK-LE-NEXT:    .cfi_def_cfa_offset 4294967328


This looks correct now on PPC. stack size is 4294967296 + linkagesize(32) = 4294967328 (0x100000020). Without the change, it is truncated to 0x20.

llvm/test/CodeGen/X86/huge-stack.ll

RKSimon

LGTM

RKSimon · 2024-03-25T11:47:27Z

Any more comments before I merge this for @wesleywiser ?

MaskRay · 2024-03-26T01:26:22Z

llvm/include/llvm/MC/MCDwarf.h

    return MCCFIInstruction(OpDefCfa, L, Register, Offset, Loc);
  }

  /// .cfi_def_cfa_register modifies a rule for computing CFA. From now
  /// on Register will be used instead of the old one. Offset remains the same.
  static MCCFIInstruction createDefCfaRegister(MCSymbol *L, unsigned Register,
                                               SMLoc Loc = {}) {
-    return MCCFIInstruction(OpDefCfaRegister, L, Register, 0, Loc);
+    return MCCFIInstruction(OpDefCfaRegister, L, Register, INT64_C(0), Loc);


INT64_C(0) is not necessary. 0 is fairly common for int64_t.

@wesleywiser Please can you fix those cases before I commit.

Without the explicit cast, gcc complains the call is ambiguous:

/home/wesley/llvm-project2/llvm/include/llvm/MC/MCDwarf.h:550:66: error: call of overloaded ‘MCCFIInstruction(llvm::MCCFIInstruction::OpType, llvm::MCSymbol*&, unsigned int&, int, llvm::SMLoc&)’ is ambiguous 550 | return MCCFIInstruction(OpDefCfaRegister, L, Register, 0, Loc); | ^ /home/wesley/llvm-project2/llvm/include/llvm/MC/MCDwarf.h:526:3: note: candidate: ‘llvm::MCCFIInstruction::MCCFIInstruction(llvm::MCCFIInstruction::OpType, llvm::MCSymbol*, unsigned int, unsigned int, llvm::SMLoc)’ 526 | MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R1, unsigned R2, SMLoc Loc) | ^~~~~~~~~~~~~~~~ /home/wesley/llvm-project2/llvm/include/llvm/MC/MCDwarf.h:519:3: note: candidate: ‘llvm::MCCFIInstruction::MCCFIInstruction(llvm::MCCFIInstruction::OpType, llvm::MCSymbol*, unsigned int, int64_t, llvm::SMLoc, llvm::StringRef, llvm::StringRef)’ 519 | MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int64_t O, SMLoc Loc, | ^~~~~~~~~~~~~~~~

Is there a different way I should solve that?

OK, we can try to cleanup later

RKSimon · 2024-03-27T16:11:53Z

@wesleywiser we're seeing some EXPENSIVE_CHECKS builds failures - I'm going to revert the patch and add -verify-machineinstrs to the problem tests (they should have had this anyhow)

…or frames larger than 2gb (#84114)" This is failing on some EXPENSIVE_CHECKS buildbots

wesleywiser · 2024-03-27T16:23:31Z

@RKSimon thanks for the heads up! Is that something I need to look into? (Sorry, still new to contributing to LLVM)

Help identify EXPENSIVE_CHECKS regressions identified in #84114

RKSimon · 2024-03-27T16:31:15Z

If you take a look at the tests I've updated in f92fa7e - these should now fail (I'd recommend adding -verify-machineinstrs to X86/huge-stack.ll test as well)

RKSimon · 2024-04-03T14:05:29Z

@wesleywiser Any progress on the -verify-machineinstrs regressions?

wesleywiser · 2024-04-05T02:19:28Z

@RKSimon thanks for the ping! I've been away last week and haven't had a ton of time to look deeply at it. The crux of the issue is that with my changes, we correctly calculate the offset to various stack locals as being larger than will fit in a 32-bit displacement:

MOV8mi $rsp, 1, $noreg, 4294967167, $noreg, 42 :: (store (s8) into %ir.3)

If my understanding of x86_64 displacements is correct, then the verification failure is entirely valid as this can't be represented by the displacement. I'm not sure what the right way to fix this is. I think I could change the Prologue/Epilogue Inserter to insert an lea to calculate the right address and then perform the mov.

Is there an existing way to handle this? It seems like this kind of situation could occur in many different parts of x86_64 codegen.

RKSimon · 2024-04-05T09:23:58Z

@phoebewang any preference to how we fix this?

phoebewang · 2024-04-05T13:52:10Z

@phoebewang any preference to how we fix this?

Do you mean #87154? I took a quick look at it. The problem happens in fastISel only. It looks to me DAGISel can select to lea or mov with different offset. I didn't look deep into it, but guess we may learn how DAGISel handle it or bail out to DAGISel otherwise?

efriedma-quic · 2024-04-05T18:30:07Z

On non-x86 targets, frame indexes that use an addressing mode that ends up being illegal are not rare. Rare enough that it doesn't really matter for performance, but not so rare it doesn't come up. In that case, we scavenge a register and generate equivalent math. It should be possible to do the same on x86.

There are various tricks you can use to make scavenging happen less frequently; for example, using LocalStackSlotAllocation to lower frame indexes for large allocations.

llvmbot added backend:ARM backend:AArch64 backend:MSP430 backend:PowerPC backend:X86 debuginfo mc Machine (object) code labels Mar 6, 2024

RKSimon reviewed Mar 6, 2024

View reviewed changes

llvm/include/llvm/CodeGen/MachineFrameInfo.h Outdated Show resolved Hide resolved

RKSimon reviewed Mar 7, 2024

View reviewed changes

llvm/include/llvm/CodeGen/MachineFrameInfo.h Outdated Show resolved Hide resolved

MaskRay reviewed Mar 7, 2024

View reviewed changes

llvm/test/CodeGen/X86/huge-stack.ll Outdated Show resolved Hide resolved

MaskRay reviewed Mar 7, 2024

View reviewed changes

llvm/test/CodeGen/X86/huge-stack.ll Show resolved Hide resolved

RKSimon reviewed Mar 8, 2024

View reviewed changes

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp Outdated Show resolved Hide resolved

RKSimon reviewed Mar 8, 2024

View reviewed changes

llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp Outdated Show resolved Hide resolved

wesleywiser added 10 commits March 20, 2024 01:15

[nfc] Add x86 test case for huge stacks

e7db86b

Review feedback: use explicit bitwidth types

06335cf

Placate msvc

36f9869

Review feedback: use integer constant function macros

1ddc600

Attempt to fix msvc build

529c2f4

Review feedback: use opaque ptrs

1f7acc3

Review feedback: autogenerate test

cdee65e

Review feedback: long -> int64_t

ad6283b

Use std::abs instead

ac400f8

wesleywiser force-pushed the fix_frame_layout_overflow branch from b51836a to ac400f8 Compare March 20, 2024 02:27

RKSimon approved these changes Mar 20, 2024

View reviewed changes

chenzheng1030 reviewed Mar 21, 2024

View reviewed changes

wesleywiser commented Mar 21, 2024

View reviewed changes

llvm/test/CodeGen/X86/huge-stack.ll Outdated Show resolved Hide resolved

Add --no_x86_scrub_sp to test

d395ab4

RKSimon approved these changes Mar 22, 2024

View reviewed changes

MaskRay reviewed Mar 26, 2024

View reviewed changes

MaskRay approved these changes Mar 26, 2024

View reviewed changes

RKSimon merged commit 58de1e2 into llvm:main Mar 27, 2024
4 checks passed

RKSimon added a commit that referenced this pull request Mar 27, 2024

Revert rG58de1e2c5eee548a9b365e3b1554d87317072ad9 "Fix stack layout f…

78f0871

…or frames larger than 2gb (#84114)" This is failing on some EXPENSIVE_CHECKS buildbots

RKSimon added a commit that referenced this pull request Mar 27, 2024

[X86] Add -verify-machineinstrs to huge stack tests

f92fa7e

Help identify EXPENSIVE_CHECKS regressions identified in #84114

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix stack layout for frames larger than 2gb #84114

Fix stack layout for frames larger than 2gb #84114

wesleywiser commented Mar 6, 2024

llvmbot commented Mar 6, 2024 •

edited

wesleywiser commented Mar 15, 2024

wesleywiser commented Mar 20, 2024

RKSimon left a comment

chenzheng1030 Mar 21, 2024

RKSimon left a comment

RKSimon commented Mar 25, 2024

MaskRay Mar 26, 2024

RKSimon Mar 26, 2024

wesleywiser Mar 27, 2024

RKSimon Mar 27, 2024

RKSimon commented Mar 27, 2024

wesleywiser commented Mar 27, 2024

RKSimon commented Mar 27, 2024

RKSimon commented Apr 3, 2024

wesleywiser commented Apr 5, 2024

RKSimon commented Apr 5, 2024

phoebewang commented Apr 5, 2024

efriedma-quic commented Apr 5, 2024

Fix stack layout for frames larger than 2gb #84114

Fix stack layout for frames larger than 2gb #84114

Conversation

wesleywiser commented Mar 6, 2024

llvmbot commented Mar 6, 2024 • edited

wesleywiser commented Mar 15, 2024

wesleywiser commented Mar 20, 2024

RKSimon left a comment

Choose a reason for hiding this comment

chenzheng1030 Mar 21, 2024

Choose a reason for hiding this comment

RKSimon left a comment

Choose a reason for hiding this comment

RKSimon commented Mar 25, 2024

MaskRay Mar 26, 2024

Choose a reason for hiding this comment

RKSimon Mar 26, 2024

Choose a reason for hiding this comment

wesleywiser Mar 27, 2024

Choose a reason for hiding this comment

RKSimon Mar 27, 2024

Choose a reason for hiding this comment

RKSimon commented Mar 27, 2024

wesleywiser commented Mar 27, 2024

RKSimon commented Mar 27, 2024

RKSimon commented Apr 3, 2024

wesleywiser commented Apr 5, 2024

RKSimon commented Apr 5, 2024

phoebewang commented Apr 5, 2024

efriedma-quic commented Apr 5, 2024

llvmbot commented Mar 6, 2024 •

edited