[StackColoring] Handle SEH catch object stack slots conservatively #66988

nikic · 2023-09-21T07:34:34Z

The write to the SEH catch object happens before cleanuppads are executed, while the first reference to the object will typically be in a catchpad.

If we make use of first-use analysis, we may end up allocating an alloca used inside the cleanuppad and the catch object at the same stack offset, which would be incorrect.

https://reviews.llvm.org/D86673 was a previous attempt to fix it. It used the heuristic "a slot loaded in a WinEH pad and never written" to detect catch objects. However, because it checks for more than one load (while probably more than zero was intended), the fix does not actually work.

The general approach also seems dubious to me, so this patch reverts that change entirely, and instead marks all catch object slots as conservative (i.e. excluded from first-use analysis) based on the WinEHFuncInfo. As far as I can tell we don't need any heuristics here, we know exactly which slots are affected.

Fixes #66984.

llvmbot · 2023-09-21T07:35:41Z

@llvm/pr-subscribers-backend-x86

Changes

The write to the SEH catch object happens before cleanuppads are executed, while the first reference to the object will typically be in a catchpad.

If we make use of first-use analysis, we may end up allocating an alloca used inside the cleanuppad and the catch object at the same stack offset, which would be incorrect.

https://reviews.llvm.org/D86673 was a previous attempt to fix it. It used the heuristic "a slot loaded in a WinEH pad and never written" to detect catch objects. However, because it checks for more than one load (while probably more than zero was intended), the fix does not actually work.

The general approach also seems dubious to me, so this patch reverts that change entirely, and instead marks all catch object slots as conservative (i.e. excluded from first-use analysis) based on the WinEHFuncInfo. As far as I can tell we don't need any heuristics here, we know exactly which slots are affected.

Fixes #66984.

Full diff: https://github.com/llvm/llvm-project/pull/66988.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/StackColoring.cpp (+13-48)
(modified) llvm/test/CodeGen/X86/stack-coloring-seh.ll (+4-4)

diff --git a/llvm/lib/CodeGen/StackColoring.cpp b/llvm/lib/CodeGen/StackColoring.cpp
index 4d69b6104f68372..d90c3e95f477944 100644
--- a/llvm/lib/CodeGen/StackColoring.cpp
+++ b/llvm/lib/CodeGen/StackColoring.cpp
@@ -370,37 +370,6 @@ STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime region");
 // If in RPO ordering chosen to walk the CFG  we happen to visit the b[k]
 // before visiting the memcpy block (which will contain the lifetime start
 // for "b" then it will appear that 'b' has a degenerate lifetime.
-//
-// Handle Windows Exception with LifetimeStartOnFirstUse:
-// -----------------
-//
-// There was a bug for using LifetimeStartOnFirstUse in win32.
-// class Type1 {
-// ...
-// ~Type1(){ write memory;}
-// }
-// ...
-// try{
-// Type1 V
-// ...
-// } catch (Type2 X){
-// ...
-// }
-// For variable X in catch(X), we put point pX=&(&X) into ConservativeSlots
-// to prevent using LifetimeStartOnFirstUse. Because pX may merged with
-// object V which may call destructor after implicitly writing pX. All these
-// are done in C++ EH runtime libs (through CxxThrowException), and can't
-// obviously check it in IR level.
-//
-// The loader of pX, without obvious writing IR, is usually the first LOAD MI
-// in EHPad, Some like:
-// bb.x.catch.i (landing-pad, ehfunclet-entry):
-// ; predecessors: %bb...
-//   successors: %bb...
-//  %n:gr32 = MOV32rm %stack.pX ...
-//  ...
-// The Type2** %stack.pX will only be written in EH runtime libs, so we
-// check the StoreSlots to screen it out.
 
 namespace {
 
@@ -462,9 +431,6 @@ class StackColoring : public MachineFunctionPass {
   /// slots lifetime-start-on-first-use is disabled).
   BitVector ConservativeSlots;
 
-  /// Record the FI slots referenced by a 'may write to memory'.
-  BitVector StoreSlots;
-
   /// Number of iterations taken during data flow analysis.
   unsigned NumIterations;
 
@@ -660,13 +626,10 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
   InterestingSlots.resize(NumSlot);
   ConservativeSlots.clear();
   ConservativeSlots.resize(NumSlot);
-  StoreSlots.clear();
-  StoreSlots.resize(NumSlot);
 
   // number of start and end lifetime ops for each slot
   SmallVector<int, 8> NumStartLifetimes(NumSlot, 0);
   SmallVector<int, 8> NumEndLifetimes(NumSlot, 0);
-  SmallVector<int, 8> NumLoadInCatchPad(NumSlot, 0);
 
   // Step 1: collect markers and populate the "InterestingSlots"
   // and "ConservativeSlots" sets.
@@ -722,13 +685,6 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
           if (! BetweenStartEnd.test(Slot)) {
             ConservativeSlots.set(Slot);
           }
-          // Here we check the StoreSlots to screen catch point out. For more
-          // information, please refer "Handle Windows Exception with
-          // LifetimeStartOnFirstUse" at the head of this file.
-          if (MI.mayStore())
-            StoreSlots.set(Slot);
-          if (MF->getWinEHFuncInfo() && MBB->isEHPad() && MI.mayLoad())
-            NumLoadInCatchPad[Slot] += 1;
         }
       }
     }
@@ -739,14 +695,23 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
     return 0;
   }
 
-  // 1) PR27903: slots with multiple start or end lifetime ops are not
+  // PR27903: slots with multiple start or end lifetime ops are not
   // safe to enable for "lifetime-start-on-first-use".
-  // 2) And also not safe for variable X in catch(X) in windows.
   for (unsigned slot = 0; slot < NumSlot; ++slot) {
-    if (NumStartLifetimes[slot] > 1 || NumEndLifetimes[slot] > 1 ||
-        (NumLoadInCatchPad[slot] > 1 && !StoreSlots.test(slot)))
+    if (NumStartLifetimes[slot] > 1 || NumEndLifetimes[slot] > 1)
       ConservativeSlots.set(slot);
   }
+
+  // The write to the catch object by the personality function is not propely
+  // modeled in IR: It happens before any cleanuppads are executed, even if the
+  // first mention of the catch object is in a catchpad. As such, mark catch
+  // object slots as conservative, so they are excluded from first-use analysis.
+  if (WinEHFuncInfo *EHInfo = MF->getWinEHFuncInfo())
+    for (WinEHTryBlockMapEntry &TBME : EHInfo->TryBlockMap)
+      for (WinEHHandlerType &H : TBME.HandlerArray)
+        if (H.CatchObj.FrameIndex != std::numeric_limits<int>::max())
+          ConservativeSlots.set(H.CatchObj.FrameIndex);
+
   LLVM_DEBUG(dumpBV("Conservative slots", ConservativeSlots));
 
   // Step 2: compute begin/end sets for each block
diff --git a/llvm/test/CodeGen/X86/stack-coloring-seh.ll b/llvm/test/CodeGen/X86/stack-coloring-seh.ll
index 3995bfa8101d69d..892c81a12dc1acf 100644
--- a/llvm/test/CodeGen/X86/stack-coloring-seh.ll
+++ b/llvm/test/CodeGen/X86/stack-coloring-seh.ll
@@ -3,7 +3,7 @@
 
 @type_info = external global ptr
 
-; FIXME: This is a miscompile.
+; Make sure %a1 and %a2 don't share the same stack offset.
 define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-LABEL: pr66984:
 ; CHECK:       # %bb.0: # %bb
@@ -12,7 +12,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:    pushl %ebx
 ; CHECK-NEXT:    pushl %edi
 ; CHECK-NEXT:    pushl %esi
-; CHECK-NEXT:    subl $20, %esp
+; CHECK-NEXT:    subl $24, %esp
 ; CHECK-NEXT:    movl %esp, -28(%ebp)
 ; CHECK-NEXT:    movl $-1, -16(%ebp)
 ; CHECK-NEXT:    leal -24(%ebp), %eax
@@ -31,7 +31,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:  $ehgcr_0_4:
 ; CHECK-NEXT:    movl -24(%ebp), %eax
 ; CHECK-NEXT:    movl %eax, %fs:0
-; CHECK-NEXT:    addl $20, %esp
+; CHECK-NEXT:    addl $24, %esp
 ; CHECK-NEXT:    popl %esi
 ; CHECK-NEXT:    popl %edi
 ; CHECK-NEXT:    popl %ebx
@@ -47,7 +47,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:    pushl %ebp
 ; CHECK-NEXT:    addl $12, %ebp
 ; CHECK-NEXT:    movl %esp, -28(%ebp)
-; CHECK-NEXT:    movl -32(%ebp), %ecx
+; CHECK-NEXT:    movl -36(%ebp), %ecx
 ; CHECK-NEXT:    movl $2, -16(%ebp)
 ; CHECK-NEXT:    calll _cleanup
 ; CHECK-NEXT:    movl $LBB0_3, %eax

rnk

Thanks, looks good.

rnk · 2023-09-21T17:33:23Z

llvm/lib/CodeGen/StackColoring.cpp

      ConservativeSlots.set(slot);
  }
+
+  // The write to the catch object by the personality function is not propely
+  // modeled in IR: It happens before any cleanuppads are executed, even if the


I think proper modelling would mean passing the catch object as a bundled argument to every invoke that unwinds to the catchpad. Every invoke would mention all catch objects of every catch it could unwind to. It would fix the issue, but I worry that the extra clutter would actually make the IR harder for humans to read and understand.

If we find the time to work on the model, catchswitches are another area for improvement. At the time, we didn't have callbr, so we added catchswitch as an unwind edge multiplexer. Having multiple successor edges coming off of an invoke which unwinds to a catchswitch serving multiple catch blocks is more accurate, and would eliminate BBs with no insertion point, which are an ongoing source of WinEH bugs.

We could also do some sort of fixup during ISel, I think? The analysis here is looking at MachineInstrs, so we could add frameindex references to the call instructions.

(I think it's generally a good idea to try to get the MachineIR into the form we want, then consider if we want to make corresponding IR changes.)

That would definitely work from a modelling perspective, and it is what we do for the CFG. We add in all the missing edges from the invokes. I'm not clear on how to add no-op FrameIndex operands to the call instructions, but that's a decent direction.

Do we expect to see problems in passes other than StackColoring?

The somewhat special situation here is that while the invoke effectively writes to the stack slot, code can only rely on that write inside the catchpad. So if we have something like this:

*e = x invoke (unmodeled potential write to e) cleanuppad v = *e catchpad(e)

and then v = *e gets GVNd to x because the write at the invoke is not modeled, I would not really consider that a miscompile.

I've filed #67110 to keep track of this idea.

rnk · 2023-09-21T17:40:22Z

llvm/test/CodeGen/X86/stack-coloring-seh.ll

@@ -3,7 +3,7 @@



This is a bit of a nit, but "seh" in the test name usually refers to either:

Windows call frame info, the unwind opcodes, powered by the .seh_ assembler directive family

Functionality related to __try, non-call exceptions, or EHa, which makes C++ cleanups run for non-call exceptions

For issues related to MSVC C++ EH and _CxxFrameHandler3, I usually just called it "wineh", maybe "wineh-cxx", but you can see we have varying opinions in the test suite currently:

$ git grep -l CxxFrameHand ../llvm/test/CodeGen/X86/ ../llvm/test/CodeGen/X86/branchfolding-catchpads.ll ../llvm/test/CodeGen/X86/catchpad-dynamic-alloca.ll ../llvm/test/CodeGen/X86/catchpad-lifetime.ll ../llvm/test/CodeGen/X86/catchpad-realign-savexmm.ll ../llvm/test/CodeGen/X86/catchpad-regmask.ll ../llvm/test/CodeGen/X86/catchpad-reuse.ll ../llvm/test/CodeGen/X86/catchpad-weight.ll ../llvm/test/CodeGen/X86/catchret-fallthrough.ll ../llvm/test/CodeGen/X86/catchret-regmask.ll ../llvm/test/CodeGen/X86/cfguard-checks-funclet.ll ../llvm/test/CodeGen/X86/cleanuppad-inalloca.ll ../llvm/test/CodeGen/X86/cleanuppad-large-codemodel.ll ../llvm/test/CodeGen/X86/cleanuppad-realign.ll ../llvm/test/CodeGen/X86/deopt-bundles.ll ../llvm/test/CodeGen/X86/ehcontguard.ll ../llvm/test/CodeGen/X86/funclet-layout.ll ../llvm/test/CodeGen/X86/machine-licm-vs-wineh.mir ../llvm/test/CodeGen/X86/noreturn-call-win64.ll ../llvm/test/CodeGen/X86/pop-stack-cleanup-msvc.ll ../llvm/test/CodeGen/X86/pr26757.ll ../llvm/test/CodeGen/X86/pr27501.ll ../llvm/test/CodeGen/X86/pr42064.ll ../llvm/test/CodeGen/X86/pr48064.mir ../llvm/test/CodeGen/X86/regalloc-spill-at-ehpad.ll ../llvm/test/CodeGen/X86/seh-unwind-inline-asm-codegen.ll ../llvm/test/CodeGen/X86/stack-coloring-seh.ll ../llvm/test/CodeGen/X86/tail-dup-catchret.ll ../llvm/test/CodeGen/X86/tail-merge-wineh.ll ../llvm/test/CodeGen/X86/win-catchpad-csrs.ll ../llvm/test/CodeGen/X86/win-catchpad-nested-cxx.ll ../llvm/test/CodeGen/X86/win-catchpad-varargs.ll ../llvm/test/CodeGen/X86/win-catchpad.ll ../llvm/test/CodeGen/X86/win-cleanuppad.ll ../llvm/test/CodeGen/X86/win-funclet-cfi.ll ../llvm/test/CodeGen/X86/win32-eh-available-externally.ll ../llvm/test/CodeGen/X86/win32-eh-states.ll ../llvm/test/CodeGen/X86/win32-eh.ll ../llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir ../llvm/test/CodeGen/X86/win64-funclet-preisel-intrinsics.ll ../llvm/test/CodeGen/X86/win64-funclet-savexmm.ll ../llvm/test/CodeGen/X86/win64_eh_leaf2.ll ../llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll ../llvm/test/CodeGen/X86/windows-seh-EHa-CppCondiTemps.ll ../llvm/test/CodeGen/X86/windows-seh-EHa-CppDtors01.ll ../llvm/test/CodeGen/X86/wineh-no-ehpads.ll

There isn't a huge amount of consistency here, do whatever you feel is best.

The write to the SEH catch object happens before cleanuppads are executed, while the first reference to the object will typically be in a catchpad. If we make use of first-use analysis, we may end up allocating an alloca used inside the cleanuppad and the catch object at the same stack offset, which would be incorrect. https://reviews.llvm.org/D86673 was a previous attempt to fix it. It used the heuristic "a slot loaded in a WinEH pad and never written" to detect catch objects. However, because it checks for more than one load (while probably more than zero was intended), the fix does not actually work. The general approach also seems dubious to me, so this patch reverts that change entirely, and instead marks all catch object slots as conservative (i.e. excluded from first-use analysis) based on the WinEHFuncInfo. As far as I can tell we don't need any heuristics here, we know exactly which slots are affected. Fixes llvm#66984.

This is a followup to #66988. The implementation there did not account for the possibility of the catch object frame index referrring to a fixed object, which is the case on win64.

This is a followup to #66988. The implementation there did not account for the possibility of the catch object frame index referrring to a fixed object, which is the case on win64. (cherry picked from commit aa70f4d)

nikic requested review from rnk and efriedma-quic September 21, 2023 07:34

llvmbot added the backend:X86 label Sep 21, 2023

rnk approved these changes Sep 21, 2023

View reviewed changes

nikic force-pushed the stack-coloring-seh-fix branch from a47fa1e to b3cb4f0 Compare September 22, 2023 09:50

nikic merged commit b3cb4f0 into llvm:main Sep 22, 2023
2 of 3 checks passed

nikic mentioned this pull request Sep 22, 2023

[WinEH] Model catch object write in MIR #67110

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[StackColoring] Handle SEH catch object stack slots conservatively #66988

[StackColoring] Handle SEH catch object stack slots conservatively #66988

nikic commented Sep 21, 2023

llvmbot commented Sep 21, 2023

rnk left a comment

rnk Sep 21, 2023

efriedma-quic Sep 21, 2023

rnk Sep 21, 2023

nikic Sep 21, 2023

nikic Sep 22, 2023

rnk Sep 21, 2023

[StackColoring] Handle SEH catch object stack slots conservatively #66988

[StackColoring] Handle SEH catch object stack slots conservatively #66988

Conversation

nikic commented Sep 21, 2023

llvmbot commented Sep 21, 2023

rnk left a comment

Choose a reason for hiding this comment

rnk Sep 21, 2023

Choose a reason for hiding this comment

efriedma-quic Sep 21, 2023

Choose a reason for hiding this comment

rnk Sep 21, 2023

Choose a reason for hiding this comment

nikic Sep 21, 2023

Choose a reason for hiding this comment

nikic Sep 22, 2023

Choose a reason for hiding this comment

rnk Sep 21, 2023

Choose a reason for hiding this comment