Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[StackColoring] Handle SEH catch object stack slots conservatively #66988

Merged
merged 1 commit into from Sep 22, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Sep 21, 2023

The write to the SEH catch object happens before cleanuppads are executed, while the first reference to the object will typically be in a catchpad.

If we make use of first-use analysis, we may end up allocating an alloca used inside the cleanuppad and the catch object at the same stack offset, which would be incorrect.

https://reviews.llvm.org/D86673 was a previous attempt to fix it. It used the heuristic "a slot loaded in a WinEH pad and never written" to detect catch objects. However, because it checks for more than one load (while probably more than zero was intended), the fix does not actually work.

The general approach also seems dubious to me, so this patch reverts that change entirely, and instead marks all catch object slots as conservative (i.e. excluded from first-use analysis) based on the WinEHFuncInfo. As far as I can tell we don't need any heuristics here, we know exactly which slots are affected.

Fixes #66984.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-backend-x86

Changes

The write to the SEH catch object happens before cleanuppads are executed, while the first reference to the object will typically be in a catchpad.

If we make use of first-use analysis, we may end up allocating an alloca used inside the cleanuppad and the catch object at the same stack offset, which would be incorrect.

https://reviews.llvm.org/D86673 was a previous attempt to fix it. It used the heuristic "a slot loaded in a WinEH pad and never written" to detect catch objects. However, because it checks for more than one load (while probably more than zero was intended), the fix does not actually work.

The general approach also seems dubious to me, so this patch reverts that change entirely, and instead marks all catch object slots as conservative (i.e. excluded from first-use analysis) based on the WinEHFuncInfo. As far as I can tell we don't need any heuristics here, we know exactly which slots are affected.

Fixes #66984.


Full diff: https://github.com/llvm/llvm-project/pull/66988.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/StackColoring.cpp (+13-48)
  • (modified) llvm/test/CodeGen/X86/stack-coloring-seh.ll (+4-4)
diff --git a/llvm/lib/CodeGen/StackColoring.cpp b/llvm/lib/CodeGen/StackColoring.cpp
index 4d69b6104f68372..d90c3e95f477944 100644
--- a/llvm/lib/CodeGen/StackColoring.cpp
+++ b/llvm/lib/CodeGen/StackColoring.cpp
@@ -370,37 +370,6 @@ STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime region");
 // If in RPO ordering chosen to walk the CFG  we happen to visit the b[k]
 // before visiting the memcpy block (which will contain the lifetime start
 // for "b" then it will appear that 'b' has a degenerate lifetime.
-//
-// Handle Windows Exception with LifetimeStartOnFirstUse:
-// -----------------
-//
-// There was a bug for using LifetimeStartOnFirstUse in win32.
-// class Type1 {
-// ...
-// ~Type1(){ write memory;}
-// }
-// ...
-// try{
-// Type1 V
-// ...
-// } catch (Type2 X){
-// ...
-// }
-// For variable X in catch(X), we put point pX=&(&X) into ConservativeSlots
-// to prevent using LifetimeStartOnFirstUse. Because pX may merged with
-// object V which may call destructor after implicitly writing pX. All these
-// are done in C++ EH runtime libs (through CxxThrowException), and can't
-// obviously check it in IR level.
-//
-// The loader of pX, without obvious writing IR, is usually the first LOAD MI
-// in EHPad, Some like:
-// bb.x.catch.i (landing-pad, ehfunclet-entry):
-// ; predecessors: %bb...
-//   successors: %bb...
-//  %n:gr32 = MOV32rm %stack.pX ...
-//  ...
-// The Type2** %stack.pX will only be written in EH runtime libs, so we
-// check the StoreSlots to screen it out.
 
 namespace {
 
@@ -462,9 +431,6 @@ class StackColoring : public MachineFunctionPass {
   /// slots lifetime-start-on-first-use is disabled).
   BitVector ConservativeSlots;
 
-  /// Record the FI slots referenced by a 'may write to memory'.
-  BitVector StoreSlots;
-
   /// Number of iterations taken during data flow analysis.
   unsigned NumIterations;
 
@@ -660,13 +626,10 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
   InterestingSlots.resize(NumSlot);
   ConservativeSlots.clear();
   ConservativeSlots.resize(NumSlot);
-  StoreSlots.clear();
-  StoreSlots.resize(NumSlot);
 
   // number of start and end lifetime ops for each slot
   SmallVector<int, 8> NumStartLifetimes(NumSlot, 0);
   SmallVector<int, 8> NumEndLifetimes(NumSlot, 0);
-  SmallVector<int, 8> NumLoadInCatchPad(NumSlot, 0);
 
   // Step 1: collect markers and populate the "InterestingSlots"
   // and "ConservativeSlots" sets.
@@ -722,13 +685,6 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
           if (! BetweenStartEnd.test(Slot)) {
             ConservativeSlots.set(Slot);
           }
-          // Here we check the StoreSlots to screen catch point out. For more
-          // information, please refer "Handle Windows Exception with
-          // LifetimeStartOnFirstUse" at the head of this file.
-          if (MI.mayStore())
-            StoreSlots.set(Slot);
-          if (MF->getWinEHFuncInfo() && MBB->isEHPad() && MI.mayLoad())
-            NumLoadInCatchPad[Slot] += 1;
         }
       }
     }
@@ -739,14 +695,23 @@ unsigned StackColoring::collectMarkers(unsigned NumSlot) {
     return 0;
   }
 
-  // 1) PR27903: slots with multiple start or end lifetime ops are not
+  // PR27903: slots with multiple start or end lifetime ops are not
   // safe to enable for "lifetime-start-on-first-use".
-  // 2) And also not safe for variable X in catch(X) in windows.
   for (unsigned slot = 0; slot < NumSlot; ++slot) {
-    if (NumStartLifetimes[slot] > 1 || NumEndLifetimes[slot] > 1 ||
-        (NumLoadInCatchPad[slot] > 1 && !StoreSlots.test(slot)))
+    if (NumStartLifetimes[slot] > 1 || NumEndLifetimes[slot] > 1)
       ConservativeSlots.set(slot);
   }
+
+  // The write to the catch object by the personality function is not propely
+  // modeled in IR: It happens before any cleanuppads are executed, even if the
+  // first mention of the catch object is in a catchpad. As such, mark catch
+  // object slots as conservative, so they are excluded from first-use analysis.
+  if (WinEHFuncInfo *EHInfo = MF->getWinEHFuncInfo())
+    for (WinEHTryBlockMapEntry &TBME : EHInfo->TryBlockMap)
+      for (WinEHHandlerType &H : TBME.HandlerArray)
+        if (H.CatchObj.FrameIndex != std::numeric_limits<int>::max())
+          ConservativeSlots.set(H.CatchObj.FrameIndex);
+
   LLVM_DEBUG(dumpBV("Conservative slots", ConservativeSlots));
 
   // Step 2: compute begin/end sets for each block
diff --git a/llvm/test/CodeGen/X86/stack-coloring-seh.ll b/llvm/test/CodeGen/X86/stack-coloring-seh.ll
index 3995bfa8101d69d..892c81a12dc1acf 100644
--- a/llvm/test/CodeGen/X86/stack-coloring-seh.ll
+++ b/llvm/test/CodeGen/X86/stack-coloring-seh.ll
@@ -3,7 +3,7 @@
 
 @type_info = external global ptr
 
-; FIXME: This is a miscompile.
+; Make sure %a1 and %a2 don't share the same stack offset.
 define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-LABEL: pr66984:
 ; CHECK:       # %bb.0: # %bb
@@ -12,7 +12,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:    pushl %ebx
 ; CHECK-NEXT:    pushl %edi
 ; CHECK-NEXT:    pushl %esi
-; CHECK-NEXT:    subl $20, %esp
+; CHECK-NEXT:    subl $24, %esp
 ; CHECK-NEXT:    movl %esp, -28(%ebp)
 ; CHECK-NEXT:    movl $-1, -16(%ebp)
 ; CHECK-NEXT:    leal -24(%ebp), %eax
@@ -31,7 +31,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:  $ehgcr_0_4:
 ; CHECK-NEXT:    movl -24(%ebp), %eax
 ; CHECK-NEXT:    movl %eax, %fs:0
-; CHECK-NEXT:    addl $20, %esp
+; CHECK-NEXT:    addl $24, %esp
 ; CHECK-NEXT:    popl %esi
 ; CHECK-NEXT:    popl %edi
 ; CHECK-NEXT:    popl %ebx
@@ -47,7 +47,7 @@ define void @pr66984(ptr %arg) personality ptr @__CxxFrameHandler3 {
 ; CHECK-NEXT:    pushl %ebp
 ; CHECK-NEXT:    addl $12, %ebp
 ; CHECK-NEXT:    movl %esp, -28(%ebp)
-; CHECK-NEXT:    movl -32(%ebp), %ecx
+; CHECK-NEXT:    movl -36(%ebp), %ecx
 ; CHECK-NEXT:    movl $2, -16(%ebp)
 ; CHECK-NEXT:    calll _cleanup
 ; CHECK-NEXT:    movl $LBB0_3, %eax

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good.

ConservativeSlots.set(slot);
}

// The write to the catch object by the personality function is not propely
// modeled in IR: It happens before any cleanuppads are executed, even if the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think proper modelling would mean passing the catch object as a bundled argument to every invoke that unwinds to the catchpad. Every invoke would mention all catch objects of every catch it could unwind to. It would fix the issue, but I worry that the extra clutter would actually make the IR harder for humans to read and understand.

If we find the time to work on the model, catchswitches are another area for improvement. At the time, we didn't have callbr, so we added catchswitch as an unwind edge multiplexer. Having multiple successor edges coming off of an invoke which unwinds to a catchswitch serving multiple catch blocks is more accurate, and would eliminate BBs with no insertion point, which are an ongoing source of WinEH bugs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also do some sort of fixup during ISel, I think? The analysis here is looking at MachineInstrs, so we could add frameindex references to the call instructions.

(I think it's generally a good idea to try to get the MachineIR into the form we want, then consider if we want to make corresponding IR changes.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would definitely work from a modelling perspective, and it is what we do for the CFG. We add in all the missing edges from the invokes. I'm not clear on how to add no-op FrameIndex operands to the call instructions, but that's a decent direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect to see problems in passes other than StackColoring?

The somewhat special situation here is that while the invoke effectively writes to the stack slot, code can only rely on that write inside the catchpad. So if we have something like this:

*e = x
invoke (unmodeled potential write to e)
cleanuppad
 v = *e
catchpad(e)

and then v = *e gets GVNd to x because the write at the invoke is not modeled, I would not really consider that a miscompile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed #67110 to keep track of this idea.

@@ -3,7 +3,7 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a nit, but "seh" in the test name usually refers to either:

  • Windows call frame info, the unwind opcodes, powered by the .seh_ assembler directive family
  • Functionality related to __try, non-call exceptions, or EHa, which makes C++ cleanups run for non-call exceptions

For issues related to MSVC C++ EH and _CxxFrameHandler3, I usually just called it "wineh", maybe "wineh-cxx", but you can see we have varying opinions in the test suite currently:

$ git grep -l CxxFrameHand ../llvm/test/CodeGen/X86/
../llvm/test/CodeGen/X86/branchfolding-catchpads.ll
../llvm/test/CodeGen/X86/catchpad-dynamic-alloca.ll
../llvm/test/CodeGen/X86/catchpad-lifetime.ll
../llvm/test/CodeGen/X86/catchpad-realign-savexmm.ll
../llvm/test/CodeGen/X86/catchpad-regmask.ll
../llvm/test/CodeGen/X86/catchpad-reuse.ll
../llvm/test/CodeGen/X86/catchpad-weight.ll
../llvm/test/CodeGen/X86/catchret-fallthrough.ll
../llvm/test/CodeGen/X86/catchret-regmask.ll
../llvm/test/CodeGen/X86/cfguard-checks-funclet.ll
../llvm/test/CodeGen/X86/cleanuppad-inalloca.ll
../llvm/test/CodeGen/X86/cleanuppad-large-codemodel.ll
../llvm/test/CodeGen/X86/cleanuppad-realign.ll
../llvm/test/CodeGen/X86/deopt-bundles.ll
../llvm/test/CodeGen/X86/ehcontguard.ll
../llvm/test/CodeGen/X86/funclet-layout.ll
../llvm/test/CodeGen/X86/machine-licm-vs-wineh.mir
../llvm/test/CodeGen/X86/noreturn-call-win64.ll
../llvm/test/CodeGen/X86/pop-stack-cleanup-msvc.ll
../llvm/test/CodeGen/X86/pr26757.ll
../llvm/test/CodeGen/X86/pr27501.ll
../llvm/test/CodeGen/X86/pr42064.ll
../llvm/test/CodeGen/X86/pr48064.mir
../llvm/test/CodeGen/X86/regalloc-spill-at-ehpad.ll
../llvm/test/CodeGen/X86/seh-unwind-inline-asm-codegen.ll
../llvm/test/CodeGen/X86/stack-coloring-seh.ll
../llvm/test/CodeGen/X86/tail-dup-catchret.ll
../llvm/test/CodeGen/X86/tail-merge-wineh.ll
../llvm/test/CodeGen/X86/win-catchpad-csrs.ll
../llvm/test/CodeGen/X86/win-catchpad-nested-cxx.ll
../llvm/test/CodeGen/X86/win-catchpad-varargs.ll
../llvm/test/CodeGen/X86/win-catchpad.ll
../llvm/test/CodeGen/X86/win-cleanuppad.ll
../llvm/test/CodeGen/X86/win-funclet-cfi.ll
../llvm/test/CodeGen/X86/win32-eh-available-externally.ll
../llvm/test/CodeGen/X86/win32-eh-states.ll
../llvm/test/CodeGen/X86/win32-eh.ll
../llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir
../llvm/test/CodeGen/X86/win64-funclet-preisel-intrinsics.ll
../llvm/test/CodeGen/X86/win64-funclet-savexmm.ll
../llvm/test/CodeGen/X86/win64_eh_leaf2.ll
../llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll
../llvm/test/CodeGen/X86/windows-seh-EHa-CppCondiTemps.ll
../llvm/test/CodeGen/X86/windows-seh-EHa-CppDtors01.ll
../llvm/test/CodeGen/X86/wineh-no-ehpads.ll

There isn't a huge amount of consistency here, do whatever you feel is best.

The write to the SEH catch object happens before cleanuppads are
executed, while the first reference to the object will typically
be in a catchpad.

If we make use of first-use analysis, we may end up allocating
an alloca used inside the cleanuppad and the catch object at the
same stack offset, which would be incorrect.

https://reviews.llvm.org/D86673 was a previous attempt to fix it.
It used the heuristic "a slot loaded in a WinEH pad and never
written" to detect catch objects. However, because it checks
for more than one load (while probably more than zero was
intended), the fix does not actually work.

The general approach also seems dubious to me, so this patch
reverts that change entirely, and instead marks all catch object
slots as conservative (i.e. excluded from first-use analysis)
based on the WinEHFuncInfo. As far as I can tell we don't need
any heuristics here, we know exactly which slots are affected.

Fixes llvm#66984.
@nikic nikic merged commit b3cb4f0 into llvm:main Sep 22, 2023
2 of 3 checks passed
nikic added a commit that referenced this pull request Sep 22, 2023
This is a followup to #66988. The implementation there did not
account for the possibility of the catch object frame index
referrring to a fixed object, which is the case on win64.
tru pushed a commit that referenced this pull request Sep 27, 2023
This is a followup to #66988. The implementation there did not
account for the possibility of the catch object frame index
referrring to a fixed object, which is the case on win64.

(cherry picked from commit aa70f4d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect stack coloring for alloca in SEH cleanuppad
4 participants