Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[codegen] Emit missing cleanups when an expression contains a branch #80698

Closed
wants to merge 15 commits into from

Conversation

usx95
Copy link
Contributor

@usx95 usx95 commented Feb 5, 2024

Fixes: #63818 for control flow in initializations.

A control flow could happen in the middle of an expression due to stmt-expr and coroutine suspensions. These are henceforth called branch-in-expr.

Due to branch-in-expr, we miss running cleanups for the temporaries constructed in the expression before thebranch-in-expr.
Under usual circumstances (without branch-in-expr), these cleanups are only scheduled (i.e. added to EHStack) after the full expression.

Examples of such deferred cleanups include:

  • ParenList/InitList: Cleanups for fields are performed by the destructor of the object being constructed.
  • Array init: Cleanup for elements of an array is included in the array cleanup.
  • Lifetime-extended temporaries: reference-binding temporaries in braced-init are lifetime extended to the parent scope.
  • Lambda capture init: init in lambda capture list are destroyed by the lambda object.

In this PR, we emit the deferred BranchInExpr cleanups if they are present while emitting a branch.
Such cleanups were previously handled only as EHOnly cleanups for exceptions. With stmt-expr and coroutine suspensions, this becomes important to emit also for non-exceptional branches in expressions.

(This doesn't work for control flow through goto).

@usx95 usx95 added clang:codegen coroutines C++20 coroutines labels Feb 5, 2024
@usx95 usx95 self-assigned this Feb 5, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Feb 5, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 5, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-coroutines

@llvm/pr-subscribers-clang-codegen

Author: Utkarsh Saxena (usx95)

Changes

Partially solves #63818 for control flow in braced aggregate initialisation.

The reason why we miss cleanups for temporaries in braced aggregate init is that the lifetime of these temporaries is extended, and their cleanup is scheduled (added to EHStack) after the full expression.

When such expressions have control flow (through return or co_await), we only emit the EHStack cleanups which does not contain the LifetimeExtendedCleanupStack cleanups.

In this PR, we forcefully emit LifetimeExtendedCleanupStack if they are present while emitting a branch.

(This still does not work for control flow in array-like list initialization expressions.)


Full diff: https://github.com/llvm/llvm-project/pull/80698.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/CGCleanup.cpp (+22-9)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+4)
  • (added) clang/test/CodeGenCXX/return-in-stmt-expr-cleanup.cpp (+37)
  • (added) clang/test/CodeGenCoroutines/coro-suspend-in-agg-init.cpp (+82)
diff --git a/clang/lib/CodeGen/CGCleanup.cpp b/clang/lib/CodeGen/CGCleanup.cpp
index f87caf050eeaa..da0528b271aa3 100644
--- a/clang/lib/CodeGen/CGCleanup.cpp
+++ b/clang/lib/CodeGen/CGCleanup.cpp
@@ -488,16 +488,11 @@ void CodeGenFunction::PopCleanupBlocks(
   }
 }
 
-/// Pops cleanup blocks until the given savepoint is reached, then add the
-/// cleanups from the given savepoint in the lifetime-extended cleanups stack.
-void CodeGenFunction::PopCleanupBlocks(
-    EHScopeStack::stable_iterator Old, size_t OldLifetimeExtendedSize,
-    std::initializer_list<llvm::Value **> ValuesToReload) {
-  PopCleanupBlocks(Old, ValuesToReload);
-
-  // Move our deferred cleanups onto the EH stack.
+/// Adds deferred lifetime-extended cleanups onto the EH stack.
+void CodeGenFunction::AddLifetimeExtendedCleanups(size_t OldLifetimeExtendedSize) {
   for (size_t I = OldLifetimeExtendedSize,
-              E = LifetimeExtendedCleanupStack.size(); I != E; /**/) {
+              E = LifetimeExtendedCleanupStack.size();
+       I != E;) {
     // Alignment should be guaranteed by the vptrs in the individual cleanups.
     assert((I % alignof(LifetimeExtendedCleanupHeader) == 0) &&
            "misaligned cleanup stack entry");
@@ -519,6 +514,17 @@ void CodeGenFunction::PopCleanupBlocks(
       I += sizeof(ActiveFlag);
     }
   }
+}
+
+/// Pops cleanup blocks until the given savepoint is reached, then add the
+/// cleanups from the given savepoint in the lifetime-extended cleanups stack.
+void CodeGenFunction::PopCleanupBlocks(
+    EHScopeStack::stable_iterator Old, size_t OldLifetimeExtendedSize,
+    std::initializer_list<llvm::Value **> ValuesToReload) {
+  PopCleanupBlocks(Old, ValuesToReload);
+
+  // Move our deferred cleanups onto the EH stack.
+  AddLifetimeExtendedCleanups(OldLifetimeExtendedSize);
   LifetimeExtendedCleanupStack.resize(OldLifetimeExtendedSize);
 }
 
@@ -1102,6 +1108,13 @@ void CodeGenFunction::EmitBranchThroughCleanup(JumpDest Dest) {
   if (!HaveInsertPoint())
     return;
 
+  // If we have lifetime-extended (LE) cleanups, then we must be emitting a
+  // branch within an expression. Emit all the LE cleanups by adding them to the
+  // EHStack. Do not remove them from lifetime-extended stack, they need to be
+  // emitted again after the expression completes.
+  RunCleanupsScope LifetimeExtendedCleanups(*this);
+  AddLifetimeExtendedCleanups(0);
+
   // Create the branch.
   llvm::BranchInst *BI = Builder.CreateBr(Dest.getBlock());
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 143ad64e8816b..ac55e84034bc6 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -1143,6 +1143,10 @@ class CodeGenFunction : public CodeGenTypeCache {
   PopCleanupBlocks(EHScopeStack::stable_iterator OldCleanupStackSize,
                    std::initializer_list<llvm::Value **> ValuesToReload = {});
 
+  /// Adds lifetime-extended cleanups from the given position to the stack.
+  /// (does not remove the cleanups from lifetime extended stack).
+  void AddLifetimeExtendedCleanups(size_t OldLifetimeExtendedSize);
+
   /// Takes the old cleanup stack size and emits the cleanup blocks
   /// that have been added, then adds all lifetime-extended cleanups from
   /// the given position to the stack.
diff --git a/clang/test/CodeGenCXX/return-in-stmt-expr-cleanup.cpp b/clang/test/CodeGenCXX/return-in-stmt-expr-cleanup.cpp
new file mode 100644
index 0000000000000..214becd81e61a
--- /dev/null
+++ b/clang/test/CodeGenCXX/return-in-stmt-expr-cleanup.cpp
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
+
+// Context: GH63818
+struct Printy {
+  ~Printy() { }
+};
+
+struct Printies {
+  const Printy &a;
+  const Printy &b;
+  ~Printies() {}
+};
+
+bool foo();
+
+void bar() {
+  Printies p2{
+    // CHECK: store ptr %ref.tmp
+    Printy(), 
+    ({
+      if(foo()) {
+        // CHECK-LABEL: if.then:
+        // CHECK-NEXT: call void @_ZN6PrintyD1Ev
+        // CHECK-NEXT: br label %return
+        return;
+      }
+      // CHECK-LABEL: if.end:
+      // CHECK-NEXT: store ptr %ref.tmp1
+      Printy();
+    })};
+  // CHECK-NEXT: call void @_ZN8PrintiesD1Ev
+  // CHECK-NEXT: call void @_ZN6PrintyD1Ev
+  // CHECK-NEXT: call void @_ZN6PrintyD1Ev
+  // CHECK-NEXT: br label %return
+  return;
+}
+
diff --git a/clang/test/CodeGenCoroutines/coro-suspend-in-agg-init.cpp b/clang/test/CodeGenCoroutines/coro-suspend-in-agg-init.cpp
new file mode 100644
index 0000000000000..362e212b7fba3
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-suspend-in-agg-init.cpp
@@ -0,0 +1,82 @@
+// RUN: %clang_cc1 --std=c++20 -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
+
+// Context: GH63818
+
+#include "Inputs/coroutine.h"
+
+struct coroutine {
+  struct promise_type;
+  std::coroutine_handle<promise_type> handle;
+};
+
+struct coroutine::promise_type {
+  coroutine get_return_object() {
+    return {std::coroutine_handle<promise_type>::from_promise(*this)};
+  }
+  std::suspend_never initial_suspend() noexcept { return {}; }
+  std::suspend_always final_suspend() noexcept { return {}; }
+  void return_void() {}
+  void unhandled_exception() {}
+};
+
+struct Printy { ~Printy(); };
+
+struct Printies {
+  const Printy &a;
+  const Printy &b;
+  const Printy &c;
+};
+
+struct Awaiter : std::suspend_always {
+  Printy await_resume() { return {}; }
+};
+
+// CHECK: define dso_local ptr @_Z5test1v()
+coroutine test1() {
+  // CHECK-NOT: @_ZN6PrintyD1Ev
+  Printies p1{
+    Printy(),
+    co_await Awaiter{},
+    // CHECK:       await.cleanup:
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev
+    // CHECK-NEXT:    br label %cleanup{{.*}}.from.await.cleanup
+    // CHECK-NOT: @_ZN6PrintyD1Ev
+
+    co_await Awaiter{}
+    // CHECK:       await2.cleanup:
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev
+    // CHECK-NEXT:    br label %cleanup{{.*}}.from.await2.cleanup
+    // CHECK-NOT: @_ZN6PrintyD1Ev
+  };
+
+  // CHECK-COUNT-3:       call void @_ZN6PrintyD1Ev
+  // CHECK-NEXT:          br label
+
+  // CHECK-NOT: @_ZN6PrintyD1Ev
+
+  // CHECK: unreachable:
+}
+
+void bar(const Printy& a, const Printy& b);
+
+// CHECK: define dso_local ptr @_Z5test2v()
+coroutine test2() {
+  // CHECK-NOT: @_ZN6PrintyD1Ev
+  bar(
+    Printy(),
+    co_await Awaiter{}
+    // CHECK:       await.cleanup:
+    // CHECK-NEXT:    br label %cleanup{{.*}}.from.await.cleanup
+    // CHECK-NOT: @_ZN6PrintyD1Ev
+  );
+  // CHECK: await.ready:
+  // CHECK:   call void @_ZN6PrintyD1Ev
+  // CHECK-NOT: @_ZN6PrintyD1Ev
+
+  // CHECK: cleanup{{.*}}:
+  // CHECK:   call void @_ZN6PrintyD1Ev
+  // CHECK-NOT: @_ZN6PrintyD1Ev
+
+  // CHECK: unreachable:
+}

Copy link

github-actions bot commented Feb 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@usx95 usx95 changed the title [codegen] Emit cleanups for lifetime-extended temporaries when stmt-expr has control-flow [codegen] Emit cleanups for lifetime-extended temporaries when an expr contains control-flow Feb 5, 2024
@efriedma-quic
Copy link
Collaborator

If I understand correctly, a "lifetime-extended" cleanup deals with the case of a temporary whose lifetime continues beyond the expression. In other words, it has different lifetimes depending on how you exit the expression: if the variable's lifetime begins, it lasts until the end of that variable's lifetime. Otherwise, it only lasts until the end of the full-expression. Due to the way it's structured, the current mechanism only works correctly with thrown exceptions; it doesn't work with other ways of exiting an expression. This patch attempts to fix that.

Passing "0" to AddLifetimeExtendedCleanups() seems wrong. For example, suppose you have the following:

int foo();
struct Printy {
  ~Printy() { }
};
struct Printies {
  const Printy &a;
  const Printy &b;
  ~Printies() {}
};
void f() {
  Printies p2{
    Printy(), 
    ({
      for (int i = 0; i < 10; ++i) {
        Printies p3{
          Printy(),
          ({
          if(foo()) {
            break;
          }
          Printy();
        })};
      }
      Printy();
    })};
}

The "break" needs to run some of the cleanups associated with lifetime-extended temporaries... but not all of them.

We might want to consider making the whole process of lifetime extensions a bit more explicit. Currently, we lazily construct the set of temporaries as we evaluate the initializer, then reconstruct the cleanup stack at the end of the full-expression to reflect the lifetime extension. Instead, we could have some separate code to compute the list of lifetime-extended temporaries associated with a given variable, and create cleanups for them before we start emitting code for the initializer. That way, we don't need to mess with the cleanup stack later; we just need to make sure we correctly skip executing the cleanups which aren't supposed to run.

See also #12658.

@usx95
Copy link
Contributor Author

usx95 commented Feb 6, 2024

Thanks for the example. I was only considering branches leading outside function scope (return and coroutine destruction while co_await-ing).

I have captured the lifetime-extended cleanup stack size expected by the jump destination. This seems to generalize well to all kinds of branch instructions.
PTAL

@@ -628,7 +628,7 @@ CodeGenFunction::getJumpDestForLabel(const LabelDecl *D) {

// Create, but don't insert, the new block.
Dest = JumpDest(createBasicBlock(D->getName()),
EHScopeStack::stable_iterator::invalid(),
EHScopeStack::stable_iterator::invalid(), 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"0" seems wrong; goto out of a StmtExpr is legal, the same way "break" is legal. But I guess we can't actually compute the correct number here; maybe we need to do something with BranchFixups?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm okay with implementing a fix with restricted scope if you want to get something into the LLVM 18 branch. But before we decide that, I'd like some idea of what's required to actually implement this completely.)

@usx95 usx95 changed the title [codegen] Emit cleanups for lifetime-extended temporaries when an expr contains control-flow [codegen] Emit missing cleanups when an expression contains a branch Feb 12, 2024
@usx95
Copy link
Contributor Author

usx95 commented Feb 12, 2024

I updated the patch to work for all the initializations that were missing cleanups due to branch in an expression.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add handling for new int[](co_await foo()) (CodeGenFunction::EmitNewArrayInitializer)?

@@ -627,9 +627,11 @@ CodeGenFunction::getJumpDestForLabel(const LabelDecl *D) {
if (Dest.isValid()) return Dest;

// Create, but don't insert, the new block.
// FIXME: We do not know `BranchInExprDepth` for the destination and currently
// emit *all* the BranchInExpr cleanups.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly describe what you expect the fix for this to look like, if you have some idea?

Copy link
Contributor Author

@usx95 usx95 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remember/copy the entire BranchInExpr stack in a branch-fix-up and then emit the cleanups from this stored BranchInExpr stack up until the now-known destination depth.
This is because we simply pop the BranchInExpr stack on leaving an expression, but in this case, we need to remember which cleanups were popped earlier.

(This would not work if the goto destination is inside a different stmt-expr. In this case, the code is invalid anyways.)

Probably need to rethink how BranchInExpr is emitted to avoid copying the complete stack each time.

// Schedule to emit element cleanup if we see a branch in the array
// initialisation expression.
if (CGF.needsBranchCleanup(dtorKind))
CGF.pushDestroy(BranchInExprCleanup, address, elementType,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on pushing a cleanup for each array element, vs. using a single cleanup which iterates over the elements, like we do for exceptions?

Is there any chance an ArrayFiller needs a branch cleanup? Off the top of my head, I don't think it's possible because branches exits can only happen in code explicitly written in the current function, but that's a subtle invariant.

Copy link
Contributor Author

@usx95 usx95 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Element wise cleanup Vs iteration-style cleanup.

I had considered using IrregularPartialArrayDestroy instead of element-wise destroy. The catch is that this cleanup expects allocated array.init.end to store the current end pointer. This is updated as we iterate over and create the elements.
I do not think we should add these "stores" overhead, especially with -fno-exceptions when there are no branches in expr (the simple cases).

What we could be doing instead is just considering the current array index and using that for the array end address.
But that seems to be warned against in the comments:

      // In principle we could tell the Cleanup where we are more
      // directly, but the control flow can get so varied here that it
      // would actually be quite complex.  Therefore we go through an
      // alloca.

I do not understand the "complex" flows where this could go wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complexity here is basically that we'd have to reimplement mem2reg to compute the correct number of cleanups along every control-flow path. Instead, we store to an alloca, and just let the mem2reg optimization clean it up.

On a similar note, because mem2reg cleans up the stores, their overhead is minimal.

Copy link
Contributor Author

@usx95 usx95 Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please tell me how to verify that mem2reg will clear these stores.
If we can use this cleanup as-is, then we should be able to reuse the EH-only cleanups without introducing another cleanup stack (as also pointed out by @jyknight).

(I was not able to remove the stores by bin/opt -S -passes=mem2reg. I am new to Codegen so I might be missing something.)

It is important that we do not introduce these stores with exceptions disabled. But from the looks of it, it should be a easy optimisation pass as the store would never be read in normal circumstances.

This is the only blocker in the reusing EH-only cleanups.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you generated the IR correctly? Note that clang marks everything optnone at -O0 by default.

That said, I guess -O0 codegen is a potential issue; I don't think we run any passes that would clean up the dead stores in that case. I'd only be concerned about the case where we don't actually emit the cleanup; trying to minimize the codesize of the cleanup code isn't important at -O0.

To address the case of no cleanups, maybe we could generate the stores lazily? Basically, save where the stores should be, but don't generate them until we know the cleanup is actually used. Or equivalently, emit the stores, then erase them if the cleanup isn't emitted.

@jyknight
Copy link
Member

Is there a valid use for having "EHCleanup" that doesn't handle branches across it? That is, do we ever need a cleanup to be called only for an exception thrown, and not otherwise leaving the scope? I'm just wondering if we can simplify things conceptually here and remove an option.

The only case I can think of is, perhaps, constructor's member initializer lists.

That is...we apparently accept the following code...and it does, I suppose, what one might "expect": skips over construction of "b" and "c" and jumps into the middle of the constructor function, from which it returns normally as if the object was fully constructed. Using "return" instead of "goto" works too, also returning "successfully" from the constructor.

struct X { X(int); ~X(); };

struct Test {
    Test() : a(1), b( ({ goto label; 2; }) ), c(3) { label: f();}
    X a, b, c;
};

Test* test() { return new Test(); }

But...that really seems like code we should be rejecting as invalid (in which case, there need not be a distinction between cleanup due to exception or branch).

@usx95
Copy link
Contributor Author

usx95 commented Feb 14, 2024

Is there a valid use for having "EHCleanup" that doesn't handle branches across it? That is, do we ever need a cleanup to be called only for an exception thrown, and not otherwise leaving the scope? I'm just wondering if we can simplify things conceptually here and remove an option.

@jyknight Previously, I had tried using all the EHCleanup in the EHStack for these purposes but had given up. I found the CallCoroEnd cleanup problematic because the suspension cleanup jump destination does not include CallCoroEnd cleanup in its scope. So we might emit it on jumping to this destination.

CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);
/// ...
GroManager.EmitGroInit();
EHStack.pushCleanup<CallCoroEnd>(EHCleanup);

I will try to juggle the CallCoroEnd cleanup and jump destination and make it work.

Another problem of relying on EHCleanup would be that many (and not all) EHOnly cleanups are added to the EHStack only when exceptions are enabled. We would need to restructure them to be added to the EHStack "always" but emitted only with -fexceptions.

I will try to do these and get back here again.

@usx95
Copy link
Contributor Author

usx95 commented Feb 28, 2024

I tried using the EHCleanups as the only source of cleanups for these branch-in-expr. #83224 is an attempt. I do not like it very much.

The first challenge is to add these cleanups to the stack even when exceptions are disabled. Some of these cleanups (like array cleanup) require auxiliary instructions, like storing pointers to the last constructed array element. In order to not blow up the number of unnecessary instructions when cleanup is not emitted (especially when exceptions are disabled), we need to somehow remember these auxiliary instructions and erase them if cleanup is not emitted. Tracking such auxiliary instructions and attributing these instructions to a cleanup becomes very complicated very quickly.

Secondly, we do not want to emit all EHCleanups in the current scope upon seeing a branch. CallCoroEnd is one such cleanup.

Using all the other EHCleanup still introduces double free in libc++ tests which might need more investigation.

My proposal would be to not directly use all the EHCleanups and forcefully add them to EHStack. Instead, maintain a separate stack for such deferred cleanups. This is something we already do for lifetime-extended cleanups. It is much easier to reason about, easier to control and keeps things simpler.

@usx95
Copy link
Contributor Author

usx95 commented Mar 15, 2024

Sent out alternative fix #85398 which looks more convincing. It uses the existing EHStack but changes some EHCleanups to NormalAndEHCleanups.

@usx95
Copy link
Contributor Author

usx95 commented Apr 15, 2024

Alternate patch 89ba7e1 has landed. Closing this as obsolete.

@usx95 usx95 closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category coroutines C++20 coroutines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

destructors not always properly run when a coroutine is suspended and then destroyed
4 participants