Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reapply "[coro][CoroSplit] Use llvm.lifetime.end to compute putting objects on the frame vs the stack (#90265) #91372

Merged
merged 16 commits into from
May 21, 2024

Conversation

alanzhao1
Copy link
Contributor

This reverts commit 9243841.

This reland addresses the performance regressions seen in #90265 by retaining the original definition of isPotentiallyReachableFromMany(...) instead of reimplementing it with isManyPotentiallyReachableFromMany(...).

Fixes #86580

… objects on the frame vs the stack (llvm#90265)"

This reverts commit 9243841.

This reland addresses the performance regressions seen in llvm#90265 by
retaining the original definition of
`isPotentiallyReachableFromMany(...)` instead of reimplementing it with
`isManyPotentiallyReachableFromMany(...)`.

Fixes llvm#86580
@llvmbot
Copy link
Collaborator

llvmbot commented May 7, 2024

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-coroutines

@llvm/pr-subscribers-llvm-transforms

Author: Alan Zhao (alanzhao1)

Changes

This reverts commit 9243841.

This reland addresses the performance regressions seen in #90265 by retaining the original definition of isPotentiallyReachableFromMany(...) instead of reimplementing it with isManyPotentiallyReachableFromMany(...).

Fixes #86580


Full diff: https://github.com/llvm/llvm-project/pull/91372.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/CFG.h (+12)
  • (modified) llvm/lib/Analysis/CFG.cpp (+84-1)
  • (modified) llvm/lib/Transforms/Coroutines/CoroFrame.cpp (+43-17)
  • (added) llvm/test/Transforms/Coroutines/coro-lifetime-end.ll (+142)
diff --git a/llvm/include/llvm/Analysis/CFG.h b/llvm/include/llvm/Analysis/CFG.h
index 86b01c13274fe..23bc10a4a9d1b 100644
--- a/llvm/include/llvm/Analysis/CFG.h
+++ b/llvm/include/llvm/Analysis/CFG.h
@@ -96,6 +96,18 @@ bool isPotentiallyReachableFromMany(
     const SmallPtrSetImpl<BasicBlock *> *ExclusionSet,
     const DominatorTree *DT = nullptr, const LoopInfo *LI = nullptr);
 
+/// Determine whether there is a potentially a path from at least one block in
+/// 'Worklist' to at least one block in 'StopSet' within a single function
+/// without passing through any of the blocks in 'ExclusionSet'. Returns false
+/// only if we can prove that once any block in 'Worklist' has been reached then
+/// no blocks in 'StopSet' can be executed without passing through any blocks in
+/// 'ExclusionSet'. Conservatively returns true.
+bool isManyPotentiallyReachableFromMany(
+    SmallVectorImpl<BasicBlock *> &Worklist,
+    const SmallPtrSetImpl<const BasicBlock *> &StopSet,
+    const SmallPtrSetImpl<BasicBlock *> *ExclusionSet,
+    const DominatorTree *DT = nullptr, const LoopInfo *LI = nullptr);
+
 /// Return true if the control flow in \p RPOTraversal is irreducible.
 ///
 /// This is a generic implementation to detect CFG irreducibility based on loop
diff --git a/llvm/lib/Analysis/CFG.cpp b/llvm/lib/Analysis/CFG.cpp
index 8528aa9f77e02..57fb529fada46 100644
--- a/llvm/lib/Analysis/CFG.cpp
+++ b/llvm/lib/Analysis/CFG.cpp
@@ -158,7 +158,7 @@ bool llvm::isPotentiallyReachableFromMany(
   const Loop *StopLoop = LI ? getOutermostLoop(LI, StopBB) : nullptr;
 
   unsigned Limit = DefaultMaxBBsToExplore;
-  SmallPtrSet<const BasicBlock*, 32> Visited;
+  SmallPtrSet<const BasicBlock *, 32> Visited;
   do {
     BasicBlock *BB = Worklist.pop_back_val();
     if (!Visited.insert(BB).second)
@@ -204,6 +204,89 @@ bool llvm::isPotentiallyReachableFromMany(
   return false;
 }
 
+bool llvm::isManyPotentiallyReachableFromMany(
+    SmallVectorImpl<BasicBlock *> &Worklist,
+    const SmallPtrSetImpl<const BasicBlock *> &StopSet,
+    const SmallPtrSetImpl<BasicBlock *> *ExclusionSet, const DominatorTree *DT,
+    const LoopInfo *LI) {
+  // When a stop block is unreachable, it's dominated from everywhere,
+  // regardless of whether there's a path between the two blocks.
+  llvm::DenseMap<const BasicBlock *, bool> StopBBReachable;
+  for (auto *BB : StopSet)
+    StopBBReachable[BB] = DT && DT->isReachableFromEntry(BB);
+
+  // We can't skip directly from a block that dominates the stop block if the
+  // exclusion block is potentially in between.
+  if (ExclusionSet && !ExclusionSet->empty())
+    DT = nullptr;
+
+  // Normally any block in a loop is reachable from any other block in a loop,
+  // however excluded blocks might partition the body of a loop to make that
+  // untrue.
+  SmallPtrSet<const Loop *, 8> LoopsWithHoles;
+  if (LI && ExclusionSet) {
+    for (auto *BB : *ExclusionSet) {
+      if (const Loop *L = getOutermostLoop(LI, BB))
+        LoopsWithHoles.insert(L);
+    }
+  }
+
+  llvm::DenseMap<const BasicBlock *, const Loop *> StopLoops;
+  for (auto *StopBB : StopSet)
+    StopLoops[StopBB] = LI ? getOutermostLoop(LI, StopBB) : nullptr;
+
+  unsigned Limit = DefaultMaxBBsToExplore;
+  SmallPtrSet<const BasicBlock*, 32> Visited;
+  do {
+    BasicBlock *BB = Worklist.pop_back_val();
+    if (!Visited.insert(BB).second)
+      continue;
+    if (StopSet.contains(BB))
+      return true;
+    if (ExclusionSet && ExclusionSet->count(BB))
+      continue;
+    if (DT && llvm::any_of(StopSet, [&](const BasicBlock *StopBB) {
+          return StopBBReachable[BB] && DT->dominates(BB, StopBB);
+        }))
+      return true;
+
+    const Loop *Outer = nullptr;
+    if (LI) {
+      Outer = getOutermostLoop(LI, BB);
+      // If we're in a loop with a hole, not all blocks in the loop are
+      // reachable from all other blocks. That implies we can't simply jump to
+      // the loop's exit blocks, as that exit might need to pass through an
+      // excluded block. Clear Outer so we process BB's successors.
+      if (LoopsWithHoles.count(Outer))
+        Outer = nullptr;
+      if (llvm::any_of(StopSet, [&](const BasicBlock *StopBB) {
+            const Loop *StopLoop = StopLoops[StopBB];
+            return StopLoop && StopLoop == Outer;
+          }))
+        return true;
+    }
+
+    if (!--Limit) {
+      // We haven't been able to prove it one way or the other. Conservatively
+      // answer true -- that there is potentially a path.
+      return true;
+    }
+
+    if (Outer) {
+      // All blocks in a single loop are reachable from all other blocks. From
+      // any of these blocks, we can skip directly to the exits of the loop,
+      // ignoring any other blocks inside the loop body.
+      Outer->getExitBlocks(Worklist);
+    } else {
+      Worklist.append(succ_begin(BB), succ_end(BB));
+    }
+  } while (!Worklist.empty());
+
+  // We have exhausted all possible paths and are certain that 'To' can not be
+  // reached from 'From'.
+  return false;
+}
+
 bool llvm::isPotentiallyReachable(
     const BasicBlock *A, const BasicBlock *B,
     const SmallPtrSetImpl<BasicBlock *> *ExclusionSet, const DominatorTree *DT,
diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index 08a4522e3fac6..dd9e77a855ef4 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -19,6 +19,7 @@
 #include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Analysis/CFG.h"
 #include "llvm/Analysis/PtrUseVisitor.h"
 #include "llvm/Analysis/StackLifetime.h"
 #include "llvm/Config/llvm-config.h"
@@ -1440,17 +1441,22 @@ namespace {
 struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
   using Base = PtrUseVisitor<AllocaUseVisitor>;
   AllocaUseVisitor(const DataLayout &DL, const DominatorTree &DT,
-                   const CoroBeginInst &CB, const SuspendCrossingInfo &Checker,
+                   const coro::Shape &CoroShape,
+                   const SuspendCrossingInfo &Checker,
                    bool ShouldUseLifetimeStartInfo)
-      : PtrUseVisitor(DL), DT(DT), CoroBegin(CB), Checker(Checker),
-        ShouldUseLifetimeStartInfo(ShouldUseLifetimeStartInfo) {}
+      : PtrUseVisitor(DL), DT(DT), CoroShape(CoroShape), Checker(Checker),
+        ShouldUseLifetimeStartInfo(ShouldUseLifetimeStartInfo) {
+    for (AnyCoroSuspendInst *SuspendInst : CoroShape.CoroSuspends)
+      CoroSuspendBBs.insert(SuspendInst->getParent());
+  }
 
   void visit(Instruction &I) {
     Users.insert(&I);
     Base::visit(I);
     // If the pointer is escaped prior to CoroBegin, we have to assume it would
     // be written into before CoroBegin as well.
-    if (PI.isEscaped() && !DT.dominates(&CoroBegin, PI.getEscapingInst())) {
+    if (PI.isEscaped() &&
+        !DT.dominates(CoroShape.CoroBegin, PI.getEscapingInst())) {
       MayWriteBeforeCoroBegin = true;
     }
   }
@@ -1553,10 +1559,19 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
     // When we found the lifetime markers refers to a
     // subrange of the original alloca, ignore the lifetime
     // markers to avoid misleading the analysis.
-    if (II.getIntrinsicID() != Intrinsic::lifetime_start || !IsOffsetKnown ||
-        !Offset.isZero())
+    if (!IsOffsetKnown || !Offset.isZero())
+      return Base::visitIntrinsicInst(II);
+    switch (II.getIntrinsicID()) {
+    default:
       return Base::visitIntrinsicInst(II);
-    LifetimeStarts.insert(&II);
+    case Intrinsic::lifetime_start:
+      LifetimeStarts.insert(&II);
+      LifetimeStartBBs.push_back(II.getParent());
+      break;
+    case Intrinsic::lifetime_end:
+      LifetimeEndBBs.insert(II.getParent());
+      break;
+    }
   }
 
   void visitCallBase(CallBase &CB) {
@@ -1586,7 +1601,7 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
 
 private:
   const DominatorTree &DT;
-  const CoroBeginInst &CoroBegin;
+  const coro::Shape &CoroShape;
   const SuspendCrossingInfo &Checker;
   // All alias to the original AllocaInst, created before CoroBegin and used
   // after CoroBegin. Each entry contains the instruction and the offset in the
@@ -1594,6 +1609,9 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
   DenseMap<Instruction *, std::optional<APInt>> AliasOffetMap{};
   SmallPtrSet<Instruction *, 4> Users{};
   SmallPtrSet<IntrinsicInst *, 2> LifetimeStarts{};
+  SmallVector<BasicBlock *> LifetimeStartBBs{};
+  SmallPtrSet<BasicBlock *, 2> LifetimeEndBBs{};
+  SmallPtrSet<const BasicBlock *, 2> CoroSuspendBBs{};
   bool MayWriteBeforeCoroBegin{false};
   bool ShouldUseLifetimeStartInfo{true};
 
@@ -1605,10 +1623,19 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
     // every basic block that uses the pointer to see if they cross suspension
     // points. The uses cover both direct uses as well as indirect uses.
     if (ShouldUseLifetimeStartInfo && !LifetimeStarts.empty()) {
-      for (auto *I : Users)
-        for (auto *S : LifetimeStarts)
-          if (Checker.isDefinitionAcrossSuspend(*S, I))
-            return true;
+      // If there is no explicit lifetime.end, then assume the address can
+      // cross suspension points.
+      if (LifetimeEndBBs.empty())
+        return true;
+
+      // If there is a path from a lifetime.start to a suspend without a
+      // corresponding lifetime.end, then the alloca's lifetime persists
+      // beyond that suspension point and the alloca must go on the frame.
+      llvm::SmallVector<BasicBlock *> Worklist(LifetimeStartBBs);
+      if (isManyPotentiallyReachableFromMany(Worklist, CoroSuspendBBs,
+                                             &LifetimeEndBBs, &DT))
+        return true;
+
       // Addresses are guaranteed to be identical after every lifetime.start so
       // we cannot use the local stack if the address escaped and there is a
       // suspend point between lifetime markers. This should also cover the
@@ -1646,13 +1673,13 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
   }
 
   void handleMayWrite(const Instruction &I) {
-    if (!DT.dominates(&CoroBegin, &I))
+    if (!DT.dominates(CoroShape.CoroBegin, &I))
       MayWriteBeforeCoroBegin = true;
   }
 
   bool usedAfterCoroBegin(Instruction &I) {
     for (auto &U : I.uses())
-      if (DT.dominates(&CoroBegin, U))
+      if (DT.dominates(CoroShape.CoroBegin, U))
         return true;
     return false;
   }
@@ -1661,7 +1688,7 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
     // We track all aliases created prior to CoroBegin but used after.
     // These aliases may need to be recreated after CoroBegin if the alloca
     // need to live on the frame.
-    if (DT.dominates(&CoroBegin, &I) || !usedAfterCoroBegin(I))
+    if (DT.dominates(CoroShape.CoroBegin, &I) || !usedAfterCoroBegin(I))
       return;
 
     if (!IsOffsetKnown) {
@@ -2830,8 +2857,7 @@ static void collectFrameAlloca(AllocaInst *AI, coro::Shape &Shape,
   bool ShouldUseLifetimeStartInfo =
       (Shape.ABI != coro::ABI::Async && Shape.ABI != coro::ABI::Retcon &&
        Shape.ABI != coro::ABI::RetconOnce);
-  AllocaUseVisitor Visitor{AI->getModule()->getDataLayout(), DT,
-                           *Shape.CoroBegin, Checker,
+  AllocaUseVisitor Visitor{AI->getModule()->getDataLayout(), DT, Shape, Checker,
                            ShouldUseLifetimeStartInfo};
   Visitor.visitPtr(*AI);
   if (!Visitor.getShouldLiveOnFrame())
diff --git a/llvm/test/Transforms/Coroutines/coro-lifetime-end.ll b/llvm/test/Transforms/Coroutines/coro-lifetime-end.ll
new file mode 100644
index 0000000000000..330c61360e20a
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-lifetime-end.ll
@@ -0,0 +1,142 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+
+declare ptr @malloc(i64)
+
+%i8.array = type { [100 x i8] }
+declare void @consume.i8.array(ptr)
+
+@testbool = external local_unnamed_addr global i8, align 1
+
+; testval does not contain an explicit lifetime end. We must assume that it may
+; live across suspension.
+define void @HasNoLifetimeEnd() presplitcoroutine {
+; CHECK-LABEL: define void @HasNoLifetimeEnd() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @HasNoLifetimeEnd.resumers)
+; CHECK-NEXT:    [[ALLOC:%.*]] = call ptr @malloc(i64 16)
+; CHECK-NEXT:    [[VFRAME:%.*]] = call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[ALLOC]])
+; CHECK-NEXT:    store ptr @HasNoLifetimeEnd.resume, ptr [[VFRAME]], align 8
+; CHECK-NEXT:    [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[HASNOLIFETIMEEND_FRAME:%.*]], ptr [[VFRAME]], i32 0, i32 1
+; CHECK-NEXT:    store ptr @HasNoLifetimeEnd.destroy, ptr [[DESTROY_ADDR]], align 8
+; CHECK-NEXT:    [[INDEX_ADDR1:%.*]] = getelementptr inbounds [[HASNOLIFETIMEEND_FRAME]], ptr [[VFRAME]], i32 0, i32 2
+; CHECK-NEXT:    call void @consume.i8.array(ptr [[INDEX_ADDR1]])
+; CHECK-NEXT:    [[INDEX_ADDR2:%.*]] = getelementptr inbounds [[HASNOLIFETIMEEND_FRAME]], ptr [[VFRAME]], i32 0, i32 3
+; CHECK-NEXT:    store i1 false, ptr [[INDEX_ADDR2]], align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  %testval = alloca %i8.array
+  %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  %alloc = call ptr @malloc(i64 16) #3
+  %vFrame = call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %alloc)
+
+  call void @llvm.lifetime.start.p0(i64 100, ptr %testval)
+  call void @consume.i8.array(ptr %testval)
+
+  %save = call token @llvm.coro.save(ptr null)
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+    i8 0, label %await.ready
+    i8 1, label %exit
+  ]
+await.ready:
+  br label %exit
+exit:
+  call i1 @llvm.coro.end(ptr null, i1 false, token none)
+  ret void
+}
+
+define void @LifetimeEndAfterCoroEnd() presplitcoroutine {
+; CHECK-LABEL: define void @LifetimeEndAfterCoroEnd() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @LifetimeEndAfterCoroEnd.resumers)
+; CHECK-NEXT:    [[ALLOC:%.*]] = call ptr @malloc(i64 16)
+; CHECK-NEXT:    [[VFRAME:%.*]] = call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[ALLOC]])
+; CHECK-NEXT:    store ptr @LifetimeEndAfterCoroEnd.resume, ptr [[VFRAME]], align 8
+; CHECK-NEXT:    [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[LIFETIMEENDAFTERCOROEND_FRAME:%.*]], ptr [[VFRAME]], i32 0, i32 1
+; CHECK-NEXT:    store ptr @LifetimeEndAfterCoroEnd.destroy, ptr [[DESTROY_ADDR]], align 8
+; CHECK-NEXT:    [[INDEX_ADDR1:%.*]] = getelementptr inbounds [[LIFETIMEENDAFTERCOROEND_FRAME]], ptr [[VFRAME]], i32 0, i32 2
+; CHECK-NEXT:    call void @consume.i8.array(ptr [[INDEX_ADDR1]])
+; CHECK-NEXT:    [[INDEX_ADDR2:%.*]] = getelementptr inbounds [[LIFETIMEENDAFTERCOROEND_FRAME]], ptr [[VFRAME]], i32 0, i32 3
+; CHECK-NEXT:    store i1 false, ptr [[INDEX_ADDR2]], align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  %testval = alloca %i8.array
+  %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  %alloc = call ptr @malloc(i64 16) #3
+  %vFrame = call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %alloc)
+
+  call void @llvm.lifetime.start.p0(i64 100, ptr %testval)
+  call void @consume.i8.array(ptr %testval)
+
+  %save = call token @llvm.coro.save(ptr null)
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+    i8 0, label %await.ready
+    i8 1, label %exit
+  ]
+await.ready:
+  br label %exit
+exit:
+  call i1 @llvm.coro.end(ptr null, i1 false, token none)
+  call void @llvm.lifetime.end.p0(i64 100, ptr  %testval)
+  ret void
+}
+
+define void @BranchWithoutLifetimeEnd() presplitcoroutine {
+; CHECK-LABEL: define void @BranchWithoutLifetimeEnd() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @BranchWithoutLifetimeEnd.resumers)
+; CHECK-NEXT:    [[ALLOC:%.*]] = call ptr @malloc(i64 16)
+; CHECK-NEXT:    [[VFRAME:%.*]] = call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[ALLOC]])
+; CHECK-NEXT:    store ptr @BranchWithoutLifetimeEnd.resume, ptr [[VFRAME]], align 8
+; CHECK-NEXT:    [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[BRANCHWITHOUTLIFETIMEEND_FRAME:%.*]], ptr [[VFRAME]], i32 0, i32 1
+; CHECK-NEXT:    store ptr @BranchWithoutLifetimeEnd.destroy, ptr [[DESTROY_ADDR]], align 8
+; CHECK-NEXT:    [[TESTVAL:%.*]] = getelementptr inbounds [[BRANCHWITHOUTLIFETIMEEND_FRAME]], ptr [[VFRAME]], i32 0, i32 2
+; CHECK-NEXT:    call void @consume.i8.array(ptr [[TESTVAL]])
+; CHECK-NEXT:    [[TMP0:%.*]] = load i8, ptr @testbool, align 1
+; CHECK-NEXT:    [[INDEX_ADDR1:%.*]] = getelementptr inbounds [[BRANCHWITHOUTLIFETIMEEND_FRAME]], ptr [[VFRAME]], i32 0, i32 3
+; CHECK-NEXT:    store i1 false, ptr [[INDEX_ADDR1]], align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  %testval = alloca %i8.array
+  %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  %alloc = call ptr @malloc(i64 16) #3
+  %vFrame = call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %alloc)
+
+  call void @llvm.lifetime.start.p0(i64 100, ptr %testval)
+  call void @consume.i8.array(ptr %testval)
+
+  %0 = load i8, ptr @testbool, align 1
+  %tobool = trunc nuw i8 %0 to i1
+  br i1 %tobool, label %if.then, label %if.end
+
+if.then:
+  call void @llvm.lifetime.end.p0(i64 100, ptr  %testval)
+  br label %if.end
+
+if.end:
+  %save = call token @llvm.coro.save(ptr null)
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+    i8 0, label %await.ready
+    i8 1, label %exit
+  ]
+await.ready:
+  br label %exit
+exit:
+  call i1 @llvm.coro.end(ptr null, i1 false, token none)
+  ret void
+}
+
+
+declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr)
+declare ptr @llvm.coro.begin(token, ptr writeonly) #3
+declare ptr @llvm.coro.frame() #5
+declare i8 @llvm.coro.suspend(token, i1) #3
+declare i1 @llvm.coro.end(ptr, i1, token) #3
+declare void @llvm.lifetime.start.p0(i64, ptr nocapture) #4
+declare void @llvm.lifetime.end.p0(i64, ptr nocapture) #4

Copy link

github-actions bot commented May 7, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code duplication is slightly not good. Can we try to mitigate such problems by some template techniques?

For example, (may be):

if constexpr (isMany) {
    if (StopSet.contains(BB))
      return true;
} else {
    if (BB == StopBB) // or BB == *StopSet.begin()
      return true;
}

llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
@nikic
Copy link
Contributor

nikic commented May 8, 2024

I was curious if just the SmallPtrSet change would be enough for the original implementation, but apparently not: https://llvm-compile-time-tracker.com/index.php?branch=nikic/perf/reachable It is a significant improvement, but not enough.

llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
Also make curly braces more consistent
@alanzhao1
Copy link
Contributor Author

alanzhao1 commented May 8, 2024

The code duplication is slightly not good. Can we try to mitigate such problems by some template techniques?

For example, (may be):

if constexpr (isMany) {
    if (StopSet.contains(BB))
      return true;
} else {
    if (BB == StopBB) // or BB == *StopSet.begin()
      return true;
}

Hmm...would making StopBB/StopSet a std::variant<const BasicBlock *, const SmallPtrSetImpl<const BasicBlock *> *> work?

EDIT: changed the second std::variant template parameter to a pointer as its template parameters can't be reference types.

@ChuanqiXu9
Copy link
Member

The code duplication is slightly not good. Can we try to mitigate such problems by some template techniques?
For example, (may be):

if constexpr (isMany) {
    if (StopSet.contains(BB))
      return true;
} else {
    if (BB == StopBB) // or BB == *StopSet.begin()
      return true;
}

Hmm...would making StopBB/StopSet a std::variant<const BasicBlock *, const SmallPtrSetImpl<const BasicBlock *> *> work?

EDIT: changed the second std::variant template parameter to a pointer as its template parameters can't be reference types.

Maybe it can be slightly better to have a isReachableImpl template function, where the type of stop can be a template, then we can judge the type directly in the Impl function. WDYT?

@alanzhao1
Copy link
Contributor Author

alanzhao1 commented May 8, 2024

The code duplication is slightly not good. Can we try to mitigate such problems by some template techniques?
For example, (may be):

if constexpr (isMany) {
    if (StopSet.contains(BB))
      return true;
} else {
    if (BB == StopBB) // or BB == *StopSet.begin()
      return true;
}

Hmm...would making StopBB/StopSet a std::variant<const BasicBlock *, const SmallPtrSetImpl<const BasicBlock *> *> work?
EDIT: changed the second std::variant template parameter to a pointer as its template parameters can't be reference types.

Maybe it can be slightly better to have a isReachableImpl template function, where the type of stop can be a template, then we can judge the type directly in the Impl function. WDYT?

Personally, I'm leaning towards using std::variant because template type parameters don't have any information about the underlying type (LLVM hasn't migrated to C++20 yet, so we don't have concepts), so the only feedback we have is that the code doesn't compile because a type doesn't have a method. A variant parameter however does retain type information, and the code would be virtually identical to what you propose - e.g.

const auto StopBB** = std::get_if<const BasicBlock *>(StopBBOrSet);
const auto StopSet** = std::get_if<const SmallPtrSetImpl<const BasicBlock *> *>(StopBBOrSet);
if (StopSet) {
    if ((*StopSet)->contains(BB))
      return true;
} else {
    if (BB == *StopBB)
      return true;
}

LMKWYT

@nikic
Copy link
Contributor

nikic commented May 8, 2024

@alanzhao1 std::variant is inefficient on multiple levels. It is very unlikely to be suitable in cases where the whole point is to have efficient code.

@alanzhao1
Copy link
Contributor Author

Code for isPotentiallyReachableFromMany(...) and isManyPotentiallyReachableFromMany(...) has been deduplicated using templates.

@rnk rnk requested review from alinas and aeubanks May 8, 2024 16:32
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Maybe @nikic wants to have perf test.

llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
@nikic
Copy link
Contributor

nikic commented May 9, 2024

LGTM. Maybe @nikic wants to have perf test.

I don't see any compile-time impact with this version anymore.

@alanzhao1
Copy link
Contributor Author

@nikic do you want to rerun the compile time tests again to see if using SingleEntrySet causes any compile-time regressions?

@nikic
Copy link
Contributor

nikic commented May 15, 2024

@alanzhao1 This still causes a regression, but it's possible to avoid it by removing the StopBBReachable set: bfc261b We don't really need this set, it's okay to set DT to nullptr if any of the stop BBs is not reachable.

llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
@alanzhao1
Copy link
Contributor Author

Ping. Any more comments?

@nikic
Copy link
Contributor

nikic commented May 20, 2024

@alanzhao1 This still causes a regression, but it's possible to avoid it by removing the StopBBReachable set: bfc261b We don't really need this set, it's okay to set DT to nullptr if any of the stop BBs is not reachable.

This comment is still open.

@alanzhao1
Copy link
Contributor Author

@alanzhao1 This still causes a regression, but it's possible to avoid it by removing the StopBBReachable set: bfc261b We don't really need this set, it's okay to set DT to nullptr if any of the stop BBs is not reachable.

This comment is still open.

Sorry I missed this! Done

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CFG changes LGTM. I assume the coroutine changes have already been previously reviewed.

llvm/lib/Analysis/CFG.cpp Outdated Show resolved Hide resolved
@alanzhao1 alanzhao1 merged commit 0c8bc08 into llvm:main May 21, 2024
3 of 4 checks passed
@alanzhao1 alanzhao1 deleted the feature/reland-coro-fix branch May 21, 2024 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[coroutines] LLVM incorrectly allocates variables on the coroutine stack
4 participants