Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopRotate][coroutines] Avoid hoisting addresses of thread-local variables outside loops in coroutines #81937

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

alanzhao1
Copy link
Contributor

Because loops in coroutines may have a co_await statement that reschedules the coroutine to another thread, we cannot cache addresses of thread-local variables obtained inside a loop by moving the computation of thoes addresses outside a loop.

Since LLVM doesn't have a model for coroutine memory accesses, this patch fixes this bug by disabling this optimization for coroutines in the same way as https://reviews.llvm.org/D135550 and https://reviews.llvm.org/D151774.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 15, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Alan Zhao (alanzhao1)

Changes

Because loops in coroutines may have a co_await statement that reschedules the coroutine to another thread, we cannot cache addresses of thread-local variables obtained inside a loop by moving the computation of thoes addresses outside a loop.

Since LLVM doesn't have a model for coroutine memory accesses, this patch fixes this bug by disabling this optimization for coroutines in the same way as https://reviews.llvm.org/D135550 and https://reviews.llvm.org/D151774.


Full diff: https://github.com/llvm/llvm-project/pull/81937.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/LoopRotationUtils.cpp (+9-1)
  • (added) llvm/test/Transforms/LoopRotate/coroutine.ll (+34)
diff --git a/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp b/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
index ec59a077302037..fb75f1622513df 100644
--- a/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
@@ -612,7 +612,15 @@ bool LoopRotate::rotateLoop(Loop *L, bool SimplifiedLatch) {
       // memory (without proving that the loop doesn't write).
       if (L->hasLoopInvariantOperands(Inst) && !Inst->mayReadFromMemory() &&
           !Inst->mayWriteToMemory() && !Inst->isTerminator() &&
-          !isa<DbgInfoIntrinsic>(Inst) && !isa<AllocaInst>(Inst)) {
+          !isa<DbgInfoIntrinsic>(Inst) && !isa<AllocaInst>(Inst) &&
+          // FIXME: It is not safe to cache the value of these instructions in
+          // coroutines, as the addresses of otherwise eligible variables (e.g.
+          // thread-local variables and errno) may change if the coroutine is
+          // resumed in a different thread. Therefore, we disable this
+          // optimization for correctness. However, this may block other correct
+          // optimizations. This should be reverted once we have a better model
+          // for memory access in coroutines.
+          !Inst->getFunction()->isPresplitCoroutine()) {
 
         if (LoopEntryBranch->getParent()->IsNewDbgInfoFormat &&
             !NextDbgInsts.empty()) {
diff --git a/llvm/test/Transforms/LoopRotate/coroutine.ll b/llvm/test/Transforms/LoopRotate/coroutine.ll
new file mode 100644
index 00000000000000..6dbaab2ecd7097
--- /dev/null
+++ b/llvm/test/Transforms/LoopRotate/coroutine.ll
@@ -0,0 +1,34 @@
+; RUN: opt -S -passes=loop-rotate < %s | FileCheck %s
+
+declare void @bar1()
+
+@threadlocalint = thread_local global i32 0, align 4
+
+define void @foo() #0 {
+; CHECK-LABEL: entry:
+; CHECK: call align 4 ptr @llvm.threadlocal.address.p0(ptr align 4 @threadlocalint)
+; CHECK: br {{.*}} label %cond.end
+entry:
+  br label %while.cond
+
+while.cond:
+  %1 = tail call align 4 ptr @llvm.threadlocal.address.p0(ptr align 4 @threadlocalint)
+  %2 = load i32, ptr %1, align 4
+  %cmp = icmp eq i32 %2, 0
+  br i1 %cmp, label %cond.end, label %cond.false
+
+cond.false:
+  call void @bar1()
+  unreachable
+
+; The address of threadlocalint must not be cached outside loops in presplit
+; coroutines.
+; CHECK-LABEL: cond.end:
+; CHECK: call align 4 ptr @llvm.threadlocal.address.p0(ptr align 4 @threadlocalint)
+; CHECK: br {{.*}} label %cond.end
+cond.end:
+  call void @bar1()
+  br label %while.cond
+}
+
+attributes #0 = { presplitcoroutine }

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/caching/hoisting

llvm/test/Transforms/LoopRotate/coroutine.ll Outdated Show resolved Hide resolved
@alanzhao1 alanzhao1 changed the title [LoopRotate][coroutines] Avoid caching addresses of thread-local variables outside loops in coroutines [LoopRotate][coroutines] Avoid hoisting addresses of thread-local variables outside loops in coroutines Feb 16, 2024
alanzhao1 added a commit to alanzhao1/llvm-project that referenced this pull request Feb 16, 2024
This is to provide a baseline test for
llvm#81937.
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

alanzhao1 added a commit to alanzhao1/llvm-project that referenced this pull request Feb 19, 2024
This is to provide a baseline test for
llvm#81937.
alanzhao1 added a commit that referenced this pull request Feb 19, 2024
This is to provide a baseline test for
#81937.
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg with comment change

llvm/lib/Transforms/Utils/LoopRotationUtils.cpp Outdated Show resolved Hide resolved
@alanzhao1
Copy link
Contributor Author

s/caching/hoisting

Fixed the PR description, but I need to wait until this has been approved before fixing the actual commit message since that requires rewriting history and force pushing.

…iables outside loops in coroutines

Because loops in coroutines may have a co_await statement that
reschedules the coroutine to another thread, we cannot cache addresses
of thread-local variables obtained inside a loop by moving the
computation of thoes addresses outside a loop.

Since LLVM doesn't have a model for coroutine memory accesses, this
patch fixes this bug by disabling this optimization for coroutines in
the same way as https://reviews.llvm.org/D135550 and
https://reviews.llvm.org/D151774.
@alanzhao1 alanzhao1 merged commit a9b5753 into llvm:main Feb 20, 2024
3 of 4 checks passed
@alanzhao1 alanzhao1 deleted the bugfix/coroutine-loop branch February 20, 2024 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants