Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] clang incorrectly uses coroutine frame for scratch space after suspending (take 2) #65054

Closed
jacobsa opened this issue Aug 29, 2023 · 9 comments
Assignees
Labels

Comments

@jacobsa
Copy link
Contributor

jacobsa commented Aug 29, 2023

Starting a new thread for this, because the previous threads have become long and confused by rollbacks and cherrypicks. Previous history here:


I've found another case where clang incorrectly writes to the coroutine frame after suspending, introducing a data race. To recap, clang must not write to the coroutine frame after suspending, because another thread may have already resumed/destroyed the coroutine by the time the write happens.

Here is the program, a slight variant of the one in #65018 (with a fix for the "caller always leaks" issue discussed in #65030):

#include <coroutine>

// A simple awaiter type with an await_suspend method that can't be
// inlined.
struct Awaiter {
  const int& x;

  bool await_ready() { return false; }
  std::coroutine_handle<> await_suspend(const std::coroutine_handle<> h);
  void await_resume() {}
};

struct MyTask {
  // A lazy promise with an await_transform method that supports awaiting
  // integer references using the Awaiter struct above.
  struct promise_type {
    MyTask get_return_object() {
      return MyTask{
          std::coroutine_handle<promise_type>::from_promise(*this),
      };
    }

    std::suspend_always initial_suspend() { return {}; }
    std::suspend_always final_suspend() noexcept { return {}; }
    void unhandled_exception();

    auto await_transform(const int& x) { return Awaiter{x}; }
  };

  std::coroutine_handle<> h;
};

// A global array of integers.
int g_array[32];

// A coroutine that awaits each integer in the global array.
MyTask FooBar() {
  for (const int& x : g_array) {
    co_await x;
  }
}

(Compiler Explorer)

Compiler Explorer isn't yet up to date with recent commits, but I built clang manually at b32aa72 and then ran ./bin/clang -std=c++20 -O0 -S -masm=intel foo.cc. This gives me the following output:

  ; (In the preamble before modifying rdi)
  ;
  ; Stash the address 64 bytes into the coroutine frame in the local thread's
  ; stack.
  mov	rax, rdi
  add	rax, 64
  mov	qword ptr [rbp - 96], rax       # 8-byte Spill
[...]

  ; Call await_suspend, then store its result on the local thread's stack.
  call	_ZN7Awaiter13await_suspendENSt7__n486116coroutine_handleIvEE@PLT
.Ltmp16:
  mov	qword ptr [rbp - 200], rax      # 8-byte Spill
  jmp	.LBB19_17

.LBB19_17:
  ; Reload `&coroutine_frame + 64` into rdi.
  mov	rdi, qword ptr [rbp - 96]       # 8-byte Reload
  
  ; Reload the result of await_suspend into rax.
  mov	rax, qword ptr [rbp - 200]      # 8-byte Reload
  
  ; Write the result of await_suspend to `&coroutine_frame + 64`.
  ; ------> This is the bug!
  mov	qword ptr [rdi], rax
  
  ; Call std::coroutine_handle::address, handing it a `this` pointer that points
  ; at the coroutine frame. Use the result as an indirect jump target.
  call	_ZNKSt7__n486116coroutine_handleIvE7addressEv
  mov	rdi, rax
  mov	rax, qword ptr [rax]
  add	rsp, 256
  pop	rbp
  .cfi_def_cfa rsp, 8
  jmp	rax                             # TAILCALL

@ChuanqiXu9

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 29, 2023

@llvm/issue-subscribers-coroutines

@jacobsa
Copy link
Contributor Author

jacobsa commented Aug 29, 2023

To be clear, this isn't new in b32aa72 (I can reproduce it back at llvmorg-16.0.0), but that commit also didn't fix it.

As previously discussed, it seems like clang needs to directly implement a more principled rule, something like "treat the coroutine frame as having escaped (or having been deallocated) at the suspend point".

@ChuanqiXu9
Copy link
Member

This is a little bit different from the alias issue. That was workarounded by the previous fix. The problem of the issue may be that part of the semantics depends on some optimizations implicitly.

This is a little bit tough to handle...

@ChuanqiXu9 ChuanqiXu9 self-assigned this Aug 29, 2023
@ChuanqiXu9
Copy link
Member

The direct issue is still the call to coroutine_handle<>::address() after await_suspend(). Without optimizations, the current logic will put the temporary result of await_suspend() to the coroutine frame since the middle end feel the temporary is escaped from coroutine_handle<>::address. To fix this fundamentally, we should wrap the whole logic about await-suspend into a standalone function. See #64945

And as a short-term workaround, we probably can mark coroutine_handle<>::address() as always-inline so that the temporary result may not be thought to be escaped then it won't be put on the coroutine frame. Although it looks dirty, it is probably do-able since the compiler are allowed to do special tricks to standard library components.

@jacobsa
Copy link
Contributor Author

jacobsa commented Aug 30, 2023

Thanks for the quick fix!

@avikivity
Copy link

Please consider backporting to 17.0.x.

@ChuanqiXu9
Copy link
Member

This is a regression so it won't be in 17.x

@avikivity
Copy link

The fix was backported (f05226d), no? Or I'm confusing bugs?

@ChuanqiXu9
Copy link
Member

f05226d

That was backported but I feel it is not so stable. So I send another PR to revert it in 17.x. So these bugs wouldn'
t be not fixed in 17.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants