-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang misaligns variables stored in coroutine frames #56671
Comments
wg21.link/p2014r0 ? I didn't track the most recent development though, nor I'm aware of any consensus yet. Also see https://reviews.llvm.org/D102147 |
@llvm/issue-subscribers-c-20 |
@yuanfang-chen Ah nice, thank you very much for the context. I'm surprised to find that this was left out of the standard. But also surprised that it needs to be in the standard in order to be fixed: wouldn't it be possible to achieve an alignment of Anyway failing that, it sounds like maybe I'll get what I want if I wait for https://reviews.llvm.org/D102147, set |
It is possible. https://reviews.llvm.org/D97915 (parent patch of https://reviews.llvm.org/D102147) will do that when aligned allocator is not available.
As mentioned above, |
I'm not sure I see why overloading I'm also not very familiar with HALO. My thought was that if there is a user-provided |
@yuanfang-chen I just noticed that https://reviews.llvm.org/D97915 hasn't had a comment in over a year. :-( Is there any hope for getting the over-allocation strategy committed any time soon? This is a correctness problem, so it seems worth doing that before the standard is perfect. |
This should be irrelevant with HALO. I prefer the direction of For the practical side, I think you could use the workaround of MLIR. MLIR is also a user of LLVM Coroutines. Previously, MLIR met a similar problem, and their solution in C++ may look like: https://godbolt.org/z/eYderhTTW |
Yes, I have a similar workaround. It's not totally correct though, because there is no upper bound on the alignment that might be needed (right?). I agree it's a problem that the standard doesn't let you define an |
Yes. Either the standard library-provided version or a user-defined version. If the user-defined version is not found, it should use the standard library-provided version.
It concerns both correctness and customization because the current standard wording prevents the compiler from searching for and using the aligned allocator (either the standard library-provided version or a user-defined version). So it is the standard that caused this bug (the compiler could fix it as an extension though, like D97915).
Agreed. I'll find some time to rebase the patch. |
Strictly, there is no upper bound on the alignment. But from my experiences, '64' works in most cases of X86_64. I saw some use cases (not coroutines) in AArch64 which uses '128' to speedup hardware cache accessing. But all of these are low-level softwares like kernels. And many use cases for coroutines are the higher level applications. |
Update: Wg21 is going to discuss P2014(cplusplus/papers#750) in 8/18. |
Update: No consensus in the meeting. And we feel good to have some implementation experience and use experience first. And option1 has some problems: And option2 is much simpler and option2 wouldn't meet these problems. But option2 will break more existing codes. Also we concern about the ABI problems. But personally I strongly prefer option2. Since for the coroutine uses, the number of coroutine type definitions are really rare. And although option2 is going to break existing codes, we need to edit rare places. Also the option1 may break existing codes too. @yuanfang-chen due to D102147 wants to implement option1. So we may implement a new revision. How do you think about this? If you don't have time, I'm Ok to implement option2 seperately. |
@ChuanqiXu9 Thanks for the updates. Glad to see that the discussion on this issue resumed. Option2 looks better to me too. Making it consistent with normal new/delete rules seems too much burden to carry without justifiable benefits. Let's just implement option2 and see what the user experiences are. I won't have the time to implement this though but happy to help review patches. |
I don't think the middle-end strictly needs to do the overload resolution as long as the language says that it's permitted to ODR-use both functions — basically, the language needs to authorize the frontend to emit an |
Yes, as long as the language permits us, it might not be a problem for the vendor. The reason why ewg don't get consensus is that: And option2 don't have these problems except it will break more existing user codes. So if option2 is in the standard, the user need to edit their codes when upgrading. But personally I feel it might not be a big deal. And we don't any consensus yet. So I think it may be good to try to implement option2 and see the feedbacks at first. |
For what it’s worth I’d be happy with option 2 even though it’s going to break my code. Will it be difficult to implement? |
As far as I can see, it wouldn't be difficult to implement. I just have too many TODOs to start working on it... |
Nice work, thank you! |
implement the option2 of P2014R0 This implements the option2 of https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2014r0.pdf. This also fixes llvm/llvm-project#56671. Although wg21 didn't get consensus for the direction of the problem, we're happy to have some implementation and user experience first. And from issue56671, the option2 should be the pursued one. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D133341
Clang seems to fail to correctly align objects in a coroutine frame that need more than 8 bytes of alignment on x86-64. Here is a program that demonstrates this by having a 64-byte-aligned object persist across a
co_await
(Compiler Explorer):When built from trunk with
-std=c++20
, this dies pretty reliably with aBad alignment
message. If you instead change the program so that it createsHighlyAligned
objects on the stack inmain
, it doesn't crash.(For the record, here is a gtest version that is a little easier to work with but has more dependencies.)
The text was updated successfully, but these errors were encountered: