-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures #77066
Conversation
@llvm/pr-subscribers-coroutines @llvm/pr-subscribers-clang Author: Utkarsh Saxena (usx95) ChangesProblemco_task<int> coro() {
int a = 1;
auto lamb = [a]() -> co_task<int> {
co_return a; // 'a' in the lambda object dies after the iniital_suspend in the lambda coroutine.
}();
co_return co_await lamb;
} Lambda captures (even by value) are prone to use-after-free once the lambda object dies. In the above example, the lambda object appears only as a temporary in the call expression. It dies after the first suspension ( SolutionThis problem can be formulated by saying that the Using this pattern inside a co_task<int> coro() {
int a = 1;
int res = co_await [a]() -> co_task<int> { co_return a; }();
co_return res;
} BackgroundThis came up in the discussion with seastar folks on RFC. This is a fairly common pattern in continuation-style-passing (CSP) async programming involving futures and continuations. Document "Lambda coroutine fiasco" by Seastar captures the problem. Fixes #76995 Full diff: https://github.com/llvm/llvm-project/pull/77066.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 60c0e3e74204ec..c100bf11454786 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -12,6 +12,7 @@
#include "clang/AST/ASTContext.h"
#include "clang/AST/DeclObjC.h"
+#include "clang/AST/Expr.h"
#include "clang/AST/ExprCXX.h"
#include "clang/AST/ExprObjC.h"
#include "clang/AST/ExprOpenMP.h"
@@ -33,6 +34,7 @@
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"
@@ -7575,15 +7577,27 @@ static void visitLifetimeBoundArguments(IndirectLocalPath &Path, Expr *Call,
Path.pop_back();
};
- if (ObjectArg && implicitObjectParamIsLifetimeBound(Callee))
- VisitLifetimeBoundArg(Callee, ObjectArg);
-
bool CheckCoroCall = false;
if (const auto *RD = Callee->getReturnType()->getAsRecordDecl()) {
CheckCoroCall = RD->hasAttr<CoroLifetimeBoundAttr>() &&
RD->hasAttr<CoroReturnTypeAttr>() &&
!Callee->hasAttr<CoroDisableLifetimeBoundAttr>();
}
+
+ if (ObjectArg) {
+ bool CheckCoroObjArg = CheckCoroCall;
+ // Ignore `__promise.get_return_object()` as it not lifetimebound.
+ if (Callee->getDeclName().isIdentifier() &&
+ Callee->getName() == "get_return_object")
+ CheckCoroObjArg = false;
+ // Coroutine lambda objects with empty capture list are not lifetimebound.
+ if (auto *LE = dyn_cast<LambdaExpr>(ObjectArg->IgnoreImplicit());
+ LE && LE->captures().empty())
+ CheckCoroObjArg = false;
+ if (implicitObjectParamIsLifetimeBound(Callee) || CheckCoroObjArg)
+ VisitLifetimeBoundArg(Callee, ObjectArg);
+ }
+
for (unsigned I = 0,
N = std::min<unsigned>(Callee->getNumParams(), Args.size());
I != N; ++I) {
diff --git a/clang/test/SemaCXX/coro-lifetimebound.cpp b/clang/test/SemaCXX/coro-lifetimebound.cpp
index 3fc7ca70a14a12..319134450e4b6f 100644
--- a/clang/test/SemaCXX/coro-lifetimebound.cpp
+++ b/clang/test/SemaCXX/coro-lifetimebound.cpp
@@ -64,6 +64,10 @@ Co<int> bar_coro(const int &b, int c) {
: bar_coro(0, 1); // expected-warning {{returning address of local temporary object}}
}
+// =============================================================================
+// Lambdas
+// =============================================================================
+namespace lambdas {
void lambdas() {
auto unsafe_lambda = [] [[clang::coro_wrapper]] (int b) {
return foo_coro(b); // expected-warning {{address of stack memory associated with parameter}}
@@ -84,15 +88,47 @@ void lambdas() {
co_return x + co_await foo_coro(b);
};
}
+
+Co<int> lambda_captures() {
+ int a = 1;
+ // Temporary lambda object dies.
+ auto lamb = [a](int x, const int& y) -> Co<int> { // expected-warning {{temporary whose address is used as value of local variable 'lamb'}}
+ co_return x + y + a;
+ }(1, a);
+ // Object dies but it has no capture.
+ auto no_capture = []() -> Co<int> { co_return 1; }();
+ auto bad_no_capture = [](const int& a) -> Co<int> { co_return a; }(1); // expected-warning {{temporary}}
+ // Temporary lambda object with lifetime extension under co_await.
+ int res = co_await [a](int x, const int& y) -> Co<int> {
+ co_return x + y + a;
+ }(1, a);
+ co_return 1;
+}
+} // namespace lambdas
+
// =============================================================================
-// Safe usage when parameters are value
+// Member coroutines
// =============================================================================
-namespace by_value {
-Co<int> value_coro(int b) { co_return co_await foo_coro(b); }
-[[clang::coro_wrapper]] Co<int> wrapper1(int b) { return value_coro(b); }
-[[clang::coro_wrapper]] Co<int> wrapper2(const int& b) { return value_coro(b); }
+namespace member_coroutines{
+struct S {
+ Co<int> member(const int& a) { co_return a; }
+};
+
+Co<int> use() {
+ S s;
+ int a = 1;
+ auto test1 = s.member(1); // expected-warning {{temporary whose address is used as value of local variable}}
+ auto test2 = s.member(a);
+ auto test3 = S{}.member(a); // expected-warning {{temporary whose address is used as value of local variable}}
+ co_return 1;
}
+[[clang::coro_wrapper]] Co<int> wrapper(const int& a) {
+ S s;
+ return s.member(a); // expected-warning {{address of stack memory}}
+}
+} // member_coroutines
+
// =============================================================================
// Lifetime bound but not a Coroutine Return Type: No analysis.
// =============================================================================
@@ -129,4 +165,22 @@ Co<int> foo_wrapper(const int& x) { return foo(x); }
// The call to foo_wrapper is wrapper is safe.
return foo_wrapper(1);
}
+
+struct S{
+[[clang::coro_wrapper, clang::coro_disable_lifetimebound]]
+Co<int> member(const int& x) { return foo(x); }
+};
+
+Co<int> use() {
+ S s;
+ int a = 1;
+ auto test1 = s.member(1); // param is not flagged.
+ auto test2 = S{}.member(a); // 'this' is not flagged.
+ co_return 1;
+}
+
+[[clang::coro_wrapper]] Co<int> return_stack_addr(const int& a) {
+ S s;
+ return s.member(a); // return of stack addr is not flagged.
+}
} // namespace disable_lifetimebound
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'll be better to say this is related to [[coro_lifetimebound]]
in the title. My instinct reaction to this is that "no, this is not strictly correct". But I feel good after I know it is an extension of [[coro_lifetimebound]]
only.
✅ With the latest revision this PR passed the C/C++ code formatter. |
clang/lib/Sema/SemaDecl.cpp
Outdated
// Allow some_promise_type::get_return_object(). | ||
// Since we are still in the promise definition, we can only do this | ||
// heuristically as the promise may not be yet associated to a coroutine. | ||
if (isa<CXXMethodDecl>(FD) && FD->getDeclName().isIdentifier() && | ||
FD->getName().equals("get_return_object") && FD->param_empty()) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return directly if it has CoroDisableLifetimeBoundAttr
? (Maybe we need to edit the semantics of CoroDisableLifetimeBoundAttr
a little bit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what purpose it would serve. Do you want [[clang::coro_disable_lifetimebound]]
to imply [[clang::coro_wrapper]]
for brevity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intention here is to remove the special handling to 'get_return_object'. The 'coro_wrapper' attribute should be applied here IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point (in ActOnFinishFunctionBody
), neither CoroDisableLifetimeBoundAttr
nor CoroWrapperAttr
would be available since these are only added to the get_return_object
FunctionDecl once we encounter a coroutine body and make the return object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. In this case, we can make it into a PendingXXX struct in Sema and we can check it in the end of the TU. But this is not required in this patch. If you don't want to address that in the current patch, let's leave a FIXME or a TODO here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that would work either. For example, consider TUs with only promise definitions and no coroutine definitions. These attr would not be available even in the end of TU.
But in that case we shouldn't see any coro_return_type
annotations either and there will be no pending actions? I don't see why this approach won't work.
I am worried about breaking the ODR rules, though. @ChuanqiXu9 is it fine to add attributes to existing functions conditioned on how they are used rather than how they are defined?
Imagine two modules:
- one only defines the promise type, but never actually uses it as a coroutine, so we don't happen to get those attributes.
- another one imports the module from the previous step and attributes get added.
We can now easily get object files that have both definitions of those functions with and without the attributes (by importing either one of the other). I am not sure if there are some potential failures lurking in if we do that, e.g. could ODR hash checks for importing both as submodules will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in that case we shouldn't see any coro_return_type annotations either and there will be no pending actions? I don't see why this approach won't work.
Why not ? We would always see the coro_return_type
annotation for the return type. The declaration (and therefore the annotations) of ACoroReturnType
would be visible to ACoroReturnType get_return_object() { return {}; }
. This TU does not have to have a coroutine definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, sorry. I somehow thought we only warn when there are real coroutine somewhere, but the whole point of this check is to flag non-coroutine functions that are not flagged as wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that would work either. For example, consider TUs with only promise definitions and no coroutine definitions. These attr would not be available even in the end of TU.
Oh, yeah. I didn't think about modules : ).
@ChuanqiXu9 is it fine to add attributes to existing functions conditioned on how they are used rather than how they are defined?
Then it is troublesome to use this attribute, which violates our intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ilya-biryukov for flagging the issue with ODR violation/Modules. Then I don't think it is wise to add these attributes. I have reverted to the not-so-great heuristic of matching function names.
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the changes, but I left a suggestion about naming the function that I think we should follow.
Thanks for the patch! |
…aptures (llvm#77066) ### Problem ```cpp co_task<int> coro() { int a = 1; auto lamb = [a]() -> co_task<int> { co_return a; // 'a' in the lambda object dies after the iniital_suspend in the lambda coroutine. }(); co_return co_await lamb; } ``` [use-after-free](https://godbolt.org/z/GWPEovWWc) Lambda captures (even by value) are prone to use-after-free once the lambda object dies. In the above example, the lambda object appears only as a temporary in the call expression. It dies after the first suspension (`initial_suspend`) in the lambda. On resumption in `co_await lamb`, the lambda accesses `a` which is part of the already-dead lambda object. --- ### Solution This problem can be formulated by saying that the `this` parameter of the lambda call operator is a lifetimebound parameter. The lambda object argument should therefore live atleast as long as the return object. That said, this requirement does not hold if the lambda does not have a capture list. In principle, the coroutine frame still has a reference to a dead lambda object, but it is easy to see that the object would not be used in the lambda-coroutine body due to no capture list. It is safe to use this pattern inside a`co_await` expression due to the lifetime extension of temporaries. Example: ```cpp co_task<int> coro() { int a = 1; int res = co_await [a]() -> co_task<int> { co_return a; }(); co_return res; } ``` --- ### Background This came up in the discussion with seastar folks on [RFC](https://discourse.llvm.org/t/rfc-lifetime-bound-check-for-parameters-of-coroutines/74253/19?u=usx95). This is a fairly common pattern in continuation-style-passing (CSP) async programming involving futures and continuations. Document ["Lambda coroutine fiasco"](https://github.com/scylladb/seastar/blob/master/doc/lambda-coroutine-fiasco.md) by Seastar captures the problem. This pattern makes the migration from CSP-style async programming to coroutines very bugprone. Fixes llvm#76995 --------- Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
Problem
use-after-free
Lambda captures (even by value) are prone to use-after-free once the lambda object dies. In the above example, the lambda object appears only as a temporary in the call expression. It dies after the first suspension (
initial_suspend
) in the lambda.On resumption in
co_await lamb
, the lambda accessesa
which is part of the already-dead lambda object.Solution
This problem can be formulated by saying that the
this
parameter of the lambda call operator is a lifetimebound parameter. The lambda object argument should therefore live atleast as long as the return object.That said, this requirement does not hold if the lambda does not have a capture list. In principle, the coroutine frame still has a reference to a dead lambda object, but it is easy to see that the object would not be used in the lambda-coroutine body due to no capture list.
It is safe to use this pattern inside a
co_await
expression due to the lifetime extension of temporaries. Example:Background
This came up in the discussion with seastar folks on RFC. This is a fairly common pattern in continuation-style-passing (CSP) async programming involving futures and continuations. Document "Lambda coroutine fiasco" by Seastar captures the problem.
This pattern makes the migration from CSP-style async programming to coroutines very bugprone.
Fixes #76995