New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use-after-free in cql3::query_processor::migration_subscriber::on_drop_column_family #15097
Comments
Decoded:
|
Looks like this was exposed with 825d617, maybe in combination of clang 16 (miscompile?). My theory is that the schema_ptr passed to:
is a temporary implicitly created by |
Cc @avikivity |
Reproduced with this patch: diff --git a/service/migration_manager.cc b/service/migration_manager.cc
index 78acb8aeb3..90fee4e9df 100644
--- a/service/migration_manager.cc
+++ b/service/migration_manager.cc
@@ -566,6 +566,7 @@ future<> migration_notifier::drop_keyspace(const sstring& ks_name) {
future<> migration_notifier::drop_column_family(const schema_ptr& cfm) {
const auto& ks_name = cfm->ks_name();
const auto& cf_name = cfm->cf_name();
+ co_await sleep(1ms);
co_await on_schema_change([&] (migration_listener* listener) {
listener->on_drop_column_family(ks_name, cf_name);
}, [&] (std::exception_ptr ex) { |
To prevent use-after-free as seen in scylladb#15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To prevent use-after-free as seen in scylladb#15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. Fixes scylladb#15097 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Hmm, it looks like the culprit is at the call site:
The life of the temporary schema_ptr generated from OTOH turning the lambda into a coroutine lambda and doing I reproduced this with a reduced unit test in seastar: SEASTAR_TEST_CASE(test_coro_arg_life_extension) {
struct value {
int i;
value(int v) : i(v) {
seastar_logger.info("value {}", i);
}
~value() {
seastar_logger.info("~value {}", i);
i = -1;
}
};
struct gen {
int i = 0;
gen() = default;
gen(int first) : i(first) {}
value operator()() {
return value(++i);
}
operator value() {
return this->operator()();
}
};
auto coro = [] (const value& v) -> future<> {
auto saved = v.i;
seastar_logger.info("entered coro {}", v.i);
co_await sleep(1ms);
seastar_logger.info("resumed coro {}", v.i);
BOOST_REQUIRE_EQUAL(saved, v.i);
};
std::vector<gen> gens;
gens.reserve(3);
gens.emplace_back(0);
gens.emplace_back(10);
gens.emplace_back(20);
co_await parallel_for_each(gens, [&] (gen& g) { return coro(g); });
} |
And when did this start? |
|
To prevent use-after-free as seen in scylladb#15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. Fixes scylladb#15097 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb#15098
To prevent use-after-free as seen in scylladb#15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. Fixes scylladb#15097 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb#15098
To prevent use-after-free as seen in scylladb#15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. \Fixes scylladb#15097 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> \Closes scylladb#15098 (cherry picked from commit 0f54e24)
Seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-debug/249/artifact/logs-full.debug.020/1692515592211_schema_management_test.py%3A%3ATestSchemaManagement%3A%3Atest_update_schema_while_node_is_killed%5Bdrop_table%5D/node2.log
The text was updated successfully, but these errors were encountered: