Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use-after-free on memtable::_cleaner when partition snapshot is destroyed in the background after memtable flush #4030

Closed
tgrabiec opened this Issue Dec 27, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@tgrabiec
Copy link
Contributor

tgrabiec commented Dec 27, 2018

Scylla version: f3da043 (>= 3.0-rc1)

May manifest as a crash in the following place:

partition_version::clear_gently(cache_tracker*) at /usr/src/debug/scylla-3.0.rc3/partition_version.cc:77
 (inlined by) mutation_cleaner_impl::destroy_gently(partition_version&) at /usr/src/debug/scylla-3.0.rc3/mutation_cleaner.hh:80
 (inlined by) mutation_cleaner::destroy_gently(partition_version&) at /usr/src/debug/scylla-3.0.rc3/mutation_cleaner.hh:148
 (inlined by) remove_or_mark_as_unique_owner at /usr/src/debug/scylla-3.0.rc3/partition_version.cc:38
operator() at /usr/src/debug/scylla-3.0.rc3/partition_version.cc:178
 (inlined by) with_allocator<partition_snapshot::~partition_snapshot()::<lambda()> > at /usr/src/debug/scylla-3.0.rc3/utils/allocation_strategy.hh:307
 (inlined by) partition_snapshot::~partition_snapshot() at /usr/src/debug/scylla-3.0.rc3/partition_version.cc:174
seastar::internal::lw_shared_ptr_accessors_esft<partition_snapshot>::dispose(partition_snapshot*) at /usr/src/debug/scylla-3.0.rc3/seastar/core/shared_ptr.hh:198
 (inlined by) seastar::lw_shared_ptr<partition_snapshot>::dispose(partition_snapshot*) at /usr/src/debug/scylla-3.0.rc3/seastar/core/shared_ptr.hh:276
 (inlined by) mutation_cleaner_impl::merge_some() at /usr/src/debug/scylla-3.0.rc3/mutation_partition.cc:2464
operator() at /usr/src/debug/scylla-3.0.rc3/mutation_partition.cc:2422
 (inlined by) apply at /usr/src/debug/scylla-3.0.rc3/seastar/core/apply.hh:35
 (inlined by) apply<mutation_cleaner_impl::start_worker()::<lambda()> mutable::<lambda()>::<lambda()> > at /usr/src/debug/scylla-3.0.rc3/seastar/core/apply.hh:43
 (inlined by) apply<mutation_cleaner_impl::start_worker()::<lambda()> mutable::<lambda()>::<lambda()> > at /usr/src/debug/scylla-3.0.rc3/seastar/core/future.hh:1309
 (inlined by) then<mutation_cleaner_impl::start_worker()::<lambda()> mutable::<lambda()>::<lambda()> > at /usr/src/debug/scylla-3.0.rc3/seastar/core/future.hh:952
 (inlined by) operator() at /usr/src/debug/scylla-3.0.rc3/mutation_partition.cc:2425
apply at /usr/src/debug/scylla-3.0.rc3/seastar/core/apply.hh:35
 (inlined by) apply<mutation_cleaner_impl::start_worker()::<lambda()> mutable::<lambda()>&> at /usr/src/debug/scylla-3.0.rc3/seastar/core/apply.hh:43
 (inlined by) apply<mutation_cleaner_impl::start_worker()::<lambda()> mutable::<lambda()>&> at /usr/src/debug/scylla-3.0.rc3/seastar/core/future.hh:1389
 (inlined by) operator() at /usr/src/debug/scylla-3.0.rc3/seastar/core/future-util.hh:86
 (inlined by) run_and_dispose at /usr/src/debug/scylla-3.0.rc3/seastar/core/task.hh:48
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/src/debug/scylla-3.0.rc3/seastar/core/reactor.cc:2695
seastar::reactor::run_some_tasks() at /usr/src/debug/scylla-3.0.rc3/seastar/core/reactor.cc:3118
seastar::reactor::run_some_tasks() at /usr/src/debug/scylla-3.0.rc3/seastar/util/log.hh:313
 (inlined by) seastar::reactor::run() at /usr/src/debug/scylla-3.0.rc3/seastar/core/reactor.cc:3265
seastar::smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()() const at /usr/src/debug/scylla-3.0.rc3/seastar/core/reactor.cc:4334
std::function<void ()>::operator()() const at /opt/scylladb/include/c++/7/bits/std_function.h:706
 (inlined by) seastar::posix_thread::start_routine(void*) at /usr/src/debug/scylla-3.0.rc3/seastar/core/posix.cc:52

Split from #4018

@tgrabiec tgrabiec added the bug label Dec 27, 2018

@tgrabiec tgrabiec self-assigned this Dec 27, 2018

avikivity added a commit that referenced this issue Dec 27, 2018

Merge "Fix use-after-free when destroying partition_snapshots in the …
…background" from Tomasz

"
partition_snapshots created in the memtable will keep a reference to
the memtable (as region*) and to memtable::_cleaner. As long as the
reader is alive, the memtable will be kept alive by
partition_snapshot_flat_reader::_container_guard. But after that
nothing prevents it from being destroyed. The snapshot can outlive the
read if mutation_cleaner::merge_and_destroy() defers its destruction
for later. When the read ends after memtable was flushed, the snapshot
will be queued in the cache's cleaner, but internally will reference
memtable's region and cleaner. This will result in a use-after-free
when the snapshot resumes destruction.

The fix is to update snapshots's region and cleaner references at the
time of queueing to point to the cache's region and cleaner.

When memtable is destroyed without being moved to cache there is no
problem because the snapshot would be queued into memtable's cleaner,
which will be drained on destruction from all snapshots.

Introduced in f3da043 (in >= 3.0-rc1)

Fixes #4030.
"

* 'fix-snapshot-merging-use-after-free' of github.com:tgrabiec/scylla:
  tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed
  tests: mvcc: Introduce mvcc_container::migrate()
  tests: mvcc: Make mvcc_partition move-constructible
  tests: mvcc: Introduce mvcc_container::make_not_evictable()
  tests: mvcc: Allow constructing mvcc_container without a cache_tracker
  mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup
  mvcc: partition_snapshot: Introduce migrate()
  mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner

avikivity added a commit that referenced this issue Dec 28, 2018

Merge "Fix use-after-free when destroying partition_snapshots in the …
…background"from Tomasz

"
partition_snapshots created in the memtable will keep a reference to
the memtable (as region*) and to memtable::_cleaner. As long as the
reader is alive, the memtable will be kept alive by
partition_snapshot_flat_reader::_container_guard. But after that
nothing prevents it from being destroyed. The snapshot can outlive the
read if mutation_cleaner::merge_and_destroy() defers its destruction
for later. When the read ends after memtable was flushed, the snapshot
will be queued in the cache's cleaner, but internally will reference
memtable's region and cleaner. This will result in a use-after-free
when the snapshot resumes destruction.

The fix is to update snapshots's region and cleaner references at the
time of queueing to point to the cache's region and cleaner.

When memtable is destroyed without being moved to cache there is no
problem because the snapshot would be queued into memtable's cleaner,
which will be drained on destruction from all snapshots.

Introduced in f3da043 (in >= 3.0-rc1)

Fixes #4030.

Tests:

  - mvcc_test (debug)

"

* tag 'fix-snapshot-merging-use-after-free-v1.1' of github.com:tgrabiec/scylla:
  tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed
  tests: mvcc: Introduce mvcc_container::migrate()
  tests: mvcc: Make mvcc_partition move-constructible
  tests: mvcc: Introduce mvcc_container::make_not_evictable()
  tests: mvcc: Allow constructing mvcc_container without a cache_tracker
  mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup
  mvcc: partition_snapshot: Introduce migrate()
  mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner

avikivity added a commit that referenced this issue Dec 28, 2018

Merge "Fix use-after-free when destroying partition_snapshots in the …
…background"from Tomasz

"
partition_snapshots created in the memtable will keep a reference to
the memtable (as region*) and to memtable::_cleaner. As long as the
reader is alive, the memtable will be kept alive by
partition_snapshot_flat_reader::_container_guard. But after that
nothing prevents it from being destroyed. The snapshot can outlive the
read if mutation_cleaner::merge_and_destroy() defers its destruction
for later. When the read ends after memtable was flushed, the snapshot
will be queued in the cache's cleaner, but internally will reference
memtable's region and cleaner. This will result in a use-after-free
when the snapshot resumes destruction.

The fix is to update snapshots's region and cleaner references at the
time of queueing to point to the cache's region and cleaner.

When memtable is destroyed without being moved to cache there is no
problem because the snapshot would be queued into memtable's cleaner,
which will be drained on destruction from all snapshots.

Introduced in f3da043 (in >= 3.0-rc1)

Fixes #4030.

Tests:

  - mvcc_test (debug)

"

* tag 'fix-snapshot-merging-use-after-free-v1.1' of github.com:tgrabiec/scylla:
  tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed
  tests: mvcc: Introduce mvcc_container::migrate()
  tests: mvcc: Make mvcc_partition move-constructible
  tests: mvcc: Introduce mvcc_container::make_not_evictable()
  tests: mvcc: Allow constructing mvcc_container without a cache_tracker
  mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup
  mvcc: partition_snapshot: Introduce migrate()
  mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner

(cherry picked from commit 8e2f6d0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.