-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compaction: Fix sstable cleanup after resharding on refresh #14147
compaction: Fix sstable cleanup after resharding on refresh #14147
Conversation
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
CI state |
@@ -1590,6 +1590,10 @@ bool needs_cleanup(const sstables::shared_sstable& sst, | |||
|
|||
bool compaction_manager::update_sstable_cleanup_state(table_state& t, const sstables::shared_sstable& sst, const dht::token_range_vector& sorted_owned_ranges) { | |||
auto& cs = get_compaction_state(&t); | |||
if (sst->is_shared()) { | |||
throw std::runtime_error(format("Shared SSTable {} cannot be marked as requiring cleanup, as it can only be processed by resharding", | |||
sst->get_filename())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we now have a printer for shared_sstable that prints also the sstable origin, that might be relevant here, so better pass just sst
.
co_await d.filter_sstables([&] (sstables::shared_sstable sst) -> future<bool> { | ||
if (table.update_sstable_cleanup_state(sst, *sorted_owned_ranges_ptr)) { | ||
if (needs_cleanup(sst, *sorted_owned_ranges_ptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the reason this wasn't the root cause for #14001 is that here we filter the unshared sstables, meaning it's benign to cleanup those sstables later on from the multi-shard sharing perspective.
But the hunk is still needed.
The original intent was to detect sstables requiring cleanup on start-up, when populating the table.
But we're not there yet as we don't have the token metadata in this early stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, thanks!
Just one nit regarding the runtime error message.
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#14147 (cherry picked from commit 156d771) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes #14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14147 (cherry picked from commit 156d771)
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#14147 (cherry picked from commit 156d771) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#14147 (cherry picked from commit 156d771) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#14147 (cherry picked from commit 156d771) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Problem can be reproduced easily: 1) wrote some sstables with smp 1 2) shut down scylla 3) moved sstables to upload 4) restarted scylla with smp 2 5) ran refresh (resharding happens, adds sstable to cleanup set and never removes it) 6) cleanup (tries to cleanup resharded sstables which were leaked in the cleanup set) Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding. Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files). Dtest: scylladb/scylla-dtest#3206 Fixes scylladb#14001. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#14147 (cherry picked from commit 156d771) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Problem can be reproduced easily:
Bumps into assert "Assertion `!sst->is_shared()' failed", as cleanup picks a shared sstable that was leaked and already processed by resharding.
Fix is about not inserting shared sstables into cleanup set, as shared sstables are restricted to resharding and cannot be processed later by cleanup (nor it should because resharding itself cleaned up its input files).
Dtest: https://github.com/scylladb/scylla-dtest/pull/3206
Fixes #14001.