New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_cluster_restore_no_resurrection dtest fails with commitlog min_gc fix when tombstone_gc is correctly deferred without node.flush #15777
Comments
The regression was introduced to master in f42eb4d |
@avikivity @mykaul if we don't find an immediate solution and we covince ourselves that there's no flaw in the test, we should consider reverting that merge, and adding |
git bisect points at 3378c24 |
Cc @eliransin |
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_on_all_shards` in `major_keyspace_compaction_task_impl` before tables are compacted to ensure that any data in membtables is flushed to sstables, AND commitlog replay will not replay it back after restart. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_on_all_shards` in `major_keyspace_compaction_task_impl` before tables are compacted to ensure that any data in membtables is flushed to sstables, AND commitlog replay will not replay it back after restart. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_all_tables` in `shard_major_keyspace_compaction_task_impl` before tables are compacted to ensure that any data in membtables is flushed to sstables, AND a new commitlog segment is forced so that major compaction would be able to purge tombstones mroe efficiently, without them being locked by live commitlog segments. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
@fruch - is https://github.com/scylladb/scylla-dtest/pull/3689 actually fixing THIS issue? |
Hmm, that's what @elcallio was pointing to... |
@mykaul One can claim this is a test issue rather than a scylla issue, since there is no clear definition of what major compaction is reponsible for. With regards to tombstone purging, whether in regular or major compaction, now there are 2 processes that are external to compaction that are a prerequisite to maximizing tombstone garbage collection:
And https://github.com/scylladb/scylla-dtest/pull/3689 is adding the latter. Ideally, the issue should have been moved to scylla-dtest first, and then it would be clearer how a dtest change fixes it. |
The latter (nodetool flush) is not documented as such method (per https://opensource.docs.scylladb.com/stable/operating-scylla/nodetool-commands/flush.html ), but OK - understood - just wanted to ensure there's no error here. |
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_all_tables`, if the `compaction_flush_all_tables_before_major` is enabled, before tables are compacted to ensure that any data in membtables is flushed to sstables, AND a new commitlog segment is forced so that major compaction would be able to purge tombstones more efficiently, without them being locked by live commitlog segments. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_all_tables`, based on the `compaction_flush_all_tables_before_major_seconds` interval, before tables are compacted to ensure that any data in membtables is flushed to sstables, AND a new commitlog segment is forced so that major compaction would be able to purge tombstones more efficiently, without them being locked by live commitlog segments. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. in the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, compaction is unaware of mutations in the commitlog and due to sharing commitlog segments by different tables, it is possible the data or tombstones will get resurrected by commitlog replay in case the node restarts right after the respective data and tombstones are purged by (major) compaction after f42eb4d. This patch calls `database::flush_all_tables`, based on the `compaction_flush_all_tables_before_major_seconds` interval, before tables are compacted to ensure that any data in membtables is flushed to sstables, AND a new commitlog segment is forced so that major compaction would be able to purge tombstones more efficiently, without them being locked by live commitlog segments. Note that this requires flushing all tables (and their memtables) in each shard since they share the commitlog and an unflushed and unrelated table may hold a commitlog segment that stores mutations from another table that is about to get compacted. in the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
I'd like #15820 to fix this issue, not https://github.com/scylladb/scylla-dtest/pull/3689. The latter indeed fixes the test and the regression, but it's superficial, whilr the scylla change fixes it more comprehensively so it would serve common use cases beyond this particular dtest. |
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
…t and flushing all tables' from Benny Halevy Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes #15777 Closes #15820 * github.com:scylladb/scylladb: docs: nodetool: flush: enrich examples docs: nodetool: compact: fix example api: add /storage_service/compact api: add /storage_service/flush compaction_manager: flush_all_tables before major compaction database: add flush_all_tables api: compaction: add flush_memtables option test/nodetool: jmx: fix path to scripts/scylla-jmx scylla-nodetool, docs: improve optional params documentation
@scylladb/scylla-maint - can we backport this to 5.4? |
Why only 5.4? If it is worth backporting, we should backport to all currently supported OSS releases. @bhalevy should we backport this? |
There's a dependency on f42eb4d which is only in 5.4 |
The title of this issue mentions that the test fails "with commitlog min_gc fix", which is the one I mentioned above |
Re-ping @scylladb/scylla-maint - please backport to 5.4 (only, per above) |
Doesn't apply cleanly, please provide a backport PR. |
@bhalevy ^^ |
@tchaikov would you like to do the backport? |
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit 66ba983)
backport created at #16756 |
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit 66ba983)
Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit 66ba983)
… active segment and flushing all tables' from Kefu Chai Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See 64ec1c6 However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See f42eb4d). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes #15777 Closes #15820 to address the confliction, following change is also included in this changeset: tools/scylla-nodetool: implement the cleanup command The --jobs command-line argument is accepted but ignored, just like the current nodetool does. Refs: #15588 Closes #16756 * github.com:scylladb/scylladb: docs: nodetool: flush: enrich examples docs: nodetool: compact: fix example api: add /storage_service/compact api: add /storage_service/flush tools/scylla-nodetool: implement the flush command compaction_manager: flush_all_tables before major compaction database: add flush_all_tables api: compaction: add flush_memtables option test/nodetool: jmx: fix path to scripts/scylla-jmx scylla-nodetool, docs: improve optional params documentation tools/scylla-nodetool: extract keyspace/table parsing tools/scylla-nodetool: implement the cleanup command test/nodetool: rest_api_mock: add more options for multiple requests
Seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-release/397/testReport/cleanup_test/TestCleanup/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split001___test_cluster_restore_no_resurrection/
With Scylla version 7d5e22b
The text was updated successfully, but these errors were encountered: