New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dtest] Tests / Sanity Tests-RAFT / throughput_limits_test.TestCompactionLimitThroughput.test_can_limit_compaction_throughput is flakey #15721
Comments
/cc @elcallio |
The log message would indicate a mutation is being written after/during commitlog shutdown. Since we are writing to compaction history, I guess it could be part of shutting down sstables? @raphaelsc - any ideas? |
Yes, indeed looks like a race in shutdown In database::stop, we have:
so user tables are correctly closed first but it could happen that a system (or even user from another shard) table itself writes into compaction_history after the latter was already closed. The effect is history missing a record for a table. Not sure it's worth fixing it. A possible fix is failing compaction if write to history failed, but I don't think it's a good idea. The weight of having compaction completed is much higher than missing some entry in history. |
We should quiesce compactions before closing the tables. |
Seen again failing in CI run
|
@bhalevy - please prioritize and assign |
@raphaelsc Is
|
Seen again in https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/4992/:
|
… shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, stop all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
… shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, stop all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
…ing shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, quiesce all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
…ing shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, quiesce all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
…ing shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, quiesce all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Seen again on 2024.1 - @scylladb/scylla-maint please backport |
…ing shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, quiesce all the compaction tasks before attempting to close the tables. Fixes #15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes #17218 (cherry picked from commit 3b7b315)
Backported to 5.4. @raphaelsc / @lkshminarayanan is this needed in 5.2? |
…ing shutdown During shutdown, as all system tables are closed in parallel, there is a possibility of a race condition between compaction stoppage and the closure of the compaction_history table. So, quiesce all the compaction tasks before attempting to close the tables. Fixes scylladb#15721 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb#17218
https://jenkins.scylladb.com/view/nexts/job/scylla-master/job/next/6668/testReport/junit/throughput_limits_test/TestCompactionLimitThroughput/Tests___Sanity_Tests_RAFT___test_can_limit_compaction_throughput/
did not run on r3, seems like a real bug.
The text was updated successfully, but these errors were encountered: