New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compaction_manager::perform_cleanup does not handle condition_variable_timed_out #15669
Comments
2 tasks
bhalevy
added a commit
to bhalevy/scylla
that referenced
this issue
Oct 9, 2023
The polling loop was intended to ignore `condition_variable_timed_out` and check for progress using a longer `max_idle_duration` timeout in the loop. Fixes scylladb#15669 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This was referenced Oct 9, 2023
bhalevy
added a commit
to bhalevy/scylla
that referenced
this issue
Nov 13, 2023
The polling loop was intended to ignore `condition_variable_timed_out` and check for progress using a longer `max_idle_duration` timeout in the loop. Fixes scylladb#15669 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb#15671 (cherry picked from commit 68a7bbe)
bhalevy
added a commit
to bhalevy/scylla
that referenced
this issue
Nov 13, 2023
The polling loop was intended to ignore `condition_variable_timed_out` and check for progress using a longer `max_idle_duration` timeout in the loop. Fixes scylladb#15669 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb#15671 (cherry picked from commit 68a7bbe)
@bhalevy please evaluate for backport. |
mykaul
added
the
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
label
Jan 1, 2024
Backport is required to 5.4 and 2024.1 |
It seem to be needed. |
Backported to 5.4. |
denesb
removed
Backport candidate
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
labels
Jan 4, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As seen in SCT (#15461 (comment)),
nodetool cleanup
occasionally fails with exit status 2.nodetool-cleanup-5-2-cleanup--db-node-d57e5c04-5/messages.log, for example, reveals this:
The conditional variable timeout originates from
scylladb/compaction/compaction_manager.cc
Line 1792 in 4e6fe34
that was supposed to be benign.
A longer timeout is checked at
scylladb/compaction/compaction_manager.cc
Lines 1775 to 1780 in 4e6fe34
That said, it is possible that if regular compaction will pick up a huge sstable that requires cleanup and it takes more than 5 minutes to compact it cleanup may still fail.
@Deexie in the above case maybe perform_cleanup can identify the respective task and wait on it.
But let's leave it for a follow-up.
The text was updated successfully, but these errors were encountered: