-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scrub compaction: segregate mode: unbounded number of buckets can cause OOM #9400
Comments
denesb
added a commit
to denesb/scylla
that referenced
this issue
Sep 28, 2021
…x buckets Recently we observed an OOM caused by the partition based splitting writer going crazy, creating 1.7K buckets while scrubbing an especially broken sstable. To avoid situations like that in the future, this patch provides a max limit for the number of live buckets. When the number of buckets reach this number, the largest bucket is closed and replaced by a bucket. This will end up creating more output sstables during scrub overall, but now they won't all be written at the same time causing insane memory pressure and possibly OOM. Scrub compaction sets this limit to 100, the same limit the TWCS's timestamp based splitting writer uses (implemented through the classifier - time_window_compaction_strategy::max_data_segregation_window_count). Fixes: scylladb#9400 Tests: unit(dev)
denesb
added a commit
to denesb/scylla
that referenced
this issue
Sep 29, 2021
…x buckets Recently we observed an OOM caused by the partition based splitting writer going crazy, creating 1.7K buckets while scrubbing an especially broken sstable. To avoid situations like that in the future, this patch provides a max limit for the number of live buckets. When the number of buckets reach this number, the largest bucket is closed and replaced by a bucket. This will end up creating more output sstables during scrub overall, but now they won't all be written at the same time causing insane memory pressure and possibly OOM. Scrub compaction sets this limit to 100, the same limit the TWCS's timestamp based splitting writer uses (implemented through the classifier - time_window_compaction_strategy::max_data_segregation_window_count). Fixes: scylladb#9400 Tests: unit(dev)
denesb
added a commit
to denesb/scylla
that referenced
this issue
Oct 20, 2021
…x buckets Recently we observed an OOM caused by the partition based splitting writer going crazy, creating 1.7K buckets while scrubbing an especially broken sstable. To avoid situations like that in the future, this patch provides a max limit for the number of live buckets. When the number of buckets reach this number, the largest bucket is closed and replaced by a bucket. This will end up creating more output sstables during scrub overall, but now they won't all be written at the same time causing insane memory pressure and possibly OOM. Scrub compaction sets this limit to 100, the same limit the TWCS's timestamp based splitting writer uses (implemented through the classifier - time_window_compaction_strategy::max_data_segregation_window_count). Fixes: scylladb#9400 Tests: unit(dev) Closes scylladb#9401 (cherry picked from commit 970fe9a)
Fix present on all active branches, not backporting. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The number of buckets the
partition_based_splitting_writer
can create is unbounded and can cause large memory pressure. Recently we observed a case where a particular sstable caused 1.7K buckets to be created causing OOM.The text was updated successfully, but these errors were encountered: