forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
compactor: purge suggestions that have live data
While debugging cockroachdb#24029 we discovered that RESTORE generates massive numbers of suggested compactions as it splits and scatters ranges. As the cluster rebalances, every removed replica leaves behind a range deletion tombstone and a suggested compaction over the keys it covered. Occasionally, the replica chosen for rebalancing will be a member of the last range of the cluster. This range extends from wherever the restore has last split the table it's restoring to the very last key. Suppose we're restoring a table with 10,000 ranges evenly distributed across primary keys from 1-10,000. If a replica in the last range gets rebalanced early in the restore—say, after only the first 500 ranges have been split off—at least one node in the cluster will have a suggested compaction for a range like the following: /Table/51/1/500 - /Max This creates a huge problem! The restore will eventually create 9500 more ranges in that keyspan, each about 32MiB in size. Some of those ranges will necessarily rebalance back onto the node with the suggested compaction. By the time the compaction queue gets around to processing the suggestion, there might be hundreds of gigabytes within the range. In our 2TB (replicated) store dump, snapshotted immediately after a RESTORE, there were two such massive suggested compactions, each of which took over 1h to complete. This bogs down the compaction queue with unnecessary work and makes it especially dangerous to initiate a DROP TABLE (cockroachdb#24029), as the incoming range deletion tombstones will pile up until the prior compaction finishes, and the cluster grinds to a halt in the meantime. The same problem happens whenever any replica is rebalanced away and back before the compaction queue has a chance to compact away the range deletion tombstone, though the impact is limited because the keyspan is smaller. This commit prevents the compaction queue from getting bogged down with suggestions based on outdated information. At the time the suggestion is considered for compaction, the queue checks whether the key span suggested has any live keys. The sign of even a single key is a good indicator that something has changed—usually that the replica, or one of its split children, has been rebalanced back onto the node. The compaction queue now deletes this suggestion instead of acting on it. This is a crucial piece of the fix to cockroachdb#24029. The other half, cockroachdb#26449, involves rate-limiting ClearRange requests. I suspect this change will have a nice performance boost for RESTORE. Anecdotally we've noticed that restores slow down over time. I'm willing to bet its because nonsense suggested compactions start hogging disk I/O. Release note: None
- Loading branch information
Showing
3 changed files
with
131 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters