Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce bottom-pri thread pool for large universal compactions #2580

Closed
wants to merge 8 commits into from

Conversation

ajkr
Copy link
Contributor

@ajkr ajkr commented Jul 14, 2017

When we had a single thread pool for compactions, a thread could be busy for a long time (minutes) executing a compaction involving the bottom level. In multi-instance setups, the entire thread pool could be consumed by such bottom-level compactions. Then, top-level compactions (e.g., a few L0 files) would be blocked for a long time ("head-of-line blocking"). Such top-level compactions are critical to prevent compaction stalls as they can quickly reduce number of L0 files / sorted runs.

This diff introduces a bottom-priority queue for universal compactions including the bottom level. This alleviates the head-of-line blocking situation for fast, top-level compactions.

  • Added Env::Priority::BOTTOM thread pool. This feature is only enabled if user explicitly configures it to have a positive number of threads.
  • Changed ThreadPoolImpl's default thread limit from one to zero. This change is invisible to users as we call IncBackgroundThreadsIfNeeded on the low-pri/high-pri pools during DB::Open with values of at least one. It is necessary, though, for bottom-pri to start with zero threads so the feature is disabled by default.
  • Separated ManualCompaction into two parts in PrepickedCompaction. PrepickedCompaction is used for any compaction that's picked outside of its execution thread, either manual or automatic.
  • Forward universal compactions involving last level to the bottom pool (worker thread's entry point is BGWorkBottomCompaction).
  • Track bg_bottom_compaction_scheduled_ so we can wait for bottom-level compactions to finish. We don't count them against the background jobs limits. So users of this feature will get an extra compaction for free.

Test Plan:

  • unit tests to ensure bottom-level compaction can execute in parallel with top-level compaction (in bottom-pri and low-pri thread pools, respectively).
  • benchmark where 2 db instances need size amp compaction; 8 instances don't:

before:

$ TEST_TMPDIR=/data/compaction_bench/ ./db_bench -benchmarks=overwrite -num_multi_db=10 -use_existing_db=true -num=1000000 -num_low_pri_threads=2 -statistics -write_buffer_size=1048576 -level0_file_num_compaction_trigger=8 -level0_slowdown_writes_trigger=20 -level0_stop_writes_trigger=1000 -target_file_size_base=10485760
 -compaction_style=1 -bytes_per_sync=1048576 -max_write_buffer_number=12 -rate_limiter_bytes_per_sec=32768000 -benchmark_write_rate_limit=8388608 -delayed_write_rate=65536 -universal_max_size_amplification_percent=95
...
rocksdb.stall.micros COUNT : 49840930
...
overwrite    :      18.670 micros/op 53561 ops/sec;    5.9 MB/s

after:

$ TEST_TMPDIR=/data/compaction_bench/ ./db_bench -benchmarks=overwrite -num_multi_db=10 -use_existing_db=true -num=1000000 -num_low_pri_threads=1 -num_bottom_pri_threads=1 -statistics -write_buffer_size=1048576 -level0_file_num_compaction_trigger=8 -level0_slowdown_writes_trigger=20 -level0_stop_writes_trigger=1000 -target_file_size_base=10485760 -compaction_style=1 -bytes_per_sync=1048576 -max_write_buffer_number=12 -rate_limiter_bytes_per_sec=32768000 -benchmark_write_rate_limit=8388608 -delayed_write_rate=65536 -universal_max_size_amplification_percent=95
...
rocksdb.stall.micros COUNT : 17707684
...
overwrite    :      15.529 micros/op 64396 ops/sec;    7.1 MB/s

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr ajkr requested a review from siying July 14, 2017 05:37
@facebook-github-bot
Copy link
Contributor

@ajkr updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr
Copy link
Contributor Author

ajkr commented Jul 26, 2017

ping @siying, zippydb still wants this by end of July, AFAIK.

Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Only one comment.

@@ -1603,6 +1641,28 @@ Status DBImpl::BackgroundCompaction(bool* made_progress,

// Clear Instrument
ThreadStatusUtil::ResetThreadStatus();
} else if (env_->GetBackgroundThreads(Env::Priority::BOTTOM) > 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this Env function call as the last condition? There is uncertainty about how expensive this call can be, because Env is pluggable. If we only check it if it is a full compaction in universal, we call it much infrequently.

@facebook-github-bot
Copy link
Contributor

@ajkr updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants