Summary: Background activities like compaction can negatively affect
latency of higher-priority tasks like request processing. To avoid this,
rocksdb already lowers the IO priority of background threads on Linux
systems. While this takes care of typical IO-bound systems, it does not
help much when CPU (temporarily) becomes the bottleneck. This is
especially likely when using more expensive compression settings.
This patch adds an API to allow for lowering the CPU priority of
background threads, modeled on the IO priority API. Benchmarks (see
below) show significant latency and throughput improvements when CPU
bound. As a result, workloads with some CPU usage bursts should benefit
from lower latencies at a given utilization, or should be able to push
utilization higher at a given request latency target.
A useful side effect is that compaction CPU usage is now easily visible
in common tools, allowing for an easier estimation of the contribution
of compaction vs. request processing threads.
As with IO priority, the implementation is limited to Linux, degrading
to a no-op on other systems.
Test Plan:
`make check -j64`
= Benchmark: CPU bound, with & without background CPU priorities =
Results: Best P99 of 10 runs each.
tl;dr: Mean latency -70-80%, P50 +18-22%, P99 -93-94%
Basic zstd compression:
```bash
./db_bench --compression_type=zstd --key_size 24 \
--max_background_jobs 16 --benchmarks readwhilewriting --num 3000000 \
--writes 300000 --threads 48 --histogram=1 --stats_interval=100000000 \
--enable_io_prio```
Count: 192000000 Average: 32.3246 StdDev: 19.51
Percentiles: P50: 1.41 P75: 5.27 P99: 144.77 P99.9: 6835.64 P99.99: 24890.69
Basic zstd compression, low cpu priority:
```bash
./db_bench --compression_type=zstd --key_size 24 \
--max_background_jobs 16 --benchmarks readwhilewriting --num 3000000 \
--writes 300000 --threads 48 --histogram=1 --stats_interval=100000000 \
--enable_io_prio --enable_cpu_prio```
Count: 192000000 Average: 6.5144 StdDev: 12.78
Percentiles: P50: 1.67 P75: 3.30 P99: 8.98 P99.9: 30.08 P99.99: 23877.45
Change vs. no CPU prio: Average -79.8%, P50 +18.4% , P99 -93.8%
Aggressive zstd compression:
```bash
./db_bench --compression_type=zstd --key_size 24 --max_background_jobs
16 --benchmarks readwhilewriting --num 3000000 --writes 300000 --threads
64 --histogram=1 --stats_interval=100000000 --enable_io_prio
--compression_level=6 --compression_max_dict_bytes=8
--compression_zstd_max_train_bytes=13```
Count: 192000000 Average: 19.0564 StdDev: 18.53
Percentiles: P50: 1.37 P75: 2.92 P99: 129.14 P99.9: 526.23 P99.99: 20604.76
Aggressive zstd compression, low cpu priority:
```bash
./db_bench --compression_type=zstd --key_size 24 --max_background_jobs
16 --benchmarks readwhilewriting --num 3000000 --writes 300000 --threads
64 --histogram=1 --stats_interval=100000000 --enable_io_prio
--compression_level=6 --compression_max_dict_bytes=8
--compression_zstd_max_train_bytes=13 --enable_cpu_prio```
Count: 192000000 Average: 6.1065 StdDev: 18.61
Percentiles: P50: 1.65 P75: 3.27 P99: 9.02 P99.9: 31.65 P99.99: 17240.06
Change vs. no CPU prio: Average -70.0%, P50 +21.9%, P99 -93.0%
Full benchmark log: https://gist.github.com/gwicke/3286cc87f09c81052d33e08a9d3d1cec