There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

mdcallag · 2022-02-24T18:47:54Z

With db_bench --seed=$X --benchmarks=multireadrandom,readrandom then the per-thread seeds used for multireadrandom will be reused for readrandom so both tests will use the same key sequences. This is usually a bad idea. One problem is with an IO-bound test and buffered IO. The first test (multireadrandom) warms the cache and the second test (readrandom) will have a much higher hit rate in the OS page cache.

Per-thread seeds are set here and the ThreadState constructor is called here.

This was fixed last year in LevelDB. The commit is here and the code is here. The solution is to have a global counter (across calls to RunBenchmark) that is incremented each time a thread is created and then use base_seed + counter as the seed. LevelDB still has a different problem in that base_seed is hardwired to 1000. So there is seed reuse if you run db_bench --benchmarks=overwrite and then db_bench --benchmarks=readrandom (readrandom will use the same seeds as overwrite).

Summary: When --benchmarks has more than one test then the threads in one benchmark will use the same set of seeds as the threads in the previous benchmark. This diff fixe that. This fixes facebook#9632 Test Plan: For this command line the block cache is 8GB, so it caches at most 1024 8KB blocks. Note that without this diff the second run of readrandom has a much better response time because seed reuse means the second run reads the same 1000 blocks as the first run and they are cached at that point. But with this diff that does not happen. ./db_bench --benchmarks=fillseq,flush,compact0,waitforcompaction,levelstats,readrandom,readrandom --compression_type=zlib --num=10000000 --reads=1000 --block_size=8192 ... Level Files Size(MB) -------------------- 0 0 0 1 11 238 2 9 253 3 0 0 4 0 0 5 0 0 6 0 0 --- perf results without this diff DB path: [/tmp/rocksdbtest-2260/dbbench] readrandom : 46.212 micros/op 21618 ops/sec; 2.4 MB/s (1000 of 1000 found) DB path: [/tmp/rocksdbtest-2260/dbbench] readrandom : 21.963 micros/op 45450 ops/sec; 5.0 MB/s (1000 of 1000 found) --- perf results with this diff DB path: [/tmp/rocksdbtest-2260/dbbench] readrandom : 47.213 micros/op 21126 ops/sec; 2.3 MB/s (1000 of 1000 found) DB path: [/tmp/rocksdbtest-2260/dbbench] readrandom : 42.880 micros/op 23299 ops/sec; 2.6 MB/s (1000 of 1000 found) Reviewers: Subscribers: Tasks: Tags:

mdcallag added the bug Confirmed RocksDB bugs label Feb 24, 2022

mdcallag self-assigned this Feb 24, 2022

mdcallag mentioned this issue Mar 1, 2022

db_bench should not reuse RNG seeds or hardwire them #9640

Open

mdcallag mentioned this issue Mar 22, 2022

Avoid seed reuse when --benchmarks has more than one test #9733

Closed

facebook-github-bot closed this as completed in d583d23 Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

mdcallag commented Feb 24, 2022

There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

Comments

mdcallag commented Feb 24, 2022