You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With db_bench --seed=$X --benchmarks=multireadrandom,readrandom then the per-thread seeds used for multireadrandom will be reused for readrandom so both tests will use the same key sequences. This is usually a bad idea. One problem is with an IO-bound test and buffered IO. The first test (multireadrandom) warms the cache and the second test (readrandom) will have a much higher hit rate in the OS page cache.
Per-thread seeds are set here and the ThreadState constructor is called here.
This was fixed last year in LevelDB. The commit is here and the code is here. The solution is to have a global counter (across calls to RunBenchmark) that is incremented each time a thread is created and then use base_seed + counter as the seed. LevelDB still has a different problem in that base_seed is hardwired to 1000. So there is seed reuse if you run db_bench --benchmarks=overwrite and then db_bench --benchmarks=readrandom (readrandom will use the same seeds as overwrite).
The text was updated successfully, but these errors were encountered:
Summary:
When --benchmarks has more than one test then the threads in one benchmark
will use the same set of seeds as the threads in the previous benchmark.
This diff fixe that.
This fixesfacebook#9632
Test Plan:
For this command line the block cache is 8GB, so it caches at most 1024 8KB blocks. Note that without
this diff the second run of readrandom has a much better response time because seed reuse means the
second run reads the same 1000 blocks as the first run and they are cached at that point. But with
this diff that does not happen.
./db_bench --benchmarks=fillseq,flush,compact0,waitforcompaction,levelstats,readrandom,readrandom --compression_type=zlib --num=10000000 --reads=1000 --block_size=8192
...
Level Files Size(MB)
--------------------
0 0 0
1 11 238
2 9 253
3 0 0
4 0 0
5 0 0
6 0 0
--- perf results without this diff
DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom : 46.212 micros/op 21618 ops/sec; 2.4 MB/s (1000 of 1000 found)
DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom : 21.963 micros/op 45450 ops/sec; 5.0 MB/s (1000 of 1000 found)
--- perf results with this diff
DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom : 47.213 micros/op 21126 ops/sec; 2.3 MB/s (1000 of 1000 found)
DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom : 42.880 micros/op 23299 ops/sec; 2.6 MB/s (1000 of 1000 found)
Reviewers:
Subscribers:
Tasks:
Tags:
With db_bench --seed=$X --benchmarks=multireadrandom,readrandom then the per-thread seeds used for multireadrandom will be reused for readrandom so both tests will use the same key sequences. This is usually a bad idea. One problem is with an IO-bound test and buffered IO. The first test (multireadrandom) warms the cache and the second test (readrandom) will have a much higher hit rate in the OS page cache.
Per-thread seeds are set here and the ThreadState constructor is called here.
This was fixed last year in LevelDB. The commit is here and the code is here. The solution is to have a global counter (across calls to RunBenchmark) that is incremented each time a thread is created and then use base_seed + counter as the seed. LevelDB still has a different problem in that base_seed is hardwired to 1000. So there is seed reuse if you run db_bench --benchmarks=overwrite and then db_bench --benchmarks=readrandom (readrandom will use the same seeds as overwrite).
The text was updated successfully, but these errors were encountered: