Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no way to avoid seed reuse when --benchmarks specifies more than one test #9632

Closed
mdcallag opened this issue Feb 24, 2022 · 0 comments
Assignees
Labels
bug Confirmed RocksDB bugs

Comments

@mdcallag
Copy link
Contributor

With db_bench --seed=$X --benchmarks=multireadrandom,readrandom then the per-thread seeds used for multireadrandom will be reused for readrandom so both tests will use the same key sequences. This is usually a bad idea. One problem is with an IO-bound test and buffered IO. The first test (multireadrandom) warms the cache and the second test (readrandom) will have a much higher hit rate in the OS page cache.

Per-thread seeds are set here and the ThreadState constructor is called here.

This was fixed last year in LevelDB. The commit is here and the code is here. The solution is to have a global counter (across calls to RunBenchmark) that is incremented each time a thread is created and then use base_seed + counter as the seed. LevelDB still has a different problem in that base_seed is hardwired to 1000. So there is seed reuse if you run db_bench --benchmarks=overwrite and then db_bench --benchmarks=readrandom (readrandom will use the same seeds as overwrite).

@mdcallag mdcallag added the bug Confirmed RocksDB bugs label Feb 24, 2022
@mdcallag mdcallag self-assigned this Feb 24, 2022
mdcallag added a commit to mdcallag/rocksdb-1 that referenced this issue Mar 22, 2022
Summary:

When --benchmarks has more than one test then the threads in one benchmark
will use the same set of seeds as the threads in the previous benchmark.
This diff fixe that.

This fixes facebook#9632

Test Plan:

For this command line the block cache is 8GB, so it caches at most 1024 8KB blocks. Note that without
this diff the second run of readrandom has a much better response time because seed reuse means the
second run reads the same 1000 blocks as the first run and they are cached at that point. But with
this diff that does not happen.

./db_bench --benchmarks=fillseq,flush,compact0,waitforcompaction,levelstats,readrandom,readrandom --compression_type=zlib --num=10000000 --reads=1000 --block_size=8192

...

Level Files Size(MB)
--------------------
  0        0        0
  1       11      238
  2        9      253
  3        0        0
  4        0        0
  5        0        0
  6        0        0

--- perf results without this diff

DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom   :      46.212 micros/op 21618 ops/sec;    2.4 MB/s (1000 of 1000 found)

DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom   :      21.963 micros/op 45450 ops/sec;    5.0 MB/s (1000 of 1000 found)

--- perf results with this diff

DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom   :      47.213 micros/op 21126 ops/sec;    2.3 MB/s (1000 of 1000 found)

DB path: [/tmp/rocksdbtest-2260/dbbench]
readrandom   :      42.880 micros/op 23299 ops/sec;    2.6 MB/s (1000 of 1000 found)

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed RocksDB bugs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant