Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize db_stress setup phase #9475

Closed
wants to merge 2 commits into from

Conversation

ajkr
Copy link
Contributor

@ajkr ajkr commented Jan 31, 2022

It is too slow that our db_crashtest.py often kills db_stress before
the setup phase completes. Profiled it and found a few ways to optimize.

Test Plan:

Measured setup phase time reduced 22% (36 -> 28 seconds) for first run, and
36% (38 -> 24 seconds) for non-first run on empty-ish DB.

  • first run benchmark command: rm -rf /dev/shm/dbstress*/ && mkdir -p /dev/shm/dbstress_expected/ && ./db_stress -max_key=100000000 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1

output before this PR:

2022/01/31-11:14:05  Initializing db_stress
...
2022/01/31-11:14:41  Starting database operations

output after this PR:

...
2022/01/31-11:12:23  Initializing db_stress
...
2022/01/31-11:12:51  Starting database operations
  • non-first run benchmark command: ./db_stress -max_key=100000000 -destroy_db_initially=0 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1

output before this PR:

2022/01/31-11:20:45  Initializing db_stress
...
2022/01/31-11:21:23  Starting database operations

output after this PR:

2022/01/31-11:22:02  Initializing db_stress
...
2022/01/31-11:22:26  Starting database operations
  • ran minified crash test a while: DEBUG_LEVEL=0 TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py blackbox --simple --interval=10 --max_key=1000000 --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --value_size_mult=33

It is too slow that our `db_crashtest.py` often kills `db_stress` before
the setup phase completes. Profiled it and found a few ways to optimize.

Test Plan:

Measured setup phase time reduced 22% (36 -> 28 seconds) for first run, and
36% (38 -> 24 seconds) for non-first run on empty-ish DB.

- first run benchmark command: `rm -rf /dev/shm/dbstress*/ && mkdir -p /dev/shm/dbstress_expected/ && ./db_stress -max_key=100000000 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1`
  - output before this PR:

```
2022/01/31-11:14:05  Initializing db_stress
...
2022/01/31-11:14:41  Starting database operations
```

  - output after this PR:

```
...
2022/01/31-11:12:23  Initializing db_stress
...
2022/01/31-11:12:51  Starting database operations
```

- non-first run benchmark command: `./db_stress -max_key=100000000 -destroy_db_initially=0 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1`
  - output before this PR:
```
2022/01/31-11:20:45  Initializing db_stress
...
2022/01/31-11:21:23  Starting database operations
```

  - output after this PR:
```
2022/01/31-11:22:02  Initializing db_stress
...
2022/01/31-11:22:26  Starting database operations
```
@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ajkr added a commit to ajkr/rocksdb that referenced this pull request Jan 31, 2022
Despite attempts to optimize `db_stress` setup phase (i.e.,
pre-`OperateDb()`) latency in facebook#9470 and facebook#9475, it still always took tens
of seconds. Since we still aren't able to setup a 100M key `db_stress`
efficiently, we should reduce the number of keys. This PR reduces it 4x
while increasing `value_size_mult` 4x (from its default value of 8) so
that memtables and SST files fill up similarly quickly.

Also disabled bzip2 compression since we'll probably never use it and
I noticed many CI runs spending majority of CPU on bzip2 decompression.
facebook-github-bot pushed a commit that referenced this pull request Jan 31, 2022
Summary:
Despite attempts to optimize `db_stress` setup phase (i.e.,
pre-`OperateDb()`) latency in #9470 and #9475, it still always took tens
of seconds. Since we still aren't able to setup a 100M key `db_stress`
quickly, we should reduce the number of keys. This PR reduces it 4x
while increasing `value_size_mult` 4x (from its default value of 8) so
that memtables and SST files fill at a similar rate compared to before this PR.

Also disabled bzip2 compression since we'll probably never use it and
I noticed many CI runs spending majority of CPU on bzip2 decompression.

Pull Request resolved: #9476

Reviewed By: siying

Differential Revision: D33898520

Pulled By: ajkr

fbshipit-source-id: 855021784ad9664f2be5bce21f0339a1cf93230d
@ajkr ajkr requested a review from anand1976 January 31, 2022 21:58
Copy link
Contributor

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speedup!

@ajkr
Copy link
Contributor Author

ajkr commented Feb 1, 2022

Nice speedup!

@anand1976 thanks for the review! Needs Phabricator accept

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants