-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize db_stress setup phase #9475
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
It is too slow that our `db_crashtest.py` often kills `db_stress` before the setup phase completes. Profiled it and found a few ways to optimize. Test Plan: Measured setup phase time reduced 22% (36 -> 28 seconds) for first run, and 36% (38 -> 24 seconds) for non-first run on empty-ish DB. - first run benchmark command: `rm -rf /dev/shm/dbstress*/ && mkdir -p /dev/shm/dbstress_expected/ && ./db_stress -max_key=100000000 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1` - output before this PR: ``` 2022/01/31-11:14:05 Initializing db_stress ... 2022/01/31-11:14:41 Starting database operations ``` - output after this PR: ``` ... 2022/01/31-11:12:23 Initializing db_stress ... 2022/01/31-11:12:51 Starting database operations ``` - non-first run benchmark command: `./db_stress -max_key=100000000 -destroy_db_initially=0 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1` - output before this PR: ``` 2022/01/31-11:20:45 Initializing db_stress ... 2022/01/31-11:21:23 Starting database operations ``` - output after this PR: ``` 2022/01/31-11:22:02 Initializing db_stress ... 2022/01/31-11:22:26 Starting database operations ```
@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ajkr
added a commit
to ajkr/rocksdb
that referenced
this pull request
Jan 31, 2022
Despite attempts to optimize `db_stress` setup phase (i.e., pre-`OperateDb()`) latency in facebook#9470 and facebook#9475, it still always took tens of seconds. Since we still aren't able to setup a 100M key `db_stress` efficiently, we should reduce the number of keys. This PR reduces it 4x while increasing `value_size_mult` 4x (from its default value of 8) so that memtables and SST files fill up similarly quickly. Also disabled bzip2 compression since we'll probably never use it and I noticed many CI runs spending majority of CPU on bzip2 decompression.
facebook-github-bot
pushed a commit
that referenced
this pull request
Jan 31, 2022
Summary: Despite attempts to optimize `db_stress` setup phase (i.e., pre-`OperateDb()`) latency in #9470 and #9475, it still always took tens of seconds. Since we still aren't able to setup a 100M key `db_stress` quickly, we should reduce the number of keys. This PR reduces it 4x while increasing `value_size_mult` 4x (from its default value of 8) so that memtables and SST files fill at a similar rate compared to before this PR. Also disabled bzip2 compression since we'll probably never use it and I noticed many CI runs spending majority of CPU on bzip2 decompression. Pull Request resolved: #9476 Reviewed By: siying Differential Revision: D33898520 Pulled By: ajkr fbshipit-source-id: 855021784ad9664f2be5bce21f0339a1cf93230d
anand1976
approved these changes
Feb 1, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice speedup!
@anand1976 thanks for the review! Needs Phabricator accept |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It is too slow that our
db_crashtest.py
often killsdb_stress
beforethe setup phase completes. Profiled it and found a few ways to optimize.
Test Plan:
Measured setup phase time reduced 22% (36 -> 28 seconds) for first run, and
36% (38 -> 24 seconds) for non-first run on empty-ish DB.
rm -rf /dev/shm/dbstress*/ && mkdir -p /dev/shm/dbstress_expected/ && ./db_stress -max_key=100000000 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1
output before this PR:
output after this PR:
./db_stress -max_key=100000000 -destroy_db_initially=0 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1
output before this PR:
output after this PR:
DEBUG_LEVEL=0 TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py blackbox --simple --interval=10 --max_key=1000000 --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --value_size_mult=33