Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two performance improvements in BlockBuilder #9039

Closed
wants to merge 1 commit into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Oct 15, 2021

Summary:
Primarily, this change reserves space in the std::string for building
the next block once a block is finished, using block_size as
reservation size. Note: also tried reusing same std::string in the
common "unbuffered" path but that showed no benefit or regression.

Secondarily, this slightly reduces the work in resetting restarts_.

Test Plan: TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1, Primary change only: 292697 vs. 280267 (+4.4%)
Run 2, Primary change only: 288763 vs. 279621 (+3.3%)
Run 1, Secondary change only: 260065 vs. 254232 (+2.3%)
Run 2, Secondary change only: 275925 vs. 272248 (+1.4%)
Run 1, Both changes: 284890 vs. 270372 (+5.3%)
Run 2, Both changes: 263511 vs. 258188 (+2.0%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)

Summary:
Primarily, this change reserves space in the std::string for building
the next block once a block is finished, using `block_size` as
reservation size. Note: also tried reusing same std::string in the
common "unbuffered" path but that showed no benefit or regression.

Secondarily, this slightly reduces the work in resetting `restarts_`.

Test Plan: TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -duration=60 -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1, Primary change only: 292697 vs. 280267 (+4.4%)
Run 2, Primary change only: 288763 vs. 279621 (+3.3%)
Run 1, Secondary change only: 260065 vs. 254232 (+2.3%)
Run 2, Secondary change only: 275925 vs. 272248 (+1.4%)
Run 1, Both changes: 284890 vs. 270372 (+5.3%)
Run 2, Both changes: 263511 vs. 258188 (+2.0%)
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Oct 15, 2021
Summary: ... by bypassing tracking of last_key in BlockBuilder when
last_key is already known (for BlockBasedTableBuilder::data_block).

I tried extracting a base class of BlockBuilder without the last_key
tracking at all, but that became complicated by NewFlushBlockPolicy() in
the public API referencing BlockBuilder, which would need to be the base
class, and I don't want to replace nearly all the internal references to
BlockBuilder.

This improvement should stack with facebook#9039

Test Plan: TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1: 278929 vs. 267799 (+4.2%)
Run 2: 281836 vs. 267432 (+5.4%)
Run 3: 278279 vs. 270454 (+2.9%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)
Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@facebook-github-bot
Copy link
Contributor

@pdillinger merged this pull request in 9d66d6d.

facebook-github-bot pushed a commit that referenced this pull request Oct 19, 2021
Summary:
... by bypassing tracking of last_key in BlockBuilder when
last_key is already known (for BlockBasedTableBuilder::data_block).

I tried extracting a base class of BlockBuilder without the last_key
tracking at all, but that became complicated by NewFlushBlockPolicy() in
the public API referencing BlockBuilder, which would need to be the base
class, and I don't want to replace nearly all the internal references to
BlockBuilder.

Possible follow-up:
* Investigate / consider using AddWithLastKey in more places

This improvement should stack with #9039

Pull Request resolved: #9040

Test Plan:
TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1: 278929 vs. 267799 (+4.2%)
Run 2: 281836 vs. 267432 (+5.4%)
Run 3: 278279 vs. 270454 (+2.9%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)

Reviewed By: mrambacher

Differential Revision: D31706033

Pulled By: pdillinger

fbshipit-source-id: 8a50fe6fefdd67b6d7665ffa687bbdcf5ad0d5ec
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
Summary:
Primarily, this change reserves space in the std::string for building
the next block once a block is finished, using `block_size` as
reservation size. Note: also tried reusing same std::string in the
common "unbuffered" path but that showed no benefit or regression.

Secondarily, this slightly reduces the work in resetting `restarts_`.

Pull Request resolved: facebook/rocksdb#9039

Test Plan:
TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1, Primary change only: 292697 vs. 280267 (+4.4%)
Run 2, Primary change only: 288763 vs. 279621 (+3.3%)
Run 1, Secondary change only: 260065 vs. 254232 (+2.3%)
Run 2, Secondary change only: 275925 vs. 272248 (+1.4%)
Run 1, Both changes: 284890 vs. 270372 (+5.3%)
Run 2, Both changes: 263511 vs. 258188 (+2.0%)

Reviewed By: zhichao-cao

Differential Revision: D31701253

Pulled By: pdillinger

fbshipit-source-id: 7e40810afbb98e6b6446955e77bda59e69b19ffd
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
Summary:
... by bypassing tracking of last_key in BlockBuilder when
last_key is already known (for BlockBasedTableBuilder::data_block).

I tried extracting a base class of BlockBuilder without the last_key
tracking at all, but that became complicated by NewFlushBlockPolicy() in
the public API referencing BlockBuilder, which would need to be the base
class, and I don't want to replace nearly all the internal references to
BlockBuilder.

Possible follow-up:
* Investigate / consider using AddWithLastKey in more places

This improvement should stack with facebook/rocksdb#9039

Pull Request resolved: facebook/rocksdb#9040

Test Plan:
TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1: 278929 vs. 267799 (+4.2%)
Run 2: 281836 vs. 267432 (+5.4%)
Run 3: 278279 vs. 270454 (+2.9%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)

Reviewed By: mrambacher

Differential Revision: D31706033

Pulled By: pdillinger

fbshipit-source-id: 8a50fe6fefdd67b6d7665ffa687bbdcf5ad0d5ec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants