Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce binary search when reseek into the same data block #5256

Closed
wants to merge 1 commit into from

Conversation

siying
Copy link
Contributor

@siying siying commented Apr 26, 2019

Summary: Right now, when Seek() is called again, RocksDB always does a binary search against the files and index blocks, even if they end up with the same file/block. Improve it as following:

  1. in LevelIterator, reseek first try to check the boundary of the current file. If it falls into the same file, skip the binary search to find the file
  2. in block based table iterator, reseek skip to reseek the iterator block if the seek key is larger than the current key and lower than the index key (boundary of the current block and the next block).

Test Plan: add a unit test and run all existing tests

@siying
Copy link
Contributor Author

siying commented Apr 26, 2019

This is resubmit of a part of #4217

@siying siying changed the title LevelIterator to try to search for the file to read when Seek() is called again Try to avoid binary search when reseek a key into the same data block in forward order Apr 26, 2019
@siying
Copy link
Contributor Author

siying commented Apr 26, 2019

Decided to totally recreate #4217 but limit the cases to make things simple.

@siying siying changed the title Try to avoid binary search when reseek a key into the same data block in forward order Reduce binary search during reseek Apr 26, 2019
@siying siying changed the title Reduce binary search during reseek Reduce binary search when reseek into the same data block Apr 26, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will this optimization apply to the top level of a 2 level index as well, i.e don't reseek in top level if the key falls in the current 2nd level index block?

@siying
Copy link
Contributor Author

siying commented May 1, 2019

@anand1976 it does. That's why user key is used, not internal key, because for the second tier index, sometimes only user key is available. I didn't realized that it is applied to it in the beginning and only found it when I saw some unit tests failed.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

…lled again

Summary: Right now, when Seek() is called again, RocksDB always does a binary search against the files and find the file to read. Change it to first check the boundary of the current file.

Test Plan: run all existing tests

Do it for block iterator too.

Fix a bug

Update HISTORY.md
@siying siying reopened this May 1, 2019
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying merged this pull request in 4479dff.

facebook-github-bot pushed a commit that referenced this pull request May 10, 2019
Summary:
#5256 broke it: `block_iter_.user_key()` may not be valid even if `block_iter_points_to_real_block_` is true. E.g. if there was an IO error or Status::Incomplete.
Pull Request resolved: #5291

Differential Revision: D15273324

Pulled By: al13n321

fbshipit-source-id: 442e5b09f9884a58f92a6ac1ca93af719c219886
vagogte pushed a commit to vagogte/rocksdb that referenced this pull request Jun 18, 2019
)

Summary:
Right now, when Seek() is called again, RocksDB always does a binary search against the files and index blocks, even if they end up with the same file/block. Improve it as following:
1. in LevelIterator, reseek first try to check the boundary of the current file. If it falls into the same file, skip the binary search to find the file
2. in block based table iterator, reseek skip to reseek the iterator block if the seek key is larger than the current key and lower than the index key (boundary of the current block and the next block).
Pull Request resolved: facebook#5256

Differential Revision: D15105072

Pulled By: siying

fbshipit-source-id: 39634bdb4a881082451fa39cecd7ecf12160bf80
vagogte pushed a commit to vagogte/rocksdb that referenced this pull request Jun 18, 2019
Summary:
facebook#5256 broke it: `block_iter_.user_key()` may not be valid even if `block_iter_points_to_real_block_` is true. E.g. if there was an IO error or Status::Incomplete.
Pull Request resolved: facebook#5291

Differential Revision: D15273324

Pulled By: al13n321

fbshipit-source-id: 442e5b09f9884a58f92a6ac1ca93af719c219886
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants