-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce binary search when reseek into the same data block #5256
Conversation
This is resubmit of a part of #4217 |
Decided to totally recreate #4217 but limit the cases to make things simple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Will this optimization apply to the top level of a 2 level index as well, i.e don't reseek in top level if the key falls in the current 2nd level index block?
@anand1976 it does. That's why user key is used, not internal key, because for the second tier index, sometimes only user key is available. I didn't realized that it is applied to it in the beginning and only found it when I saw some unit tests failed. |
@siying has updated the pull request. Re-import the pull request |
…lled again Summary: Right now, when Seek() is called again, RocksDB always does a binary search against the files and find the file to read. Change it to first check the boundary of the current file. Test Plan: run all existing tests Do it for block iterator too. Fix a bug Update HISTORY.md
@siying has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: #5256 broke it: `block_iter_.user_key()` may not be valid even if `block_iter_points_to_real_block_` is true. E.g. if there was an IO error or Status::Incomplete. Pull Request resolved: #5291 Differential Revision: D15273324 Pulled By: al13n321 fbshipit-source-id: 442e5b09f9884a58f92a6ac1ca93af719c219886
) Summary: Right now, when Seek() is called again, RocksDB always does a binary search against the files and index blocks, even if they end up with the same file/block. Improve it as following: 1. in LevelIterator, reseek first try to check the boundary of the current file. If it falls into the same file, skip the binary search to find the file 2. in block based table iterator, reseek skip to reseek the iterator block if the seek key is larger than the current key and lower than the index key (boundary of the current block and the next block). Pull Request resolved: facebook#5256 Differential Revision: D15105072 Pulled By: siying fbshipit-source-id: 39634bdb4a881082451fa39cecd7ecf12160bf80
Summary: facebook#5256 broke it: `block_iter_.user_key()` may not be valid even if `block_iter_points_to_real_block_` is true. E.g. if there was an IO error or Status::Incomplete. Pull Request resolved: facebook#5291 Differential Revision: D15273324 Pulled By: al13n321 fbshipit-source-id: 442e5b09f9884a58f92a6ac1ca93af719c219886
Summary: Right now, when Seek() is called again, RocksDB always does a binary search against the files and index blocks, even if they end up with the same file/block. Improve it as following:
Test Plan: add a unit test and run all existing tests