-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable blob caching for MultiGetBlob in RocksDB #10272
Conversation
Summary: - Enabled blob caching for MultiGetBlob in RocksDB - Refactored MultiGetBlob logic and interface in RocksDB - Cleaned up Version::MultiGetBlob() and moved 'blob'-related code snippets into BlobSource This task is a part of facebook#10156
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @gangliao !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @gangliao ! Looks pretty awesome 🎉
if (!IsValidBlobOffset(offset, key_size, value_size, file_size_)) { | ||
*blob_req->status = Status::Corruption("Invalid blob offset"); | ||
*blob_reqs[i]->status = Status::Corruption("Invalid blob offset"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a bit more to this: with this change, it is no longer guaranteed that read_reqs[i]
corresponds to blob_reqs[i]
because we only create FSReadRequest
s for blobs that pass the offset and compression type checks. We would also want to make sure we do not overwrite these Corruption
s with something else when we loop over blob_reqs
below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense; this is a tricky situation, and we can likely use unordered map or even a vector to save the original index. In that case, we can still identify the mapping between them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just brainstorming here but we might also be able to use a bitmask to track which requests failed this initial validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be overthinking. We don't need any auxiliary structure since that info already in blob_req[i]->status
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @gangliao !
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Update HISTORY.md for blob cache. Implementation can be found from Github issue #10156 (or Github PRs #10155, #10178, #10225, #10198, and #10272). Pull Request resolved: #10328 Reviewed By: riversand963 Differential Revision: D37732514 Pulled By: gangliao fbshipit-source-id: 4c942a41c07914bfc8db56a0d3cf4d3e53d5963f
Summary:
This task is a part of #10156