-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a sorted vector instead of a map to store blob file metadata #9526
Conversation
a66900f
to
b8d05f0
Compare
@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @ltamasi for the improvement.
db/version_set.h
Outdated
const BlobFiles& GetBlobFiles() const { return blob_files_; } | ||
|
||
// REQUIRES: This version has been saved (see VersionSet::SaveTo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realize: seems SaveTo
is a method of VersionBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, nice catch :) Will fix this across the board (there are some preexisting occurrences as well)
@ltamasi has updated the pull request. You must reimport the pull request before landing. |
@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…BlobGC (#9542) Summary: Fixes a bug introduced in #9526 where we index one position past the end of a `vector`. Pull Request resolved: #9542 Test Plan: `make asan_check` Will add a unit test in a separate PR. Reviewed By: akankshamahajan15 Differential Revision: D34145825 Pulled By: ltamasi fbshipit-source-id: 4e87c948407dee489d669a3e41f59e2fcc1228d8
Summary:
The patch replaces
std::map
with a sortedstd::vector
forVersionStorageInfo::blob_files_
and preallocates the spacefor the
vector
before saving theBlobFileMetaData
into thenew
VersionStorageInfo
inVersionBuilder::Rep::SaveBlobFilesTo
.These changes reduce the time the DB mutex is held while
saving new
Version
s, and using a sortedvector
also makeslookups faster thanks to better memory locality.
In addition, the patch introduces helper methods
VersionStorageInfo::GetBlobFileMetaData
andVersionStorageInfo::GetBlobFileMetaDataLB
that can be used byclients to perform lookups in the
vector
, and does some generalcleanup in the parts of code where blob file metadata are used.
Test Plan:
Ran
make check
and the crash test script for a while.Performance was tested using a load-optimized benchmark (
fillseq
with vector memtable, no WAL) and small file sizes so that a significant number of files are produced:Final statistics before the patch:
With the patch:
Total time to complete the benchmark is 2611 seconds with the patch, down from 2986 secs.