Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate blob file writing with the flush logic #7345

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
eebe29b
Integrate BlobFileBuilder into BuildTable/CompactionIterator and emit…
ltamasi Jul 9, 2020
dd6eee4
Use the new AddBlobFile(Garbage) overloads in VersionEdit::DecodeFrom()
ltamasi Aug 25, 2020
312d430
Introduce VersionEdit::{SetBlobFileAdditions, SetBlobFileGarbages}
ltamasi Aug 25, 2020
3b88847
Use SetBlobFileAdditions in the flush code path
ltamasi Aug 25, 2020
f65eba1
Add TODOs re: flush logs/stats
ltamasi Aug 26, 2020
f4bc1f1
Log blob file numbers when committing flush results
ltamasi Aug 26, 2020
c603093
Log head and tail of blob files after flush
ltamasi Aug 26, 2020
54eda60
Small formatting change in GetBlobAdditionInfo
ltamasi Aug 26, 2020
6692a9b
Remove a few TODOs that got addressed
ltamasi Aug 26, 2020
a9a6ccf
Add VersionStorageInfo::BlobFileSummary
ltamasi Aug 27, 2020
c48f12e
Print blob file summary after flush
ltamasi Aug 27, 2020
db2dfed
Finish blob file before table file in BuildTable
ltamasi Aug 27, 2020
575a70e
Account for blob files when computing stats for flush
ltamasi Aug 28, 2020
d1f6d47
Revert "Log blob file numbers when committing flush results"
ltamasi Sep 1, 2020
b0b036c
Log only blob file count in MemTableList
ltamasi Sep 1, 2020
40068be
Revert "Add VersionStorageInfo::BlobFileSummary"
ltamasi Sep 1, 2020
d5247c8
Log blob file summary after flush in a more efficient way
ltamasi Sep 1, 2020
bcf31b5
Apply formatting
ltamasi Sep 2, 2020
1cc9f54
Add a test case to db_flush_test
ltamasi Sep 2, 2020
a328822
Check values after flush using Get
ltamasi Sep 2, 2020
37c6447
Only set bytes_written/num_output_files in CompactionStats if we wrot…
ltamasi Sep 2, 2020
27eaa3e
Add a test case for BlobFileBuilder::Add failures
ltamasi Sep 2, 2020
3f5f021
Parameterize the error test case and use it to cover Finish() as well
ltamasi Sep 2, 2020
358258b
Check if files generated by failed jobs are deleted
ltamasi Sep 2, 2020
a6d24e1
Keep track of the paths of all the blob files opened in the builder a…
ltamasi Sep 3, 2020
55bb048
Remove leftover empty test case
ltamasi Sep 3, 2020
8152e5e
Add comment
ltamasi Sep 3, 2020
7f18644
Pass blob_file_path to VerifyBlobFile
ltamasi Sep 3, 2020
3f3bd3e
Small cleanup
ltamasi Sep 3, 2020
d2d5a21
More of the same
ltamasi Sep 3, 2020
3ad0d9f
Make FaultInjectionTestEnv a member and call Close in the DBFlushTest…
ltamasi Sep 3, 2020
6c94527
Clear blob_file_additions upon failure in BuildTable
ltamasi Sep 3, 2020
6f0f59e
Make clang-format happy
ltamasi Sep 3, 2020
bffe597
Explicitly cast blobs.size() to int
ltamasi Sep 3, 2020
c422f81
Small cleanup: switch the order of two checks
ltamasi Sep 8, 2020
404edc5
Assert that passed in vectors are empty
ltamasi Sep 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions db/blob/blob_file_builder.cc
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,13 @@ BlobFileBuilder::BlobFileBuilder(
int job_id, uint32_t column_family_id,
const std::string& column_family_name, Env::IOPriority io_priority,
Env::WriteLifeTimeHint write_hint,
std::vector<std::string>* blob_file_paths,
std::vector<BlobFileAddition>* blob_file_additions)
: BlobFileBuilder([versions]() { return versions->NewFileNumber(); }, env,
fs, immutable_cf_options, mutable_cf_options,
file_options, job_id, column_family_id,
column_family_name, io_priority, write_hint,
blob_file_additions) {}
blob_file_paths, blob_file_additions) {}

BlobFileBuilder::BlobFileBuilder(
std::function<uint64_t()> file_number_generator, Env* env, FileSystem* fs,
Expand All @@ -45,6 +46,7 @@ BlobFileBuilder::BlobFileBuilder(
int job_id, uint32_t column_family_id,
const std::string& column_family_name, Env::IOPriority io_priority,
Env::WriteLifeTimeHint write_hint,
std::vector<std::string>* blob_file_paths,
std::vector<BlobFileAddition>* blob_file_additions)
: file_number_generator_(std::move(file_number_generator)),
env_(env),
Expand All @@ -59,6 +61,7 @@ BlobFileBuilder::BlobFileBuilder(
column_family_name_(column_family_name),
io_priority_(io_priority),
write_hint_(write_hint),
blob_file_paths_(blob_file_paths),
blob_file_additions_(blob_file_additions),
blob_count_(0),
blob_bytes_(0) {
Expand All @@ -67,7 +70,10 @@ BlobFileBuilder::BlobFileBuilder(
assert(fs_);
assert(immutable_cf_options_);
assert(file_options_);
assert(blob_file_paths_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also assert blob_file_paths_.empty()? Is there a case when it is not empty here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will add another assertion to that effect.

assert(blob_file_paths_->empty());
assert(blob_file_additions_);
assert(blob_file_additions_->empty());
}

BlobFileBuilder::~BlobFileBuilder() = default;
Expand Down Expand Up @@ -145,7 +151,7 @@ Status BlobFileBuilder::OpenBlobFileIfNeeded() {

assert(immutable_cf_options_);
assert(!immutable_cf_options_->cf_paths.empty());
const std::string blob_file_path = BlobFileName(
std::string blob_file_path = BlobFileName(
immutable_cf_options_->cf_paths.front().path, blob_file_number);

std::unique_ptr<FSWritableFile> file;
Expand All @@ -161,14 +167,20 @@ Status BlobFileBuilder::OpenBlobFileIfNeeded() {
}
}

// Note: files get added to blob_file_paths_ right after the open, so they
// can be cleaned up upon failure. Contrast this with blob_file_additions_,
// which only contains successfully written files.
assert(blob_file_paths_);
blob_file_paths_->emplace_back(std::move(blob_file_path));

assert(file);
file->SetIOPriority(io_priority_);
file->SetWriteLifeTimeHint(write_hint_);

Statistics* const statistics = immutable_cf_options_->statistics;

std::unique_ptr<WritableFileWriter> file_writer(new WritableFileWriter(
std::move(file), blob_file_path, *file_options_, env_,
std::move(file), blob_file_paths_->back(), *file_options_, env_,
nullptr /*IOTracer*/, statistics, immutable_cf_options_->listeners,
immutable_cf_options_->file_checksum_gen_factory));

Expand Down
3 changes: 3 additions & 0 deletions db/blob/blob_file_builder.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ class BlobFileBuilder {
const std::string& column_family_name,
Env::IOPriority io_priority,
Env::WriteLifeTimeHint write_hint,
std::vector<std::string>* blob_file_paths,
std::vector<BlobFileAddition>* blob_file_additions);

BlobFileBuilder(std::function<uint64_t()> file_number_generator, Env* env,
Expand All @@ -47,6 +48,7 @@ class BlobFileBuilder {
const std::string& column_family_name,
Env::IOPriority io_priority,
Env::WriteLifeTimeHint write_hint,
std::vector<std::string>* blob_file_paths,
std::vector<BlobFileAddition>* blob_file_additions);

BlobFileBuilder(const BlobFileBuilder&) = delete;
Expand Down Expand Up @@ -79,6 +81,7 @@ class BlobFileBuilder {
std::string column_family_name_;
Env::IOPriority io_priority_;
Env::WriteLifeTimeHint write_hint_;
std::vector<std::string>* blob_file_paths_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since we assert blob_file_paths_.empty() && blob_file_additions_.empty(), would it be easier to make these two members belong to BlobFileBuilder?
Furthermore, in non-debug mode, the assertion becomes no-op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I didn't make these members is that they outlive BlobFileBuilder and I wanted to avoid copying them (or having to introduce unconventional "read once" type of accessors that transfer them to the caller).

std::vector<BlobFileAddition>* blob_file_additions_;
std::unique_ptr<BlobLogWriter> writer_;
uint64_t blob_count_;
Expand Down
Loading