Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated blob garbage collection: relocate blobs #7694

Closed
wants to merge 19 commits into from

Conversation

ltamasi
Copy link
Contributor

@ltamasi ltamasi commented Nov 18, 2020

Summary:
The patch adds basic garbage collection support to the integrated BlobDB
implementation. Valid blobs residing in the oldest blob files are relocated
as they are encountered during compaction. The threshold that determines
which blob files qualify is computed based on the configuration option
blob_garbage_collection_age_cutoff, which was introduced in #7661 .
Once a blob is retrieved for the purposes of relocation, it passes through the
same logic that extracts large values to blob files in general. This means that
if, for instance, the size threshold for key-value separation (min_blob_size)
got changed or writing blob files got disabled altogether, it is possible for the
value to be moved back into the LSM tree. In particular, one way to re-inline
all blob values if needed would be to perform a full manual compaction with
enable_blob_files set to false, enable_blob_garbage_collection set to
true, and blob_file_garbage_collection_age_cutoff set to 1.0.

Some TODOs that I plan to address in separate PRs:

  1. We'll have to measure the amount of new garbage in each blob file and log
    BlobFileGarbage entries as part of the compaction job's VersionEdit.
    (For the time being, blob files are cleaned up solely based on the
    oldest_blob_file_number relationships.)
  2. When compression is used for blobs, the compression type hasn't changed,
    and the blob still qualifies for being written to a blob file, we can simply copy
    the compressed blob to the new file instead of going through decompression
    and compression.
  3. We need to update the formula for computing write amplification to account
    for the amount of data read from blob files as part of GC.

Test Plan:
make check

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few minor comments. Thanks @ltamasi for the PR.

void PrepareOutput();

bool ExtractLargeValueImpl();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to put some comment on these methods?

std::advance(
it, compaction->blob_garbage_collection_age_cutoff() * blob_files.size());

return it != blob_files.end() ? it->first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: if !blob_files.empty() and it == blob_files.end(), will returning blob_files.back()->first + 1 also be correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would also work.

void PrepareOutput();

bool ExtractLargeValueImpl();
void ExtractLargeValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither ExtractLargeValue() nor ExtractLargeValueImpl() guarantee to extract large value. Maybe it's clearer to name MaybeExtractLargeValueIfNeeded() and MaybeExtractLargeValueIfNeededImpl()?

@ltamasi
Copy link
Contributor Author

ltamasi commented Nov 24, 2020

Thanks so much for the review @riversand963 !

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ltamasi merged this pull request in 51a8dc6.

codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
Summary:
The patch adds basic garbage collection support to the integrated BlobDB
implementation. Valid blobs residing in the oldest blob files are relocated
as they are encountered during compaction. The threshold that determines
which blob files qualify is computed based on the configuration option
`blob_garbage_collection_age_cutoff`, which was introduced in facebook#7661 .
Once a blob is retrieved for the purposes of relocation, it passes through the
same logic that extracts large values to blob files in general. This means that
if, for instance, the size threshold for key-value separation (`min_blob_size`)
got changed or writing blob files got disabled altogether, it is possible for the
value to be moved back into the LSM tree. In particular, one way to re-inline
all blob values if needed would be to perform a full manual compaction with
`enable_blob_files` set to `false`, `enable_blob_garbage_collection` set to
`true`, and `blob_file_garbage_collection_age_cutoff` set to `1.0`.

Some TODOs that I plan to address in separate PRs:

1) We'll have to measure the amount of new garbage in each blob file and log
`BlobFileGarbage` entries as part of the compaction job's `VersionEdit`.
(For the time being, blob files are cleaned up solely based on the
`oldest_blob_file_number` relationships.)
2) When compression is used for blobs, the compression type hasn't changed,
and the blob still qualifies for being written to a blob file, we can simply copy
the compressed blob to the new file instead of going through decompression
and compression.
3) We need to update the formula for computing write amplification to account
for the amount of data read from blob files as part of GC.

Pull Request resolved: facebook#7694

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D25069663

Pulled By: ltamasi

fbshipit-source-id: bdfa8feb09afcf5bca3b4eba2ba72ce2f15cd06a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants