Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlobDB GC: add SST <-> oldest blob file referenced mapping #5903

Closed
wants to merge 5 commits into from

Conversation

ltamasi
Copy link
Contributor

@ltamasi ltamasi commented Oct 10, 2019

Summary:
This is groundwork for adding garbage collection support to BlobDB. The
patch adds logic that keeps track of the oldest blob file referred to by
each SST file. The oldest blob file is identified during flush/
compaction (similarly to how the range of keys covered by the SST is
identified), and persisted in the manifest as a custom field of the new
file edit record. Blob indexes with TTL are ignored for the purposes of
identifying the oldest blob file (since such blob files are cleaned up by the
TTL logic in BlobDB).

Test Plan:
Added new unit tests; also ran db_bench in BlobDB mode, inspected the
manifest using ldb, and confirmed (by scanning the SST files using
sst_dump) that the value of the oldest blob file number field matches
the contents of the file for each SST.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yiwu-arbug
Copy link
Contributor

Great to see the BlobDB work is continued. Just curious how the oldest blob file going to use for GC? Thanks.

Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks great to me!

return;
}

if (blob_index.file_number() == kInvalidBlobFileNumber) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really -- this shouldn't happen since BlobDB numbers the blob files starting from 1. It's just a paranoid check; I'll mark it as such.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Oct 14, 2019
Summary:
Extracted from PR facebook#5903 for technical reasons.

Test Plan:
make check
facebook-github-bot pushed a commit that referenced this pull request Oct 14, 2019
Summary:
Extracted from PR #5903 for technical reasons.
Pull Request resolved: #5919

Test Plan: make check

Differential Revision: D17910132

Pulled By: ltamasi

fbshipit-source-id: 6ecbb8d6e84b2a1d1f28575ad48ac3cc65833eb5
Summary:
This is groundwork for adding garbage collection support to BlobDB. The
patch adds logic that keeps track of the oldest blob file referred to by
each SST file. The oldest blob file is identified during flush/
compaction (similarly to how the range of keys covered by the SST is
identified), and persisted in the manifest as a custom field of the new
file edit record.

Test Plan:
Added new unit tests; also ran db_bench in BlobDB mode, inspected the
manifest using ldb, and confirmed (by scanning the SST files using
sst_dump) that the value of the oldest blob file number field matches
the contents of the file for each SST.
@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ltamasi merged this pull request in 5f025ea.

merryChris pushed a commit to merryChris/rocksdb that referenced this pull request Nov 18, 2019
Summary:
Extracted from PR facebook#5903 for technical reasons.
Pull Request resolved: facebook#5919

Test Plan: make check

Differential Revision: D17910132

Pulled By: ltamasi

fbshipit-source-id: 6ecbb8d6e84b2a1d1f28575ad48ac3cc65833eb5
merryChris pushed a commit to merryChris/rocksdb that referenced this pull request Nov 18, 2019
…5903)

Summary:
This is groundwork for adding garbage collection support to BlobDB. The
patch adds logic that keeps track of the oldest blob file referred to by
each SST file. The oldest blob file is identified during flush/
compaction (similarly to how the range of keys covered by the SST is
identified), and persisted in the manifest as a custom field of the new
file edit record. Blob indexes with TTL are ignored for the purposes of
identifying the oldest blob file (since such blob files are cleaned up by the
TTL logic in BlobDB).
Pull Request resolved: facebook#5903

Test Plan:
Added new unit tests; also ran db_bench in BlobDB mode, inspected the
manifest using ldb, and confirmed (by scanning the SST files using
sst_dump) that the value of the oldest blob file number field matches
the contents of the file for each SST.

Differential Revision: D17859997

Pulled By: ltamasi

fbshipit-source-id: 21662c137c6259a6af70446faaf3a9912c550e90
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants