Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track WAL in MANIFEST: define WAL related classes to be used in VersionEdit and VersionSet #7164

Closed
wants to merge 25 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jul 22, 2020

WalAddition, WalDeletion are defined in wal_version.h and used in VersionEdit.
WalAddition is used to represent events of creating a new WAL (no size, just log number), or closing a WAL (with size).
WalDeletion is used to represent events of deleting or archiving a WAL, it means the WAL is no longer alive (won't be replayed during recovery).

WalSet is the set of alive WALs kept in VersionSet.

  1. Why use WalDeletion instead of relying on MinLogNumber to identify outdated WALs

On recovery, we can compute MinLogNumber() based on the log numbers kept in MANIFEST, any log with number < MinLogNumber can be ignored. So it seems that we don't need to persist WalDeletion to MANIFEST, since we can ignore the WALs based on MinLogNumber.

But the MinLogNumber() is actually a lower bound, it does not exactly mean that logs starting from MinLogNumber must exist. This is because in a corner case, when a column family is empty and never flushed, its log number is set to the largest log number, but not persisted in MANIFEST. So let's say there are 2 column families, when creating the DB, the first WAL has log number 1, so it's persisted to MANIFEST for both column families. Then CF 0 is empty and never flushed, CF 1 is updated and flushed, so a new WAL with log number 2 is created and persisted to MANIFEST for CF 1. But CF 0's log number in MANIFEST is still 1. So on recovery, MinLogNumber is 1, but since log 1 only contains data for CF 1, and CF 1 is flushed, log 1 might have already been deleted from disk.

We can make MinLogNumber() be the exactly minimum log number that must exist, by persisting the most recent log number for empty column families that are not flushed. But if there are N such column families, then every time a new WAL is created, we need to add N records to MANIFEST.

In current design, a record is persisted to MANIFEST only when WAL is created, closed, or deleted/archived, so the number of WAL related records are bounded to 3x number of WALs.

  1. Why keep WalSet in VersionSet instead of applying the VersionEdits to VersionStorageInfo

VersionEdits are originally designed to track the addition and deletion of SST files. The SST files are related to column families, each column family has a list of Versions, and each Version keeps the set of active SST files in VersionStorageInfo.

But WALs are a concept of DB, they are not bounded to specific column families. So logically it does not make sense to store WALs in a column family's Versions.
Also, Version's purpose is to keep reference to SST / blob files, so that they are not deleted until there is no version referencing them. But a WAL is deleted regardless of version references.
So we keep the WALs in VersionSet for the purpose of writing out the DB state's snapshot when creating new MANIFESTs.

Test Plan:
make version_edit_test && ./version_edit_test
make wal_edit_test && ./wal_edit_test

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Cheng-Chang ! I have a couple of questions and small comments, please see below.

db/version_edit.cc Outdated Show resolved Hide resolved
db/version_edit.cc Outdated Show resolved Hide resolved
db/version_edit.cc Outdated Show resolved Hide resolved
db/version_edit.h Outdated Show resolved Hide resolved
db/version_edit_test.cc Outdated Show resolved Hide resolved
db/wal_version.cc Outdated Show resolved Hide resolved
db/wal_version.h Outdated Show resolved Hide resolved
db/wal_version.h Outdated Show resolved Hide resolved
db/wal_version_test.cc Outdated Show resolved Hide resolved
db/wal_version.h Outdated Show resolved Hide resolved
Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Cheng-Chang for working on this. Overall LGTM except the comments.

db/version_edit.h Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@ghost
Copy link
Author

ghost commented Jul 29, 2020

@ltamasi @riversand963 Thanks for review, PTAL.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will defer to @ltamasi to stamp.

CMakeLists.txt Outdated Show resolved Hide resolved
db/version_edit.h Outdated Show resolved Hide resolved
db/wal_version.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

CMakeLists.txt Show resolved Hide resolved
db/version_edit.cc Outdated Show resolved Hide resolved
db/version_edit.h Outdated Show resolved Hide resolved
db/wal_edit.h Outdated Show resolved Hide resolved
db/wal_edit_test.cc Outdated Show resolved Hide resolved
db/wal_edit.cc Show resolved Hide resolved
db/wal_edit.cc Show resolved Hide resolved
db/version_edit_test.cc Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Cheng-Chang ! LGTM in general, I just have a couple of final comments.

db/wal_edit.cc Show resolved Hide resolved
db/wal_edit.cc Show resolved Hide resolved
db/wal_edit_test.cc Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@Cheng-Chang merged this pull request in cd48eca.

@ghost ghost changed the title Define WAL related classes to be used in VersionEdit and VersionSet Track WAL in MANIFEST: define WAL related classes to be used in VersionEdit and VersionSet Aug 15, 2020
@ghost ghost mentioned this pull request Aug 18, 2020
facebook-github-bot pushed a commit that referenced this pull request Aug 20, 2020
Summary:
The updates resolve comments left from #7164.

Pull Request resolved: #7282

Test Plan: wal_edit_test

Reviewed By: ltamasi

Differential Revision: D23196824

Pulled By: cheng-chang

fbshipit-source-id: 797f3fef27fc72114c2be777d9eadd3429da5301
codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
…acebook#7164)

Summary:
`WalAddition`, `WalDeletion` are defined in `wal_version.h` and used in `VersionEdit`.
`WalAddition` is used to represent events of creating a new WAL (no size, just log number), or closing a WAL (with size).
`WalDeletion` is used to represent events of deleting or archiving a WAL, it means the WAL is no longer alive (won't be replayed during recovery).

`WalSet` is the set of alive WALs kept in `VersionSet`.

1. Why use `WalDeletion` instead of relying on `MinLogNumber` to identify outdated WALs

On recovery, we can compute `MinLogNumber()` based on the log numbers kept in MANIFEST, any log with number < MinLogNumber can be ignored. So it seems that we don't need to persist `WalDeletion` to MANIFEST, since we can ignore the WALs based on MinLogNumber.

But the `MinLogNumber()` is actually a lower bound, it does not exactly mean that logs starting from MinLogNumber must exist. This is because in a corner case, when a column family is empty and never flushed, its log number is set to the largest log number, but not persisted in MANIFEST. So let's say there are 2 column families, when creating the DB, the first WAL has log number 1, so it's persisted to MANIFEST for both column families. Then CF 0 is empty and never flushed, CF 1 is updated and flushed, so a new WAL with log number 2 is created and persisted to MANIFEST for CF 1. But CF 0's log number in MANIFEST is still 1. So on recovery, MinLogNumber is 1, but since log 1 only contains data for CF 1, and CF 1 is flushed, log 1 might have already been deleted from disk.

We can make `MinLogNumber()` be the exactly minimum log number that must exist, by persisting the most recent log number for empty column families that are not flushed. But if there are N such column families, then every time a new WAL is created, we need to add N records to MANIFEST.

In current design, a record is persisted to MANIFEST only when WAL is created, closed, or deleted/archived, so the number of WAL related records are bounded to 3x number of WALs.

2. Why keep `WalSet` in `VersionSet` instead of applying the `VersionEdit`s to `VersionStorageInfo`

`VersionEdit`s are originally designed to track the addition and deletion of SST files. The SST files are related to column families, each column family has a list of `Version`s, and each `Version` keeps the set of active SST files in `VersionStorageInfo`.

But WALs are a concept of DB, they are not bounded to specific column families. So logically it does not make sense to store WALs in a column family's `Version`s.
Also, `Version`'s purpose is to keep reference to SST / blob files, so that they are not deleted until there is no version referencing them. But a WAL is deleted regardless of version references.
So we keep the WALs in `VersionSet`  for the purpose of writing out the DB state's snapshot when creating new MANIFESTs.

Pull Request resolved: facebook#7164

Test Plan:
make version_edit_test && ./version_edit_test
make wal_edit_test && ./wal_edit_test

Reviewed By: ltamasi

Differential Revision: D22677936

Pulled By: cheng-chang

fbshipit-source-id: 5a3b6890140e572ffd79eb37e6e4c3c32361a859
codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
Summary:
The updates resolve comments left from facebook#7164.

Pull Request resolved: facebook#7282

Test Plan: wal_edit_test

Reviewed By: ltamasi

Differential Revision: D23196824

Pulled By: cheng-chang

fbshipit-source-id: 797f3fef27fc72114c2be777d9eadd3429da5301
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants