-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track WAL in MANIFEST: define WAL related classes to be used in VersionEdit and VersionSet #7164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Cheng-Chang ! I have a couple of questions and small comments, please see below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Cheng-Chang for working on this. Overall LGTM except the comments.
@ltamasi @riversand963 Thanks for review, PTAL. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Will defer to @ltamasi to stamp.
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
@cheng-chang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Cheng-Chang ! LGTM in general, I just have a couple of final comments.
@Cheng-Chang merged this pull request in cd48eca. |
…acebook#7164) Summary: `WalAddition`, `WalDeletion` are defined in `wal_version.h` and used in `VersionEdit`. `WalAddition` is used to represent events of creating a new WAL (no size, just log number), or closing a WAL (with size). `WalDeletion` is used to represent events of deleting or archiving a WAL, it means the WAL is no longer alive (won't be replayed during recovery). `WalSet` is the set of alive WALs kept in `VersionSet`. 1. Why use `WalDeletion` instead of relying on `MinLogNumber` to identify outdated WALs On recovery, we can compute `MinLogNumber()` based on the log numbers kept in MANIFEST, any log with number < MinLogNumber can be ignored. So it seems that we don't need to persist `WalDeletion` to MANIFEST, since we can ignore the WALs based on MinLogNumber. But the `MinLogNumber()` is actually a lower bound, it does not exactly mean that logs starting from MinLogNumber must exist. This is because in a corner case, when a column family is empty and never flushed, its log number is set to the largest log number, but not persisted in MANIFEST. So let's say there are 2 column families, when creating the DB, the first WAL has log number 1, so it's persisted to MANIFEST for both column families. Then CF 0 is empty and never flushed, CF 1 is updated and flushed, so a new WAL with log number 2 is created and persisted to MANIFEST for CF 1. But CF 0's log number in MANIFEST is still 1. So on recovery, MinLogNumber is 1, but since log 1 only contains data for CF 1, and CF 1 is flushed, log 1 might have already been deleted from disk. We can make `MinLogNumber()` be the exactly minimum log number that must exist, by persisting the most recent log number for empty column families that are not flushed. But if there are N such column families, then every time a new WAL is created, we need to add N records to MANIFEST. In current design, a record is persisted to MANIFEST only when WAL is created, closed, or deleted/archived, so the number of WAL related records are bounded to 3x number of WALs. 2. Why keep `WalSet` in `VersionSet` instead of applying the `VersionEdit`s to `VersionStorageInfo` `VersionEdit`s are originally designed to track the addition and deletion of SST files. The SST files are related to column families, each column family has a list of `Version`s, and each `Version` keeps the set of active SST files in `VersionStorageInfo`. But WALs are a concept of DB, they are not bounded to specific column families. So logically it does not make sense to store WALs in a column family's `Version`s. Also, `Version`'s purpose is to keep reference to SST / blob files, so that they are not deleted until there is no version referencing them. But a WAL is deleted regardless of version references. So we keep the WALs in `VersionSet` for the purpose of writing out the DB state's snapshot when creating new MANIFESTs. Pull Request resolved: facebook#7164 Test Plan: make version_edit_test && ./version_edit_test make wal_edit_test && ./wal_edit_test Reviewed By: ltamasi Differential Revision: D22677936 Pulled By: cheng-chang fbshipit-source-id: 5a3b6890140e572ffd79eb37e6e4c3c32361a859
Summary: The updates resolve comments left from facebook#7164. Pull Request resolved: facebook#7282 Test Plan: wal_edit_test Reviewed By: ltamasi Differential Revision: D23196824 Pulled By: cheng-chang fbshipit-source-id: 797f3fef27fc72114c2be777d9eadd3429da5301
WalAddition
,WalDeletion
are defined inwal_version.h
and used inVersionEdit
.WalAddition
is used to represent events of creating a new WAL (no size, just log number), or closing a WAL (with size).WalDeletion
is used to represent events of deleting or archiving a WAL, it means the WAL is no longer alive (won't be replayed during recovery).WalSet
is the set of alive WALs kept inVersionSet
.WalDeletion
instead of relying onMinLogNumber
to identify outdated WALsOn recovery, we can compute
MinLogNumber()
based on the log numbers kept in MANIFEST, any log with number < MinLogNumber can be ignored. So it seems that we don't need to persistWalDeletion
to MANIFEST, since we can ignore the WALs based on MinLogNumber.But the
MinLogNumber()
is actually a lower bound, it does not exactly mean that logs starting from MinLogNumber must exist. This is because in a corner case, when a column family is empty and never flushed, its log number is set to the largest log number, but not persisted in MANIFEST. So let's say there are 2 column families, when creating the DB, the first WAL has log number 1, so it's persisted to MANIFEST for both column families. Then CF 0 is empty and never flushed, CF 1 is updated and flushed, so a new WAL with log number 2 is created and persisted to MANIFEST for CF 1. But CF 0's log number in MANIFEST is still 1. So on recovery, MinLogNumber is 1, but since log 1 only contains data for CF 1, and CF 1 is flushed, log 1 might have already been deleted from disk.We can make
MinLogNumber()
be the exactly minimum log number that must exist, by persisting the most recent log number for empty column families that are not flushed. But if there are N such column families, then every time a new WAL is created, we need to add N records to MANIFEST.In current design, a record is persisted to MANIFEST only when WAL is created, closed, or deleted/archived, so the number of WAL related records are bounded to 3x number of WALs.
WalSet
inVersionSet
instead of applying theVersionEdit
s toVersionStorageInfo
VersionEdit
s are originally designed to track the addition and deletion of SST files. The SST files are related to column families, each column family has a list ofVersion
s, and eachVersion
keeps the set of active SST files inVersionStorageInfo
.But WALs are a concept of DB, they are not bounded to specific column families. So logically it does not make sense to store WALs in a column family's
Version
s.Also,
Version
's purpose is to keep reference to SST / blob files, so that they are not deleted until there is no version referencing them. But a WAL is deleted regardless of version references.So we keep the WALs in
VersionSet
for the purpose of writing out the DB state's snapshot when creating new MANIFESTs.Test Plan:
make version_edit_test && ./version_edit_test
make wal_edit_test && ./wal_edit_test