-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug for WalManager with compressed WAL #10130
Conversation
d01deca
to
947c936
Compare
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @akankshamahajan15 for the fix! Left a few comments.
// In case of wal_compression, it writes a `kSetCompressionType` record | ||
// which is not associated with any sequence number. As result for an empty | ||
// file, GetSortedWalsOfType() will skip these WALs causing the operations | ||
// to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As result for an empty file, GetSortedWalsOfType() will skip these WALs causing the operations to fail.
With this PR's fix, is this still true? No operation will fail, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rewrite the comment to mention operations will not fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An empty file skipped by GetSortedWalsOfType()
should not fail because, even if we enable WAL tracking in manifest, we do not log a WAL of sync size 0 to the MANIFEST. Therefore, when we recover later, we won't be looking for such WALs.
@riversand963 I was thinking of running the crash tests with |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
947c936
to
f980798
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
// In case of wal_compression, it writes a `kSetCompressionType` record | ||
// which is not associated with any sequence number. As result for an empty | ||
// file, GetSortedWalsOfType() will skip these WALs causing the operations | ||
// to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An empty file skipped by GetSortedWalsOfType()
should not fail because, even if we enable WAL tracking in manifest, we do not log a WAL of sync size 0 to the MANIFEST. Therefore, when we recover later, we won't be looking for such WALs.
Summary: RocksDB uses WalManager to manage WAL files. In WalManager::ReadFirstLine(), the assumption is that reading the first record of a valid WAL file will return OK status and set the output sequence to non-zero value.
This assumption has been broken by WAL compression which writes a
kSetCompressionType
record which is not associated with any sequence number.Consequently, WalManager::GetSortedWalsOfType() will skip these WALs and not return them to caller, e.g. Checkpoint, Backup, causing the operations to fail.
Test Plan:
- Newly Added test