-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force a new manifest file if append to current one fails #6331
Conversation
ASSERT_NE(new_manifest, old_manifest); | ||
|
||
Reopen(options); | ||
ASSERT_EQ("val", Get(Key(0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check the value of key(1) also?
} | ||
ASSERT_NE(new_manifest, old_manifest); | ||
Reopen(options); | ||
ASSERT_EQ("val", Get(Key(0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
@@ -204,6 +198,12 @@ Status ErrorHandler::SetBGError(const Status& bg_err, BackgroundErrorReason reas | |||
|
|||
new_bg_err = Status(bg_err, sev); | |||
|
|||
// Check if recovery is currently in progress. If it is, we will save this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this block just before if (new_bg_err == Status::NoSpace()) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but is there a particular reason? The reason for moving this down here is so recovery_error_ will have the correct severity. This way, the caller of DB::Resume() will receive an error indicating whether they can retry resume later or not. We don't necessarily care how new_bg_err_ is handled from this point on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Can you remove the ";" in line 210 in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the process crashes after closing the previous, potentially corrupted MANIFEST, but before creating the new MANIFEST? Will the db still be able to recover?
db/error_handler_test.cc
Outdated
DestroyAndReopen(options); | ||
ASSERT_OK(dbfull()->GetLiveFiles(live_files, &manifest_size, false)); | ||
for (auto& file : live_files) { | ||
if (file.find("MANIFEST") != std::string::npos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use ParseFileName()
here and in a few other places?
db/error_handler_test.cc
Outdated
ASSERT_NE(new_manifest, old_manifest); | ||
Reopen(options); | ||
ASSERT_EQ("val", Get(Key(0))); | ||
Destroy(options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we let DBErrorHandlingTest
, a subclass of DBTestBase
observe KEEP_DB
env variable. See ~DBTestBase()
. This allows us to optionally preserve the db directory for debugging purpose.
Isn't it the same as basic crash recovery? |
It is, and will cause later recovery to fail iiuc. Just trying to understand the scope of this fix. |
Actually, I think there is a difference, especially if the writes are batched. A crash while writing manifest will only result in the last record being partially written, and that will be skipped on open I think. The previous records in the batch will be valid. However, if the DB stays up (either read-only or read-write), the files referenced by the previous records will be deleted and recovery on re-open will fail. I think this is the issue you were trying to fix in #5379? |
Thanks for reminding me of this. I checked log::Reader, and you are right, the trailing incomplete version edit in the MANIFEST will not be reported as failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @anand1976 for the fix.
ASSERT_NE(new_manifest, old_manifest); | ||
|
||
Reopen(options); | ||
ASSERT_EQ("val", Get(Key(0))); | ||
Destroy(options); | ||
ASSERT_EQ("val", Get(Key(1))); | ||
Close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Close()
at the end of each test is not needed, but it's harmless.
Summary: When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operaitons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Test Plan: Add new unit tests in error_handler_test.cc Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
1ec461a
to
ebe2496
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anand1976 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Appveyor failure is unrelated. |
The FB internal failures are due to an infra issue and unrelated to this PR. So going ahead with landing this PR. |
@anand1976 merged this pull request in fb05b5a. |
Summary: Fix for issue #6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: #6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3
Summary: Fix for issue #6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: #6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3
Summary: Fix for issue facebook#6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: facebook#6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3
Summary: Fix for issue facebook#6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: facebook#6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3 Signed-off-by: Yi Wu <yiwu@pingcap.com>
Summary: Fix for issue facebook#6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: facebook#6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3
Summary: Fix for issue facebook/rocksdb#6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: facebook/rocksdb#6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3 Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Fix for issue facebook/rocksdb#6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: facebook/rocksdb#6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3 Signed-off-by: Changlong Chen <levisonchen@live.cn>
Fix for issue #6316
When an append/sync of the manifest file fails due to an IO error such
as NoSpace, we don't always put the DB in read-only mode. This is true
for flush and compactions, as well as foreground operatons such as column family
add/drop, CompactFiles etc. Subsequent changes to the DB will be
recorded in the same manifest file, which would have a corrupted record
in the middle due to the previous failure. On next DB::Open(), it will
fail to process the full manifest and data will be lost.
To fix this, we reset VersionSet::descriptor_log_ on append/sync
failure, which will force a new manifest file to be written on the next
append.
Test Plan:
Add new unit tests in error_handler_test.cc