-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: add LineFileReader and Status::MustCheck #8026
Conversation
Summary: Removed confusing, awkward, and undocumented internal API ReadOneLine and replaced with very simple LineFileReader. In refactoring backupable_db.cc, this has the side benefit of removing the arbitrary cap on the size of backup metadata files. Also added Status::MustCheck to make it easy to mark a Status as "must check." Using this, I can ensure that after LineFileReader::ReadLine returns false the caller checks GetStatus(). Also removed some excessive conditional compilation in status.h Test Plan: existing tests, incl playing with ASSERT_STATUS_CHECKED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
} else { | ||
line.clear(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this else is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although LineFileReader::ReadLine actually records an empty line on 'false' return, that's not guaranteed by the API comment.
std::vector<std::shared_ptr<FileInfo>> files; | ||
while (backup_meta_reader->ReadLine(&line)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you do not want to loop only up to num_files any longer? It seems a little cleaner to initialize num_files = 0 (or the strtoul) and then loop up to that many entries. If there are still more lines, then return an error at the end. As it is now, if the code has gibberish at the end, you will process the gibberish and return something about that gibberish. If you duplicated the contents of the "files" section, you will read them all, and then say "nope, too many"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need another 'if' and certainly another local variable to do it the way you propose, so this way is simpler.
std::vector<std::shared_ptr<FileInfo>> files; | ||
while (backup_meta_reader->ReadLine(&line)) { | ||
std::vector<std::string> components = StringSplit(line, ' '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will this do if there are multiple consecutive spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care
if (components[2] != ROCKSDB_NAMESPACE::ToString(checksum_value)) { | ||
return Status::Corruption("Invalid checksum value for " + filename + | ||
" in " + meta_filename_); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this happen? I think the original code was really checking to make sure the checksum_value was at EOL. I guess it might happen if the checksum_value had extra characters at the end? But we do not check that condition anywhere else (what if num_files=1234ABCD?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_files is redundant information. We don't need to check its syntax strictly because we have a later check on its value.
checksum should be canonically formatted to be accepted, and that's mostly for debugability.
Besides, all of these parsing details are nits because this is not a user-serviceable data format. It's internal. Rejecting data that wasn't strictly generated by StoreToFile is only "nice to have."
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments, otherwise LGTM
} | ||
ASSERT_OK(reader->GetStatus()); | ||
ASSERT_EQ(count, nlines); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a couple more tests:
- Read after EOF returns false but Status is still OK
} | ||
ASSERT_TRUE(reader->GetStatus().IsCorruption()); | ||
ASSERT_LT(count, nlines / 2); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a few more tests:
- Read after error still returns false and the status is still the same
// the line to `out`, or returning false on failure. You must check | ||
// GetStatus to determine whether the failure was just end-of-file (OK | ||
// status) or an I/O error (another status). | ||
bool ReadLine(std::string* out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the comment to state that out does not contain the newline character.
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@pdillinger merged this pull request in 4b18c46. |
Summary: Removed confusing, awkward, and undocumented internal API ReadOneLine and replaced with very simple LineFileReader. In refactoring backupable_db.cc, this has the side benefit of removing the arbitrary cap on the size of backup metadata files. Also added Status::MustCheck to make it easy to mark a Status as "must check." Using this, I can ensure that after LineFileReader::ReadLine returns false the caller checks GetStatus(). Also removed some excessive conditional compilation in status.h Pull Request resolved: facebook/rocksdb#8026 Test Plan: added unit test, and running tests with ASSERT_STATUS_CHECKED Reviewed By: mrambacher Differential Revision: D26831687 Pulled By: pdillinger fbshipit-source-id: ef749c265a7a26bb13cd44f6f0f97db2955f6f0f Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Removed confusing, awkward, and undocumented internal API ReadOneLine and replaced with very simple LineFileReader. In refactoring backupable_db.cc, this has the side benefit of removing the arbitrary cap on the size of backup metadata files. Also added Status::MustCheck to make it easy to mark a Status as "must check." Using this, I can ensure that after LineFileReader::ReadLine returns false the caller checks GetStatus(). Also removed some excessive conditional compilation in status.h Pull Request resolved: facebook/rocksdb#8026 Test Plan: added unit test, and running tests with ASSERT_STATUS_CHECKED Reviewed By: mrambacher Differential Revision: D26831687 Pulled By: pdillinger fbshipit-source-id: ef749c265a7a26bb13cd44f6f0f97db2955f6f0f Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Removed confusing, awkward, and undocumented internal API
ReadOneLine and replaced with very simple LineFileReader.
In refactoring backupable_db.cc, this has the side benefit of
removing the arbitrary cap on the size of backup metadata files.
Also added Status::MustCheck to make it easy to mark a Status as
"must check." Using this, I can ensure that after
LineFileReader::ReadLine returns false the caller checks GetStatus().
Also removed some excessive conditional compilation in status.h
Test Plan: added unit test, and running tests with ASSERT_STATUS_CHECKED