-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db_stress support tracking historical values #8960
Conversation
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
2d02085
to
4741881
Compare
@ajkr has updated the pull request. You must reimport the pull request before landing. |
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ajkr for the PR. Left a few initial minor comments before taking closer look.
db_stress_tool/db_stress_gflags.cc
Outdated
@@ -789,7 +790,10 @@ DEFINE_int32(get_property_one_in, 1000, | |||
|
|||
DEFINE_bool(sync_fault_injection, false, | |||
"If true, FaultInjectionTestFS will be used for write operations, " | |||
" and unsynced data in DB will lost after crash."); | |||
"and unsynced data in DB will lost after crash. In such a case we " | |||
"track DB changes in a trace file in --expected_values_dir for " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give an example of the name of a trace file? For example, it can be reworded a bit like:
track DB changes in a trace file (*.trace)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expectation if expected_values_dir
is empty but sync_fault_injection
is set to true? The doc for sync_fault_injection
seems to imply that "in such a case, we will need expected_values_dir
.
If so, can we add a check in command line arguments parsing and processing, so that we make sure user sets expected_values_dir
if they set sync_fault_injection
to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give an example of the name of a trace file?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expectation if expected_values_dir is empty but sync_fault_injection is set to true?
Empty expected_values_dir
means no verification at startup, and verification at shutdown based only on the operations performed by that particular db_stress
run. Combining it with sync_fault_injection=true
would not bring additional coverage (compared to sync_fault_injection=false
) because sync_fault_injection
only causes unsynced data to be dropped at process end, which is after all verifications when expected_values_dir
is empty.
Since I believe the combination brings no additional coverage, it is currently not supported. The error looks like this:
$ ./db_stress --expected_values_dir="" --sync_fault_injection=true
...
Error enabling history tracing: Not implemented:
db_stress_tool/expected_state.cc
Outdated
@@ -239,6 +386,10 @@ Status AnonExpectedStateManager::Open() { | |||
return latest_->Open(true /* create */); | |||
} | |||
|
|||
Status AnonExpectedStateManager::SaveAtAndAfter(DB* /* db */) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can this be inlined just to save some typing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
std::string temp_path = GetTempPathForFilename(kLatestFilename); | ||
Status s = Env::Default()->FileExists(temp_path); | ||
#ifndef ROCKSDB_LITE | ||
Status FileExpectedStateManager::SaveAtAndAfter(DB* db) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is SaveAtAndAfter
only going to be called on DB open, or eventually also anytime the WAL is synced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to call it anytime the WAL is synced. However there are some difficulties with parallel writers mutating the "*.state" file while it's being copied, for example. We would need to figure out the locking to make it possible.
4741881
to
8242f86
Compare
@ajkr has updated the pull request. You must reimport the pull request before landing. |
2 similar comments
@ajkr has updated the pull request. You must reimport the pull request before landing. |
@ajkr has updated the pull request. You must reimport the pull request before landing. |
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
ecacfa3
to
db46d2e
Compare
@ajkr has updated the pull request. You must reimport the pull request before landing. |
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a reasonable starting place to me 👍
db_stress_tool/expected_state.cc
Outdated
} | ||
} | ||
if (s.ok()) { | ||
if (old_saved_seqno != kMaxSequenceNumber && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: duplicate if
could be simplified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@ajkr has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
db_stress_tool/expected_state.cc
Outdated
} | ||
} | ||
if (s.ok()) { | ||
if (old_saved_seqno != kMaxSequenceNumber && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
When
--sync_fault_injection
is set, this PR takes a snapshot of the expected values and starts an operation trace when the DB is opened. These files are stored in--expected_values_dir
. They will be used for recovering the expected state of the DB following a crash where a suffix of unsynced operations are allowed to be lost.Test Plan: injected crashed at various points in
FileExpectedStateManager
and verified the next run recovers the state/trace file with highest seqno and removes all older/temporary files. Note we don't use sync_fault_injection in CI crash tests yet.