Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rawkv: Fix unstable test-region-merge #13580

Merged
merged 5 commits into from
Oct 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 0 additions & 3 deletions components/causal_ts/src/tso.rs
Original file line number Diff line number Diff line change
Expand Up @@ -622,9 +622,6 @@ impl<C: PdClient + 'static> CausalTsProvider for BatchTsoProvider<C> {
}

async fn async_flush(&self) -> Result<TimeStamp> {
fail::fail_point!("causal_ts_provider_flush", |_| Err(box_err!(
"async_flush err(failpoints)"
)));
self.renew_tso_batch(true, TsoBatchRenewReason::flush)
.await?;
// TODO: Return the first tso by renew_tso_batch instead of async_get_ts
Expand Down
22 changes: 11 additions & 11 deletions tests/failpoints/cases/test_rawkv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ impl TestSuite {
}
}

const FP_CAUSAL_TS_PROVIDER_FLUSH: &str = "causal_ts_provider_flush";
const FP_GET_TSO: &str = "test_raftstore_get_tso";

/// Verify correctness on leader transfer.
// TODO: simulate and test for the scenario of issue #12498.
Expand All @@ -164,8 +164,8 @@ fn test_leader_transfer() {
assert_eq!(suite.must_raw_get(key1), Some(b"v4".to_vec()));
}

// Disable CausalObserver::flush_timestamp to produce causality issue.
fail::cfg(FP_CAUSAL_TS_PROVIDER_FLUSH, "return").unwrap();
// Make causal_ts_provider.async_flush() & handle_update_max_timestamp fail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why failpoint FP_GET_TSO cant make causal_ts_provider.async_flush() & handle_update_max_timestamp fail ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The critical path of causal_ts_provider.async_flush() is getting a batch of TSOs from PD and renew the TSO cache. So the error of get_tso must make causal_ts_provider.async_flush() fail.
Please refer to BatchTsoProvider::renew_tso_batch_impl().

And handle_update_max_timestamp() invokes causal_ts_provider.async_flus() here (when storage.api-version = 2). So handle_update_max_timestamp() will also fail (and keep retrying until succeed).

fail::cfg(FP_GET_TSO, "return(50)").unwrap();

// Transfer leader and write to store 2.
{
Expand All @@ -182,8 +182,8 @@ fn test_leader_transfer() {
// Transfer leader back.
suite.must_transfer_leader(&region, 1);
suite.must_leader_on_store(key1, 1);
// Enable CausalObserver::flush_timestamp.
fail::cfg(FP_CAUSAL_TS_PROVIDER_FLUSH, "off").unwrap();
// Make handle_update_max_timestamp succeed.
fail::cfg(FP_GET_TSO, "off").unwrap();
// Transfer leader and write to store 2 again.
{
suite.must_transfer_leader(&region, 2);
Expand All @@ -195,7 +195,7 @@ fn test_leader_transfer() {
assert_eq!(suite.must_raw_get(key1), Some(b"v8".to_vec()));
}

fail::remove(FP_CAUSAL_TS_PROVIDER_FLUSH);
fail::remove(FP_GET_TSO);
suite.stop();
}

Expand Down Expand Up @@ -238,8 +238,8 @@ fn test_region_merge() {
assert_eq!(suite.must_raw_get(keys[1]), Some(b"v4".to_vec()));
}

// Disable CausalObserver::flush_timestamp to produce causality issue.
fail::cfg(FP_CAUSAL_TS_PROVIDER_FLUSH, "return").unwrap();
// Make causal_ts_provider.async_flush() & handle_update_max_timestamp fail.
fail::cfg(FP_GET_TSO, "return(50)").unwrap();

// Merge region 1 to 3.
{
Expand All @@ -253,8 +253,8 @@ fn test_region_merge() {
assert_eq!(suite.must_raw_get(keys[1]), Some(b"v4".to_vec()));
}

// Enable CausalObserver::flush_timestamp.
fail::cfg(FP_CAUSAL_TS_PROVIDER_FLUSH, "off").unwrap();
// Make handle_update_max_timestamp succeed.
fail::cfg(FP_GET_TSO, "off").unwrap();

// Merge region 3 to 5.
{
Expand All @@ -268,6 +268,6 @@ fn test_region_merge() {
assert_eq!(suite.must_raw_get(keys[1]), Some(b"v8".to_vec()));
}

fail::remove(FP_CAUSAL_TS_PROVIDER_FLUSH);
fail::remove(FP_GET_TSO);
suite.stop();
}