Permalink
Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Integrate Opencensus and fix Jepsen bank test (#2764)
Got it. Fixed it. Squashed it. Done with it. Tamed it. Time to bask in the glory of victory! Fixed Jepsen bank test violation during a network partition. The violation was happening because: 1. Current Zero leader receives a commit request and assigns a commit timestamp. 2. Gets partitioned out, so the proposal gets blocked. 3. Another Zero becomes the leader and renews the txn timestamp lease, starting at a much higher number (previous lease + lease bandwidth of 10K). 4. The new leader services new txn requests, which advances all Alphas to a much higher MaxApplied timestamp. 5. The previous leader, who is now the follower, retries the old commit proposal and succeeds. This causes 2 issues. a) A later txn read doesn't see this commit, and advances. b) Alpha rejects a write on a posting list key, which already has a new commit at higher ts. Both of the scenarios caused bank txn test violation. Open Census and Dgraph debug disect was instrumental in determining the cause of this violation. The tell-tale sign was noticing a /Commit timeout of one of the penultimate commits, to the violating commit. Fixes: 0. Use OpenCensus as a torch to light every path that a txn took, to determine the cause. 1. Do not allow a context deadline when proposing a txn commit. 2. Do not allow Zero follower to propagate proposals to Zero leader. Tested this PR to fix issue #2391 . Tested with partition-ring and clock-skew nemesis in a 10-node cluster. More testing is needed around predicate moves. Changelog: * Trial: Do not purge anything (needs to be refactored and reintroduced before a release). * More debugging, more tracing for Jepsen. * Opencensus in Zero. * One fix for Jepsen test, which ensures that a context deadline cannot just cancel a txn proposal. Need some refactoring of how Zero gets membership updates, now that we need Zero to not forward proposals to the leader. * Update Raft lib, so we have access to the feature to disallow Raft proposal forwarding. * Update raftpb proto package as well * Dirty changes to ensure that Zero followers don't forward proposals to Zero leader. * Various Opencensus integration changes.
- Loading branch information
Showing
22 changed files
with
654 additions
and
251 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.