raft: scaffolding for store liveness and leader leases#123789
Draft
nvb wants to merge 30 commits intocockroachdb:masterfrom
Draft
raft: scaffolding for store liveness and leader leases#123789nvb wants to merge 30 commits intocockroachdb:masterfrom
nvb wants to merge 30 commits intocockroachdb:masterfrom
Conversation
We can use this to split up the prototyping effort. There's work to do below the new StoreLiveness interface, in Raft using StoreLiveness, and above Raft using the new Status.LeadSupportUntil field. To collaborate on this, we can push new commits to this branch. Try to avoid force pushing to prevent skew.
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Member
This commit creates a new replicaRLockedStoreLiveness intermediary type which translates replica IDs to store IDs using an rlocked replica's RangeDescriptor and uses the resulting store ID to call into a real (and as-of-yet unimplemented) StoreLiveness instance.
This patch adds and implements two new message types -- MsgFortify and MsgFortifyResp. Using these, the leader broadcasts a fortification request to all its followers when a vote is won. It handles responses by updating its leadSupport map. Things still missing - 1. We're not evaluating if we're supported by a majority of followers, and till when. 2. We're not persisting lead and leadEpoch to disk. 3. Re-fortification isn't handled. 4. Testing. Release note: None
d4662c5 to
7fdc252
Compare
This patch mocks store liveness in datadriven tests and adds a few directives to bump epochs and withdraw support. It then constructs an "interesting" store liveness state and runs a new election. In doing so, we ensure that MsgFortifyResp are populated correctly based on the store liveness state. Release note: None
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
This patch build re-fortification. Now, on every tick, the leader tries to re-fortify any peers that were never fortified or who have withdrawn support from a fortified epoch. We only do this if the peer's store is currently supporting the leader's; otherwise, we'll have to wait until this happens. Release note: None
This commit adds a ClockTimestamp to RaftMessageRequestBatch.
489fde9 to
d20e4c2
Compare
This patch is based on Sumeer's prototype in cockroachdb#122547. The main differences are many simplifications due to the algorithm changes between take 3 and take 4. This patch adds only the basic heartbeating capability and structure of the store liveness fabric. The algorithms logic will come in separate commits.
d20e4c2 to
80f5be6
Compare
Two main changes: - The transport is not longer RPC, but streaming instead, with a per-node send queue for outgoing heartbeat requests, and separate per-store receive queues for incoming heartbeat requests, and responses. - The algorithm logic is in a very basic state. Persistence is not implemented yet (there are TODOs). There is zero testing.
This makes the transport look more like etcd/raft.
cf7420e to
4d36801
Compare
This patch introduces a new SupportTracker struct that tracks support provided by followers to a leader. The leader can then use this tracked support to calculate its QSE, which is used by higher layers. Release note: None
4c2d7d7 to
3b969e3
Compare
The state machine should only campaign if it isn't supporting the leader. The only exception is if the leader has explicitly asked a follower to do so by initiating a campaignTransfer. Epic: none Release note: None
7a2d26c to
3a35370
Compare
A follower should only vote for a candidate if it isn't supporting a leader. Epic: none Release note: None
The cluster setting `kv.storeliveness.enabled` starts and stops the main store liveness loop.
95ffbfb to
2adef81
Compare
2067153 to
f927061
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We can use this to split up the prototyping effort. There's work to do below the new StoreLiveness interface, in Raft using StoreLiveness, and above Raft using the new Status.LeadSupportUntil field.
To collaborate on this, we can push new commits to this branch. Try to avoid force pushing to prevent skew.