raft: centralize configuration change application #10865

tbg · 2019-06-28T12:39:15Z

Put all the logic related to applying a configuration change in one
place in preparation for adding joint consensus.

This inspired various TODOs.

I had to rewrite TestSnapshotSucceedViaAppResp since it was relying
on a snapshot applied to the leader, which is now prevented.

tbg · 2019-06-28T12:40:20Z

cc @nvanbenschoten (sorry - can't request reviews from you here -- @gyuho could you add @nvanbenschoten to the repo?)

codecov-io · 2019-06-28T13:19:32Z

Codecov Report

Merging #10865 into master will increase coverage by 0.34%.
The diff coverage is 78.94%.

@@            Coverage Diff             @@
##           master   #10865      +/-   ##
==========================================
+ Coverage   62.81%   63.16%   +0.34%     
==========================================
  Files         398      398              
  Lines       37447    37483      +36     
==========================================
+ Hits        23523    23675     +152     
+ Misses      12336    12225     -111     
+ Partials     1588     1583       -5

Impacted Files	Coverage Δ
raft/quorum/quorum.go	`100% <ø> (ø)`	⬆️
raft/rawnode.go	`67.5% <100%> (+4.14%)`	⬆️
raft/quorum/majority.go	`100% <100%> (ø)`	⬆️
raft/node.go	`93.67% <100%> (+0.92%)`	⬆️
raft/tracker/tracker.go	`81.05% <46.15%> (-6.16%)`	⬇️
raft/quorum/joint.go	`92.3% <50%> (-7.7%)`	⬇️
raft/raft.go	`90.71% <79.16%> (-0.59%)`	⬇️
auth/simple_token.go	`67.47% <0%> (-19.52%)`	⬇️
auth/store.go	`45.83% <0%> (-16.91%)`	⬇️
etcdserver/util.go	`87.5% <0%> (-2.5%)`	⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d506962...691a9df. Read the comment docs.

bdarnell · 2019-06-28T19:53:30Z

raft/raft.go

+		// follower state. This should never fire, but if it did, we'd have
+		// prevented damage by returning early, so log only a loud warning.
+		r.logger.Warningf("%x attempted to restore snapshot as leader; should never happen", r.id)
+		return false


Add a comment to this function explaining the bool return value. Should we also step down to follower in this case?

nvanbenschoten · 2019-06-28T21:05:00Z

raft/raft.go

+	}
+
+	// More defense-in-depth: throw away snapshot if recipient is not in the
+	// config.


Could you describe what could go wrong if we didn't do this?

Don't really have a good answer here, but I added something. The "am I currently in the config" checks show up in random places and it doesn't seem there's a system for them. I mostly just don't want to have to even think about it in this method.

nvanbenschoten · 2019-06-28T21:10:02Z

raft/raft.go

 	}
+
+	pr := r.prs.Progress[r.id]
+	pr.MaybeUpdate(pr.Next - 1) // TODO(tbg): this is untested and likely unneeded


This looks new. What prompted you to add it?

No, it's not new, just looks different now. I've traced this back to basically the age of the dinosaurs but there was no reason given for doing this. I don't think it matters, but am not sure enough to rip it out via a drive-by.

nvanbenschoten · 2019-06-28T21:11:23Z

raft/raft.go

+			if isLearner && !pr.IsLearner {
+				// Can only change Learner to Voter.
+				//
+				// TODO(tbg): why?


Did we come to a decision on this?

I don't think there's a good reason, but we're waiting for @xiang90 and @siddontang for more input. My plan so far is to implement everything as if we wanted to allow demotions (incl unit tests etc) but then not actually expose them to the outside world. We certainly don't need them in CRDB.

nvanbenschoten · 2019-06-28T21:18:03Z

raft/raft_test.go

@@ -356,8 +356,8 @@ func TestLearnerPromotion(t *testing.T) {

 	nt.send(pb.Message{From: 1, To: 1, Type: pb.MsgBeat})

-	n1.addNode(2)


nit: for the purpose of avoiding verbosity and the diff, you might consider adding addNode, removeNode, and addLearner as test-only (in raft_test.go) methods. They would be one-liners.

Not a bad idea, but I'll keep as is.

nvanbenschoten · 2019-06-28T21:21:34Z

raft/quorum/majority.go

+		if i > 0 {
+			buf.WriteByte(' ')
+		}
+		buf.WriteString(fmt.Sprintf("%d", sl[i]))


fmt.Fprintf(&buf, ...? Are you trying to prevent buf from escaping?

No, just some confused code. Fixed.

nvanbenschoten · 2019-06-28T21:24:01Z

raft/tracker/tracker.go

+// Config reflects the configuration tracked in a ProgressTracker.
+type Config struct {
+	Voters   quorum.JointConfig
+	Learners map[uint64]struct{}


This is a little surprising to me. I would have expected Learners to live in MajorityConfig. How would a joint configuration where three learners are all atomically upgraded to voting replicas be represented in this model?

How would a joint configuration where three learners are all atomically upgraded to voting replicas be represented in this model?

You start with voters=(1) learners=(2 3) and transition to voters=(1)&(1 2 3) learners=() and then transition to voters=(1 2 3) learners=()

It gets more interesting when you add learners by demoting a voter because we don't at any given point want a peer to be tracked as both a voter and a learner, but naively you need this because you go from

voters=(1 2 3) learners=() to voters=(1 2 3)&(1 2) learners=(3) so (and this isn't in this PR yet because it's not needed) you get another map NextLearners which stashes the learners that will be added when transitioning out of the joint config rather than in. I.e you really do

voters=(1 2 3) learners=() to voters=(1 2 3) & (1 2) learners=() next_learners=(3) and then to voters=(1 2) learners=(3)

Basically whenever you see that you need a new learner but in the joint config it's also a voter it goes to next_learners, otherwise to learners. It's more efficient to use learners directly (instead of unconditionally pushing into next_learners) so the peer gets caught up earlier.

I would have expected Learners to live in MajorityConfig

This could've gone either way, but so far the quorum package is really only concerned with quorum computations at the moment. Learners don't influence the quorum.
I don't know if this is relevant to your question, but the ConfState will have to contain nodes, joint nodes, learners, and next learners, i.e. corresponds to a Config.

You start with voters=(1) learners=(2 3) and transition to voters=(1)&(1 2 3) learners=() and then transition to voters=(1 2 3) learners=()

But right now the second state would necessarily be voters=(1)&(1 2 3) learners=(2 3)

so (and this isn't in this PR yet because it's not needed) you get another map NextLearners which stashes the learners that will be added when transitioning out of the joint config rather than in

Yeah, so I guess this is what I'm getting at. You'd want to associate a learners map with each side of the joint config. I'm ok with this living outside of the MajorityConfig, but we should make it clear that Learners only refers to the left side of the joint config.

On that note, would we want to represent this like:

type Config struct { Voters quorum.JointConfig Learners [2]map[uint64]struct{} }

to mirror the structure in quorum.JointConfig?

You'd want to associate a learners map with each side of the joint config.

Sort of, but it's much less symmetric than the joint config (a learner added in the next config may end up in Learners instead of NextLearners), so I'm more comfortable keeping this more explicit. I plan to call this Learners and NextLearners (added roughly the comment above into the code to make the plan clear).

But right now the second state would necessarily be voters=(1)&(1 2 3) learners=(2 3)

Yep, but also we don't support joint config changes yet :-) I agree that this is awkward but the in-code comment now explains exactly how this will work (and it'll also be the next PR).

nvanbenschoten · 2019-06-28T21:25:00Z

raft/tracker/tracker.go

+		Config: Config{
+			Voters: quorum.JointConfig{
+				quorum.MajorityConfig{},
+				// TODO(tbg): this will be mostly empty, so make it a nil pointer


Put all the logic related to applying a configuration change in one place in preparation for adding joint consensus. This inspired various TODOs. I had to rewrite TestSnapshotSucceedViaAppResp since it was relying on a snapshot applied to the leader, which is now prevented.

This is helpful to quickly print the configuration log messages without having to specify Voters and Learners separately. It will also come in handy for joint quorums because it allows holding on to voters and learners as a unit, which is useful for unit testing.

tbg · 2019-07-03T19:29:37Z

Thanks for the reviews! I think I got everything addressed, but anything I missed I'll follow up on when pinged.

tbg requested a review from bdarnell June 28, 2019 12:40

tbg force-pushed the multi-conf-change branch from 0c33973 to 3c65a08 Compare June 28, 2019 12:45

tbg force-pushed the multi-conf-change branch 3 times, most recently from c143dd7 to 6e4f0a8 Compare June 28, 2019 16:08

bdarnell approved these changes Jun 28, 2019

View reviewed changes

tbg force-pushed the multi-conf-change branch 2 times, most recently from 745431f to 87d0330 Compare June 28, 2019 20:54

nvanbenschoten reviewed Jun 28, 2019

View reviewed changes

tbg force-pushed the multi-conf-change branch from 87d0330 to 691a9df Compare June 29, 2019 13:24

tbg added 2 commits July 3, 2019 21:26

tbg force-pushed the multi-conf-change branch from 691a9df to 6697adf Compare July 3, 2019 19:26

tbg merged commit 48f5bb6 into etcd-io:master Jul 3, 2019

tbg deleted the multi-conf-change branch July 3, 2019 19:58

BusyJay mentioned this pull request Jun 19, 2020

port etcd joint consensus tikv/raft-rs#378

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raft: centralize configuration change application #10865

raft: centralize configuration change application #10865

tbg commented Jun 28, 2019 •

edited

Loading

tbg commented Jun 28, 2019

codecov-io commented Jun 28, 2019 •

edited

Loading

bdarnell Jun 28, 2019

tbg Jun 28, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jun 28, 2019

tbg Jun 29, 2019

nvanbenschoten Jul 1, 2019

tbg Jul 3, 2019

tbg Jul 3, 2019

nvanbenschoten Jun 28, 2019

tbg commented Jul 3, 2019

		@@ -356,8 +356,8 @@ func TestLearnerPromotion(t *testing.T) {

		nt.send(pb.Message{From: 1, To: 1, Type: pb.MsgBeat})

		n1.addNode(2)

raft: centralize configuration change application #10865

raft: centralize configuration change application #10865

Conversation

tbg commented Jun 28, 2019 • edited Loading

tbg commented Jun 28, 2019

codecov-io commented Jun 28, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbg commented Jul 3, 2019

tbg commented Jun 28, 2019 •

edited

Loading

codecov-io commented Jun 28, 2019 •

edited

Loading