Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: support raft learner in etcd #10645

Closed
wants to merge 35 commits into from
Closed

*: support raft learner in etcd #10645

wants to merge 35 commits into from

Conversation

jingyih
Copy link
Contributor

@jingyih jingyih commented Apr 15, 2019

Edit: This PR is replaced by #10725, #10727, #10730.

This PR adds support for learner (non-voting) member in etcd. Currently the basic functionality is done. Feels like this might be a good time to get reviews before the changes grow bigger. Please refer to the pull requests against learner branch for incremental changes and their context.

Background

Status
The implementation status is tracked in task list #10537.

Main contributors to learner branch:
@jingyih @WIZARD-CXY @gyuho @jpbetz

@codecov-io
Copy link

codecov-io commented Apr 15, 2019

Codecov Report

Merging #10645 into master will decrease coverage by 0.14%.
The diff coverage is 68.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10645      +/-   ##
==========================================
- Coverage    71.8%   71.66%   -0.15%     
==========================================
  Files         393      393              
  Lines       36628    36875     +247     
==========================================
+ Hits        26302    26425     +123     
- Misses       8502     8599      +97     
- Partials     1824     1851      +27
Impacted Files Coverage Δ
etcdserver/api/v3rpc/rpctypes/error.go 90.47% <ø> (ø) ⬆️
etcdserver/errors.go 0% <ø> (ø) ⬆️
etcdserver/api/membership/errors.go 100% <ø> (ø) ⬆️
etcdctl/ctlv3/command/printer_fields.go 0% <0%> (ø) ⬆️
etcdserver/api/v2v3/server.go 0% <0%> (ø) ⬆️
proxy/grpcproxy/adapter/cluster_client_adapter.go 16.66% <0%> (-3.34%) ⬇️
etcdctl/ctlv3/command/printer_simple.go 72.84% <0%> (+0.36%) ⬆️
clientv3/retry.go 87.5% <100%> (+0.22%) ⬆️
etcdctl/ctlv3/command/ep_command.go 63.57% <100%> (ø) ⬆️
etcdserver/api/v3rpc/maintenance.go 76.92% <100%> (+1.92%) ⬆️
... and 45 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f29b1ad...f852a94. Read the comment docs.

)

type Cluster interface {
// MemberList lists the current cluster membership.
MemberList(ctx context.Context) (*MemberListResponse, error)

// MemberAdd adds a new member into the cluster.
MemberAdd(ctx context.Context, peerAddrs []string) (*MemberAddResponse, error)
MemberAdd(ctx context.Context, peerAddrs []string, isLearner bool) (*MemberAddResponse, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiang90 @jpbetz Sorry for late response. Should we create a new RPC, rather than breaking API? I am fine with changing the function signature, and just release notes.

@jingyih
Copy link
Contributor Author

jingyih commented May 7, 2019

Removing WIP label, as most of the items in task list are done.

@jingyih jingyih changed the title [WIP] *: support raft learner in etcd *: support raft learner in etcd May 7, 2019
@jingyih jingyih added this to the etcd-v3.4 milestone May 7, 2019
@xiang90
Copy link
Contributor

xiang90 commented May 9, 2019

@jingyih Can you improve the commit history a little bit? Right now they are 60 commits. For example separate them into 3-5 commits that can be reviewed independently? Thanks!

@jingyih
Copy link
Contributor Author

jingyih commented May 10, 2019

@jingyih Can you improve the commit history a little bit? Right now they are 60 commits. For example separate them into 3-5 commits that can be reviewed independently? Thanks!

@xiang90

I agree there are a lot of commits. Half of these commits are merge commits, which can be ignored. What happened was we first created a feature branch, and all the code changes were first merged into the feature branch. There are about 30 PRs opened against feature branch, they closely match the items in task list #10537.

Reviewing 30 commits is not easy, rebasing and grouping them into 3 to 5 commits is not easy neither. I can pick the most important commits (maybe 15 to 20 commits) and list them in groups. Is this acceptable?

@xiang90
Copy link
Contributor

xiang90 commented May 10, 2019

@jingyih OK. Let us get rid of the merge commits first.

jingyih and others added 15 commits May 10, 2019 16:42
- Added isLearner flag to MemberAddRequest in Cluster API.
- Added isLearner field to StatusResponse in Maintenance API.
- Added MemberPromote rpc to Cluster API.
Fixed compilation erros after API change for learner.
Added IsLearner field to etcdserver internal Member type. Routed
learner MemberAdd request from server API to raft. Apply learner
MemberAdd result to server after the request is passed through Raft.
Added IsLearner flag to clientv3 MemberAdd API.
Added support for "etcdctl member add --learner".
Added an e2e test to exercise "etcdctl member add --learner".
Added learner field to endpoint status API.
Hardcoded allowed rpc for learner node. Added filtering in grpc
interceptor to check if rpc is allowed for learner node.
Adding TestKVForLearner. Also adding test utility functions for clientv3
integration tests.
1. Maintenance API MoveLeader() returns ErrBadLeaderTransferee if
transferee does not exist or is raft learner.

2. etcdserver TransferLeadership() only choose voting member as
transferee.
Adding integration test TestMoveLeaderToLearnerError, which ensures that
leader transfer to learner member will fail.
jingyih and others added 14 commits May 10, 2019 16:42
Fixes TestMemberAddForLearner and TestMemberPromoteForLearner.
Adding delay in the test for the newly started learner member to catch
up applying config change entries in raft log.
Adjust StrictReconfigCheck logic to accommodate learner members in the
cluster.
Make learner return code.Unavailable when the request is not supported
by learner. Client balancer will retry a different endpoint.
If member does not exist in cluster, IsLearner will panic.
Hard-coded the maximum number of learners to 1.
Use ReadyNotify instead of time.Sleep to wait for server ready.
If learner is not ready to be promoted, use etcdserver.ErrLearnerNotReady
instead of using membership.ErrLearnerNotReady.
Check http StatusCode. Only Unmarshal body if StatusCode is statusOK.
Update TestMemberPromote to include both learner not-ready and learner
ready test cases.

Removed unit test TestPromoteMember, it requires underlying raft node to
be started and running. The member promote is covered by the integration
test.
@jingyih
Copy link
Contributor Author

jingyih commented May 11, 2019

@jingyih OK. Let us get rid of the merge commits first.

@xiang90 Done.

@@ -51,10 +53,13 @@ type Member struct {

// NewMember creates a Member without an ID and generates one based on the
// cluster name, peer URLs, and time. This is used for bootstrapping/adding new member.
func NewMember(name string, peerURLs types.URLs, clusterName string, now *time.Time) *Member {
func NewMember(name string, peerURLs types.URLs, clusterName string, now *time.Time, isLearner bool) *Member {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably add a func called NewLearnerMember. Boolean arg on public method is not good in general.

@xiang90
Copy link
Contributor

xiang90 commented May 14, 2019

@jingyih I reviewed the first 7 commits (up to etcdctl: add learner field in member list output). Can you make a separate PR for these 7 commits? I think they can be merged separately first.

@jingyih
Copy link
Contributor Author

jingyih commented May 14, 2019

@jingyih I reviewed the first 7 commits (up to etcdctl: add learner field in member list output). Can you make a separate PR for these 7 commits? I think they can be merged separately first.

@xiang90
Thanks for reviewing! A separate PR for the first 7 commits is created #10725.

But do we want to merge only part of the learner commits into master? I though it might cause confusion?

@@ -17,10 +17,11 @@ package clientv3
import (
"context"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the import format needs to be fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. FYI, it is not in the final version. "error" is imported here temporarily because the MemberPromote API returns error "not implemented". The "error" import is removed when the actual implementation is done.

@@ -252,6 +252,16 @@ func (c *RaftCluster) Recover(onSet func(*zap.Logger, *semver.Version)) {
}
}

// IsPromoteChange checks if m is a promoteChange
func (c *RaftCluster) IsPromoteChange(m *Member) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsMemberPromoted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function checks if the config change is for promoting a member (as opposed to adding a member). Maybe use IsMemberPromoteConfChange or IsConfChangeForMemberPromote?

@@ -252,6 +252,16 @@ func (c *RaftCluster) Recover(onSet func(*zap.Logger, *semver.Version)) {
}
}

// IsPromoteChange checks if m is a promoteChange
func (c *RaftCluster) IsPromoteChange(m *Member) bool {
members, _ := membersFromStore(c.lg, c.v2store)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to load member from c.v2store, not from c.members directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Members info from store is more "accurate" then c.members. In most cases they are the same, except during server bootstrapping.

During server bootstrapping, we construct members information from "--init-cluster". The resulting members info does not have any information on member's learner flag. Because we do not provide a learner flag when starting etcd binary.

Also check here, in current code we load members from store when validating the conf change in apply stage, it is for very similar reasons.

func (c *RaftCluster) ValidateConfigurationChange(cc raftpb.ConfChange) error {
members, removed := membersFromStore(c.lg, c.v2store)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should be more specific on this topic. Consider the case where we add a new member to an existing cluster.

  1. Call MemberAdd API to add new member to existing members in cluster.
    1.1 Each existing member append confchange raft entry of adding the new member.
    1.2 Each existing member apply the raft entry which adds the new member to its c.members and backend storage.
  2. When starting the new member (during bootstrapping), new member's c.members is initially constructed from the input parameter "--init-cluster", whereas initally the backend storage has no members information. At this point, new member's c.members is almost accurate except that it does not have info on learner.
  3. As the new member getting raft entries from leader and applying them, it will eventually apply all the confchange raft entries, including the entry that adds itself to the cluster. At this point, its c.members is accurate.

During bootstrapping, apply stage should always refer to member info form backend storage. Otherwise server gets confused when applying the confchange raft entries: adding a member which already exists in its c.members.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, just realized this function is not part of the final version. We had to come up with a better way of deciding (in apply stage) if a confchange is for adding new member, or for promoting an existing member.
https://github.com/jingyih/etcd/blob/b433162c0b9ea4d13e9a0dd49b1f51e20896cbcc/etcdserver/server.go#L2232

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. then probably replace this part of the PR with the final version :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be removed in later commits. You will not see this function in part2 PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@xiang90
Copy link
Contributor

xiang90 commented May 15, 2019

@jingyih Can you create a part2 PR? I feel the next 10 commits are good candidates.

@jingyih
Copy link
Contributor Author

jingyih commented May 15, 2019

@jingyih Can you create a part2 PR? I feel the next 10 commits are good candidates.

I can do it tomorrow.

@xiang90
Copy link
Contributor

xiang90 commented May 15, 2019

@jingyih Probably the remaining commits can be easily reviewed in one PR?

@jingyih
Copy link
Contributor Author

jingyih commented May 16, 2019

@jingyih Probably the remaining commits can be easily reviewed in one PR?

Yeah I am creating the last part - part 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants