Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug:master] change of leader each time to the one who last started/restarted #2804

Closed
kmlebedev opened this issue Mar 24, 2022 · 10 comments
Closed

Comments

@kmlebedev
Copy link
Contributor

kmlebedev commented Mar 24, 2022

change of leader each time to the one who last started/restarted

version

2.90

weed master run cmd:

/usr/bin/weed -logtostderr=true \
      -v=1 \
      master \
      -mdir /data \
      -port "9333" \
      -defaultReplication="100" \
      -metricsPort=9090 \
      -volumeSizeLimitMB=1000 \
      -resumeState=true \
      -ip.bind=0.0.0.0 \
      -ip ${POD_NAME}.$NAMESPACE-${SEAWEEDFS_FULLNAME}-master-direct.service.dc1.consul \
      -peers sw-master-0.s3-sw-master-direct.service.dc1.consul:9333,sw-master-0.s3-sw-master-direct.service.dc2.consul:9333,sw-master-0.s3-sw-master-direct.service.dc3.consul:9333

logs:

I0324 14:37:48     1 config.go:44] Reading security.toml from /etc/seaweedfs/security.toml
I0324 14:37:48     1 config.go:44] Reading master.toml from /etc/seaweedfs/master.toml
I0324 14:37:48     1 file_util.go:23] Folder /data Permission: -rwxrwxr-x
I0324 14:37:48     1 master.go:187] current: sw-master-0.s3-sw-master-direct.service.dc1.consul:9333 peers:sw-master-0.s3-sw-master-direct.service.dc1.consul:9333,sw-master-0.s3-sw-master-direct.service.dc2.consul:9333,sw-master-0.s3-sw-master-direct.service.dc3.consul:9333
I0324 14:37:48     1 master_server.go:289] [master.sequencer.type] : [memory]
I0324 14:37:48     1 master_server.go:122] Volume Size Limit is 1000 MB
I0324 14:37:48     1 master.go:133] Start Seaweed Master 30GB 2.90 497ebbbd at 0.0.0.0:9333
I0324 14:37:48     1 raft_server.go:82] Starting RaftServer with sw-master-0.s3-sw-master-direct.service.dc1.consul:9333
I0324 14:37:48     1 raft_server.go:61] Recovery raft state {MaxVolumeId:208}
I0324 14:37:48     1 raft_server.go:143] current cluster leader:
I0324 14:38:03     1 master_server.go:165] leader change event:  => sw-master-0.s3-sw-master-direct.service.dc1.consul:9333
I0324 14:38:03     1 master_server.go:168] [ sw-master-0.s3-sw-master-direct.service.dc1.consul:9333 ] sw-master-0.s3-sw-master-direct.service.dc1.consul:9333 becomes leader.
I0324 14:38:03     1 cluster_commands.go:28] max volume id 208 ==> 209
I0324 14:38:03     1 cluster_commands.go:28] max volume id 209 ==> 210
I0324 14:38:03     1 cluster_commands.go:28] max volume id 210 ==> 211
I0324 14:38:03     1 cluster_commands.go:28] max volume id 211 ==> 212
I0324 14:38:03     1 cluster_commands.go:28] max volume id 212 ==> 213
I0324 14:38:03     1 cluster_commands.go:28] max volume id 213 ==> 214
I0324 14:38:03     1 cluster_commands.go:28] max volume id 214 ==> 215
I0324 14:38:03     1 cluster_commands.go:28] max volume id 215 ==> 216
I0324 14:38:03     1 cluster_commands.go:28] max volume id 216 ==> 217
I0324 14:38:03     1 cluster_commands.go:28] max volume id 217 ==> 218
I0324 14:38:03     1 cluster_commands.go:28] max volume id 218 ==> 219
I0324 14:38:06     1 master_server.go:175] [ sw-master-0.s3-sw-master-direct.service.dc1.consul:9333 ] sw-master-0.s3-sw-master-direct.service.dc1.consul:9333 is the leader.
I0324 14:38:06     1 master.go:165] Start Seaweed Master 30GB 2.90 497ebbbd grpc server at 0.0.0.0:19333

Снимок экрана 2022-03-24 в 19 52 04
Снимок экрана 2022-03-24 в 19 57 52

@kmlebedev
Copy link
Contributor Author

locally not reproduced:

  1. make cluster
  2. Get Leader
curl -s http://127.0.0.1:9333/cluster/status | jq .
{
 "IsLeader": true,
 "Leader": "master0:9333",
 "Peers": [
   "master1:9334",
   "master2:9335"
 ]
}
  1. restart not leader
docker restart seaweedfs_master2_1

seaweedfs_master2_1 exited with code 0
master2_1  | I0325 06:28:13     1 config.go:46] Reading : Config File "security" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
master2_1  | I0325 06:28:13     1 config.go:46] Reading : Config File "master" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
master2_1  | I0325 06:28:13     1 file_util.go:23] Folder /tmp Permission: -rwxrwxrwx
master2_1  | I0325 06:28:13     1 master.go:227] current: master2:9335 peers:master0:9333,master1:9334,master2:9335
master2_1  | I0325 06:28:13     1 master_server.go:289] [master.sequencer.type] : []
master2_1  | I0325 06:28:13     1 master_server.go:122] Volume Size Limit is 100 MB
master2_1  | I0325 06:28:13     1 master.go:138] Start Seaweed Master 30GB 2.95  at master2:9335
master2_1  | I0325 06:28:13     1 raft_server.go:82] Starting RaftServer with master2:9335
master2_1  | I0325 06:28:13     1 raft_server.go:143] current cluster leader: 
master2_1  | I0325 06:28:31     1 tls.go:35] load cert:  / key:  error: open : no such file or directory
master2_1  | I0325 06:28:31     1 master.go:171] Start Seaweed Master 30GB 2.95  grpc server at master2:19335
master2_1  | I0325 06:28:31     1 masterclient.go:51] master masterClient bootstraps with masters [master0:9333 master1:9334 master2:9335]
master2_1  | I0325 06:28:31     1 masterclient.go:99] master masterClient Connecting to master master0:9333
master2_1  | I0325 06:28:31     1 masterclient.go:122] master masterClient Connected to master0:9333
master0_1  | I0325 06:28:31     1 master_grpc_server.go:278] + client master@master2:9335
master2_1  | I0325 06:28:35     1 master_server.go:165] leader change event:  => master0:9333
master2_1  | I0325 06:28:35     1 master_server.go:168] [ master2:9335 ] master0:9333 becomes leader.
  1. Check Leader
curl -s http://127.0.0.1:9333/cluster/status | jq .
{
  "IsLeader": true,
  "Leader": "master0:9333",
  "Peers": [
    "master2:9335",
    "master1:9334"
  ]
}

@kmlebedev
Copy link
Contributor Author

but if enable -resumeState=true
the bug is reproduced
3.

seaweedfs_master2_1 exited with code 0
master2_1  | I0325 06:37:55     1 config.go:46] Reading : Config File "security" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
master2_1  | I0325 06:37:55     1 config.go:46] Reading : Config File "master" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
master2_1  | I0325 06:37:55     1 file_util.go:23] Folder /tmp Permission: -rwxrwxrwx
master2_1  | I0325 06:37:55     1 master.go:227] current: master2:9335 peers:master0:9333,master1:9334,master2:9335
master2_1  | I0325 06:37:55     1 master_server.go:289] [master.sequencer.type] : []
master2_1  | I0325 06:37:55     1 master_server.go:122] Volume Size Limit is 100 MB
master2_1  | I0325 06:37:55     1 master.go:138] Start Seaweed Master 30GB 2.95  at master2:9335
master2_1  | I0325 06:37:55     1 raft_server.go:82] Starting RaftServer with master2:9335
master2_1  | I0325 06:37:55     1 raft_server.go:143] current cluster leader: 
master1_1  | I0325 06:38:13     1 master_server.go:165] leader change event: master0:9333 => 
master2_1  | I0325 06:38:13     1 master_server.go:165] leader change event:  => master2:9335
master2_1  | I0325 06:38:13     1 master_server.go:168] [ master2:9335 ] master2:9335 becomes leader.
master1_1  | I0325 06:38:13     1 master_server.go:165] leader change event:  => master2:9335
master1_1  | I0325 06:38:13     1 master_server.go:168] [ master1:9334 ] master2:9335 becomes leader.
master2_1  | I0325 06:38:13     1 master_server.go:175] [ master2:9335 ] master2:9335 is the leader.
master2_1  | I0325 06:38:13     1 tls.go:35] load cert:  / key:  error: open : no such file or directory
master2_1  | I0325 06:38:13     1 master.go:171] Start Seaweed Master 30GB 2.95  grpc server at master2:19335
master2_1  | I0325 06:38:13     1 masterclient.go:51] master masterClient bootstraps with masters [master0:9333 master1:9334 master2:9335]
master2_1  | I0325 06:38:13     1 masterclient.go:99] master masterClient Connecting to master master0:9333
master2_1  | I0325 06:38:13     1 masterclient.go:122] master masterClient Connected to master0:9333
master0_1  | I0325 06:38:13     1 master_grpc_server.go:278] + client master@master2:9335
master0_1  | I0325 06:38:14     1 master_server.go:165] leader change event: master0:9333 => 
master0_1  | I0325 06:38:14     1 master_server.go:165] leader change event:  => master2:9335
master0_1  | I0325 06:38:14     1 master_server.go:168] [ master0:9333 ] master2:9335 becomes leader.
filer_1    | I0325 06:38:16     1 masterclient.go:136] redirected to leader master2:9335
master0_1  | I0325 06:38:16     1 master_grpc_server.go:294] - client filer@172.18.0.8:8888
master2_1  | I0325 06:38:16     1 master_grpc_server.go:278] + client filer@172.18.0.8:8888
filer_1    | I0325 06:38:16     1 masterclient.go:163] + filer 172.18.0.8:8888 leader:true
master0_1  | I0325 06:38:16     1 master_grpc_server.go:294] - client master@master1:9334
master1_1  | I0325 06:38:16     1 masterclient.go:136] redirected to leader master2:9335
master1_1  | I0325 06:38:16     1 masterclient.go:99] master masterClient Connecting to master master2:9335
master1_1  | I0325 06:38:16     1 masterclient.go:122] master masterClient Connected to master2:9335
master2_1  | I0325 06:38:16     1 master_grpc_server.go:278] + client master@master1:9334
master0_1  | I0325 06:38:16     1 master_grpc_server.go:294] - client master@master0:9333
master0_1  | I0325 06:38:16     1 masterclient.go:136] redirected to leader master2:9335
master0_1  | I0325 06:38:16     1 masterclient.go:99] master masterClient Connecting to master master2:9335
master0_1  | I0325 06:38:16     1 masterclient.go:122] master masterClient Connected to master2:9335
  1. Check Leader
curl -s http://127.0.0.1:9333/cluster/status | jq .
{
  "Leader": "master2:9335",
  "Peers": [
    "master1:9334",
    "master2:9335"
  ]
}

@kmlebedev
Copy link
Contributor Author

kmlebedev commented Mar 25, 2022

after restarted master2_1 becomes promoted

master2_1  | [raft]07:40:42.336101 open.log.append log index  1
master2_1  | [raft]07:40:42.336187 open.log.append log index  2
master2_1  | [raft]07:40:42.336229 open.log.append: finish 
master2_1  | [raft]07:40:42.336246 open.log.recovery number of log  2
master2_1  | [raft]07:40:42.336270 [master2:9335 Term:0] start from previous saved state
master2_1  | [raft]07:40:42.336291 Name: master2:9335, State: follower, Term: 0, CommitedIndex: 0 
master2_1  | [raft]07:40:42.336933 [master2:9335 Term:0] server.peer.add:  master0:9333 0
master2_1  | [raft]07:40:42.336983 [master2:9335 Term:0] server.loop.run  follower
master2_1  | [raft]07:40:42.341252 [master2:9335 Term:0] server.peer.add:  master1:9334 1
master2_1  | [raft]07:40:42.343367 [master2:9335 Term:0] server.peer.add:  master2:9335 2
master2_1  | [raft]07:40:46.543780 [master2:9335 Term:0] server.loop.run  candidate
master2_1  | [raft]07:40:46.544269 peer.vote:  master2:9335 -> master1:9334
master2_1  | [raft]07:40:46.544298 peer.vote:  master2:9335 -> master0:9333
master1_1  | I0325 07:40:46     1 master_server.go:165] leader change event: master0:9333 => 
master1_1  | [raft]07:40:46.552000 [master1:9334 Term:1] server.rv.vote:  master1:9334  votes for master2:9335 at term 1
master2_1  | [raft]07:40:46.552644 peer.vote.recv:  master2:9335 <- master1:9334
master2_1  | [raft]07:40:46.553898 [master2:9335 Term:1] server.candidate.vote.granted:  1
master2_1  | [raft]07:40:46.554015 [master2:9335 Term:1] server.candidate.recv.enough.votes
master2_1  | [raft]07:40:46.554339 [master2:9335 Term:1] server.loop.run  leader
master2_1  | I0325 07:40:46     1 master_server.go:165] leader change event:  => master2:9335
master2_1  | I0325 07:40:46     1 master_server.go:168] [ master2:9335 ] master2:9335 becomes leader.
master2_1  | [raft]07:40:46.554424 [master2:9335 Term:1] leaderLoop.set.PrevIndex to  2
master2_1  | [raft]07:40:46.554626 peer.heartbeat:  master0:9333 113.98417ms
master2_1  | [raft]07:40:46.554884 peer.heartbeat:  master1:9334 113.98417ms
master2_1  | [raft]07:40:46.555051 [master2:9335 Term:1] server.command.process
master2_1  | [raft]07:40:46.555165 peer.heartbeat.flush:  master1:9334
master2_1  | [raft]07:40:46.555199 peer.heartbeat.flush:  master0:9333
master2_1  | [raft]07:40:46.555545 log.entriesAfter.partial:  2   0x1a837e0
master2_1  | [raft]07:40:46.555211 log.entriesAfter.partial:  2   0x1a837e0
master2_1  | [raft]07:40:46.555997 log.entriesAfter: startIndex: 0  length 3
master2_1  | [raft]07:40:46.556411 peer.append.send: master2:9335->master1:9334 [prevLog:2 length: 1]
master2_1  | [raft]07:40:46.555601 log.entriesAfter: startIndex: 0  length 3
master2_1  | [raft]07:40:46.557024 peer.append.send: master2:9335->master0:9333 [prevLog:2 length: 1]

@kmlebedev
Copy link
Contributor Author

i add debug

			glog.V(0).Infof("check existsPeerName server %s: raft: %s", existsPeerName, peer.ToGrpcAddress())
			if peer.ToGrpcAddress() == existsPeerName {

https://github.com/chrislusf/seaweedfs/blob/master/weed/server/raft_server.go#L120

master2_1  | I0325 09:55:58     1 raft_server.go:122] check existsPeerName server master0:9333: raft: master0:19333

Remove deleted peers
it doesn't work

kmlebedev added a commit to kmlebedev/seaweedfs that referenced this issue Mar 25, 2022
@kmlebedev
Copy link
Contributor Author

kmlebedev commented Mar 31, 2022

Fixing the work of snapshots led to #2833

@kmlebedev
Copy link
Contributor Author

kmlebedev commented Mar 31, 2022

@chrislusf The presence of snapshots on several master servers at startup causes conflicts when choosing a leader.
Also, if there is a snapshot, the log index will be non-zero, which makes the candidate master and leads to re-elections, where the candidate always becomes the leader.

It worked while the raft was created without logs and snapshots.

@chrislusf
Copy link
Collaborator

this is also related to #2847 ?

@kmlebedev
Copy link
Contributor Author

@chrislusf I did not see the point in repairing the current raft. Since I plan to store the full topology also in the raft.
This will allow you to have standby masters with a complete topology, which will significantly reduce the service unavailability time when changing the master

#2868

@kmlebedev
Copy link
Contributor Author

with raftBootstrap=false max volume restored from raft logs

master2_1  | 2022-04-04T13:40:55.937Z [INFO]  raft: pipelining replication: peer="{Voter master1:9334 master1:19334}"
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 0 ==> 1
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 1 ==> 2
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 2 ==> 3
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 3 ==> 4
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 4 ==> 5
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 5 ==> 6
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 6 ==> 7
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 7 ==> 8
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 8 ==> 9
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 9 ==> 10
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 10 ==> 11
master2_1  | I0404 13:40:55     1 raft_server.go:82] max volume id 11 ==> 12

@kmlebedev
Copy link
Contributor Author

kmlebedev commented Apr 14, 2022

sw moved to hashicorp raft #2868

add param weed master -raftHashicorp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants