If all server nodes down, can't be clustered again #526

Closed
wayhome opened this Issue Dec 8, 2014 · 15 comments

Projects

None yet

6 participants

@wayhome
wayhome commented Dec 8, 2014

My consul version is 0.4.1, here is my consul config file.

{
  "bind_addr": "0.0.0.0",
  "bootstrap_expect": 3,
  "data_dir": "/var/consul",
  "datacenter": "aws-dev2",
  "disable_remote_exec": true,
  "encrypt": "3PvUq+FBixKZmlCCvwzDHg==",
  "log_level": "INFO",
  "node_name": "contest03.aws.dev",
  "server": true,
  "start_join": [
  "contest01.aws.dev",
  "contest02.aws.dev",
   "contest03.aws.dev"
  ]

}

Now contest01~contest03 is a cluster. If I stop them all then start again, they will never be clusted again. Now matter how you restart them, or call the join method, they just can't be clustered, unless clear all the data in the data dir .

The log is like this:

2014/12/08 13:27:34 [INFO] agent: (LAN) joined: 3 Err: <nil>
2014/12/08 13:27:34 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:27:35 [INFO] agent.rpc: Accepted client: 127.0.0.1:56452
2014/12/08 13:27:35 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
2014/12/08 13:27:36 [INFO] agent.rpc: Accepted client: 127.0.0.1:56453
2014/12/08 13:27:49 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:28:08 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:28:24 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:28:50 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:29:12 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:29:42 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:29:59 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:30:27 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:30:52 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:31:12 [ERR] agent: failed to sync remote state: No cluster leader
2014/12/08 13:31:30 [ERR] agent: failed to sync remote state: No cluster leader
@armon
Member
armon commented Dec 8, 2014

You are probably having the servers gracefully leave, which removes them from the peer set of Raft. Once 2 of the servers are removed, you lose quorum and the cluster goes into an outage. Outage recovery is done via: http://consul.io/docs/guides/outage.html

If you hard stop them, or they crash, or power fail, when they restart a new leader will be elected. Graceful leave of all servers will cause an outage.

@armon armon closed this Dec 8, 2014
@dellis23

I'm not sure about the original poster, but this issue still seems to be very much in play. Your explanation makes sense, but clearing the peers.json file does nothing to fix the situation in my case. Either the documentation is lacking, or something else is missing here. I'm happy to help with the docs if I can get a better understanding of what is going on. I've tried the following:

  • Stopping the servers, deleting all the peers.json files, starting the servers. They weren't recreated.
  • Touching the peers.json file and restarting the servers one by one. The files weren't re-filled with anything.
  • Killing the consul processes via kill <pid>
  • Killing the processes via SIGINT
@armon
Member
armon commented Feb 23, 2015

@dellis23 I'm not sure what your situation is, but there are two different mechanisms for outage recovery. You can either:

  • Clear the peers file, and use the -bootstrap flag to re-establish the cluster. Not the recommended option.
  • Edit the peers file to reflect the list of all servers. Restart the servers. This is recommended.

Touching the peers file does not do anything, nor does killing the processes once the outage is already happening. Hope that helps!

@dellis23

Thanks @armon. I think the other component to this was that the serf local snapshot had marked the servers as having left. This was due to me misreading "gracefully leaving" as being something that you would want to do when shutting down a server. I assumed this was the preferred way -- that the server would come back up just as gracefully when restarted.

@armon
Member
armon commented Feb 23, 2015

@dellis23 It is a bit confusing I agree. Graceful leave means "I intend to leave this cluster, and when I do, do not mark it as a failure." This means all services / node is deregistered instead of being marked as failed. For servers, they are removed from the raft peer set to avoid a quorum loss.

@thedjEJ
thedjEJ commented Apr 16, 2015

I was able to get around this issue by first stopping consul, then deleting the entire rafts folder (the folder that holds peers.json) and restarting consul.
Seems to work consistently for me and the servers (3 of them) elect a new leader again without any issues.

@armon
Member
armon commented Apr 17, 2015

@thedjEJ That will cause complete data loss, and is not really recommended

@thedjEJ
thedjEJ commented Apr 20, 2015

@armon With my testing, whenever the peers.json file is empty (shows a value of "null" in the file), deleting the folder, then starting up each server again seems to bring the servers back to state of quorum. I have not seen other files other than the peers.json file in the rafts folder though.

Is it also not advisable to only delete the peers.json file when it is null, as it seems that the servers discover each other correctly once this is done. I am on consul 0.5.0

@armon
Member
armon commented Apr 23, 2015

@thedjEJ The raft/ folder should contain all persisted data. There should be some other files in there namely snapshots and the mdb databases. Deleting everything probably allows clustering to work because -bootstrap-expect will only operate on a fresh cluster (no data). It is unsafe for it to act under other situations, because it could cause data loss. In this case, nuking the directory will also cause full data loss (since that is where the data is stored).

@thedjEJ
thedjEJ commented Apr 24, 2015

Thanks @armon . I think I understand this better now and nuking the directory is DEFINITELY not a good idea. Not leaving gracefully is the issue though and I will look into not letting the servers do this, especially when they might once again join the cluster.

@pwilczynskiclearcode

Removing /var/lib/consul/raft/* on all consul hosts helped.

@loslosbaby

I am seeing this issue with 0.6.1 right now. If I have three machines, with identical configs, but, one has the -bootstrap-expect=3 and its the last one to boot (luck of the draw) the cluster never forms. I have a crummy systemd script (this on Ubuntu 15.10) to start the system, but, the shutdown has no special command, so, as per above, consul is leaving gracefully.

If I systemctl stop consul all three machines, then start the 1st one (with -bootstrap-expect) first, then the other two, the "No cluster leader" situation persists. I really, really like consul but I can't babysit a startup rendezvous problem....

The only reliable way I can get this to work is to manually start all three servers without a start_join list, then tell one to join the other two with consul join (two IP's).

Can anyone give me some guidance? I'd be glad to post any sanitized log/config file etc.. TIA, G.

@loslosbaby

Its these messages that make my debugging nose twitch:

Jan 12 15:29:56 ieca122 sh[49439]: 2016/01/12 15:29:56 [INFO] agent: (LAN) joined: 2 Err: <nil>
Jan 12 15:30:04 ieca122 sh[49439]: 2016/01/12 15:30:04 [ERR] agent: failed to sync remote state: No cluster leader
Jan 12 15:30:04 ieca122 consul[49447]: agent: failed to sync remote state: No cluster leader

Update: And I do consul members and I get three nodes of type Server that are Alive....but endless No cluster leader msgs in the three logs.

@loslosbaby

It should be noted that the machines don't have a private network and I am NOT using the -WAN option. I'm looking at that now. I just nuked /tmp/consul/* and restarted them, and did a manual join and they're still confused. Poor guys!

@loslosbaby

Update: I got it working, as per pwilczynskiclearcode's answer:

  • stop all three servers
  • remove /var/consul/*
  • restart all three in any order with one having -bootstrap-expect=3
  • consul join from that 1st server with the other two IP's

That's the only config I can get up and working. The join_start tag leads to server election loop land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment