Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client gets repeatedly evicted from cluster #352

Closed
imkira opened this issue Sep 19, 2014 · 27 comments
Closed

client gets repeatedly evicted from cluster #352

imkira opened this issue Sep 19, 2014 · 27 comments
Labels
type/bug Feature does not function as expected

Comments

@imkira
Copy link
Contributor

imkira commented Sep 19, 2014

Hi, I don't know exactly how to reproduce this, only that it occurs frequently when clients run for as long as 1 day of operation.
My setup is like the following:

Node     Address             Status  Type    Build  Protocol
consul1  172.17.0.29:8301    alive   server  0.4.0  2
consul2  172.17.0.30:8301    alive   server  0.4.0  2
consul3  172.17.0.31:8301    alive   server  0.4.0  2
client1  172.30.14.118:8301  alive   client  0.4.0  2

When the bug manifests itself, what happens is that the client (client1)'s serfHealth check is changed back and forth between critical and passing every few seconds (5 or so). I don't know if this a client bug or a server bug, but if I restart the client this endless loop bug stops manifesting itself and the situation goes back to normal.

I am attaching the logs of each agent if that helps understand what's going on:

consul1 (server agent):

    2014/09/18 06:56:29 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:56:29 [INFO] consul: member 'client1' joined, marking health alive
    2014/09/18 06:56:34 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:56:35 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:56:35 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:56:35 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:56:35 [INFO] consul: member 'client1' failed, marking health critical
    2014/09/18 06:56:59 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:56:59 [INFO] consul: member 'client1' joined, marking health alive
    2014/09/18 06:57:02 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:05 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:05 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:57:05 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:05 [INFO] consul: member 'client1' failed, marking health critical
    2014/09/18 06:57:15 [INFO] serf: attempting reconnect to client1 172.30.1.254:8301
    2014/09/18 06:57:29 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:57:29 [INFO] consul: member 'client1' joined, marking health alive
    2014/09/18 06:57:32 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:35 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:35 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:57:35 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:35 [INFO] consul: member 'client1' failed, marking health critical
    2014/09/18 06:57:36 [INFO] memberlist: Suspect client1 has failed, no acks received

consul2 (server agent):

    2014/09/18 06:56:36 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:56:59 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:57:00 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:02 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:05 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:57:05 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:05 [INFO] serf: attempting reconnect to client1 172.30.1.254:8301
    2014/09/18 06:57:06 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:29 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:57:30 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:32 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:34 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:35 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:57:35 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:35 [INFO] serf: attempting reconnect to client1 172.30.1.254:8301

consul3 (server agent):

    2014/09/18 06:56:30 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:56:33 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:56:35 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:56:35 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:56:59 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:57:01 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:03 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:05 [INFO] memberlist: Marking client1 as failed, suspect timeout reached
    2014/09/18 06:57:05 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:29 [INFO] serf: EventMemberJoin: client1 172.30.1.254
    2014/09/18 06:57:30 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:33 [INFO] memberlist: Suspect client1 has failed, no acks received
    2014/09/18 06:57:35 [INFO] serf: EventMemberFailed: client1 172.30.1.254
    2014/09/18 06:57:35 [INFO] memberlist: Suspect client1 has failed, no acks received

client1 (client agent):

2014/09/19 14:27:26 [WARN] memberlist: Refuting a suspect message (from: consul1)

The above lines are a copy paste of the logs referring to a specific time frame. client1's log looks quite short and seems to only mention consul1, but it also mentions messages from consul2 and consul3 being refuted if I wait long enough.

I am running the server agents with:

GOMAXPROCS=8 /usr/local/sbin/consul agent -server -data-dir=/dist/consul/data -ui-dir=/dist/consul/ui -pid-file=/dist/consul/agent.pid -client=0.0.0.0 -node=consul<1,2,3> -bootstrap-expect=3

For consul2 and consul3 I append -join=172.17.0.29 (the address of consul1).

And I am running client1 with:

consul agent -data-dir /data/consul/tmp -ui-dir /data/consul/ui

I think this is unrelated but, I am using the HTTP API interface to contact client1 and ask it to join consul1 (172.17.0.29). I do this several times a day and I do it even when client1 is already in joined state.

Do you know what is going on?
Please let me know I you need more info on tracking this one down.

@imkira imkira changed the title client gets continuously evicted from cluster client gets repeatedly evicted from cluster Sep 19, 2014
@armon
Copy link
Member

armon commented Sep 19, 2014

@imkira This looks like an issue with UDP messages not being properly delivered. This pattern of suspected failure / re-join flapping continuously is exactly what is expected if the client is not handling the UDP messages. Verify that the UDP messages are going through on the UDP port (8301 by default)

@imkira
Copy link
Contributor Author

imkira commented Sep 24, 2014

@armon thanks for the prompt reply.

The weird part is that, if I were to have any firewall setting problem, it shouldn't go away even if I restart the client consul, right? (By "restart" I mean the process itself, not the machine where it runs) But the problem does go away, in fact. I have also noticed that this problem happens not after "1 day of operation" like I guessed before, but rather when I put my dev machine to sleep.

Anyway, this is my dev machine and it shouldn't happen (I suppose) in production. I am using docker for running consul1,2,3 (servers) and exposing 8300 8301 8302 8400 8500 8600/udp. Docker is running on top of boot2docker (since I am running it on OSX) and the client agent is running on the host machine itself. I have no firewall settings blocking docker vm (boot2docker) or docker instances to host. Also, host is able to contact docker instances at the exposed ports specified above. It appears I should have 8301/udp listed as you said, but that doesn't seem to be the root of this problem since it runs fine for hours until I put the machine to sleep.

I haven't looked at the implementation part where you are "handling the UDP messages", but I am assuming client agents listen to 8301/udp and wait for gossip messages (?) If that is true, might be the case the udp server running on the client agents is getting somewhat "confused" after awaking from sleep?
Any other ideas on what may be going wrong?

@armon
Copy link
Member

armon commented Sep 24, 2014

Ahhh okay. This clears it up. There is a known issue with Docker's networking that causes this. There is some kind of ARP cache that causes improper routing to take place, and breaks the UDP routing. Not sure what exactly the issue is, but we've noticed this happens anytime there is a rapid setup/teardown of Docker networking. The only known work-around is to "sleep" at least 3-5 minutes in between the teardown and re-join of the network.

@imkira
Copy link
Contributor Author

imkira commented Sep 29, 2014

Well that explains it all!
Thanks a lot!

@imkira imkira closed this as completed Sep 29, 2014
@dennybaa
Copy link

I've also had such a case, but it's occurred that it's not UDP problem.
The thing is that kernel performs connection tracking for UDP packets also, but for UDP it's not working as that good and should)
Flushing connection tracking tables does the trick. Try using conntrack -F

@arthurbarr
Copy link

An alternative solution if you're using Docker might be to use host networking for the Consul container. This should (I believe) prevent any of use of NAT and conntrack.

@outrunthewolf
Copy link

So does consul still perform correctly when having these issues?

2014/11/05 16:32:39 [WARN] memberlist: Refuting a suspect message (from: consulmaster)
2014/11/05 16:33:42 [INFO] memberlist: Suspect consulslave has failed, no acks received

Could you explain a fix for it if not? I'm a little confused.

@ryanuber
Copy link
Member

ryanuber commented Nov 5, 2014

@outrunthewolf these messages come from way down in the gossip layer (memberlist), indicating a network problem between nodes. If nodes aren't able to successfully send ping/ack messages to one another over UDP, then you will likely see members flapping between dead/alive state.

I would try the suggestions above from @arthurbarr and @dennybaa and see if you can get any mileage out of either solution, or as @armon suggested if you are doing rapid teardown/startup of docker networking, you might want to add a pause in between.

Let us know if you are still having trouble!

@threetee
Copy link

I've also solved this problem in the past by stopping consul on the problem node, forcibly deregistering the problem node using the consul dashboard, and restarting consul on the node.

@imkira
Copy link
Contributor Author

imkira commented Dec 15, 2014

I also noticed that if the network interface for which consul is announcing the IP address to the other nodes changes address it will stop working. Only a restart fixes it.

@brycekahle
Copy link

If you are on CoreOS, running toolbox, then yum install conntrack and then conntrack -F seems to do the trick.

@cap10morgan
Copy link

@brycekahle & @dennybaa Thank you! This needs to be shared far and wide. :)

@cap10morgan
Copy link

A scriptable option on CoreOS:

docker run --net=host --privileged --rm cap10morgan/conntrack -F

@jlordiales
Copy link

Hi cap10morgan, is the Dockerfile for the conntrack image available somewhere?

Thanks!

@cap10morgan
Copy link

@jlordiales Yep, I've made it an automated build.

@jlordiales
Copy link

@cap10morgan Thanks man! Really appreciate it

@florentvaldelievre
Copy link

Hello,

Same issue here, it keeps adding and removing server nodes. Running consul within a docker (1.4.1) container with 3 server agents nodes(staging3,staging4,staging5)
You can see below staging3 logs ( --bootstrap-expect=3 ) node.
My Host Os is Centos7, The only way to fix the problem is to use conntrack (Not easy instalable on centos7) ?

EDIT: attempted to run conntrack(docker run --net=host --privileged --rm cap10morgan/conntrack -F) on my 3 nodes and restarted consul, same issue

 2015/04/09 11:44:46 [INFO] serf: EventMemberJoin: staging4 192.168.1.200
    2015/04/09 11:44:46 [INFO] consul: adding server staging4 (Addr: 192.168.1.200:8300) (DC: dc1)
    2015/04/09 11:44:47 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:44:48 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:44:49 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:44:50 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:44:52 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:44:53 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:44:54 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:44:55 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:44:57 [INFO] memberlist: Marking staging4 as failed, suspect timeout reached
    2015/04/09 11:44:57 [INFO] serf: EventMemberFailed: staging4 192.168.1.200
    2015/04/09 11:44:57 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:44:57 [INFO] consul: removing server staging4 (Addr: 192.168.1.200:8300) (DC: dc1)
    2015/04/09 11:44:58 [INFO] serf: EventMemberJoin: staging4 192.168.1.200
    2015/04/09 11:44:58 [INFO] consul: adding server staging4 (Addr: 192.168.1.200:8300) (DC: dc1)
    2015/04/09 11:44:58 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:44:59 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:00 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:45:01 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:02 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:45:03 [INFO] memberlist: Marking staging5 as failed, suspect timeout reached
    2015/04/09 11:45:03 [INFO] serf: EventMemberFailed: staging5 192.168.1.210
    2015/04/09 11:45:03 [INFO] consul: removing server staging5 (Addr: 192.168.1.210:8300) (DC: dc1)
    2015/04/09 11:45:03 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:03 [ERR] dns: rpc error: No cluster leader
    2015/04/09 11:45:04 [ERR] agent: failed to sync remote state: No cluster leader
    2015/04/09 11:45:05 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:07 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:09 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:11 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:13 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:13 [ERR] dns: rpc error: No cluster leader
    2015/04/09 11:45:14 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:16 [INFO] serf: EventMemberJoin: staging5 192.168.1.210
    2015/04/09 11:45:16 [INFO] consul: adding server staging5 (Addr: 192.168.1.210:8300) (DC: dc1)
    2015/04/09 11:45:16 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:17 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:18 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:19 [INFO] memberlist: Suspect staging5 has failed, no acks received
    2015/04/09 11:45:20 [INFO] memberlist: Suspect staging4 has failed, no acks received
    2015/04/09 11:45:21 [INFO] memberlist: Suspect staging5 has failed, no acks received

This is my port configuration:

progrium/consul:latest "/bin/start -server 47 minutes ago Up 46 minutes 53/tcp, 172.17.42.1:53->53/udp, 192.168.1.190:8300->8300/tcp, 192.168.1.190:8301->8301/tcp, 192.168.1.190:8301->8301/udp, 192.168.1.190:8302->8302/tcp, 192.168.1.190:8302->8302/udp, 192.168.1.190:8400->8400/tcp, 192.168.1.190:8500->8500/tcp consul

Any hint ?

@cap10morgan
Copy link

@florentvaldelievre You can use my docker conntrack container. See my comment above.

@florentvaldelievre
Copy link

Thanks @cap10morgan .
The issue actually happens when I 'docker restart consul' on a node.
I guess, this is the issue mentioned right at the bottom of this page

Quoted from the page


Quickly restarting a node using the same IP issue

When testing a cluster scenario, you may kill a container and restart it again on the same host and see that it has trouble re-joining the cluster.

There is an issue when you restart a node as a new container with the same published ports that will cause heartbeats to fail and the node will flap. This is an ARP table caching problem. If you wait about 3 minutes before starting again, it should work fine. You can also manually reset the cache.


Questions

  1. Do you know where the issue is tracked ?
  2. Does it mean i should never restart my consul container(docker restart consul) but i should: 'stop' , wait 3 min, then 'start'?
  3. As a hacky solution, does it means i need to create a custom consul image which will clean ARP table each time we start consul ?

Thanks

@cap10morgan
Copy link

Hacky or not, your option 3 fixes it. That's what I'm doing. Otherwise ask Jeff Lindsay to bake it into the progrium/consul Docker image. :)

@progrium
Copy link

progrium commented Apr 9, 2015

We're building this into a new version of the Consul container, coming soon. If anybody wants to submit a PR against the current one, that's fine too.

@vitalyisaev2
Copy link

I ran into the similair problems. I'm on Centos 6.4, Docker 1.6.

The first host:

$ export HOST_IP=10.241.232.14
$ docker run -d --name consul -h $(hostname --fqdn) -p 8500:8500 -p 8400:8400 -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8600:53 -p 8600:53/udp progrium/consul -server -advertise=$HOST_IP -bootstrap -ui-dir /ui

Second one:

$ export HOST_IP=10.241.232.13
$ docker run -d --name consul -h $(hostname --fqdn) -p 8500:8500 -p 8400:8400 -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8600:53 -p 8600:53/udp progrium/consul -server -join 10.241.232.14 -advertise=$HOST_IP -ui-dir /ui

Falling into this issue after the docker restart consul on the any of hosts.

@ekristen
Copy link

I'm seeing this problem as well. Is there a ticket with docker open about this bug?

@gauravphoenix
Copy link

@ekristen moby/moby#8795 is tracking the issue.

@anroots
Copy link

anroots commented Sep 19, 2015

apt-get install conntrack
connctrack -F

This seemed to work for me.

Docker version 1.8.2, build 0a8c2e3
Ubuntu 14.04.3 LTS

@sam0737
Copy link

sam0737 commented Oct 4, 2015

I think I have hit the same issue, here is my observation -

I am running a container with consul installed inside.
In docker run I did map the port 51086 (0.0.0.0:51086->51086/tcp, 0.0.0.0:51086->51086/udp), and in the container config I have the following.
"ports": { "serf_lan": 51086 }

Then I hit the issue.

I use tcpdump to watch the UDP traffic...

On host's docker0 interface
02:19:07.763638 IP "CONTAINER".51086 > "ANOTHER_NODE_OUTSIDE".8301: UDP, length 48
On host's eth0 interface
02:19:07.763638 IP "ETH0".1026 > "ANOTHER_NODE_OUTSIDE".8301: UDP, length 48

I bet there were existing track tuple that occupied sport 51086, hence 1026 got allocated. This seems to mess the gossip protocol up.

When I assign another port, says 41086 instead of 51086 and restart the container, everything is back to normal.

the tcpdump would looks like this
On host's docker0 interface
02:19:07.763638 IP "CONTAINER".41086 > "ANOTHER_NODE_OUTSIDE".8301: UDP, length 48
On host's eth0 interface
02:19:07.763638 IP "ETH0".41086 > "ANOTHER_NODE_OUTSIDE".8301: UDP, length 48

Is it possible that the source port being different from the serf_lan configuration will mess the gossip protocol up? (Just for the record - the DNAT is working, it maps ETH0:51086 to CONTAINER:51086 correctly)

I didn't try conntrack -F yet. I don't have an repro now maybe I should try that next time.

@sam0737
Copy link

sam0737 commented Oct 4, 2015

Oh here is how I can reproduce that.

I restart my container quickly, by using maestro-ng stop and start.
The new container with have an new internal IP, but I still have its serf lan port configured as 41086.
Now when it trying to gossip with the nodes outside, the conntrack assign a new source port (say 1026 as noted above) for it because the 41086 is already in used by the old internal IP and not expired yet. And that result in node status flapping.
conntrack -F does solve the problem.

I am hoping Consul could make changes to its gossip protocol for accepting "ping" from a different source port than the one configured.

@slackpad slackpad added type/bug Feature does not function as expected docker labels Nov 16, 2015
duckhan pushed a commit to duckhan/consul that referenced this issue Oct 24, 2021
* crds: Add support for L7 intentions

Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests