New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swarm restarts all containers #38203
Comments
ping @dperny PTAL |
I am also having the same issue.
When I look at the logs, I can see the only that happens during that time is
Which is the same @Umaaz's error I think. This only started happening when I upgrade docker recently. I am not sure how else to provide debug information. |
@wk8 PTAL |
I am also getting same issue when the heartbeat is getting failed. The service 4adb11869318 on manager node and the service e7b284330420 on worker node had the issues very frequently My manager node logs:
My worker node logs:
docker version
docker info
Environment: Both nodes are separate DigitalOcean droplets |
I'm facing exactly the same. Any updates about this post? |
Same here! |
I'm facing exactly the same. Any updates about this post? |
Same with 19.03.5
next in journalctl -u docker
all containers restarted simultaniously
|
I am also seeing this issue in docker v.20.10.2 |
Potential solution can be found here: #36311 |
Probably not the proper issue but it's in the same chain of issues so I'll post it there. Debian 10 - This issue happened without any warning, randomly:
I tried disabling IPv6 on host or setting the proper config inside the deamon but the issue didn't go away. Similar issue on S.O I didn't try it since I migrated everything, but maybe try to upgrade your docker deamon ? |
Is the problem still there? |
For my side, I can add that we found the reason on our code. I was not a docker issue for us. |
+1 |
Description
We are running a docker swarm cluster with 3 managers and 5 workers. Twice now we have experienced some error in the cluster where every service is restarted. After some time all the services recover and it all goes back to normal.
Steps to reproduce the issue:
I am unable to reproduce the error on demand, it has only happened twice on the cluster that has been running for 105 days, with over 200 containers.
Describe the results you received:
When looking into the issues i came across this in the logs:
This dump only appears on one of the 3 managers, on the other 2 there are logs such as:
Seems that the managers are having trouble communicating, but i am unsure as to why.
I would appriciate any assistance you can give in solving this issue.
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
Virtualization: kvm
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:
Kernel: Linux 3.10.0-862.9.1.el7.x86_64
Architecture: x86-64
The text was updated successfully, but these errors were encountered: