Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign upDocker swarm load balancing not working over private network #36689
Comments
GordonTheTurtle
added
the
area/swarm
label
Mar 25, 2018
This comment has been minimized.
This comment has been minimized.
cecchisandrone
commented
Sep 19, 2018
•
|
Any update on this? Did you solve the issue in some way? It happens also to me with 3 nodes swarm on docker 18.06.1-ce |
This comment has been minimized.
This comment has been minimized.
|
Not really, I've found a job. Gonna try to reproduce the issue today and report back. |
This comment has been minimized.
This comment has been minimized.
|
Yes, the issue still persists for current wireguard (0.0.20180910-wg1) and docker-ce (18.06.1-ce). I have 2 nodes, both are are active and reachable over internal addresses but every 2nd request to docker service fails. Sadly, I stuck at the same point. Could not figure out what blocks requests between docker nodes. |
This comment has been minimized.
This comment has been minimized.
bagbag
commented
Nov 16, 2018
|
Same happens for me with wireguard 0.0.20181018 and docker-ce 18.09.0 |
thaJeztah
added
the
area/networking
label
Nov 16, 2018
This comment has been minimized.
This comment has been minimized.
|
JFYI, found very same issue: #37985 |
agrrh commentedMar 25, 2018
•
edited
Description
Problem is probably similar to #25325. Docker can't reach containers from hostB when I query hostA public address.
I'm using Docker swarm with 2 hosts, they are connected via wireguard tunnel and are reachable to each other. I'm able to ping those hosts from each other using internal addresses.
Then I initialize swarm mode using
--advertise-addr,--data-path-addrand--listen-addroptions, also stated internal addresses there. Hosts are visible viadocker node ls, both active. No errors in syslog.But when I create service with 2 replicas, I'm facing strange behavior, accessing service via one of public IPs, I'm able to reach only containers which are running on this particular node. Other requests fail with timeout.
Steps to reproduce the issue:
docker service create --name dummy --replicas 2 --publish 8080:80 agrrh/dummy-service-py)Describe the results you received:
As I said, requests to containers on other nodes fail:
Describe the results you expected:
I expect to be able to reach all of running containers by querying public address of any single node.
Additional information you deem important (e.g. issue happens only occasionally):
It seems to me that wireguard/tunnel itself is not the cause as I still able to send pings between containers. For example, containerB can reach those containerA addresses:
10.255.0.4 @lo~0.050 ms (looks like this actually don't leave host2)10.255.0.5 @eth0~0.700 ms (I can see this withtcpdumpon other end, it's reachable!)172.18.0.3 @eth1~0.050 ms (this probably don't leave host2 too)Due to using
--advertise-addrI can see packets running between hosts via private interface.I tried to install
ntpand sync the clock but this not helped.I also attempted to apply various fixes (e.g. turn off masquerading, re-create default bridge with lower MTU, set default bind IP, etc), but got no luck.
I reproduced the issue 3 already times with clean setup and ready to provide collaborators access to my test hosts if you would like to investigate onsite.
Output of
docker version:Same on both hosts:
Output of
docker info:Additional environment details (AWS, VirtualBox, physical, etc.):
Wireguard setup guide (assuming you installed it):
Servers should be reachable via internal addresses in a moment after this steps.