Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swarm mesh routing doesn't load balance between nodes (Wireguard setup) #37985

Open
angristan opened this issue Oct 6, 2018 · 10 comments

Comments

Projects
None yet
6 participants
@angristan
Copy link

commented Oct 6, 2018

Description

I am running a swarm cluster with 3 workers that also are managers.

I am running them on Hetzner Cloud, and since they don't have private networking, I use Wireguard to create a VPN between the servers.

I had no issues before running the same steps on Digital Ocean, without wireguard and with "real" private networking, so I assume the issues come from my current setup, but I can't understand why.

All the nodes are running Debian 9.

root@erina ~# ip a show wg0
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1
    link/none
    inet 10.0.42.5/32 scope global wg0
       valid_lft forever preferred_lft forever
root@honoka ~# ip a show wg0
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1
    link/none
    inet 10.0.42.6/32 scope global wg0
       valid_lft forever preferred_lft forever
root@utaha ~# ip a show wg0
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1
    link/none
    inet 10.0.42.7/32 scope global wg0
       valid_lft forever preferred_lft forever

Of course, they can reach other, and the 3 required ports for Swarm are not closed. I'm even running a GlusterFS cluster without issue.

I have seen quite a number of issues in this repo regarding this exact problem, but since my setup is different I have opened another one.

Steps to reproduce the issue:

  1. Setup Docker and Wireguard
  2. On erina: docker swarm init --advertise-addr 10.0.42.5
  3. On the 2 others: docker swarm join --token xxxx 10.0.42.5:237
  4. docker service create -p 80:80 --name web nginx:latest

Describe the results you received:

The nginx container is only accessible via the node it's running on.

root@erina ~# docker service ps web
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
w6s5wvr3uyxo        web.1               nginx:latest        utaha               Running             Running 10 minutes ago

Here, I will be able to access Nginx on utaha's localhost, pirvate IP and public IP, but not via the other nodes.

Describe the results you expected:

Thanks to Swarm's mesh network, I should be able to access any service from any node.
Since they can reach each other and there is not closed ports, I wonder what's wrong? Maybe something with the overlay network?

Output of docker version:

Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:18 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:22:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

root@erina ~# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: 5vlol9p0ulu73d79dq6ie08w7
 Is Manager: true
 ClusterID: yva12o7oaiq6tws478wrlm2bx
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.0.42.5
 Manager Addresses:
  10.0.42.5:2377
  10.0.42.6:2377
  10.0.42.7:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-8-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.769GiB
Name: erina
ID: SMEQ:ZNHW:JQKH:D3ZA:YACF:FWT5:GUYS:3DVB:7NSQ:DQVL:PYTI:AQAR
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support
root@honoka ~# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: purz0nyznvqmuk9pyng2s4crw
 Is Manager: true
 ClusterID: yva12o7oaiq6tws478wrlm2bx
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.0.42.6
 Manager Addresses:
  10.0.42.5:2377
  10.0.42.6:2377
  10.0.42.7:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-8-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.769GiB
Name: honoka
ID: L4TQ:BHFL:4TVF:5UCU:BYF4:F6GR:6XOV:HWWC:BNMJ:H6DX:6G36:3SIU
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support
root@utaha ~# docker info
Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: mmxnqjixdse5ilap5rajq56rn
 Is Manager: true
 ClusterID: yva12o7oaiq6tws478wrlm2bx
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.0.42.7
 Manager Addresses:
  10.0.42.5:2377
  10.0.42.6:2377
  10.0.42.7:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-8-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.769GiB
Name: utaha
ID: LOXM:5A2Q:TXEG:L52Z:CJT6:OYVY:ZBGS:RZLW:EP65:H5GA:FLDQ:7R6A
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.):

Here are some logs:

Oct 06 23:08:36 erina systemd[1]: Starting Docker Application Container Engine...
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.768946546+02:00" level=info msg="libcontainerd: started new docker-containerd process" pid=969
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.769103268+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.769114245+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.769331073+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.769363513+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.769413803+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42020ba80, CONNECTING" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="starting containerd" revision=468a545b9edcd5932818eb9de8e72413e616e86e version=v1.1.2
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a b
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs f
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs f
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesys
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd-debug.sock"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd.sock"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36+02:00" level=info msg="containerd successfully booted in 0.022201s"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.902770000+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42020ba80, READY" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912410004+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912443441+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912478778+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912490812+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912524030+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4203481f0, CONNECTING" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.912727681+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4203481f0, READY" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.915611444+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.922769065+02:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.923277607+02:00" level=warning msg="Your kernel does not support swap memory limit"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.923648157+02:00" level=warning msg="Your kernel does not support cgroup rt period"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.923948367+02:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.924632248+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.924996002+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.925309891+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.925595559+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.925900079+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420348e90, CONNECTING" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.926492085+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420348e90, READY" module=grpc
Oct 06 23:08:36 erina dockerd[877]: time="2018-10-06T23:08:36.926922934+02:00" level=info msg="Loading containers: start."
Oct 06 23:08:37 erina dockerd[877]: time="2018-10-06T23:08:37.166515706+02:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 06 23:08:37 erina dockerd[877]: time="2018-10-06T23:08:37.200500071+02:00" level=info msg="Loading containers: done."
Oct 06 23:08:37 erina dockerd[877]: time="2018-10-06T23:08:37.246219920+02:00" level=info msg="Docker daemon" commit=e68fc7a graphdriver(s)=overlay2 version=18.06.1-ce
Oct 06 23:08:37 erina dockerd[877]: time="2018-10-06T23:08:37.246589871+02:00" level=info msg="Daemon has completed initialization"
Oct 06 23:08:37 erina dockerd[877]: time="2018-10-06T23:08:37.277017891+02:00" level=info msg="API listen on /var/run/docker.sock"
Oct 06 23:08:37 erina systemd[1]: Started Docker Application Container Engine.
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.801132469+02:00" level=info msg="parsed scheme: \"\"" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.801217056+02:00" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.801336474+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{/var/run/docker/swarm/control.sock 0  <nil>}]" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.801359905+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.801430262+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420ab3e30, CONNECTING" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.806367078+02:00" level=info msg="Listening for connections" addr="[::]:2377" module=node node.id=5vlol9p0ulu73d79dq6ie08w7 proto=tcp
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.806508629+02:00" level=info msg="Listening for local connections" addr=/var/run/docker/swarm/control.sock module=node node.id=5vlol9p0ulu73d79dq6ie08w7 proto=unix
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.810957083+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420ab3e30, READY" module=grpc
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.819838510+02:00" level=info msg="44d099625302cf1f became follower at term 0" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.820413278+02:00" level=info msg="newRaft 44d099625302cf1f [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.820467764+02:00" level=info msg="44d099625302cf1f became follower at term 1" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.822656939+02:00" level=info msg="44d099625302cf1f is starting a new election at term 1" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.822724250+02:00" level=info msg="44d099625302cf1f became candidate at term 2" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.822753509+02:00" level=info msg="44d099625302cf1f received MsgVoteResp from 44d099625302cf1f at term 2" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.822772604+02:00" level=info msg="44d099625302cf1f became leader at term 2" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.822784040+02:00" level=info msg="raft.node: 44d099625302cf1f elected leader 44d099625302cf1f at term 2" module=raft node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.824055570+02:00" level=info msg="Creating default ingress network" module=node node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.825692045+02:00" level=info msg="leadership changed from not yet part of a raft cluster to 5vlol9p0ulu73d79dq6ie08w7" module=node node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:15 erina dockerd[877]: time="2018-10-06T23:14:15.825908931+02:00" level=info msg="dispatcher starting" module=dispatcher node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.327800756+02:00" level=info msg="manager selected by agent for new session: { }" module=node/agent node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.328535321+02:00" level=info msg="waiting 0s before registering session" module=node/agent node.id=5vlol9p0ulu73d79dq6ie08w7
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.331564429+02:00" level=info msg="worker 5vlol9p0ulu73d79dq6ie08w7 was successfully registered" method="(*Dispatcher).register"
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.336467018+02:00" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=10.0.42.5 Adv-addr=10.0.42.5 Data-addr= Remote-addr-list=[] MTU=1500"
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.336952412+02:00" level=info msg="New memberlist node - Node:erina will use memberlist nodeID:61113be67569 with config:&{NodeID:61113be67569 Hostname:erina BindAddr:0.0.0.0 Adve
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.340097522+02:00" level=info msg="Node 61113be67569/10.0.42.5, joined gossip cluster"
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.340463507+02:00" level=info msg="Node 61113be67569/10.0.42.5, added to nodes list"
Oct 06 23:14:16 erina dockerd[877]: time="2018-10-06T23:14:16.451896856+02:00" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or direc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.677672996+02:00" level=info msg="parsed scheme: \"\"" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.677758228+02:00" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.678233853+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.0.42.6:2377 0  <nil>}]" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.678259473+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.678382077+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4215b4610, CONNECTING" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.685426270+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4215b4610, READY" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.689241995+02:00" level=info msg="parsed scheme: \"\"" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.689313955+02:00" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.690387376+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.0.42.6:2377 0  <nil>}]" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.690434968+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.690572105+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42173a0d0, CONNECTING" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.703830708+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42173a0d0, READY" module=grpc
Oct 06 23:14:32 erina dockerd[877]: time="2018-10-06T23:14:32.792400950+02:00" level=info msg="worker purz0nyznvqmuk9pyng2s4crw was successfully registered" method="(*Dispatcher).register"
Oct 06 23:14:33 erina dockerd[877]: time="2018-10-06T23:14:33.193782128+02:00" level=info msg="Node 66434deef943/10.0.42.6, joined gossip cluster"
Oct 06 23:14:33 erina dockerd[877]: time="2018-10-06T23:14:33.193891161+02:00" level=info msg="Node 66434deef943/10.0.42.6, added to nodes list"
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.412553161+02:00" level=info msg="parsed scheme: \"\"" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.412652667+02:00" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.412928029+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.0.42.7:2377 0  <nil>}]" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.412945426+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.413018920+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b60090, CONNECTING" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.421483563+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b60090, READY" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.426802400+02:00" level=info msg="parsed scheme: \"\"" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.427245029+02:00" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.427967680+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.0.42.7:2377 0  <nil>}]" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.428043646+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.428114067+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b05910, CONNECTING" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.435951375+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b05910, READY" module=grpc
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.515967916+02:00" level=info msg="worker mmxnqjixdse5ilap5rajq56rn was successfully registered" method="(*Dispatcher).register"
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.915913094+02:00" level=info msg="Node e29997e99be2/10.0.42.7, joined gossip cluster"
Oct 06 23:14:39 erina dockerd[877]: time="2018-10-06T23:14:39.916019536+02:00" level=info msg="Node e29997e99be2/10.0.42.7, added to nodes list"
Oct 06 23:16:05 erina dockerd[877]: time="2018-10-06T23:16:05+02:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/cff02f8ce4cd9eee5da51bd6287a539934512fce80f0ca924300bc9a53f88e93/shim.sock" debug=fal
Oct 06 23:17:03 erina dockerd[877]: time="2018-10-06T23:17:03+02:00" level=info msg="shim reaped" id=cff02f8ce4cd9eee5da51bd6287a539934512fce80f0ca924300bc9a53f88e93
Oct 06 23:17:03 erina dockerd[877]: time="2018-10-06T23:17:03.877115321+02:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 06 23:17:03 erina dockerd[877]: time="2018-10-06T23:17:03.877818612+02:00" level=warning msg="rmServiceBinding 6cbe6e43c69ed9115589acd06aee30354dd07a5480178d23dbde42c844d1f8d7 possible transient state ok:false entries:0 set:false "
Oct 06 23:17:03 erina dockerd[877]: time="2018-10-06T23:17:03.991738857+02:00" level=warning msg="rmServiceBinding 2a273d588b999b0992f3eb71527d683cb2ec9c93f24a7566311899998997ce25 possible transient state ok:false entries:0 set:false "
Oct 06 23:19:16 erina dockerd[877]: time="2018-10-06T23:19:16.341187532+02:00" level=info msg="NetworkDB stats erina(61113be67569) - netID:4sqgzuyftpssn311oacx8xuql leaving:false netPeers:3 entries:12 Queue qLen:0 netMsg/s:0"
Oct 06 23:24:16 erina dockerd[877]: time="2018-10-06T23:24:16.540785611+02:00" level=info msg="NetworkDB stats erina(61113be67569) - netID:4sqgzuyftpssn311oacx8xuql leaving:false netPeers:3 entries:12 Queue qLen:0 netMsg/s:0"
Oct 06 23:29:16 erina dockerd[877]: time="2018-10-06T23:29:16.740837575+02:00" level=info msg="NetworkDB stats erina(61113be67569) - netID:4sqgzuyftpssn311oacx8xuql leaving:false netPeers:3 entries:12 Queue qLen:0 netMsg/s:0"

I'll be happy to provide more information if needed.

@angristan

This comment has been minimized.

Copy link
Author

commented Oct 7, 2018

I've been able to reproduce the issue on Vultr with Wireguard too.

Here are some network info of one node, if that helps:

root@node1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP group default qlen 1000
    link/ether 56:00:01:b5:94:b1 brd ff:ff:ff:ff:ff:ff
    inet 217.69.xxx.xxx/23 brd 217.69.xxx.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 2001:19f0:6801:d6c:5400:1ff:feb5:94b1/64 scope global mngtmpaddr dynamic
       valid_lft 2591730sec preferred_lft 604530sec
    inet6 fe80::5400:1ff:feb5:94b1/64 scope link
       valid_lft forever preferred_lft forever
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1
    link/none
    inet 10.0.0.1/32 scope global wg0
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:c5:f5:66:4f brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
9: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:17:2a:59:de brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
    inet6 fe80::42:17ff:fe2a:59de/64 scope link
       valid_lft forever preferred_lft forever
11: vethad870c8@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 22:ed:0a:82:44:16 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::20ed:aff:fe82:4416/64 scope link
       valid_lft forever preferred_lft forever
15: vethfa2ee72@if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 0e:2d:83:44:9a:8b brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::c2d:83ff:fe44:9a8b/64 scope link
       valid_lft forever preferred_lft forever
root@node1:~# ip r
default via 217.69.xxx.1 dev ens3
10.0.0.2 dev wg0 scope link
10.0.0.3 dev wg0 scope link
169.254.169.254 via 217.69.xxx.1 dev ens3
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
217.69.xxx.0/23 dev ens3 proto kernel scope link src 217.69.xxx.xxx
root@node1:~# wg
interface: wg0
  public key: QgS5sC4pRT4MwUM5YXgElfRc0O/NvHWUXwfX7LDiJmg=
  private key: (hidden)
  listening port: 1194

peer: 1W5TiygX6nYpH7gSPYNmeWLi/1dQ28zJXJYORsPufng=
  endpoint public_ip:1194
  allowed ips: 10.0.0.3/32
  latest handshake: 1 minute, 44 seconds ago
  transfer: 445.74 KiB received, 472.49 KiB sent

peer: 78NWEbQ6XF/wZ3d3kzUCKWf8kKDajH3YfpDpUVFzYVM=
  endpoint: public_ip:1194
  allowed ips: 10.0.0.2/32
  latest handshake: 1 minute, 50 seconds ago
  transfer: 467.85 KiB received, 492.89 KiB sent

@angristan angristan changed the title Mesh doesn't load balance between nodes (Wireguard setup) Swarm mesh routing doesn't load balance between nodes (Wireguard setup) Oct 8, 2018

@angristan

This comment has been minimized.

Copy link
Author

commented Oct 15, 2018

Containers on different hosts can access each other trough the swarm overlay.

@angristan

This comment has been minimized.

Copy link
Author

commented Oct 15, 2018

Is there any technical documentation on mesh?

@angristan

This comment has been minimized.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Oct 15, 2018

Why are you running the swarm listener through wireguard?

Also, have you tried tracing this through something like tcpdump?

@angristan

This comment has been minimized.

Copy link
Author

commented Oct 15, 2018

Because that's the only way the nodes can communicate? As I said, on Hetzner Cloud, we don't have private networking. (Did I miss something?)

Yes, I have done a ton of tcpdump, while comparing to a functional non-wireguard cluster. I didn't find anything, but I also don't really know what I should be searching.

@bagbag

This comment has been minimized.

Copy link

commented Nov 16, 2018

Same happens for me with wireguard 0.0.20181018 and docker-ce 18.09.0
Have you found a solution for this problem?

@angristan

This comment has been minimized.

Copy link
Author

commented Nov 16, 2018

Hello @bagbag, no, I'm still searching for a solution

@bagbag

This comment has been minimized.

Copy link

commented Nov 17, 2018

I gave up and use the public network with --opt encrypted.

@klipitkas

This comment has been minimized.

Copy link

commented Dec 8, 2018

Also experiencing this issue on Hetzner Cloud.

Edit
It appears mine was an issue due to the fact that I installed it on Ubuntu using a snap package. While trying the same thing on Debian 9 with the latest docker as of today it appears to be working normally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.