Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With Docker Swarm unable to access containers on other nodes #27541

Closed
the-nw1-group opened this issue Oct 19, 2016 · 10 comments
Closed

With Docker Swarm unable to access containers on other nodes #27541

the-nw1-group opened this issue Oct 19, 2016 · 10 comments

Comments

@the-nw1-group
Copy link

When we deploy a simple application into a Docker Swarm, if the swarm spans two nodes, the containers on one node are unable to reach the containers on the other node.

The simple application is comprised of one reverse proxy (Apache HTTPD) a web site (again Apache HTTPD). When the reverse proxy can't connect to the web site, we eventually get

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

Steps to reproduce the issue:

Create a docker image, and push to a registry, from the Dockerfile:

FROM httpd:2.4-alpine

ADD httpd.conf /usr/local/apache2/conf/httpd.conf

The new httpd.conf file enables mod proxy, and has the proxy rules:

ProxyPreserveHost On
ProxyRequests off
ProxyPass /test/ http://web/
ProxyPassReverse /test/ http://web/

run the following:

[dev01] docker swarm init --advertise-addr $(hostname -i)
[tst01] docker swarm join --token <TOKEN> vm-beis-dev01:2377
[dev01] docker network create --driver overlay --opt encrypted test_common
[dev01] docker service create --with-registry-auth \
 --name rp \
 --publish 1080:80 \
 --network test_common \
 --constraint 'node.role==manager' \
 registry:443/rp:test
[dev01] docker service create --with-registry-auth \
 --name web \
 --network test_common \
 --replicas 2 \
 --constraint 'node.role==worker' \
 httpd:2.4-alpine

Describe the results you received:

[dev01] curl http://vm-beis-dev01:1080/test/
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

Describe the results you expected:

[dev01] curl http://vm-beis-dev01:1080/test/
<html><body><h1>It works!</h1></body></html>

Output of docker version: (the same on both servers)

Client:
 Version:      1.12.2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   bb80604
 Built:
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   bb80604
 Built:
 OS/Arch:      linux/amd64

Output of docker info: (on dev01)

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 13
Server Version: 1.12.2
Storage Driver: devicemapper
 Pool Name: docker-253:1-1006633150-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 1.765 GB
 Data Space Total: 107.4 GB
 Data Space Available: 105.6 GB
 Metadata Space Used: 4.616 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.143 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge host null
Swarm: active
 NodeID: 5s918xu9flglfop6wolz1xci1
 Is Manager: true
 ClusterID: cim2lskzq8aaba2xnsv5gwzf5
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 10.102.16.17
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-327.18.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 14.53 GiB
Name: vm-beis-dev01
ID: YGKS:SQHV:RS6U:26F5:FU3O:A5A5:YDYG:NUGF:K7CC:7SJ5:RYXC:MSKV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 45
 Goroutines: 139
 System Time: 2016-10-19T13:13:31.433439724+01:00
 EventsListeners: 1
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
 127.0.0.0/8

on tst01:

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 5
Server Version: 1.12.2
Storage Driver: devicemapper
 Pool Name: docker-253:1-207915392-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 3.593 GB
 Data Space Total: 107.4 GB
 Data Space Available: 103.8 GB
 Metadata Space Used: 9.777 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.138 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge host null
Swarm: active
 NodeID: 7533sn2g79yo3gtt75x13ettp
 Is Manager: false
 Node Address: 10.102.16.16
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-327.18.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 14.53 GiB
Name: vm-beis-tst01
ID: YGKS:SQHV:RS6U:26F5:FU3O:A5A5:YDYG:NUGF:K7CC:7SJ5:RYXC:MSKV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 41
 Goroutines: 94
 System Time: 2016-10-19T13:14:15.171870415+01:00
 EventsListeners: 2
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
 127.0.0.0/8

Output of uname -a:

Linux vm-beis-dev01 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Linux vm-beis-tst01 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Ports 2377 (TCP), 7946 (TCP/UDP) and 4789 (TCP/UDP) are open on both nodes.
firewalld is disabled, but iptables is running on both servers

[dev01] docker exec <rp-container-id> wget -O - -q http://web/
wget: can't connect to remote host (10.0.0.4): Operation timed out
[tst01] docker exec <web-1-container-id> wget -O - -q http://web/
<html><body><h1>It works!</h1></body></html>
[tst01] docker exec <web-2-container-id> wget -O - -q http://web/
<html><body><h1>It works!</h1></body></html>
@amitkumarj441
Copy link

@the-nw1-group One reason this may not be working as intended is when --listen-addr is not configured properly or not configured at all and is causing a problem in your env(like you have multiple host nics)

@the-nw1-group
Copy link
Author

I did try that, but got the same results, but just to double check:

[dev01] docker swarm init --advertise-addr $(hostname -i)
[tst01] docker swarm join --token <TOKEN> --listen-addr=$(hostname -i) vm-beis-dev01:2377 

[dev01] docker network create --driver overlay --opt encrypted test_common
[dev01] docker service create --with-registry-auth \
 --name rp \
 --publish 1080:80 \
 --network test_common \
 --constraint 'node.role==manager' \
 registry:443/rp:test
[dev01] docker service create --with-registry-auth \
 --name web \
 --network test_common \
 --replicas 2 \
 --constraint 'node.role==worker' \
 httpd:2.4-alpine

[dev01] curl http://vm-beis-dev01:1080/test/
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

[dev01] docker exec <rp-container-id> wget -O - -q http://web/
wget: can't connect to remote host (10.0.0.4): Operation timed out
[tst01] docker exec <web-1-container-id> wget -O - -q http://web/
<html><body><h1>It works!</h1></body></html>
[tst01] docker exec <web-2-container-id> wget -O - -q http://web/
<html><body><h1>It works!</h1></body></html>

@aboch
Copy link
Contributor

aboch commented Oct 19, 2016

@the-nw1-group Before doing more investigation, I think it is worth to first check if it works over a non-encrypted overlay network.

@coryleeio
Copy link

I just went through a lot of headache with something similar, and turning off the opt encrypted fixed mine. I was having sort of intermittent network failures when starting new hosts.

@the-nw1-group
Copy link
Author

Thanks to both @aboch and @coryleeio, running it with a non-encrypted overlay network seems to be working. Obv. our security guys won't like it, but for initial testing that's fine.

@aboch
Copy link
Contributor

aboch commented Oct 20, 2016

@the-nw1-group I am thinking your 3.10.x kernel is missing some modules needed to get the dataplane encryption work. Can you run the script https://github.com/docker/docker/blob/master/contrib/check-config.sh on your host and post the o/p. Thanks.

@the-nw1-group
Copy link
Author

Here's the output from dev01:

warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-3.10.0-327.18.2.el7.x86_64 ...

Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_USER_NS: enabled
  (RHEL7/CentOS7: User namespaces disabled; add 'user_namespace.enable=1' to boot command line)
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: missing
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: enabled
- CONFIG_MEMCG_KMEM: enabled
- CONFIG_RESOURCE_COUNTERS: enabled
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled
- CONFIG_NETPRIO_CGROUP: enabled (as module)
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: enabled
- CONFIG_IP_VS: enabled (as module)
- CONFIG_IP_VS_NFCT: enabled
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_EXT3_FS: missing
- CONFIG_EXT3_FS_XATTR: missing
- CONFIG_EXT3_FS_POSIX_ACL: missing
- CONFIG_EXT3_FS_SECURITY: missing
    (enable these ext3 configs if you are using ext3 as backing filesystem)
- CONFIG_EXT4_FS: enabled (as module)
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
  - "overlay":
    - CONFIG_VXLAN: enabled (as module)
      Optional (for encrypted networks):
      - CONFIG_CRYPTO: enabled
      - CONFIG_CRYPTO_AEAD: enabled
      - CONFIG_CRYPTO_GCM: enabled (as module)
      - CONFIG_CRYPTO_SEQIV: enabled
      - CONFIG_CRYPTO_GHASH: enabled (as module)
      - CONFIG_XFRM: enabled
      - CONFIG_XFRM_USER: enabled
      - CONFIG_XFRM_ALGO: enabled
      - CONFIG_INET_ESP: enabled (as module)
      - CONFIG_INET_XFRM_MODE_TRANSPORT: enabled (as module)
  - "ipvlan":
    - CONFIG_IPVLAN: missing
  - "macvlan":
    - CONFIG_MACVLAN: enabled (as module)
    - CONFIG_DUMMY: enabled (as module)
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: missing
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled (as module)
    - CONFIG_BTRFS_FS_POSIX_ACL: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled (as module)
    - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled (as module)
  - "zfs":
    - /dev/zfs: missing
    - zfs command: missing
    - zpool command: missing

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000

and here's the output from tst01:

warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-3.10.0-327.18.2.el7.x86_64 ...

Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_USER_NS: enabled
  (RHEL7/CentOS7: User namespaces disabled; add 'user_namespace.enable=1' to boot command line)
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: missing
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: enabled
- CONFIG_MEMCG_KMEM: enabled
- CONFIG_RESOURCE_COUNTERS: enabled
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled
- CONFIG_NETPRIO_CGROUP: enabled (as module)
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: enabled
- CONFIG_IP_VS: enabled (as module)
- CONFIG_IP_VS_NFCT: enabled
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_EXT3_FS: missing
- CONFIG_EXT3_FS_XATTR: missing
- CONFIG_EXT3_FS_POSIX_ACL: missing
- CONFIG_EXT3_FS_SECURITY: missing
    (enable these ext3 configs if you are using ext3 as backing filesystem)
- CONFIG_EXT4_FS: enabled (as module)
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
  - "overlay":
    - CONFIG_VXLAN: enabled (as module)
      Optional (for encrypted networks):
      - CONFIG_CRYPTO: enabled
      - CONFIG_CRYPTO_AEAD: enabled
      - CONFIG_CRYPTO_GCM: enabled (as module)
      - CONFIG_CRYPTO_SEQIV: enabled
      - CONFIG_CRYPTO_GHASH: enabled (as module)
      - CONFIG_XFRM: enabled
      - CONFIG_XFRM_USER: enabled
      - CONFIG_XFRM_ALGO: enabled
      - CONFIG_INET_ESP: enabled (as module)
      - CONFIG_INET_XFRM_MODE_TRANSPORT: enabled (as module)
  - "ipvlan":
    - CONFIG_IPVLAN: missing
  - "macvlan":
    - CONFIG_MACVLAN: enabled (as module)
    - CONFIG_DUMMY: enabled (as module)
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: missing
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled (as module)
    - CONFIG_BTRFS_FS_POSIX_ACL: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled (as module)
    - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled (as module)
  - "zfs":
    - /dev/zfs: missing
    - zfs command: missing
    - zpool command: missing

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000

Thanks.

@coryleeio
Copy link

This could possibly be the same as #26523

@aboch
Copy link
Contributor

aboch commented Oct 26, 2016

@the-nw1-group

Please make sure ip protocol 50 is open for all your hosts.
See #27425

@the-nw1-group
Copy link
Author

Brilliant - It works!

Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants