Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Overlay network -service name issue #26480

Closed
rajkumar49 opened this issue Sep 12, 2016 · 32 comments

Comments

Projects
None yet
@rajkumar49
Copy link

commented Sep 12, 2016

I face this issue in Overlay networks. I started all Docker Swarm services in same overlay network using Docker 1.12.1 engine. I can able to access container using service name in the Same host only . Accessing the another host containers using service name is not working . i have even tried the --listen-addr method when launching the Swarm manger and Swarm Worker. Related closed ticket #23855
Also, i can see that Overlay network allocated the IP address to all containers in all the hosts . i can ping the VIP of the service from that service's containers . please help.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2016

Can you please provide the full details requested by the issue template?

Namely output of docker version and docker info.
Exact steps to reproduce would be helpful as well.

Thanks

@ludolac

This comment has been minimized.

Copy link

commented Sep 12, 2016

Same issue @rajkumar49 , on swarm with 3 nodes....

docker swarm init --advertise-addr 172.31.100.232 --listen-addr eth0:2377

create overlay net1:
docker network create -d overlay net1

create 2 services:

docker service create --name front-nginx --mode global -p 80:80 -p 443:443 --network net1 nginx

docker service create --name db --replicas 1 --network net1 postgres:9.4

container db deployed on node3 ...

Can only ping service db from container front-nginx-xxxx on node3 ....

ping: unknown host from container front-nginx-xxx on node1 and node2

docker version


Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:02:53 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:02:53 2016
 OS/Arch:      linux/amd64


docker info


Containers: 5
 Running: 2
 Paused: 0
 Stopped: 3
Images: 22
Server Version: 1.12.1
Storage Driver: devicemapper
 Pool Name: docker-202:2-527764-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: ext4
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 1.946 GB
 Data Space Total: 107.4 GB
 Data Space Available: 16.44 GB
 Metadata Space Used: 3.047 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.144 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.90 (2014-09-01)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host null bridge overlay
Swarm: active
 NodeID: e90w99orw8re6cpx1a9fka6i9
 Is Manager: true
 ClusterID: 1r1nwnu4iibvnt7dyebqf95s4
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 172.31.100.232
Runtimes: runc
Default Runtime: runc
Security Options:
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.962 GiB
Name: nodea0
ID: RJTC:3OAA:AT7X:A7IE:VBHG:YQ6Q:6H4H:BLVY:CIHN:CBDL:NINX:2M7F
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Insecure Registries:
 127.0.0.0/8
@charlesa101

This comment has been minimized.

Copy link

commented Sep 13, 2016

having exactly the same issue :(

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 13, 2016

additional info : Hi I am using the Ubuntu 16.04.1 LTS in four hosts. One of the host is Docker swarm master.

@ludolac

This comment has been minimized.

Copy link

commented Sep 13, 2016

I would like add more info...
ping doesn't work but telnet command work:

ping db
PING db (10.0.0.2): 56 data bytes
92 bytes from c86475619a13 (10.0.0.5): Destination Host Unreachable
92 bytes from c86475619a13 (10.0.0.5): Destination Host Unreachable
92 bytes from c86475619a13 (10.0.0.5): Destination Host Unreachable
telnet db 5432
Trying 10.0.0.2...
Connected to db.
Escape character is '^]'.
@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 13, 2016

Docker info:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 30
Server Version: 1.12.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 185
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: overlay null bridge host
Swarm: active
NodeID: 63qv5oe7apbctp1l2cfamsz3e
Is Manager: true
ClusterID: 7b81tksw7q5h3t8e7lmrlpelw
Managers: 1
Nodes: 2 (usually 4)
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 172.19.xx.xx
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-21-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.685 GiB
Name: ubuntu-docker-xxxx-1
ID: 5YVX:ZMG7:THAY:2AQD:GLPZ:MBE4:JEFL:BG4Q:65TN:5TA7:D7OO:XDRB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: dockeriot
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 13, 2016

@ludolac can you able to connect to DB , the web app works using service name of DB ?

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 13, 2016

update : i can able to connect to Telnet port of MongoDB :
root@72c007d6a7b1:/# telnet mongo-test 27017
Trying 10.0.0.5...
Connected to mongo-test.
Escape character is '^]'

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 13, 2016

When i access the MongoDB using WebApp , the WebApp giving error :
Verify that MongoDB is running on mongo-test:27017 and restart server.

(here both containers are running in 2 different nodes)

@mrjana

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2016

@rajkumar49 I think your issue is related to gossip channel not established. Can you attach daemon logs for the node where you are having problems?

@niau

This comment has been minimized.

Copy link
Contributor

commented Sep 15, 2016

@rajkumar49 just curious if you can also to provide the output of docker service inspect of MongoDB and WebApp and more specifically the VIPs parts of it?

@xiaods

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2016

came across same issue. +1 to resolved.

Urgent need fix it.

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 21, 2016

anyone able to fix it ?

@xiaods

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2016

gossip channel not established, anyone can confirm it?

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 22, 2016

Docker changelog :
https://github.com/docker/docker/blob/master/CHANGELOG.md
Networking:
Fix issue that prevented containers to be accessed by hostname with Docker overlay driver in Swarm Mode #25603 #25648

any relation between this ticket and the above fix ?
here hostname means?

@erkie

This comment has been minimized.

Copy link

commented Sep 24, 2016

I'm sorry that I don't have any more info for debugging this but: I ran into the same issue just now (docker 1.12.1). A restart of all my hosts fixed it.

@xiaods

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2016

@erkie restart the host? are u can retry and confirm it effective

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 26, 2016

Hi , My Network Configuration in 4 LAB machines are:
Machine 1: IFCONFIG
docker0 Link encap:Ethernet HWaddr 02:42:d0:b5:5e:87
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:d0ff:feb5:5e87/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:8254 errors:0 dropped:0 overruns:0 frame:0
TX packets:8024 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1083679 (1.0 MB) TX bytes:1157751 (1.1 MB)

docker_gwbridge Link encap:Ethernet HWaddr 02:42:0c:92:ea:c0
inet addr:172.22.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:cff:fe92:eac0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:62 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4060 (4.0 KB) TX bytes:1040 (1.0 KB)

enp2s0 Link encap:Ethernet HWaddr d4:3d:7e:63:93:d8
inet addr:172.19.65.35 Bcast:172.19.65.63 Mask:255.255.255.224
inet6 addr: fe80::d63d:7eff:fe63:93d8/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2524799 errors:0 dropped:0 overruns:0 frame:0
TX packets:2488706 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:460783814 (460.7 MB) TX bytes:411957304 (411.9 MB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:541612 errors:0 dropped:0 overruns:0 frame:0
TX packets:541612 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:57005136 (57.0 MB) TX bytes:57005136 (57.0 MB)

veth9c1d4f4 Link encap:Ethernet HWaddr 62:4d:7b🇩🇪df:de
inet6 addr: fe80::604d:7bff:fede:dfde/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:1338 (1.3 KB)
Machine 2:
docker0 Link encap:Ethernet HWaddr 02:42:ea:85:3c:19
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:eaff:fe85:3c19/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:142557 errors:0 dropped:0 overruns:0 frame:0
TX packets:142620 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8682315 (8.6 MB) TX bytes:10830947 (10.8 MB)

docker_gwbridge Link encap:Ethernet HWaddr 02:42:e3:87:d2:d2
inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:e3ff:fe87:d2d2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:32466 errors:0 dropped:0 overruns:0 frame:0
TX packets:41984 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2064352 (2.0 MB) TX bytes:47629734 (47.6 MB)

enp2s0 Link encap:Ethernet HWaddr d4:3d:7e:63:93:a0
inet addr:172.19.65.36 Bcast:172.19.65.63 Mask:255.255.255.224
inet6 addr: fe80::d63d:7eff:fe63:93a0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1673586 errors:0 dropped:0 overruns:0 frame:0
TX packets:1633564 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:330979568 (330.9 MB) TX bytes:199853419 (199.8 MB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:166 errors:0 dropped:0 overruns:0 frame:0
TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:12368 (12.3 KB) TX bytes:12368 (12.3 KB)

veth2cad523 Link encap:Ethernet HWaddr 2e:60:c9:ed:a1:9a
inet6 addr: fe80::2c60:c9ff:feed:a19a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:1464 (1.4 KB)

Machine 3 :
docker0 Link encap:Ethernet HWaddr 02:42:99:50:d6:b0
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:99ff:fe50:d6b0/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:475 errors:0 dropped:0 overruns:0 frame:0
TX packets:458 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:42167 (42.1 KB) TX bytes:58504 (58.5 KB)

docker_gwbridge Link encap:Ethernet HWaddr 02:42:11:8c:7a:58
inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:11ff:fe8c:7a58/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3284 errors:0 dropped:0 overruns:0 frame:0
TX packets:4699 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2436863 (2.4 MB) TX bytes:3977167 (3.9 MB)

enp2s0 Link encap:Ethernet HWaddr d4:3d:7e:63:93:da
inet addr:172.19.65.37 Bcast:172.19.65.63 Mask:255.255.255.224
inet6 addr: fe80::d63d:7eff:fe63:93da/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1557094 errors:0 dropped:0 overruns:0 frame:0
TX packets:1504375 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:333428889 (333.4 MB) TX bytes:179183580 (179.1 MB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:164 errors:0 dropped:0 overruns:0 frame:0
TX packets:164 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:12176 (12.1 KB) TX bytes:12176 (12.1 KB)

veth9815d8c Link encap:Ethernet HWaddr 1e:8f:7a:5d:5b:49
inet6 addr: fe80::1c8f:7aff:fe5d:5b49/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:1296 (1.2 KB)

Machine 4 :
docker0 Link encap:Ethernet HWaddr 02:42:9f:e9:0b:a1
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:9fff:fee9:ba1/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:76503 errors:0 dropped:0 overruns:0 frame:0
TX packets:76717 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4472160 (4.4 MB) TX bytes:4131227 (4.1 MB)

docker_gwbridge Link encap:Ethernet HWaddr 02:42:64:06:7d:86
inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:64ff:fe06:7d86/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1383 errors:0 dropped:0 overruns:0 frame:0
TX packets:2548 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2270949 (2.2 MB) TX bytes:2583459 (2.5 MB)

enp2s0 Link encap:Ethernet HWaddr 68:05:ca:16:88:29
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:19 Memory:f7dc0000-f7de0000

enp3s0 Link encap:Ethernet HWaddr d4:3d:7e:5e:50:45
inet addr:172.19.65.38 Bcast:172.19.65.63 Mask:255.255.255.224
inet6 addr: fe80::a0ca:7d97:3972:6733/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1232734 errors:0 dropped:15 overruns:0 frame:0
TX packets:1233315 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:992803000 (992.8 MB) TX bytes:505287005 (505.2 MB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:2459862 errors:0 dropped:0 overruns:0 frame:0
TX packets:2459862 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:403802470 (403.8 MB) TX bytes:403802470 (403.8 MB)

veth1cdb9fb Link encap:Ethernet HWaddr 3a:17:da:c2:f8:27
inet6 addr: fe80::3817:daff:fec2:f827/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:491 errors:0 dropped:0 overruns:0 frame:0
TX packets:1449 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2228084 (2.2 MB) TX bytes:149035 (149.0 KB)

Do you think above IP address details are correct for Swarm ?

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 26, 2016

root@ubuntu-docker-xxxx-3:# telnet 172.19.65.36 2377
Trying 172.19.65.36...
telnet: Unable to connect to remote host: Connection refused
root@ubuntu-docker-xxxx-3:
# telnet 172.19.65.36 7946
Trying 172.19.65.36...
Connected to 172.19.65.36.
Escape character is '^]'.
Connection closed by foreign host.
root@ubuntu-docker-xxxx-3:~# telnet 172.19.65.36 4789
Trying 172.19.65.36...
telnet: Unable to connect to remote host: Connection refused

i have tried to connect Docker Service Ports using telnet.
the above output is between two LAB machines , Any Network Port block issue you think ?
(the Two LAB machines mentioned above are Already Part of Docker Swarm Cluster)

@xiaods

This comment has been minimized.

Copy link
Contributor

commented Sep 30, 2016

hi all. 1.12.2-rc1 is release, i have testing above case with latest rc version. it works like a charm. anyone can testing and confirm it fixed again?

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Sep 30, 2016

hi ,
how to upgrade to the latest Docker engine 1.12.2-rc1?

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Sep 30, 2016

@rajkumar49 the release candidate can be found here; https://github.com/docker/docker/releases/tag/v1.12.2-rc1. You can install it with

curl -fsSL https://test.docker.com | sh
@drajen

This comment has been minimized.

Copy link

commented Oct 4, 2016

I had horrible inconsistencies with 1.12.1 with name resolution. Updated to 1.12.2-rc1 and all my problems went away. A mental note here is that ping only works with containers on the same host while the service exposed port works across hosts.

@rajkumar49

This comment has been minimized.

Copy link
Author

commented Oct 5, 2016

yeah , docker version 1.12.2 rc1 fixed the overlay network issue.

@rajkumar49 rajkumar49 closed this Oct 5, 2016

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Oct 6, 2016

Thanks for testing!

@thaJeztah thaJeztah added this to the 1.12.2 milestone Oct 6, 2016

@bvipparla

This comment has been minimized.

Copy link

commented Oct 13, 2016

Facing the same issue with the latest docker 1.12.2 and running in swarm mode.Not able to resolve services by service name.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Oct 13, 2016

@bvipparla are your services connected to a custom network? If you don't specify a network, you won't be able to use service discovery by name.

@bvipparla

This comment has been minimized.

Copy link

commented Oct 14, 2016

@thaJeztah yes. We've a custom overlay network with 4 nodes running in swarm mode. If we restart all the docker engines, the SD is working fine using the service name. After a while, it doesn't work again. We are building a micro services stack where all of the individual micro-services report about the health to a central discovery server over rabbitmq. We are running the rabbitmq as a swarm service, somehow after a while (like 1-2 hrs or so), the microservices are not able to reach this rabbitmq service and failing to establish channel comms. This is not issue with just the rabbitmq service. I used that as an example.

I've searched through the forums about this issue and noticed some issues discussing about the embedded DNS and iptables and VIP's. So played with the swarm services by scaling up and down the services. We have around 12 micro services talking to each other. Shuffled through all of these by scaling up and down randomly and when all the services stabilized, the comms are working fine again.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Oct 14, 2016

@bvipparla could you open a new issue with those details (and possibly more details, if relevant)?

@galindro

This comment has been minimized.

Copy link

commented Dec 14, 2016

@thaJeztah , I can confirm that this issue is happening with Docker versions 1.12.3 and 1.12.4 all running on Ubuntu 16.04 without the fix in systemd network mentioned in #26492 .

Here is a print of my swarm cluster:

image

As you can see, I'm running 4 services:

root@ip-10-0-1-100:~# docker service ls
ID            NAME    REPLICAS  IMAGE                 COMMAND
9vtg1wkhclsh  nginx   3/3       nginx                 
azquqaizlq1c  alpine  3/3       alpine                sleep 30000
bnamhbjl00az  viz     1/1       manomarks/visualizer  
brsouv0mqk7u  b       1/1       busybox               sleep 10000

From the ip-10-0-1-100 (manager node), I've run ash through docker exec on one of the containers of alpine service and I see that I could ping only those services that has containers running on the same host. Those services are alpine itself and viz. If I try to ping any other service that doesn't has containers running on the same host, the ICMP packages can't reach those services containers.

But the strange behaviour is: If I access the nginx service through tcp port 80, it works, like reported by @ludolac here. But it only showed this behaviour with docker-engine 1.12.3. With 1.12.4, the wget nginx command not worked anymore....

This isssue is preventing me to use Docker 1.12.x with swarm mode in more than one host. IMHO, this is a critical issue in swarm mode.

Do I need to open a new thread?

Here are a print of the ip-10-0-1-100 (manager node) terminal:

root@ip-10-0-1-100:~# docker service ps alpine
ID                         NAME      IMAGE   NODE           DESIRED STATE  CURRENT STATE           ERROR
aoeslwnpf4mruhnae8erf023r  alpine.1  alpine  ip-10-0-1-100  Running        Running 18 minutes ago  
3l007wh1l5cxevxj40rdmnvgw  alpine.2  alpine  ip-10-0-1-100  Running        Running 18 minutes ago  
e14z2qthcgjtn273jqeugv876  alpine.3  alpine  ip-10-0-2-100  Running        Running 18 minutes ago  

root@ip-10-0-1-100:~# docker exec -ti alpine.1.aoeslwnpf4mruhnae8erf023r ash
/ # 
/ # 
/ # ping viz
PING viz (10.0.0.2): 56 data bytes
64 bytes from 10.0.0.2: seq=0 ttl=64 time=0.121 ms
^C
--- viz ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.121/0.121/0.121 ms
/ # 
/ # 
/ # ping nginx
PING nginx (10.0.0.6): 56 data bytes
^C
--- nginx ping statistics ---
6 packets transmitted, 0 packets received, 100% packet loss
/ # 
/ # 
/ # wget nginx
Connecting to nginx (10.0.0.6:80)
index.html           100% |*********************************************************************************************|   612   0:00:00 ETA
/ # 
/ # 
/ # cat index.html 
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
/ # 
/ # 
/ # rm index.html 
/ # 
/ # 
/ # root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# apt update
Hit:1 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial InRelease
Get:2 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]
Get:3 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-backports InRelease [102 kB]         
Get:4 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main Sources [211 kB]             
Get:5 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/universe Sources [113 kB]                               
Get:6 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/multiverse Sources [3,640 B]                                    
Get:7 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [440 kB]                            
Get:8 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [370 kB]                                  
Get:9 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/universe Translation-en [134 kB]                                  
Get:10 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/multiverse amd64 Packages [7,376 B]          
Get:11 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]                              
Hit:12 https://apt.dockerproject.org/repo ubuntu-xenial InRelease
Get:13 http://security.ubuntu.com/ubuntu xenial-security/main Sources [53.0 kB]
Get:14 http://security.ubuntu.com/ubuntu xenial-security/universe Sources [15.3 kB]
Get:15 http://security.ubuntu.com/ubuntu xenial-security/multiverse Sources [724 B]
Get:16 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [191 kB]
Get:17 http://security.ubuntu.com/ubuntu xenial-security/main Translation-en [78.7 kB]
Get:18 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [64.8 kB]
Get:19 http://security.ubuntu.com/ubuntu xenial-security/universe Translation-en [35.5 kB]
Get:20 http://security.ubuntu.com/ubuntu xenial-security/multiverse amd64 Packages [2,756 B]
Fetched 2,026 kB in 2s (695 kB/s)      
Reading package lists... Done
Building dependency tree       
Reading state information... Done
8 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@ip-10-0-1-100:~# apt list --upgradable
Listing... Done
apt/xenial-updates,xenial-security 1.2.15ubuntu0.2 amd64 [upgradable from: 1.2.15]
apt-transport-https/xenial-updates,xenial-security 1.2.15ubuntu0.2 amd64 [upgradable from: 1.2.15]
apt-utils/xenial-updates,xenial-security 1.2.15ubuntu0.2 amd64 [upgradable from: 1.2.15]
docker-engine/ubuntu-xenial 1.12.4-0~ubuntu-xenial amd64 [upgradable from: 1.12.3-0~xenial]
libapt-inst2.0/xenial-updates,xenial-security 1.2.15ubuntu0.2 amd64 [upgradable from: 1.2.15]
libapt-pkg5.0/xenial-updates,xenial-security 1.2.15ubuntu0.2 amd64 [upgradable from: 1.2.15]
python3-software-properties/xenial-updates 0.96.20.5 all [upgradable from: 0.96.20.4]
software-properties-common/xenial-updates 0.96.20.5 all [upgradable from: 0.96.20.4]
root@ip-10-0-1-100:~# apt -y dist-upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  apt apt-transport-https apt-utils docker-engine libapt-inst2.0 libapt-pkg5.0 python3-software-properties software-properties-common
8 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 21.4 MB of archives.
After this operation, 214 kB of additional disk space will be used.
Get:1 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 libapt-pkg5.0 amd64 1.2.15ubuntu0.2 [702 kB]
Get:2 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 libapt-inst2.0 amd64 1.2.15ubuntu0.2 [55.7 kB]
Get:3 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 apt amd64 1.2.15ubuntu0.2 [1,042 kB]
Get:4 https://apt.dockerproject.org/repo ubuntu-xenial/main amd64 docker-engine amd64 1.12.4-0~ubuntu-xenial [19.4 MB]
Get:5 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 apt-utils amd64 1.2.15ubuntu0.2 [196 kB]
Get:6 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 apt-transport-https amd64 1.2.15ubuntu0.2 [26.0 kB]
Get:7 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 software-properties-common all 0.96.20.5 [9,432 B]
Get:8 http://sa-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/main amd64 python3-software-properties all 0.96.20.5 [19.9 kB]
Fetched 21.4 MB in 0s (38.9 MB/s)                                                        
(Reading database ... 70429 files and directories currently installed.)
Preparing to unpack .../libapt-pkg5.0_1.2.15ubuntu0.2_amd64.deb ...
Unpacking libapt-pkg5.0:amd64 (1.2.15ubuntu0.2) over (1.2.15) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
Setting up libapt-pkg5.0:amd64 (1.2.15ubuntu0.2) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
(Reading database ... 70429 files and directories currently installed.)
Preparing to unpack .../libapt-inst2.0_1.2.15ubuntu0.2_amd64.deb ...
Unpacking libapt-inst2.0:amd64 (1.2.15ubuntu0.2) over (1.2.15) ...
Preparing to unpack .../apt_1.2.15ubuntu0.2_amd64.deb ...
Unpacking apt (1.2.15ubuntu0.2) over (1.2.15) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up apt (1.2.15ubuntu0.2) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
(Reading database ... 70429 files and directories currently installed.)
Preparing to unpack .../apt-utils_1.2.15ubuntu0.2_amd64.deb ...
Unpacking apt-utils (1.2.15ubuntu0.2) over (1.2.15) ...
Preparing to unpack .../apt-transport-https_1.2.15ubuntu0.2_amd64.deb ...
Unpacking apt-transport-https (1.2.15ubuntu0.2) over (1.2.15) ...
Preparing to unpack .../docker-engine_1.12.4-0~ubuntu-xenial_amd64.deb ...
Unpacking docker-engine (1.12.4-0~ubuntu-xenial) over (1.12.3-0~xenial) ...
Preparing to unpack .../software-properties-common_0.96.20.5_all.deb ...
Unpacking software-properties-common (0.96.20.5) over (0.96.20.4) ...
Preparing to unpack .../python3-software-properties_0.96.20.5_all.deb ...
Unpacking python3-software-properties (0.96.20.5) over (0.96.20.4) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for systemd (229-4ubuntu12) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for dbus (1.10.6-1ubuntu3.1) ...
Setting up libapt-inst2.0:amd64 (1.2.15ubuntu0.2) ...
Setting up apt-utils (1.2.15ubuntu0.2) ...
Setting up apt-transport-https (1.2.15ubuntu0.2) ...
Setting up docker-engine (1.12.4-0~ubuntu-xenial) ...
Setting up python3-software-properties (0.96.20.5) ...
Setting up software-properties-common (0.96.20.5) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~# docker node ls
ID                           HOSTNAME        STATUS  AVAILABILITY  MANAGER STATUS
0413wg8x7p6dkv9ttd42gefye    ip-10-0-3-100   Down    Active        Unreachable
7fzw5kwaj7j3urxapxvsujn44    ip-10-0-22-107  Ready   Active        
a6z9520td2iw67291igk9qzot    ip-10-0-2-100   Ready   Active        Reachable
cdt5hal5avbwdxiwfw5kymp3k *  ip-10-0-1-100   Ready   Active        Leader
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# docker service ls
ID            NAME    REPLICAS  IMAGE                 COMMAND
9vtg1wkhclsh  nginx   3/3       nginx                 
azquqaizlq1c  alpine  3/3       alpine                sleep 30000
bnamhbjl00az  viz     1/1       manomarks/visualizer  
brsouv0mqk7u  b       1/1       busybox               sleep 10000
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~# docker service ps alpine
ID                         NAME          IMAGE   NODE           DESIRED STATE  CURRENT STATE                ERROR
ckqelj734hvx8n8s0eypagmmi  alpine.1      alpine  ip-10-0-1-100  Running        Running about a minute ago   
aoeslwnpf4mruhnae8erf023r   \_ alpine.1  alpine  ip-10-0-1-100  Shutdown       Complete about a minute ago  
70806xpn57mkr5haiiq20qgsx  alpine.2      alpine  ip-10-0-1-100  Running        Running about a minute ago   
3l007wh1l5cxevxj40rdmnvgw   \_ alpine.2  alpine  ip-10-0-1-100  Shutdown       Complete about a minute ago  
bpxhm55a9tvhw7t6zrtinjk8a  alpine.3      alpine  ip-10-0-2-100  Running        Running about a minute ago   
e14z2qthcgjtn273jqeugv876   \_ alpine.3  alpine  ip-10-0-2-100  Shutdown       Complete about a minute ago  
root@ip-10-0-1-100:~# 
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~#
root@ip-10-0-1-100:~# docker exec -ti alpine.1.ckqelj734hvx8n8s0eypagmmi ash
/ # 
/ # 
/ # 
/ # ping -c 2 viz
PING viz (10.0.0.2): 56 data bytes
64 bytes from 10.0.0.2: seq=0 ttl=64 time=0.125 ms
64 bytes from 10.0.0.2: seq=1 ttl=64 time=0.083 ms

--- viz ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.083/0.104/0.125 ms
/ # 
/ # 
/ # ping -c 2 alpine
PING alpine (10.0.0.10): 56 data bytes
64 bytes from 10.0.0.10: seq=0 ttl=64 time=0.036 ms
64 bytes from 10.0.0.10: seq=1 ttl=64 time=0.062 ms

--- alpine ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.036/0.049/0.062 ms
/ # 
/ # 
/ # ping -c 2 nginx
ping: bad address 'nginx'
/ # 
/ # 
/ # wget nginx
wget: bad address 'nginx'
/ # 
/ # 
/ # 
@thaJeztah

This comment has been minimized.

Copy link
Member

commented Dec 14, 2016

@galindro ping not working across hosts is due to the way services virtual-ip's work. Originally using ping was not possible at all; see #24201, which was addressed before 1.12.0; pinging accross hosts is not possible in docker 1.12 (see #25497), but will be implemented in docker 1.13 (see docker/libnetwork#1501, and #28019)

How are your services started? Are they all connected to the same custom network? It's better to open a new issue, because the issue that was reported here was resolved, and there's many possible causes for overlay networking not working (many depending on configuration of the hosts)

@galindro

This comment has been minimized.

Copy link

commented Dec 14, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.