New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containers lose connectivity to the internet (after some time - trigger unknown) #15172

Open
amoghe opened this Issue Jul 30, 2015 · 10 comments

Comments

Projects
None yet
8 participants
@amoghe

amoghe commented Jul 30, 2015

Details:

  • Output of uname -a:
    Linux storm 3.13.0-59-generic #98-Ubuntu SMP Fri Jul 24 21:05:26 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
  • Output of docker version
Client version: 1.7.1
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 786b29d
OS/Arch (client): linux/amd64
Server version: 1.7.1
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): 786b29d
OS/Arch (server): linux/amd64
  • Output of docker -D info
Containers: 1
Images: 65
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 67
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.13.0-59-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 4
Total Memory: 11.43 GiB
Name: storm
ID: 7A7R:XXXP:ZC25:6VPN:L6DW:MYVC:WBDX:PNRQ:26Z5:7F4Z:DZV4:KPKC
WARNING: No swap limit support
  • Environment details: Ubuntu on baremetal (laptop)
  • How reproducible: Easily (since upgrading to 1.7.1)
  • Steps to reproduce
    ** Restart docker daemon (sudo service docker restart)
    ** Launch container and test connectivity [docker run -it ubuntu /bin/ping -c4 8.8.8.8]
    ** Let some time elapse (leave system idle)
    ** Repeat step 3 -> no connectivity
  • Expected: No issues connecting the second time
  • Additional info:

Once connectivity is lost, the container can still reach interfaces on the machine. For example, my wired interface is still reachable. But can't get past that.

---[On laptop]--------------------

akshay@storm:~/go/src$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 28:d2:44:69:3f:2c  
          inet addr:192.168.1.38  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: 2601:647:4200:60f0:2ad2:44ff:fe69:3f2c/64 Scope:Global
          inet6 addr: fe80::2ad2:44ff:fe69:3f2c/64 Scope:Link
          inet6 addr: 2601:647:4200:60f0:34c5:2010:6c66:25a/64 Scope:Global
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1111596 errors:0 dropped:0 overruns:0 frame:0
          TX packets:503526 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1186022016 (1.1 GB)  TX bytes:101440411 (101.4 MB)
          Interrupt:20 Memory:f0600000-f0620000 

---[Inside the container]----------------

akshay@storm:~/go/src/$ docker run -it ubuntu /bin/bash
root@6be66ba9ccbc:/# ping 192.168.1.38
PING 192.168.1.38 (192.168.1.38) 56(84) bytes of data.
64 bytes from 192.168.1.38: icmp_seq=1 ttl=64 time=0.083 ms
64 bytes from 192.168.1.38: icmp_seq=2 ttl=64 time=0.083 ms
64 bytes from 192.168.1.38: icmp_seq=3 ttl=64 time=0.080 ms

^C
--- 192.168.1.38 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.080/0.082/0.083/0.001 ms

But cannot reach any further.

root@6be66ba9ccbc:/# ping -c2  8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.

--- 8.8.8.8 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

Meanwhile, my laptop has no issues connecting to the interwebz.

akshay@storm:~/go/src/d7/controller$ ping -c2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=54 time=12.0 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=54 time=11.6 ms

Route table looks as it should:

akshay@storm:~/go/src$ ip route show
default via 192.168.1.1 dev eth0  proto static 
50.203.224.0/24 dev pertino0  proto kernel  scope link  src 50.203.224.4 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.42.1 
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.38  metric 1 

Firewall has stock settings (confirmed there are no rules using iptables -L).

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (1 references)
target     prot opt source               destination

Syslog has the following output for when things work normally:

Jul 30 13:15:29 storm kernel: [75984.772971] type=1400 audit(1438287329.141:76): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=28577 comm="apparmor_parser"
---- [Docker restarted manually] ------------------------------
Jul 30 13:15:39 storm kernel: [75995.361822] aufs au_opts_parse:1155:docker[28578]: unknown option dirperm1
Jul 30 13:15:39 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab)
Jul 30 13:15:39 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab): no ifupdown configuration found.
Jul 30 13:15:39 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/veth7cad9ab: couldn't determine device driver; ignoring...
Jul 30 13:15:39 storm kernel: [75995.420836] device veth7de53b9 entered promiscuous mode
Jul 30 13:15:39 storm kernel: [75995.421597] IPv6: ADDRCONF(NETDEV_UP): veth7cad9ab: link is not ready
Jul 30 13:15:39 storm kernel: [75995.421816] IPv6: ADDRCONF(NETDEV_CHANGE): veth7cad9ab: link becomes ready
Jul 30 13:15:39 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/veth7de53b9, iface: veth7de53b9)
Jul 30 13:15:39 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/veth7de53b9, iface: veth7de53b9): no ifupdown configuration found.
Jul 30 13:15:39 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/veth7de53b9: couldn't determine device driver; ignoring...
Jul 30 13:15:39 storm avahi-daemon[668]: Withdrawing workstation service for veth7cad9ab.
Jul 30 13:15:39 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab)
Jul 30 13:15:40 storm kernel: [75996.419471] docker0: port 1(veth7de53b9) entered forwarding state
Jul 30 13:15:40 storm kernel: [75996.419537] docker0: port 1(veth7de53b9) entered forwarding state
Jul 30 13:15:41 storm avahi-daemon[668]: Joining mDNS multicast group on interface veth7de53b9.IPv6 with address fe80::2824:a5ff:feaf:149a.
Jul 30 13:15:41 storm avahi-daemon[668]: New relevant interface veth7de53b9.IPv6 for mDNS.
Jul 30 13:15:41 storm avahi-daemon[668]: Registering new address record for fe80::2824:a5ff:feaf:149a on veth7de53b9.*.
Jul 30 13:15:42 storm kernel: [75998.568946] docker0: port 1(veth7de53b9) entered disabled state
Jul 30 13:15:42 storm kernel: [75998.583006] docker0: port 1(veth7de53b9) entered forwarding state
Jul 30 13:15:42 storm kernel: [75998.583027] docker0: port 1(veth7de53b9) entered forwarding state
Jul 30 13:15:42 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab)
Jul 30 13:15:42 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab): no ifupdown configuration found.
Jul 30 13:15:42 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/veth7cad9ab: couldn't determine device driver; ignoring...
Jul 30 13:15:42 storm avahi-daemon[668]: Interface veth7de53b9.IPv6 no longer relevant for mDNS.
Jul 30 13:15:42 storm avahi-daemon[668]: Leaving mDNS multicast group on interface veth7de53b9.IPv6 with address fe80::2824:a5ff:feaf:149a.
Jul 30 13:15:42 storm avahi-daemon[668]: Withdrawing address record for fe80::2824:a5ff:feaf:149a on veth7de53b9.
Jul 30 13:15:42 storm avahi-daemon[668]: Withdrawing workstation service for veth7cad9ab.
Jul 30 13:15:42 storm avahi-daemon[668]: Withdrawing workstation service for veth7de53b9.
Jul 30 13:15:43 storm kernel: [75998.619062] docker0: port 1(veth7de53b9) entered disabled state
Jul 30 13:15:43 storm kernel: [75998.620148] device veth7de53b9 left promiscuous mode
Jul 30 13:15:43 storm kernel: [75998.620161] docker0: port 1(veth7de53b9) entered disabled state
Jul 30 13:15:43 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/veth7cad9ab, iface: veth7cad9ab)
Jul 30 13:15:43 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/veth7de53b9, iface: veth7de53b9)

But when they don't work, syslog doesn't tell us anything different:

Jul 30 13:16:43 storm avahi-daemon[668]: Invalid response packet from host 192.168.1.15.
Jul 30 13:17:01 storm CRON[28652]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jul 30 13:17:06 storm wpa_supplicant[1253]: wlan0: CTRL-EVENT-SCAN-STARTED 
Jul 30 13:23:04 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a)
Jul 30 13:23:04 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a): no ifupdown configuration found.
Jul 30 13:23:04 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/vetha3d205a: couldn't determine device driver; ignoring...
Jul 30 13:23:04 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/vethb2caa2c, iface: vethb2caa2c)
Jul 30 13:23:04 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/vethb2caa2c, iface: vethb2caa2c): no ifupdown configuration found.
Jul 30 13:23:04 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/vethb2caa2c: couldn't determine device driver; ignoring...
Jul 30 13:23:04 storm kernel: [76440.036232] device vethb2caa2c entered promiscuous mode
Jul 30 13:23:04 storm kernel: [76440.037058] IPv6: ADDRCONF(NETDEV_UP): vetha3d205a: link is not ready
Jul 30 13:23:04 storm kernel: [76440.040438] IPv6: ADDRCONF(NETDEV_CHANGE): vetha3d205a: link becomes ready
Jul 30 13:20:24 storm avahi-daemon[668]: Invalid response packet from host 192.168.1.15.
Jul 30 13:23:04 storm avahi-daemon[668]: Withdrawing workstation service for vetha3d205a.
Jul 30 13:23:04 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a)
Jul 30 13:23:05 storm kernel: [76441.030999] docker0: port 1(vethb2caa2c) entered forwarding state
Jul 30 13:23:05 storm kernel: [76441.031080] docker0: port 1(vethb2caa2c) entered forwarding state
Jul 30 13:23:05 storm avahi-daemon[668]: Joining mDNS multicast group on interface vethb2caa2c.IPv6 with address fe80::9041:e9ff:feac:f856.
Jul 30 13:23:05 storm avahi-daemon[668]: New relevant interface vethb2caa2c.IPv6 for mDNS.
Jul 30 13:23:05 storm avahi-daemon[668]: Registering new address record for fe80::9041:e9ff:feac:f856 on vethb2caa2c.*.
Jul 30 13:23:17 storm kernel: [76453.180992] docker0: port 1(vethb2caa2c) entered disabled state
Jul 30 13:23:17 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices added (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a)
Jul 30 13:23:17 storm NetworkManager[858]:    SCPlugin-Ifupdown: device added (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a): no ifupdown configuration found.
Jul 30 13:23:17 storm NetworkManager[858]: <warn> /sys/devices/virtual/net/vetha3d205a: couldn't determine device driver; ignoring...
Jul 30 13:23:17 storm avahi-daemon[668]: Interface vethb2caa2c.IPv6 no longer relevant for mDNS.
Jul 30 13:23:17 storm avahi-daemon[668]: Leaving mDNS multicast group on interface vethb2caa2c.IPv6 with address fe80::9041:e9ff:feac:f856.
Jul 30 13:23:17 storm avahi-daemon[668]: Withdrawing address record for fe80::9041:e9ff:feac:f856 on vethb2caa2c.
Jul 30 13:23:17 storm avahi-daemon[668]: Withdrawing workstation service for vetha3d205a.
Jul 30 13:23:17 storm avahi-daemon[668]: Withdrawing workstation service for vethb2caa2c.
Jul 30 13:23:17 storm kernel: [76453.242893] docker0: port 1(vethb2caa2c) entered disabled state
Jul 30 13:23:17 storm kernel: [76453.245111] device vethb2caa2c left promiscuous mode
Jul 30 13:23:17 storm kernel: [76453.245128] docker0: port 1(vethb2caa2c) entered disabled state
Jul 30 13:23:17 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/vetha3d205a, iface: vetha3d205a)
Jul 30 13:23:17 storm NetworkManager[858]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/vethb2caa2c, iface: vethb2caa2c)

No idea what could be going on here. As mentioned earlier, this has been a problem since upgrading to 1.7.1

The upgrade came from the apt repo that was added sources.list.d by the first install. I attempted a reinstall which involved removing the old package (w/ purge), then installing the new one from the get.docker.io shell script (wget pipe trick).

Evidence of this (notice docker-engine is now installed, but entries for old ones remain):

akshay@storm:~/go/src$ dpkg-query --list "*docker*"
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                                             Version                               Architecture                          Description
+++-================================================================-=====================================-=====================================-======================================================================================================================================
un  docker                                                           <none>                                <none>                                (no description available)
ii  docker-engine                                                    1.7.1-0~trusty                        amd64                                 Docker: the open-source application container engine
un  docker.io                                                        <none>                                <none>                                (no description available)
un  lxc-docker                                                       <none>                                <none>                                (no description available)
un  lxc-docker-virtual-package                                       <none>                                <none>                                (no description available)
@amoghe

This comment has been minimized.

Show comment
Hide comment
@amoghe

amoghe Jul 30, 2015

#ENEEDMOREINFO

Hmm... at this point I'm not sure this is a bug in docker, or some interop issue. I've dumped everything I know about it in the first comment in the bug. Happy to provide any additional info as required.

amoghe commented Jul 30, 2015

#ENEEDMOREINFO

Hmm... at this point I'm not sure this is a bug in docker, or some interop issue. I've dumped everything I know about it in the first comment in the bug. Happy to provide any additional info as required.

@phemmer

This comment has been minimized.

Show comment
Hide comment
@phemmer

phemmer Jul 31, 2015

Contributor

That iptables output is pretty bare. Sounds like you have something on your box which is wiping the iptables rules.

Contributor

phemmer commented Jul 31, 2015

That iptables output is pretty bare. Sounds like you have something on your box which is wiping the iptables rules.

@amoghe

This comment has been minimized.

Show comment
Hide comment
@amoghe

amoghe Sep 10, 2015

@phemmer - that iptables output is the same as when things are working fine (i.e. it doesn't change when the containers lose network connectivity).

Also confirmed on a separate machine (which is running docker 1.5 and not exhibiting this behavior) that the output of iptables -L is the same.

amoghe commented Sep 10, 2015

@phemmer - that iptables output is the same as when things are working fine (i.e. it doesn't change when the containers lose network connectivity).

Also confirmed on a separate machine (which is running docker 1.5 and not exhibiting this behavior) that the output of iptables -L is the same.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Sep 10, 2015

Contributor

I have those problems, too. Sometimes random, 100% after switching from wifi to cord and back.

Contributor

LK4D4 commented Sep 10, 2015

I have those problems, too. Sometimes random, 100% after switching from wifi to cord and back.

@amoghe

This comment has been minimized.

Show comment
Hide comment
@amoghe

amoghe Sep 18, 2015

More information...

I took the following tcpdumps on the bridge interfaces while running ping inside the container.

The vethXxx interface:

akshay@storm:~/go/src$ sudo tcpdump -ni vethcec0e76 
tcpdump: WARNING: vethcec0e76: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vethcec0e76, link-type EN10MB (Ethernet), capture size 65535 bytes
16:36:54.639081 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 85, length 64
16:36:55.646940 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 86, length 64
16:36:56.654943 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 87, length 64
16:36:57.663137 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 88, length 64
16:36:58.671088 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 89, length 64

On docker0:

akshay@storm:~/go/src$ sudo tcpdump -ni docker0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 65535 bytes
16:36:54.639081 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 85, length 64
16:36:55.646940 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 86, length 64
16:36:56.654943 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 87, length 64
16:36:57.663137 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 88, length 64

But on eth0

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

Seems like packets reach the docker0 interface, but do not make it to eth0. What gives?

amoghe commented Sep 18, 2015

More information...

I took the following tcpdumps on the bridge interfaces while running ping inside the container.

The vethXxx interface:

akshay@storm:~/go/src$ sudo tcpdump -ni vethcec0e76 
tcpdump: WARNING: vethcec0e76: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vethcec0e76, link-type EN10MB (Ethernet), capture size 65535 bytes
16:36:54.639081 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 85, length 64
16:36:55.646940 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 86, length 64
16:36:56.654943 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 87, length 64
16:36:57.663137 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 88, length 64
16:36:58.671088 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 89, length 64

On docker0:

akshay@storm:~/go/src$ sudo tcpdump -ni docker0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 65535 bytes
16:36:54.639081 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 85, length 64
16:36:55.646940 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 86, length 64
16:36:56.654943 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 87, length 64
16:36:57.663137 IP 172.17.0.2 > 8.8.8.8: ICMP echo request, id 183, seq 88, length 64

But on eth0

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

Seems like packets reach the docker0 interface, but do not make it to eth0. What gives?

@amoghe

This comment has been minimized.

Show comment
Hide comment
@amoghe

amoghe Sep 18, 2015

I read through #13381 , seems related, but doesn't help much.

amoghe commented Sep 18, 2015

I read through #13381 , seems related, but doesn't help much.

@kennbrodhagen

This comment has been minimized.

Show comment
Hide comment
@kennbrodhagen

kennbrodhagen Nov 25, 2015

I think I experienced this problem recently, too. I have just upgraded from docker 1.5-ish and now have the following versions:
Docker version 1.9.1, build a34a1d5
docker-compose version: 1.5.1
docker-machine version 0.5.1 (HEAD)
VirtualBox 4.3.22 r98236
Mac OSX 10.10.5 (Yosemite)

I had spun up a docker-compse.yml that I've been using since the earlier version of docker. I did some work in the office and then went home where I later resumed my work. I just closed my MacBook lid and opened it later, no shutdowns, restarts, etc.

In the earlier versions of docker I wouldn't notice anything at this point, it would just work. This time the network quit working and I used docker exec to confirm the container created by compose could no longer resolve hostnames. I tried docker-compose rm and rebuilt all the containers but still no luck.

Finally, I did a docker-machine rm -f to delete the virtualbox and then rebuilt it along with all the containers again. This fixed my issue so I'm up and running now.

I'm glad I have a workaround but it's still a pain to have to delete and rebuild the VM for docker with network changes. I anticipate having issues when I connect/disconnect from VPN, etc. But I haven't confirmed this empirically yet.

I'm happy to run any troubleshooting steps anyone has. I'll give the iptables and docker -D info a shot if/when the issue comes back. FWIW here is what I get now in a working state:

(from the MacBook)
% docker -D info
DEBU[0000] Trusting certs with subjects: [010U

kbrodhagen]
Containers: 5
Images: 87
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /mnt/sda1/var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 97
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.13-boot2docker
Operating System: Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015
CPUs: 1
Total Memory: 3.859 GiB
Name: dev
ID: 6BCF:5VG4:TI6F:KQ2P:DFVN:LUSY:CQDP:5CAS:MNML:J3AJ:XDVT:GVWA
Debug mode (server): true
File Descriptors: 43
Goroutines: 96
System Time: 2015-11-25T16:01:49.375399742Z
EventsListeners: 0
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Labels:
provider=virtualbox

(From the docker virtualbox host via docker-machine ssh):
$ docker -D info
Containers: 5
Images: 87
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /mnt/sda1/var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 97
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.13-boot2docker
Operating System: Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015
CPUs: 1
Total Memory: 3.859 GiB
Name: dev
ID: 6BCF:5VG4:TI6F:KQ2P:DFVN:LUSY:CQDP:5CAS:MNML:J3AJ:XDVT:GVWA
Debug mode (server): true
File Descriptors: 43
Goroutines: 96
System Time: 2015-11-25T16:03:34.865937776Z
EventsListeners: 0
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Labels:
provider=virtualbox

$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain DOCKER (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:postgres
ACCEPT tcp -- anywhere 172.17.0.4 tcp dpt:2181
ACCEPT tcp -- anywhere 172.17.0.5 tcp dpt:webcache

kennbrodhagen commented Nov 25, 2015

I think I experienced this problem recently, too. I have just upgraded from docker 1.5-ish and now have the following versions:
Docker version 1.9.1, build a34a1d5
docker-compose version: 1.5.1
docker-machine version 0.5.1 (HEAD)
VirtualBox 4.3.22 r98236
Mac OSX 10.10.5 (Yosemite)

I had spun up a docker-compse.yml that I've been using since the earlier version of docker. I did some work in the office and then went home where I later resumed my work. I just closed my MacBook lid and opened it later, no shutdowns, restarts, etc.

In the earlier versions of docker I wouldn't notice anything at this point, it would just work. This time the network quit working and I used docker exec to confirm the container created by compose could no longer resolve hostnames. I tried docker-compose rm and rebuilt all the containers but still no luck.

Finally, I did a docker-machine rm -f to delete the virtualbox and then rebuilt it along with all the containers again. This fixed my issue so I'm up and running now.

I'm glad I have a workaround but it's still a pain to have to delete and rebuild the VM for docker with network changes. I anticipate having issues when I connect/disconnect from VPN, etc. But I haven't confirmed this empirically yet.

I'm happy to run any troubleshooting steps anyone has. I'll give the iptables and docker -D info a shot if/when the issue comes back. FWIW here is what I get now in a working state:

(from the MacBook)
% docker -D info
DEBU[0000] Trusting certs with subjects: [010U

kbrodhagen]
Containers: 5
Images: 87
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /mnt/sda1/var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 97
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.13-boot2docker
Operating System: Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015
CPUs: 1
Total Memory: 3.859 GiB
Name: dev
ID: 6BCF:5VG4:TI6F:KQ2P:DFVN:LUSY:CQDP:5CAS:MNML:J3AJ:XDVT:GVWA
Debug mode (server): true
File Descriptors: 43
Goroutines: 96
System Time: 2015-11-25T16:01:49.375399742Z
EventsListeners: 0
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Labels:
provider=virtualbox

(From the docker virtualbox host via docker-machine ssh):
$ docker -D info
Containers: 5
Images: 87
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /mnt/sda1/var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 97
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.13-boot2docker
Operating System: Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015
CPUs: 1
Total Memory: 3.859 GiB
Name: dev
ID: 6BCF:5VG4:TI6F:KQ2P:DFVN:LUSY:CQDP:5CAS:MNML:J3AJ:XDVT:GVWA
Debug mode (server): true
File Descriptors: 43
Goroutines: 96
System Time: 2015-11-25T16:03:34.865937776Z
EventsListeners: 0
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Labels:
provider=virtualbox

$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain DOCKER (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:postgres
ACCEPT tcp -- anywhere 172.17.0.4 tcp dpt:2181
ACCEPT tcp -- anywhere 172.17.0.5 tcp dpt:webcache

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Nov 25, 2015

Member

@kennbrodhagen just in case it helps; DNS resolution no longer working could also be related to docker/machine#1857

Member

thaJeztah commented Nov 25, 2015

@kennbrodhagen just in case it helps; DNS resolution no longer working could also be related to docker/machine#1857

@lukebennett

This comment has been minimized.

Show comment
Hide comment
@lukebennett

lukebennett Nov 28, 2015

Yep I have this too. I re-reun the following to fix it (even though the rule still shows in iptables without it):

iptables -t nat -A POSTROUTING ! -o docker0 -s 172.17.0.0/16 -j MASQUERADE

Docker 1.9.1 (was previously on 1.8.2) running on Ubuntu 14.10

lukebennett commented Nov 28, 2015

Yep I have this too. I re-reun the following to fix it (even though the rule still shows in iptables without it):

iptables -t nat -A POSTROUTING ! -o docker0 -s 172.17.0.0/16 -j MASQUERADE

Docker 1.9.1 (was previously on 1.8.2) running on Ubuntu 14.10

@nilesh1013

This comment has been minimized.

Show comment
Hide comment
@nilesh1013

nilesh1013 Feb 2, 2016

Guys, I was also facing the same issue on production servers. Somehow iptables were getting flushed daily.

Reason:
The reason was apf (advanced policy firewall) was installed in my ubuntu server and there was a cron daily for it to restart it. When apf get's restart, it flushes all iptable rules

Solution:

set SET_FASTLOAD="1" in /etc/apf/conf.apf and restart apf

what this option does is following (as mentioned in apf file)
The fast load feature makes use of the iptables-save/restore facilities to do
a snapshot save of the current firewall rules on an APF stop then when APF is
instructed to start again it will restore the snapshot. This feature allows
APF to load hundreds of rules back into the firewall without the need to
regenerate every firewall entry.

Took me 2 days to figure it out, I hope it'll help someone. Thanks

nilesh1013 commented Feb 2, 2016

Guys, I was also facing the same issue on production servers. Somehow iptables were getting flushed daily.

Reason:
The reason was apf (advanced policy firewall) was installed in my ubuntu server and there was a cron daily for it to restart it. When apf get's restart, it flushes all iptable rules

Solution:

set SET_FASTLOAD="1" in /etc/apf/conf.apf and restart apf

what this option does is following (as mentioned in apf file)
The fast load feature makes use of the iptables-save/restore facilities to do
a snapshot save of the current firewall rules on an APF stop then when APF is
instructed to start again it will restore the snapshot. This feature allows
APF to load hundreds of rules back into the firewall without the need to
regenerate every firewall entry.

Took me 2 days to figure it out, I hope it'll help someone. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment