No network connectivity in some docker containers after upgrade to 1153.0.0 #1554
Comments
I was able to reproduce this with Docker 1.11.2 on Linux 4.7.1 and 4.6.3, but was unable to reproduce with Docker 1.10.3. |
@crawford did you reproduce generically? or only in QEMU/KVM? |
@bryanlatten I've been reproducing it with QEMU. I haven't tried other platforms. |
I have the same issue docker doesn't attach veth interface to docker0 bridge. Restarting daemon helps and manually attaching interface by running docker info:
os-release:
|
Linking moby/moby#26492 |
This is what I have observed so far from trying to test and narrow it down to a problematic component: Docker containers' network links randomly fail to have their master set. This happens with Docker in CoreOS alpha and beta. The It continues to fail when booting a kernel from stable and the user space from alpha or beta. It does not fail with alpha or beta kernels and stable user spaces. It fails whether Docker is built with Go 1.6 or 1.7. It fails with all Project Atomic patches applied. It fails when patching libnetwork to just use an |
I'll add that we're experiencing the same issue using CoreOS Beta (1153.4.0) running in AWS:
I used a container that simply |
Proof-of-concept fix is here: dm0-/libnetwork@4343ba4c21f1a121f9e867efda3231a61dc5565e. Waiting for confirmation from upstream. |
I believe I have a (rather unfortunate) workaround for people who can't run a patched Docker: stop/mask |
This was fixed with coreos/coreos-overlay@874c1b8 and coreos/docker#29 and should roll out in the next Alpha. Assuming nothing goes wrong, we'll backport this to Docker 1.11.2 in Beta in the coming weeks. |
This is now available in Stable. /cc @bryanlatten |
I'm interested why you don't modify the systemd-networkd config to avoid matching on these interfaces? |
Well, it's not possible to exactly specify what you don't want, but you could write some rules like 'Name=eth*' to match what you do want. Maybe too hard to cover all bases? |
There isn't a way to match all ethernet devices. The names use the persistent naming scheme (so they won't be |
ATM this seems to be a regression with CoreOS stable 1122.3.0 -> 1185.3.0 breaking Weave for some users. Should we have a separate issue to track that somewhere? |
I note that systemd/systemd#4228 has now been merged and the upstream Docker PR moby/libnetwork#1450 was rejected. Do you have a CoreOS issue to make use of the new systemd feature? |
Issue Report
Bug
After upgrading from stable (1068.10.0) to alpha (1153.0.0), some freshly submitted fleet services start containers that do not have network connectivity. From within the container I cannot ping the docker bridge (default gateway address) or any other containers or host interfaces. A simple "docker restart " resolves the issue.
CoreOS Version
Environment
QEMU/KVM on Openstack
Expected Behavior
All docker containers have network connectivity, as it has always been in the past.
Actual Behavior
A few containers cannot reach any hosts, not even able to ping the default gateway (docker0).
Reproduction Steps
Other Information
This happens across different docker images that are based on different distributions, so I don't think it is related to the image.
docker inspect
shows no difference before (when networking is down) and after a restart of the container (when networking works again).I'm passing these options to docker:
The text was updated successfully, but these errors were encountered: