Cannot connect from Docker containers to the outside #1936

Closed
pietervisser opened this Issue Apr 26, 2017 · 8 comments

Comments

Projects
None yet
6 participants
@pietervisser

Issue Report

Bug

Since the update to 1353.6.0 we experience network issues when trying to ping from a docker container to the outside.

Container Linux Version

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1353.6.0
VERSION_ID=1353.6.0
BUILD_ID=2017-04-25-0215
PRETTY_NAME="Container Linux by CoreOS 1353.6.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"

Environment

Google Cloud

Expected Behavior

Pinging to 8.8.8.8 from a Docker container in a docker network should result in no pakket loss.

Actual Behavior

When creating a new docker network and trying to connect to the world, sometimes a network is created in which no connections are possible to the outside.

  1. Create a docker network
  2. Run a container and ping to the outside
  3. Remove the network

Repeat this several times, and sometimes 100% packet loss.

docker network create foo > /dev/null; docker run --rm --net foo busybox ping -c 1 -w 1 -q 8.8.8.8; docker network rm foo > /dev/null

@pietervisser pietervisser changed the title from Random network issues with Docker network to Cannot connect from Docker containers to the outside Apr 26, 2017

@lucab

This comment has been minimized.

Show comment
Hide comment
@lucab

lucab Apr 26, 2017

Member

@pietervisser I guess you are running the above command in a while loop. Can you please check if you can reproduce the same issue with unique network names (e.g. foo$i with a monotical index, instead of foo)? I know there are some potential races in network creation and I'm not sure if you are hitting that or something else.

Member

lucab commented Apr 26, 2017

@pietervisser I guess you are running the above command in a while loop. Can you please check if you can reproduce the same issue with unique network names (e.g. foo$i with a monotical index, instead of foo)? I know there are some potential races in network creation and I'm not sure if you are hitting that or something else.

@pietervisser

This comment has been minimized.

Show comment
Hide comment
@pietervisser

pietervisser Apr 26, 2017

@lucab, thanks but no I'm not running it in a while loop. Executing this just a couple of times manually will result in connection issues. To be sure, I used unique networks and can still reproduce this issue. Use could use this loop to reproduce the issue.

for i in {1..10}; do docker network create foo$i > /dev/null; docker run --rm --net foo$i busybox ping -c 1 -w 1 -q 8.8.8.8; docker network rm foo$i > /dev/null; done

pietervisser commented Apr 26, 2017

@lucab, thanks but no I'm not running it in a while loop. Executing this just a couple of times manually will result in connection issues. To be sure, I used unique networks and can still reproduce this issue. Use could use this loop to reproduce the issue.

for i in {1..10}; do docker network create foo$i > /dev/null; docker run --rm --net foo$i busybox ping -c 1 -w 1 -q 8.8.8.8; docker network rm foo$i > /dev/null; done

@bsphere

This comment has been minimized.

Show comment
Hide comment
@bsphere

bsphere Apr 26, 2017

we also experience the same thing on AWS.

re-running ping multiple times (inside a user create network container) sometimes work and sometimes not. i don't have to re-create the network..

this update screwed up an ~40 nodes nomadproject cluster for us.

bsphere commented Apr 26, 2017

we also experience the same thing on AWS.

re-running ping multiple times (inside a user create network container) sometimes work and sometimes not. i don't have to re-create the network..

this update screwed up an ~40 nodes nomadproject cluster for us.

@dm0-

This comment has been minimized.

Show comment
Hide comment
@dm0-

dm0- Apr 26, 2017

Member

Can you try this and see if it makes a difference? Copy /usr/lib/systemd/network/50-docker.network into /etc/systemd/network, and change the Match=docker* line to Match=docker* br-*. Then run sudo systemctl restart systemd-networkd. Does that fix the issues?

Member

dm0- commented Apr 26, 2017

Can you try this and see if it makes a difference? Copy /usr/lib/systemd/network/50-docker.network into /etc/systemd/network, and change the Match=docker* line to Match=docker* br-*. Then run sudo systemctl restart systemd-networkd. Does that fix the issues?

@bsphere

This comment has been minimized.

Show comment
Hide comment
@bsphere

bsphere Apr 26, 2017

@dm0- seems like it does make a difference, but I had to reboot for this to work.

bsphere commented Apr 26, 2017

@dm0- seems like it does make a difference, but I had to reboot for this to work.

@dm0-

This comment has been minimized.

Show comment
Hide comment
@dm0-

dm0- Apr 26, 2017

Member

We'll build a new stable with that fix which should be released over the next day.

Member

dm0- commented Apr 26, 2017

We'll build a new stable with that fix which should be released over the next day.

@euank

This comment has been minimized.

Show comment
Hide comment
@euank

euank Apr 26, 2017

Member

To clarify the actual issue: this is basically a redux of #1554, but for the bridge interfaces docker network creates rather than the default docker0 interface.

I'm not sure the exact root cause. I can't reproduce this on the old stable and the docker version there was identical.
It seems like sometihng about networkd in systemd v233 changed which caused it to break these bridges, but I'm not totally sure!

We'll add a test for docker network create bridges to make sure this doesn't regress again; for the previous issue we were only testing it was fixed on docker0 and so missed this.

Thanks for reporting.

Member

euank commented Apr 26, 2017

To clarify the actual issue: this is basically a redux of #1554, but for the bridge interfaces docker network creates rather than the default docker0 interface.

I'm not sure the exact root cause. I can't reproduce this on the old stable and the docker version there was identical.
It seems like sometihng about networkd in systemd v233 changed which caused it to break these bridges, but I'm not totally sure!

We'll add a test for docker network create bridges to make sure this doesn't regress again; for the previous issue we were only testing it was fixed on docker0 and so missed this.

Thanks for reporting.

@polygox

This comment has been minimized.

Show comment
Hide comment
@polygox

polygox Apr 28, 2017

I am experiencing network problems with the current version 1353.7.0 which did not occur before (at least not before 1353.6.0, I am not sure when this happened first).

I use docker-compose for starting some apps and a proxy.
Two bridge networks are defined in the compose file.

When the coreos server is restarted (or the networks are removed), the apps are accessible from the outside when using docker-compose up (networks are created in this case). But when stopping all app/proxy-containers and starting them again, they are not accessible any more. Could this be related to this issue?

polygox commented Apr 28, 2017

I am experiencing network problems with the current version 1353.7.0 which did not occur before (at least not before 1353.6.0, I am not sure when this happened first).

I use docker-compose for starting some apps and a proxy.
Two bridge networks are defined in the compose file.

When the coreos server is restarted (or the networks are removed), the apps are accessible from the outside when using docker-compose up (networks are created in this case). But when stopping all app/proxy-containers and starting them again, they are not accessible any more. Could this be related to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment