Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting container to multiple bridge networks breaks port forwarding from external IPs #21741

Closed
bmerry opened this issue Apr 4, 2016 · 19 comments

Comments

@bmerry
Copy link

bmerry commented Apr 4, 2016

It may just be a problem on my machine, but it seems that creating a container that sits on two networks is somehow interfering with port forwarding. The forwarding works when accessing 127.0.01, but not when accessing the IP address for another interface.

Output of docker version:

Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 68
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 304
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-83-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.36 GiB
Name: kryton
ID: MFLN:XRFL:372N:VPKR:3AWK:USXB:3Q3E:EAH2:66ZK:T5U2:NEOP:TSCH
Username: bmerry
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.):

Physical machine

Steps to reproduce the issue:

  1. Create a docker-compose.yml file with the following content
version: "2"
services:
    server:
        image: nginx:1.9.12
        networks:
            - front
            - back
        ports:
            - "8080:80"
networks:
    front:
    back:

In actual use, there would be other services connect to the back network but not the front network, but they're not necessary to demonstrate the bug.
2. With docker-compose 1.6.2, run docker-compose up.
3. From the host, run curl http://localhost:8080.
4. From the host, run curl http://IPADDRESS:8080, where IPADDRESS is an IP address of a non-local interface on the machine.

Describe the results you received:
Step 3 spits out an HTML page from nginx. Step 4 outputs nothing and eventually times out.

Describe the results you expected:
Step 4 should return the same HTML page as step 3.

Additional information you deem important (e.g. issue happens only occasionally):
I'm not running any other firewall software on this machine. If I remove the config line putting the service on the back network, then the problem disappears. Similarly, if I run docker network disconnect to disconnect the container from the back network, the problem disappears, and reconnecting it makes the problem come back.

Output of iptables -vnL:

Chain INPUT (policy ACCEPT 4912 packets, 2311K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER     all  --  *      br-b81344fadd68  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      br-b81344fadd68  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  br-b81344fadd68 !br-b81344fadd68  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  br-b81344fadd68 br-b81344fadd68  0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER     all  --  *      br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  br-a4d09867c7ea !br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  br-a4d09867c7ea br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 4538 packets, 674K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (3 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:5000
    0     0 ACCEPT     tcp  --  !br-a4d09867c7ea br-a4d09867c7ea  0.0.0.0/0            172.18.0.2           tcp dpt:80

Chain DOCKER-ISOLATION (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  br-a4d09867c7ea br-b81344fadd68  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  br-b81344fadd68 br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  docker0 br-b81344fadd68  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  br-b81344fadd68 docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  docker0 br-a4d09867c7ea  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  br-a4d09867c7ea docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Output of iptables -t nat -vnL

hain PREROUTING (policy ACCEPT 8 packets, 536 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 2 packets, 272 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 528 packets, 34127 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    1    60 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 529 packets, 34187 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  all  --  *      !br-b81344fadd68  172.19.0.0/16        0.0.0.0/0           
    0     0 MASQUERADE  all  --  *      !br-a4d09867c7ea  172.18.0.0/16        0.0.0.0/0           
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           
    0     0 MASQUERADE  tcp  --  *      *       172.17.0.2           172.17.0.2           tcp dpt:5000
    0     0 MASQUERADE  tcp  --  *      *       172.18.0.2           172.18.0.2           tcp dpt:80

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  br-b81344fadd68 *       0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  br-a4d09867c7ea *       0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:5000 to:172.17.0.2:5000
    1    60 DNAT       tcp  --  !br-a4d09867c7ea *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.18.0.2:80
@aboch
Copy link
Contributor

aboch commented Apr 4, 2016

@bmerry

I am trying to create a Docker network that allows my containers to talk to each other, but which keeps the backend containers completely isolated from the host and the internet (in both directions)

It seems what you need is the ability to define the back network as internal as you would do if you created it manually via docker network create --internal back.
But I think this is not supported by compose yet.
As a work-around, can you try to specify as diver option: com.docker.network.internal, it may work.

@bmerry bmerry changed the title enable_ip_masquerade: false breaks external port forwarding on other networks Connecting container to multiple bridge networks breaks port forwarding from external IPs Apr 4, 2016
@bmerry
Copy link
Author

bmerry commented Apr 4, 2016

It now seems that it's not he enable_ip_masquerade that's a problem: even two networks without any driver options seems to be triggering the problem. I've updated the bug to reflect this.

I'm slightly suspicious that maybe this is due to something sticky (since I could have sworn it worked when I first tried it), but the behaviour is persisting over deleting the networks and containers and rebooting the machine. It would be great if someone could confirm the behaviour I'm seeing.

@bmerry
Copy link
Author

bmerry commented Apr 4, 2016

Running the daemon with --userland-proxy=false seems to fix the problem.

@josephearl
Copy link

Having the same problem. In my case (running on CentOS 7 on AWS) --userland-proxy=false did not solve the issue.
My daemon options:

[Service]
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --userland-proxy=false --storage-driver devicemapper --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=amazonec2 
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
Environment=

@oben59
Copy link

oben59 commented Apr 11, 2016

I also have the same issue. I found out that it seems to be an intermittent problem as restarting the compose project many times finally made the port forwarding working.
I'm on Ubuntu 15.10.

@railnet
Copy link

railnet commented Apr 13, 2016

Same issue for me too.
It seems the port mapping is applied ramdomly to one of the two IP addresses of the cantainer.
See below my docker-compose.yml file and the two tests done with the related iptables content taken at the host.
It seems clear that my system works fine only when a specific network interface of the container named "forwarder"
is used for port mapping at the host (ref. 172.18.0.1 vs 170.20.0.1).

Workaround com.docker.network.internal doesn't solve with my compose version?

Any suggestion?
Any inconsistent configuration in my docker-compose.yml?

Proposal for next compose release in key "ports": specify ports HOST_PORT:CONTAINER_NETWORK:CONTAINER_PORT too.
i.e. in my case
ports:
- "4189:front:9999"

docker info
Server Version: 1.10.3
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 3.13.0-68-generic
Operating System: Ubuntu 14.04.3 LTS

docker-compose version 1.7.0rc1, build 1ad8866

docker-compose.yml file:

version: '2'

services:
inter_container1:
... it has the gw to 170.20.0.1
networks:
back:
ipv4_address: 170.20.0.11

inter_container2:
... ... it has the gw to 170.20.0.1
    networks:
    back:
        ipv4_address: 170.20.0.12

forwarder:
    image: ...
    ...
    hostname: Forwarder
    networks:
    front:
        ipv4_address: 172.18.0.1
    back:
        ipv4_address: 170.20.0.1
    ports:
     - "4189:9999"
     - "9811:9181"
     - "9812:9281"
     - "9819:9981"
     - "9501:9105"
     - "9502:9205"
     - "9509:9905"

networks:
front:
driver: bridge
driver_opts:
com.docker.network.bridge.host_binding_ipv4: "172.18.0.254"
com.docker.network.bridge.enable_ip_masquerade: "true"

    ipam:
      driver: default
      config:
      - subnet: 172.18.0.0/24
        gateway: 172.18.0.254
back:
    driver: bridge
    driver_opts:
        com.docker.network.bridge.enable_ip_masquerade: "false"
        com.docker.network.internal: "true"
    ipam:
      driver: default
      config:
      - subnet: 170.20.0.0/24
        gateway: 170.20.0.254     

Test1 (when it is broken):

>sudo iptables -t nat -L

Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.18.0.0/24        anywhere            
MASQUERADE  all  --  <my_company_net>     anywhere            
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9981
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9905
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9999
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9281
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9181
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9205
MASQUERADE  tcp  --  170.20.0.1           170.20.0.1           tcp dpt:9105

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9819 to:170.20.0.1:9981
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9509 to:170.20.0.1:9905
DNAT       tcp  --  anywhere             anywhere             tcp dpt:4189 to:170.20.0.1:9999
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9812 to:170.20.0.1:9281
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9811 to:170.20.0.1:9181
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9502 to:170.20.0.1:9205
DNAT       tcp  --  anywhere             anywhere             tcp dpt:9501 to:170.20.0.1:9105

Test2 (when it works)

>sudo iptables -t nat -L

Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.18.0.0/24        anywhere            
MASQUERADE  all  --  <my_company_net>     anywhere            
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9981
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9281
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9999
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9905
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9205
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9181
MASQUERADE  tcp  --  172.18.0.1           172.18.0.1           tcp dpt:9105

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9819 to:172.18.0.1:9981
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9812 to:172.18.0.1:9281
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:4189 to:172.18.0.1:9999
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9509 to:172.18.0.1:9905
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9502 to:172.18.0.1:9205
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9811 to:172.18.0.1:9181
DNAT       tcp  --  anywhere             172.18.0.254         tcp dpt:9501 to:172.18.0.1:9105

@choobs-dev
Copy link

As @railnet pointed out I have the same issue (docker/compose#3318).

@railnet
Copy link

railnet commented Apr 13, 2016

It seems to me an active discussion on similar symptoms is active here too:
docker/compose#3055
@ubi-US @dnephin @mavenugo @linfan @darkermatter @bmerry @aboch @josephearl @oben59 @choobs-dev

May you confirm I'm right?
I'm available for possible further investigation in order to support you in the fix.

@railnet
Copy link

railnet commented Apr 13, 2016

I just did a test by following the approach here below reported, but it doesn't solve.

As reported here:
https://docs.docker.com/v1.7/articles/networking/#binding-ports
... "Or if you always want Docker port forwards to bind to one specific IP address, you can edit your system-wide Docker server settings and add the option --ip=IP_ADDRESS. Remember to restart your Docker server after editing this setting."

Then, I have configured --ip=172.18.0.1 , I restarted the Docker daemon, but the iptables in the host are not coherent.
Ref.
...
DNAT tcp -- anywhere anywhere tcp dpt:9819 to:170.20.0.1:9981
...

Refer to my previous post for more details.
Note that the issue regards a compose scenario.

@bmerry
Copy link
Author

bmerry commented Apr 14, 2016

@railnet I've not actually tried it, but my understanding of --ip is that it specifies a host IP address to bind i.e., only external connections coming in through that host interface will be forwarded.

@choobs-dev
Copy link

Just tried with docker version 1.11.0 and now it works. Will be doing more testing to make sure it's not transient.

@railnet
Copy link

railnet commented Apr 14, 2016

Hi @choobs-dev unfortunately I just verified the issue is still present with the following versions:
docker version
Client:
Version: 1.11.0
API version: 1.23
Go version: go1.5.4
Git commit: 4dc5990
Built: Wed Apr 13 18:34:23 2016
OS/Arch: linux/amd64

Server:
Version: 1.11.0
API version: 1.23
Go version: go1.5.4
Git commit: 4dc5990
Built: Wed Apr 13 18:34:23 2016
OS/Arch: linux/amd64

docker-compose version 1.7.0, build 0d7bf73

@aboch
Copy link
Contributor

aboch commented Apr 14, 2016

I am not sure about the specific issue you are hitting and a work-around in the context of compose.
I am adding some info, history about the port-mapping and default gw behavior. Hope it helps.

When a container connects to multiple networks providing external connectivity (ex. more than one bridge network), the container's default gateway is dynamically chosen among the ones provided by the networks based on a certain rule.

The programming of the port-mapping has to be coherent with the container's chosen default gateway, therefore it needs to change if the default gateway changes.

1.11 finally respects this.
1.10 instead did not have the needed changes yet, so it had a workaround to perform the port-mapping programming only for the network specified during docker run command and to not switch the container's default gateway on new network connect.

@railnet
Copy link

railnet commented Apr 14, 2016

Hi @aboch ,
first of all thanks for the info provided.
As the specific container has two networks connected, then the 2 gw for those 2 networks have to be configured.
For the "front" network, it is easy to say that the gw will be the interface towards the host.
For the "back" network, we wont the gw be the conatiner itsself - and then we configured the container ip interface in the back network as back-gw.
To be honest, in my scenario I'm configuring the specific container's default gateway for back network via explicit command sent to it during the "docker-compose" launch (ref, the entry "command:" in the docker-compose.yml file). This approach was taken by me because I didn't find an alternative solution; in other words, how I can configure the container's default gateway in the docker-compose.yml file ?
May you clarify your sentence "the container's default gateway is dynamically chosen among the ones provided by the networks based on a certain rule" ?

Anyway, it is not clear to me the principle used by docker 1.11 as to choose the port mapping.
Why it is choosing the "back" network of my container and not the "front" one?
Do you mean that if a container becomes the gw of a specific network, then Docker choose the container interface attached of that network in order to assign the port mapping?
How I can "prepare" a container that has to work as front-end in that case?

Many Thanks in advance.

@choobs-dev included in this post too, because we are discussing on "compose" and he opened an issue on it.
Bye

@aboch
Copy link
Contributor

aboch commented Apr 14, 2016

@railnet

May you clarify your sentence "the container's default gateway is dynamically chosen among the ones provided by the networks based on a certain rule" ?

From the container perspective, being connected to multiple networks is like a server with multiple NICs. In such case you end up with multiple default gateways in the host routing table but only one will be used, the one on top of the table. Same end result for container where libnetwork makes sure only one default gw is programmed.

May you clarify your sentence "the container's default gateway is dynamically chosen among the ones provided by the networks based on a certain rule" ?

libnetwork chooses the default gateway for the container based on the priority associated to the network attachment point (Endpoint). Given the UI does not provide yet a way for user to set this priority, we fall-back to default logic which is choosing the first network in alphabetically order... That is why it is choosing the back versus the front network.

Anyway, it is not clear to me the principle used by docker 1.11 as to choose the port mapping.

Based on the current container's default gateway

To be honest, in my scenario I'm configuring the specific container's default gateway for back network

I am not sure I understand this. You can configure multiple routes, but the default is the default (0.0.0.0/0). And if more default routes are specified, then only the one on top will be used. And the order is dictated by their metric, the lowest the first.

Do you mean that if a container becomes the gw of a specific network

A container does never become the gateway for a network. The network has its own gateway (for bridge network is the bridge interface). When you connect a container to a network, the network driver provides libnetwork with the default gateway to be programmed for the container. In most cases, the network gateway is the one returned, unless you play with driver options and are telling the bridge driver to return a custom chosen IP.

How I can "prepare" a container that has to work as front-end in that case?

Please try play with the network names. 0front, 1back as an example.

@railnet
Copy link

railnet commented Apr 14, 2016

Hi @aboch
my previous post was liable to misinterpretations but your intro was clear, sharable and
it highlighed a perfect alignment in the expected goal.
No comments about the explanation.

The core of the post is obviously the sentence "we fall-back to default logic which is choosing
the first network in alphabetically order
".
Lack of knowledge of this part was the root cause of my failure.

NOw I can share that it solves for me and then the issue can be closed from my point of view.
Your post was a guiding light for me. :-)
Just verified the new configuration and it works as expected.
docker-compose version 1.7.0
docker engine 1.11.0

Issue solved.

Have a great time

@aboch
Copy link
Contributor

aboch commented Apr 14, 2016

@railnet
Thank you for verifying ! That's a very good news we could get you unblocked.

The default gateway pick up logic has been in place since docker supports container connecting to multiple networks. Clearly this has not been spelled out enough, even though there are some issue opened in docker/docker and libnetwork. So the blame is on us. We must specify it in the documentation if not already present.

@bmerry
Copy link
Author

bmerry commented Apr 18, 2016

I can confirm that listing networks in lexicographical order in docker-compose.yml with 1.10.3 and Compose 1.6.2 works. Since @aboch has posted #22086 to address the documentation issue, I'm going to close this.

@bmerry bmerry closed this as completed Apr 18, 2016
@dustymabe
Copy link

This is my attempt at an explanation of this problem:
http://dustymabe.com/2016/05/25/non-deterministic-docker-networking-and-source-based-ip-routing/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants