New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop/rm container did not remove its network namespace after running docker network disconnect cmd to diconnect bridge network #31597

Closed
ryanlyy opened this Issue Mar 7, 2017 · 15 comments

Comments

Projects
None yet
4 participants
@ryanlyy
Copy link

ryanlyy commented Mar 7, 2017


BUG REPORT INFORMATION

Description
in multiple network interface environment, after running docker network disconnect to disconnect on NIC and stop container, the IP address on other NIC still is reachable. Frankly at this time, this container network namespace should be disappear.

Steps to reproduce the issue:

  1. start container: docker run -dt centos bash
  2. create ipvlan device in host: ip link add ipvlan3 link vxlan_oam type ipvlan mode l2 (here vxlan_oam is vxlan device)
  3. cd /var/run/netns
  4. Get PID of that container
  5. ln -s /proc/PID/ns/net /var/run/netns/PID
  6. Add ipvlan3 to container as following: ip link set dev ipvlan3 netns 7973
[root@cluster-misc netns]# ip netns exec 7973 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
32: eth0@if33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link
       valid_lft forever preferred_lft forever
34: ipvlan3@if9: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default qlen 1000
    link/ether b6:0e:e8:51:2c:c9 brd ff:ff:ff:ff:ff:ff
[root@cluster-misc netns]#
  1. Add ip addr to ipvlan3 in container: ip netns exec PID ip addr add 10.100.10.3/24 dev ipvlan3
  2. Bring up ipvlan3 in container: ip netns exec PID ip link set ipvlan3 up
  3. Add another netns: ip netns add ns001
  4. Create another ipvlan device on host: ip link add ipvlan1 link vxlan_oam type ipvlan mode l2
  5. Add ipvlan1 to ns001: ip link set dev ipvlan1 netns ns001
  6. Add ip addr to ipvlan1 in ns001: ip netns exec ns001 ip addr add 10.100.10.1/24 dev ipvlan1
  7. Bring ipvlan1 in ns001 up: ip netns exec ns001 ip link set ipvlan1 up
  8. ping 10.100.10.3 well from ns001: ip netns exec ns001 ping 10.100.10.3
  9. for now: disconnect eth0 from bridge: docker network disconnect bridge container_id
  10. docker stop container_id; docker rm container_id
  11. step 14 still work: i.e: ping still work. it is not reasonable because docker stopped and deleted.

Describe the results you received:
device and ip address in deleted container is still reachable.

Describe the results you expected:
container network namespace shall clear all information when container stop and deleted including device and ip address

Additional information you deem important (e.g. issue happens only occasionally):
reproducable always

Output of docker version:

[root@cluster-misc netns]# docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:
 OS/Arch:      linux/amd64

Output of docker info:

[root@cluster-misc netns]# docker info
Containers: 5
 Running: 0
 Paused: 0
 Stopped: 5
Images: 58
Server Version: 1.12.1
Storage Driver: devicemapper
 Pool Name: docker-253:0-25813446-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 3.695 GB
 Data Space Total: 107.4 GB
 Data Space Available: 4.699 GB
 Metadata Space Used: 5.923 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.142 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.107-RHEL7 (2015-10-14)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.11-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 2.937 GiB
Name: cluster-misc
ID: NXSJ:RTDK:LQRT:JW6W:IUNX:FMSL:IPL3:KIVY:TEYA:WVHY:ITNK:5OXK
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://10.0.2.15:6699/
Https Proxy: https://10.0.2.15:6699/
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
 127.0.0.0/8
[root@cluster-misc netns]#

Additional environment details (AWS, VirtualBox, physical, etc.):
this issue is fired in all VirtualBox, physical and VM platform

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Mar 7, 2017

ping @mavenugo @aboch PTAL

@aboch

This comment has been minimized.

Copy link
Contributor

aboch commented Mar 7, 2017

@ryanlyy
The container network namespace is being removed by the garbage collection logic in between a minute time frame.
https://github.com/docker/libnetwork/blob/master/osl/namespace_linux.go#L36

@aboch

This comment has been minimized.

Copy link
Contributor

aboch commented Mar 7, 2017

Also, on a regular container stop or network disconnect, the respective containers interfaces programmed by docker are removed right away.
You issue is that you are manually adding interfaces into the container netns. Docker is not aware of those and won't remove them for you once you stop the container.

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 8, 2017

@aboch Thanks for your answer. I understood docker can do garbage collection for network which docker created. I did another test: if I don't run: docker network disconnect, then I can't ping that IP address on NIC I added manually after stop docker. So although that NIC is not added by docker, it can clear it too if I don't run docker network disconnect. So I think docker network disconnect leads to something wrong.

Thansk,
Ryan

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 13, 2017

Hi All,

I analyzed docker stop code, seems there is possible issue: Please let me know your comments:

L740 does return if len(settings) == 0. it is not correct for example: in my test case. (frankly, this case is implemented using kerbernets with network plugin).

When L741 return, L758 is not run (i.e: sandbox is not deleted for this container). It is really garbage. docker should remote sandbox whatever if there is network or not.

736 sid := container.NetworkSettings.SandboxID
737 settings := container.NetworkSettings.Networks
738 container.NetworkSettings.Ports = nil
739
740 if sid == "" || len(settings) == 0 {
741 return
742 }
743
744 var networks []libnetwork.Network
745 for n, epSettings := range settings {
746 if nw, err := daemon.FindNetwork(n); err == nil {
747 networks = append(networks, nw)
748 }
749 cleanOperationalData(epSettings)
750 }
751
752 sb, err := daemon.netController.SandboxByID(sid)
753 if err != nil {
754 logrus.Warnf("error locating sandbox id %s: %v", sid, err)
755 return
756 }
757
758 if err := sb.Delete(); err != nil {
759 logrus.Errorf("Error deleting sandbox id %s for container %s: %v", sid, container.ID, err)
760 }

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 13, 2017

Remove L740-742 should be ok. then setting is empty and networks is empty too. then L745 and L762 is not executed.

762 for _, nw := range networks {
763 attributes := map[string]string{
764 "container": container.ID,
765 }
766 daemon.LogNetworkEventWithAttributes(nw, "disconnect", attributes)
767 }
768 }

@aboch

This comment has been minimized.

Copy link
Contributor

aboch commented Mar 13, 2017

Thanks @ryanlyy for looking into the code.

The check at L740 means the network sandbox was not created for this container or that the container was not connected to any network.

Checking on the sandbox missing sid =="" is correct.

The sandbox would not be created when the container is using another container network
docker run --network container:<id> ...
in that case a sandbox is not created for it, a no network connection may exist, so we must not try to delete one when the container stops.

But regarding the check on len(settings) == 0 I think you are quite right.
That is wrong for exactly the use case you are explaining: When a container is detached from all the networks, and then stopped.

Thanks for finding this @ryanlyy. Please push a PR.

And this is my commit that broke it: 2bb3fc1#diff-0f20873a38571444bac38770160648a5R966

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 13, 2017

@aboch, I agree with you. yes. keep sid == "" return and check len(seetings) to clean network.

By the way, How do I push a PR?

Thanks,
Ryan

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Mar 13, 2017

@ryanlyy you can find more information about contributing in our contributors guide; https://docs.docker.com/opensource/code/

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 14, 2017

@thaJeztah thanks a lot. I will investigate it how I can contribute it.

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 16, 2017

#dibs

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 16, 2017

@thaJeztah today i did code change and did make test but it failed at the following, While I did build it and replace my old docker* and it works well. I can docker run/stop etc action. And I believe my code works because I added some debugging log. do you know what is wrong with integration?

---> Making bundle: .integration-daemon-start (in bundles/17.05.0-dev/test-integration-cli)
INFO: Waiting for daemon to start...
+++ exec dockerd --debug --host unix:///go/src/github.com/docker/docker/bundles/17.05.0-dev/test-integration-cli/docker.sock --storage-driver devicemapper --pidfile bundles/17.05.0-dev/test-integration-cli/docker.pid --userland-proxy=true
...........................................................
error: daemon at unix:///go/src/github.com/docker/docker/bundles/17.05.0-dev/test-integration-cli/docker.sock fails to 'docker version':
Client:
Version: 17.05.0-dev
API version: 1.29
Go version: go1.7.5
Git commit: 3e04fb9
Built: Wed Mar 15 09:41:50 2017
OS/Arch: linux/amd64
Cannot connect to the Docker daemon at unix:///go/src/github.com/docker/docker/bundles/17.05.0-dev/test-integration-cli/docker.sock. Is the docker daemon running?
---> Making bundle: .integration-daemon-stop (in bundles/17.05.0-dev/test-integration-cli)
make: *** [test] Error 1

@ryanlyy

This comment has been minimized.

Copy link

ryanlyy commented Mar 17, 2017

@thaJeztah can we use VirtualBox to do "make test"?

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Mar 17, 2017

Hm possibly it's because it's trying to use devicemapper inside the development container (which won't work); you can override the graph driver to use using the DOCKER_GRAPHDRIVER variable; setting it to vfs should always work (but will be slow). You can set TESTFLAGS to run only a limited set of integration tests. For example;

make DOCKER_GRAPHDRIVER=vfs TESTFLAGS='-check.f DockerSuite.TestContainerAPIRestart*' binary test-integration-cli

To run just the tests starting with TestContainerAPIRestart*

@aboch

This comment has been minimized.

Copy link
Contributor

aboch commented Mar 22, 2017

Hi @ryanlyy there is another issue which needs to be fixed soon. But the fix for it requires your fix as well.
So I will go ahead and include your commit in my PR. PTAL at #31996. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment