could not add veth pair inside the network sandbox: could not find an appropriate master "bridged563c27" for "vethac2aa6d" #25039

Open
jschunlei opened this Issue Jul 26, 2016 · 12 comments

Projects

None yet

9 participants

@jschunlei
jschunlei commented Jul 26, 2016 edited

[root@docker3 ~]# docker version

Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:25:01 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:25:01 UTC 2016
 OS/Arch:      linux/amd64

Additional environment : VirtualBox

Steps to reproduce the issue:

  1. my command is docker run -d --net=vlan100 10.10.114.162:5000/tomcat_root:7.0.68
  2. vlan100 is overlay driver
  3. the container can't start, return error: Cannot start container de093f4b3302d470f55f510f67ce43588fc9b8a3c69bf0564d31e47b15c7a41d: could not add veth pair inside the network sandbox: could not find an appropriate master \"bridged563c27\" for \"veth6e7b39b\"
  4. I find the source code about the error : https://github.com/docker/libnetwork/blob/master/osl/interface_linux.go ; but i can't understand the source code
  5. After i execute the command systemctl restart docker, the command docker run -d --net=vlan100 10.10.114.162:5000/tomcat_root:7.0.68 can execute

let us look the docker log:

Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.902868257+08:00" level=info msg="GET /v1.21/containers/10.10.114.162:5000/tomcat_root:7.0.68/json"
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.903236728+08:00" level=error msg="Handler for GET /v1.21/containers/10.10.114.162:5000/tomcat_root:7.0.68/json returned error: no such id: 10.10.114.162:5000/tomcat_root:7.0.68"
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.903287269+08:00" level=error msg="HTTP Error" err="no such id: 10.10.114.162:5000/tomcat_root:7.0.68" statusCode=404
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.903970040+08:00" level=info msg="GET /v1.21/images/10.10.114.162:5000/tomcat_root:7.0.68/json"
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.925476848+08:00" level=info msg="GET /v1.21/containers/10.10.114.162:5000/tomcat_root:7.0.68/json"
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.926012473+08:00" level=error msg="Handler for GET /v1.21/containers/10.10.114.162:5000/tomcat_root:7.0.68/json returned error: no such id: 10.10.114.162:5000/tomcat_root:7.0.68"
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.926085631+08:00" level=error msg="HTTP Error" err="no such id: 10.10.114.162:5000/tomcat_root:7.0.68" statusCode=404
Jul 21 11:08:52 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:52.926828006+08:00" level=info msg="GET /v1.21/images/10.10.114.162:5000/tomcat_root:7.0.68/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.103304398+08:00" level=info msg="POST /v1.21/containers/create?name=mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.150929182+08:00" level=info msg="GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.151967711+08:00" level=info msg="GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.152221242+08:00" level=error msg="Handler for GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3/json returned error: no such id: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.152283467+08:00" level=error msg="HTTP Error" err="no such id: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3" statusCode=404
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.152736487+08:00" level=info msg="POST /v1.21/containers/create?name=mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.164683359+08:00" level=info msg="GET /v1.21/images/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.165127125+08:00" level=error msg="Handler for GET /v1.21/images/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3/json returned error: No such image: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.165226147+08:00" level=error msg="HTTP Error" err="No such image: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3" statusCode=404
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.201350658+08:00" level=error msg="Handler for GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee/json returned error: Unknown device b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.201481500+08:00" level=error msg="HTTP Error" err="Unknown device b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640" statusCode=500
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.202610084+08:00" level=info msg="GET /v1.21/images/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.202936281+08:00" level=error msg="Handler for GET /v1.21/images/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee/json returned error: No such image: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.203010333+08:00" level=error msg="HTTP Error" err="No such image: mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee" statusCode=404
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.628313028+08:00" level=info msg="POST /v1.21/containers/eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed/attach?stderr=1&stdout=1&stream=1"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.629488780+08:00" level=info msg="POST /v1.21/containers/eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed/start"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.691938104+08:00" level=info msg="POST /v1.21/containers/b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640/attach?stderr=1&stdout=1&stream=1"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.693335453+08:00" level=info msg="POST /v1.21/containers/b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640/start"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.811502802+08:00" level=info msg="GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.47a48329-239a-4cfa-a53b-9c09b7d4caf3/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.908797586+08:00" level=info msg="GET /v1.21/containers/mesos-e60a3b0e-e3c0-4b47-8d9c-0b890b13a696-S4.634c9a4a-211b-4611-a88e-7b3fa08380ee/json"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.922388977+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]"
Jul 21 11:08:53 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:53.922510859+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.013185703+08:00" level=info msg="GET /containers/json?all=0&size=0"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.107527531+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.107634030+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.198995205+08:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /data/hps/hps_install/data/docker/containers/eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed/shm: no such file or directory\nfailed to umount /data/hps/hps_install/data/docker/containers/eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed/mqueue: no such file or directory"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.262301064+08:00" level=error msg="Handler for POST /v1.21/containers/eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed/start returned error: Cannot start container eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed: could not add veth pair inside the network sandbox: could not find an appropriate master \"bridged563c27\" for \"veth4b72cce\""
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.262483721+08:00" level=error msg="HTTP Error" err="Cannot start container eed5033a902578540bb63aca1c592a380645364ba0e5403374c0f6602a513aed: could not add veth pair inside the network sandbox: could not find an appropriate master \"bridged563c27\" for \"veth4b72cce\"" statusCode=500
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.602577178+08:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /data/hps/hps_install/data/docker/containers/b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640/shm: no such file or directory\nfailed to umount /data/hps/hps_install/data/docker/containers/b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640/mqueue: no such file or directory"
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.646335388+08:00" level=error msg="Handler for POST /v1.21/containers/b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640/start returned error: Cannot start container b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640: could not add veth pair inside the network sandbox: could not find an appropriate master \"bridged563c27\" for \"veth2b36284\""
Jul 21 11:08:54 QZ-HPS-FBXT-5 docker[20630]: time="2016-07-21T11:08:54.646547788+08:00" level=error msg="HTTP Error" err=**"Cannot start container b781b7840564735c35e52ffc34a9e957df5c395aef5c3e57c67ce477d1c0f640: could not add veth pair inside the network sandbox: could not find an appropriate master \"bridged563c27\" for \"veth2b36284\"" statusCode=500**

Have anyone can help me! Thanks very much

@thaJeztah
Member

Could you also add the output of docker info? And are you still able to reproduce this on the current release (1.11.2)?

@jschunlei

@thaJeztah I'm sorry, I can't add "docker info",I can't connect the computer.
After docker daemon restarted, no longer appear this kind of phenomenon
May be random,i'm not sure When to appear again, I haven't tested the version of 1.11.2
I fount that it's not stable about docker version 1.9.1
Thank you for your help

@thaJeztah
Member

Thanks! I'll close this issue for now, because we have no way to reproduce currently, but let me know if you run into this again

@thaJeztah thaJeztah closed this Jul 28, 2016
@marcinkowski
marcinkowski commented Aug 2, 2016 edited

The same here.
could not add veth pair inside the network sandbox: could not find an appropriate master \"ov-000102-aahg2\" for \"veth89643e1\"

I have a cluster made of 2 nodes and one manager. I was trying to create a service on node-1:

docker service create --replicas 1 --name mariadb --constraint="node.hostname==node-1" --env MYSQL_ROOT_PASSWORD='mariadb' mariadb:10.1.16

This is an output from "docker info" on manager node:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge host null
Swarm: active
 NodeID: 14xl1yo62s0d6l8baeppkl0u2
 Is Manager: true
 ClusterID: dl5m8ytqi4gn7jqhas0nxx0b9
 Managers: 1
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: 10.111.208.212
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-86-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 992.5 MiB
Name: manager-1
ID: XISA:J5UD:HJYM:IBW4:HODQ:DRPM:XZYB:YLTO:EKCE:TS6G:CQMI:J5XW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

Thanks for help

@thaJeztah
Member

Hm, strange as 1.12 is quite different from the external k/v overlay networks in 1.9.

ping @mrjana any ideas?

@thaJeztah thaJeztah reopened this Aug 2, 2016
@technolo-g

Hi,
We are able to reproduce this using the following setup:

test-compose.sh:

#!/bin/bash
export DOCKER_HOST=tcp://swarmhost:2375

docker-compose -f test-compose.yml up -d
docker-compose -f test-compose.yml scale testnode=40
docker rm -v `docker ps -a | grep busybox | awk '{print $1}'`
docker network rm tmp_overlaytest

test-compose.yml:

version: '2'
networks:
    overlaytest:
        driver: overlay

services:
     testnode:
          image: busybox
          command: /bin/echo
          networks:
               - overlaytest

We have approximately 30 physical hosts behind the swarm (running swarm 1.24). One other thing of note is that if we hit the hosts behind swarm directly, or using constraints to keep them all on the same host, this does not appear to happen. Here is some additional metadata from our test runs:

Docker version:

Client:
 Version:      1.12.0-rc2
 API version:  1.23
 Go version:   go1.6.2
 Git commit:   906eacd
 Built:        Fri Jun 17 20:35:33 2016
 OS/Arch:      darwin/amd64
 Experimental: true

Server:
 Version:      swarm/1.2.4
 API version:  1.22
 Go version:   go1.5.4
 Git commit:   5d5f7f0
 Built:        Thu Jul 28 19:52:54 UTC 2016
 OS/Arch:      linux/amd64

Daemon logs:

time="2016-08-04T13:00:11.201220747-06:00" level=error msg="Handler for POST /v1.23/containers/07258d009b613a0aa43e85ca88358a4a8bc83c102a702b1fd041e6268ca64e10/start returned error: could not add veth pair inside the network sandbox: could not find an appropriate master \"ov-000100-6c624\" for \"vethfd7111a\""
time="2016-08-04T13:00:11.411893486-06:00" level=error msg="Peer add failed in the driver: could not add neigbor entry into the sandbox: could not find the interface with name vx-000100-6c624\n"

Compose error

ERROR: for tmp_testnode_5  Error response from daemon: could not add veth pair inside the network sandbox: could not find an appropriate master "ov-000100-6c624" for "vethfb888c4"

Please let us know if we can provide any additional information.

@thaJeztah thaJeztah added the kind/bug label Aug 4, 2016
@thonatos
thonatos commented Aug 31, 2016 edited

The same here. Solved with :

root@swarm1:~# uname -r
3.13.0-32-generic

root@swarm1:~# apt-get install linux-generic-lts-vivid
root@swarm1:~# reboot

root@swarm1:~# uname -r
3.19.0-69-generic
@zxkane
zxkane commented Sep 13, 2016

Same here. The host is Ubuntu 14.04.5 LTS. Docker engine is 1.12.1.

I have to upgrade kernel to 3.19.0-69-generic mentioned by @thonatos to run containers in swarm mode.

@jareddlc
jareddlc commented Sep 13, 2016 edited

thanks @thonatos can confirm fixed for me:

my specific error was:

starting container failed: could not add veth pair inside the network sandbox: could not find an appropriate master \"ov-000100-1wkbc\" for \"vethee39f9d\"
Description:    Ubuntu 14.04.5 LTS
Release:    14.04
Codename:   trusty

Docker 1.12.0 & 1.12.1

was on 3.13.0-XX-generic
upgraded to 3.19.0-68-generic

and now my containers run

@technolo-g

I can confirm a reboot including the new kernel fixed it for us.

On Tuesday, September 13, 2016, Jared De La Cruz notifications@github.com
wrote:

thanks @thonatos https://github.com/thonatos can confirm fixed for me:

Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty

was on 3.13.0-XX-generic
upgraded to 3.19.0-68-generic

and now my containers run


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#25039 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADXvdBhFn_uEq7zv1aDxVCuQQyn2oNVdks5qpjaegaJpZM4JUu1W
.

@jschunlei
jschunlei commented Sep 13, 2016 edited

@technolo-g my linux system is centos 7.1 , and the kernel version is 4.3.3
[root@mysql1 ~]# uname -r
4.3.3-1.el7.elrepo.x86_64

But I met this kind of phenomenon also

@bigwhite
bigwhite commented Oct 9, 2016

Today I have met this problem,too. The env is ubuntu 14.04 lts, docker 1.12.1, linux kernel version:3.13.0-86-generic. After upgrading kernel to 3.19.0-70, the problem disappeared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment