Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address already assigned in block after restart docker #746

Closed
liuxu623 opened this issue May 3, 2017 · 18 comments
Closed

Address already assigned in block after restart docker #746

liuxu623 opened this issue May 3, 2017 · 18 comments
Assignees

Comments

@liuxu623
Copy link

liuxu623 commented May 3, 2017

Only in CentOS have this problem,Ubuntu does not have this problem.

CentOS Linux release 7.3.1611 (Core)
docker 17.04.0-ce
calicoctl 1.1.3

http://docs.projectcalico.org/v2.1/getting-started/docker/tutorials/ipam

systemctl restart docker
docker start my_workload

Error response from daemon: IpamDriver.RequestAddress: IP assignment error, data: {IP:192.0.2.100 HandleID:<nil> Attrs:map[] Hostname:centos-linux-1.shared}: Address already assigned in block
Error: failed to start containers: my_workload
@liuxu623 liuxu623 changed the title Address already assigned in block Address already assigned in block after restart docker May 3, 2017
@tmjd tmjd self-assigned this May 3, 2017
@tmjd
Copy link
Member

tmjd commented May 3, 2017

Sounds like a change in the behavior of docker. It would be helpful if you could please do the following.

Attach logs from CentOS and Ubuntu. I'd like to see the network creation, creation of the container, the restart of the docker service, and the restart of the workload. And the specific commands you are using.

Could you please provide the docker version that your Ubuntu setup is using.

I'm guessing the docker on Ubuntu is not re-requesting the IP assignment where it is on CentOS, but that is just a guess.

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

CentOS

[root@centos-linux-1 ~]# cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)
[root@centos-linux-1 ~]# docker info
Containers: 2
 Running: 1
 Paused: 0
 Stopped: 1
Images: 2
Server Version: 17.04.0-ce
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge calico host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary:
containerd version: 422e31ce907fd9c3833a38d7b8fdd023e5a76e73
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 988.6MiB
Name: centos-linux-1.shared
ID: MIFO:6D6X:PSX3:PUYR:AWWH:HVVC:2GY7:K5AQ:FPLB:AOIV:H6LL:PQ3F
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Cluster Store: etcd://127.0.0.1:2379
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
[root@centos-linux-1 ~]# calicoctl version
Version:      v1.1.3
Build date:
Git commit:   e8a457d
[root@centos-linux-1 ~]# calicoctl node run
Running command to load modules: modprobe -a xt_set ip6_tables
Enabling IPv4 forwarding
Enabling IPv6 forwarding
Increasing conntrack limit
Removing old calico-node container (if running).
Running the following command to start calico-node:

docker run --net=host --privileged --name=calico-node -d --restart=always -e CALICO_NETWORKING_BACKEND=bird -e NO_DEFAULT_POOLS= -e CALICO_LIBNETWORK_ENABLED=true -e IP6_AUTODETECTION_METHOD=first-found -e CALICO_LIBNETWORK_CREATE_PROFILES=true -e CALICO_LIBNETWORK_LABEL_ENDPOINTS=false -e CALICO_LIBNETWORK_IFPREFIX=cali -e NODENAME=centos-linux-1.shared -e ETCD_AUTHORITY=127.0.0.1:2379 -e ETCD_SCHEME=http -e ETCD_ENDPOINTS= -e IP_AUTODETECTION_METHOD=first-found -v /var/log/calico:/var/log/calico -v /var/run/calico:/var/run/calico -v /lib/modules:/lib/modules -v /run/docker/plugins:/run/docker/plugins -v /var/run/docker.sock:/var/run/docker.sock quay.io/calico/node:v1.1.3

Image may take a short time to download if it is not available locally.
Container started, checking progress logs.

Skipping datastore connection test
Using autodetected IPv4 address on interface eth0: 10.211.55.11/24
No AS number configured on node resource, using global value
Created default IPv6 pool (fd80:24e2:f998:72d6::/64) with NAT outgoing enabled. IPIP mode: off
Using node name: centos-linux-1.shared
Starting libnetwork service
Calico node started successfully
[root@centos-linux-1 ~]# cat << EOF | calicoctl create -f -
> - apiVersion: v1
>   kind: ipPool
>   metadata:
>     cidr: 192.0.2.0/24
> EOF
Successfully created 1 'ipPool' resource(s)
[root@centos-linux-1 ~]# docker network create --driver calico --ipam-driver calico-ipam --subnet=192.0.2.0/24 my_net
337f752b6ff22e102b3132d27fd183ce244d66588228f3431e7941fc0db080d5
[root@centos-linux-1 ~]# docker run --net my_net --name my_workload --ip 192.0.2.100 -tid busybox
677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a
[root@centos-linux-1 ~]# systemctl restart docker

docker logs 

5月 04 11:01:38 centos-linux-1.shared systemd[1]: Stopping Docker Application Container Engine...
5月 04 11:01:38 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:38.093779820+08:00" level=info msg="Processing signal 'terminated'"
5月 04 11:01:48 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:48.096148531+08:00" level=info msg="Container 677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a failed to exit within 10 seconds of signal 15 - using the force"
5月 04 11:01:48 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:48.136163697+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
5月 04 11:01:49 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:49.136923034+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
5月 04 11:01:51 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:51.137665366+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
5月 04 11:01:53 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:53.094927183+08:00" level=error msg="Force shutdown daemon"
5月 04 11:01:53 centos-linux-1.shared dockerd[2819]: time="2017-05-04T11:01:53.095422618+08:00" level=info msg="stopping containerd after receiving terminated"
5月 04 11:01:54 centos-linux-1.shared systemd[1]: Starting Docker Application Container Engine...
5月 04 11:01:54 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:54.141074962+08:00" level=info msg="libcontainerd: new containerd process, pid: 4172"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.150450226+08:00" level=info msg="[graphdriver] using prior storage driver: overlay"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.159882008+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.160810274+08:00" level=info msg="Loading containers: start."
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.167399328+08:00" level=warning msg="libcontainerd: client is out of sync, restore was called on a fully synced container (677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a)."
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.168777087+08:00" level=warning msg="libcontainerd: failed to retrieve container 677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a state: rpc error: code = 2 desc = containerd: container not found"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.187290362+08:00" level=info msg="Firewalld running: true"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.530174787+08:00" level=info msg="Removing stale sandbox c47d0e927fcfd235551963cc4b5b03bd385e8c67817a8c5976dac5686a295750 (677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a)"
5月 04 11:01:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:55.539137772+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
5月 04 11:01:56 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:56.540145488+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
5月 04 11:01:58 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:01:58.542168870+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
5月 04 11:02:02 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:02.543771400+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 8s"
5月 04 11:02:10 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:10.546019290+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
5月 04 11:02:11 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:11.547659838+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
5月 04 11:02:13 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:13.549537315+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
5月 04 11:02:17 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:17.550346283+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 8s"
5月 04 11:02:25 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:25.556144378+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 1s"
5月 04 11:02:26 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:26.557762470+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 2s"
5月 04 11:02:28 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:28.559155358+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 4s"
5月 04 11:02:32 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:32.560053383+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 8s"
5月 04 11:02:40 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:40.561582052+08:00" level=warning msg="Failed to retrieve ipam driver to release interface address on delete of endpoint my_workload (3744d2133f6b41974ff2a9a374f4e0cc3675655f7b102d36ed0ea6a99c8f9f09): legacy plugin: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused"
5月 04 11:02:40 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:40.715647285+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
5月 04 11:02:40 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:40.991042022+08:00" level=info msg="Loading containers: done."
5月 04 11:02:41 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:41.006221576+08:00" level=info msg="Daemon has completed initialization"
5月 04 11:02:41 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:41.006264307+08:00" level=info msg="Docker daemon" commit=4845c56 graphdriver=overlay version=17.04.0-ce
5月 04 11:02:41 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:02:41.013955136+08:00" level=info msg="API listen on /var/run/docker.sock"
5月 04 11:02:41 centos-linux-1.shared systemd[1]: Started Docker Application Container Engine.
[root@centos-linux-1 ~]# docker start my_workload
Error response from daemon: IpamDriver.RequestAddress: IP assignment error, data: {IP:192.0.2.100 HandleID:<nil> Attrs:map[] Hostname:centos-linux-1.shared}: Address already assigned in block
Error: failed to start containers: my_workload

docker logs 

5月 04 11:03:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:03:55.092149479+08:00" level=warning msg="error locating sandbox id c47d0e927fcfd235551963cc4b5b03bd385e8c67817a8c5976dac5686a295750: sandbox c47d0e927fcfd235551963cc4b5b03bd385e8c67817a8c5976dac5686a295750 not found"
5月 04 11:03:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:03:55.092266550+08:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/677f30ab659c271558a2e5fa8502486b49a274f170d26c3b6c59a114d650133a/shm: invalid argument"
5月 04 11:03:55 centos-linux-1.shared dockerd[4166]: time="2017-05-04T11:03:55.092409926+08:00" level=error msg="Handler for POST /v1.28/containers/my_workload/start returned error: IpamDriver.RequestAddress: IP assignment error, data: {IP:192.0.2.100 HandleID:<nil> Attrs:map[] Hostname:centos-linux-1.shared}: Address already assigned in block"

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

Ubuntu

root@ubuntu-linux:~# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=17.04
DISTRIB_CODENAME=zesty
DISTRIB_DESCRIPTION="Ubuntu 17.04"
root@ubuntu-linux:~# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 17.03.1-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.10.0-20-generic
Operating System: Ubuntu 17.04
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 985.6 MiB
Name: ubuntu-linux
ID: UCNQ:VPWJ:LSNR:EARC:F46D:QU4Y:ENEC:CO3Z:5SLM:3YWR:AUEH:3ATU
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Cluster Store: etcd://127.0.0.1:2379
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
root@ubuntu-linux:~# calicoctl version
Version:      v1.1.3
Build date:
Git commit:   e8a457d
root@ubuntu-linux:~# calicoctl node run
Running command to load modules: modprobe -a xt_set ip6_tables
Enabling IPv4 forwarding
Enabling IPv6 forwarding
Increasing conntrack limit
Removing old calico-node container (if running).
Running the following command to start calico-node:

docker run --net=host --privileged --name=calico-node -d --restart=always -e CALICO_LIBNETWORK_LABEL_ENDPOINTS=false -e CALICO_LIBNETWORK_IFPREFIX=cali -e NODENAME=ubuntu-linux -e NO_DEFAULT_POOLS= -e IP_AUTODETECTION_METHOD=first-found -e CALICO_LIBNETWORK_CREATE_PROFILES=true -e ETCD_AUTHORITY=127.0.0.1:2379 -e ETCD_SCHEME=http -e ETCD_ENDPOINTS= -e CALICO_NETWORKING_BACKEND=bird -e CALICO_LIBNETWORK_ENABLED=true -e IP6_AUTODETECTION_METHOD=first-found -v /var/log/calico:/var/log/calico -v /var/run/calico:/var/run/calico -v /lib/modules:/lib/modules -v /run/docker/plugins:/run/docker/plugins -v /var/run/docker.sock:/var/run/docker.sock quay.io/calico/node:v1.1.3

Image may take a short time to download if it is not available locally.
Container started, checking progress logs.

Skipping datastore connection test
Using autodetected IPv4 address on interface enp0s5: 10.211.55.13/24
No AS number configured on node resource, using global value
Created default IPv4 pool (192.168.0.0/16) with NAT outgoing enabled. IPIP mode: off
Created default IPv6 pool (fd80:24e2:f998:72d6::/64) with NAT outgoing enabled. IPIP mode: off
Using node name: ubuntu-linux
Starting libnetwork service
Calico node started successfully
root@ubuntu-linux:~# cat << EOF | calicoctl create -f -
> - apiVersion: v1
>   kind: ipPool
>   metadata:
>     cidr: 192.0.2.0/24
> EOF
Successfully created 1 'ipPool' resource(s)
root@ubuntu-linux:~# docker network create --driver calico --ipam-driver calico-ipam --subnet=192.0.2.0/24 my_net
ce1c0783bff32c13454af3a24e70b7bbd21c9dbea10b811c795b06cbe4f062fb
root@ubuntu-linux:~# docker run --net my_net --name my_workload --ip 192.0.2.100 -tid busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
7520415ce762: Pull complete
Digest: sha256:32f093055929dbc23dec4d03e09dfe971f5973a9ca5cf059cbfb644c206aa83f
Status: Downloaded newer image for busybox:latest
36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642
root@ubuntu-linux:~# systemctl restart docker

docker logs 

May 04 11:31:19 ubuntu-linux systemd[1]: Stopping Docker Application Container Engine...
May 04 11:31:19 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:19.978674246+08:00" level=info msg="Processing signal 'terminated'"
May 04 11:31:29 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:29.981264075+08:00" level=info msg="Container 36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642 failed to exit within 10 seconds of signal 15 - using the force"
May 04 11:31:30 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:30.051167229+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
May 04 11:31:31 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:31.052226649+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
May 04 11:31:33 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:33.053391489+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.RevokeExternalConnectivity: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.RevokeExternalConnectivity: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
May 04 11:31:34 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:34.979159219+08:00" level=error msg="Force shutdown daemon"
May 04 11:31:34 ubuntu-linux dockerd[11863]: time="2017-05-04T11:31:34.979655795+08:00" level=info msg="stopping containerd after receiving terminated"
May 04 11:31:35 ubuntu-linux systemd[1]: Stopped Docker Application Container Engine.
May 04 11:31:35 ubuntu-linux systemd[1]: Starting Docker Application Container Engine...
May 04 11:31:36 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:36.019303915+08:00" level=info msg="libcontainerd: new containerd process, pid: 12768"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.033241947+08:00" level=info msg="[graphdriver] using prior storage driver: aufs"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.037490374+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.037699631+08:00" level=warning msg="Your kernel does not support swap memory limit"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.037732174+08:00" level=warning msg="Your kernel does not support cgroup rt period"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.037741773+08:00" level=warning msg="Your kernel does not support cgroup rt runtime"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.038069858+08:00" level=info msg="Loading containers: start."
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.043863511+08:00" level=warning msg="libcontainerd: client is out of sync, restore was called on a fully synced container (36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642)."
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.044334220+08:00" level=warning msg="libcontainerd: failed to retrieve container 36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642 state: rpc error: code = 2 desc = containerd: container not found"
May 04 11:31:37 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:37.136918111+08:00" level=info msg="Firewalld running: false"
May 04 11:31:38 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:38.119721620+08:00" level=info msg="Removing stale sandbox bb3785be12bba64f000272c18afe2f3cb064177651f410f193ea70ad09786314 (36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642)"
May 04 11:31:38 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:38.128279539+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
May 04 11:31:39 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:39.130301758+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
May 04 11:31:41 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:41.131284919+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
May 04 11:31:45 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:45.133496960+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 8s"
May 04 11:31:53 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:53.135133645+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 1s"
May 04 11:31:54 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:54.136313576+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 2s"
May 04 11:31:56 ubuntu-linux dockerd[12750]: time="2017-05-04T11:31:56.137669860+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 4s"
May 04 11:32:00 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:00.139066459+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/Plugin.Activate: dial unix /run/docker/plugins/calico.sock: connect: connection refused, retrying in 8s"
May 04 11:32:08 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:08.143898978+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 1s"
May 04 11:32:09 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:09.145485010+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 2s"
May 04 11:32:11 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:11.146168807+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 4s"
May 04 11:32:15 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:15.146824803+08:00" level=warning msg="Unable to connect to plugin: /run/docker/plugins/calico-ipam.sock/Plugin.Activate: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused, retrying in 8s"
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.147700783+08:00" level=warning msg="Failed to retrieve ipam driver to release interface address on delete of endpoint my_workload (aedb84ed49731669813940f0d7102b250afc474fe3625b907f2bf4c54f8760da): legacy plugin: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused"
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.226905824+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.404487534+08:00" level=info msg="Loading containers: done."
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.425258246+08:00" level=info msg="Daemon has completed initialization"
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.425294889+08:00" level=info msg="Docker daemon" commit=c6d412e graphdriver=aufs version=17.03.1-ce
May 04 11:32:23 ubuntu-linux systemd[1]: Started Docker Application Container Engine.
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.453799598+08:00" level=info msg="API listen on /var/run/docker.sock"
root@ubuntu-linux:~# docker start my_workload
Error response from daemon: IpamDriver.RequestAddress: IP assignment error, data: {IP:192.0.2.100 HandleID:<nil> Attrs:map[] Hostname:ubuntu-linux}: Address already assigned in block
Error: failed to start containers: my_workload

docker logs

May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.404487534+08:00" level=info msg="Loading containers: done."
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.425258246+08:00" level=info msg="Daemon has completed initialization"
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.425294889+08:00" level=info msg="Docker daemon" commit=c6d412e graphdriver=aufs version=17.03.1-ce
May 04 11:32:23 ubuntu-linux systemd[1]: Started Docker Application Container Engine.
May 04 11:32:23 ubuntu-linux dockerd[12750]: time="2017-05-04T11:32:23.453799598+08:00" level=info msg="API listen on /var/run/docker.sock"
May 04 11:33:24 ubuntu-linux dockerd[12750]: time="2017-05-04T11:33:24.337333983+08:00" level=warning msg="error locating sandbox id bb3785be12bba64f000272c18afe2f3cb064177651f410f193ea70ad09786314: sandbox bb3785be12bba64f000272c18afe2f3cb064177651f410f193ea70ad09786314 not found"
May 04 11:33:24 ubuntu-linux dockerd[12750]: time="2017-05-04T11:33:24.337782453+08:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/36fc4c1f32428f5dba8aefb2a845154cb78a6ce8bf80e70f8b4f39ec48776642/shm: invalid argument"
May 04 11:33:24 ubuntu-linux dockerd[12750]: time="2017-05-04T11:33:24.359917539+08:00" level=error msg="Handler for POST /v1.27/containers/my_workload/start returned error: IpamDriver.RequestAddress: IP assignment error, data: {IP:192.0.2.100 HandleID:<nil> Attrs:map[] Hostname:ubuntu-linux}: Address already assigned in block"

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

Ubuntu have the same problem now.

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

I think dockerd couldn't connect calico when dockerd restart led to this problem.

level=warning msg="Failed to retrieve ipam driver to release interface address on delete of endpoint my_workload (3744d2133f6b41974ff2a9a374f4e0cc3675655f7b102d36ed0ea6a99c8f9f09): legacy plugin: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused"
level=warning msg="Failed to retrieve ipam driver to release interface address on delete of endpoint my_workload (aedb84ed49731669813940f0d7102b250afc474fe3625b907f2bf4c54f8760da): legacy plugin: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico-ipam.sock/Plugin.Activate: dial unix /run/docker/plugins/calico-ipam.sock: connect: connection refused"

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

I tried not using the docker to start calico,it work.And docker start fast than before.

5月 04 17:13:16 centos-linux-1.shared systemd[1]: Stopping Docker Application Container Engine...
5月 04 17:13:16 centos-linux-1.shared dockerd[16321]: time="2017-05-04T17:13:16.741668073+08:00" level=info msg="Processing signal 'terminated'"
5月 04 17:13:26 centos-linux-1.shared dockerd[16321]: time="2017-05-04T17:13:26.744373699+08:00" level=info msg="Container 91ba705e96616766e599a0e0b9e43d7912514c6cc12f9c03c2410e16f2382544 failed to exit within 10 seconds of signal 15 - using the force"
5月 04 17:13:26 centos-linux-1.shared dockerd[16321]: time="2017-05-04T17:13:26.897680908+08:00" level=info msg="stopping containerd after receiving terminated"
5月 04 17:13:27 centos-linux-1.shared systemd[1]: Starting Docker Application Container Engine...
5月 04 17:13:27 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:27.993818124+08:00" level=info msg="libcontainerd: new containerd process, pid: 31188"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.003568120+08:00" level=info msg="[graphdriver] using prior storage driver: overlay"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.025739205+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.027269820+08:00" level=info msg="Loading containers: start."
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.064686226+08:00" level=info msg="Firewalld running: true"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.649151353+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.817218563+08:00" level=info msg="Loading containers: done."
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.834336472+08:00" level=info msg="Daemon has completed initialization"
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.834381384+08:00" level=info msg="Docker daemon" commit=4845c56 graphdriver=overlay version=17.04.0-ce
5月 04 17:13:29 centos-linux-1.shared dockerd[31182]: time="2017-05-04T17:13:29.848028052+08:00" level=info msg="API listen on /var/run/docker.sock"
5月 04 17:13:29 centos-linux-1.shared systemd[1]: Started Docker Application Container Engine.

@liuxu623
Copy link
Author

liuxu623 commented May 4, 2017

Why you use docker to start calico?
There is another way to solve the problem?

@tmjd
Copy link
Member

tmjd commented May 4, 2017

You can setup and run the processes that run in the calico/node container outside of docker, I don't believe there is anything restricting you from doing that. Docker is used to start calico because it provides an easy mechanism for distribution that bundles the needed dependencies.

@tmjd
Copy link
Member

tmjd commented May 4, 2017

Sounds like you've figured out the issue here so I'm going to close this issue. If there is more we can help you with feel free to open another issue, check out https://www.projectcalico.org//community, or join us on https://calicousers.slack.com/

@tmjd tmjd closed this as completed May 4, 2017
song-jiang pushed a commit that referenced this issue Jan 4, 2019
@ansiz
Copy link

ansiz commented Feb 11, 2019

@tmjd @liuxu623 How to run the processes that run in the calico/node container outside of docker? Can you provide some documents?

@liuxu623
Copy link
Author

liuxu623 commented Feb 11, 2019

@ansiz Just copy calico binary and config file from container to host , and use systemd manage calico process , you can see https://github.com/laincloud/lain/tree/master/playbooks/roles/calico, but it works on calico 2.1.

@ansiz
Copy link

ansiz commented Feb 12, 2019

@liuxu623 Thanks a lot, I have successfully run all the services outside the container. But if I assigned a calico network for a container which specified --restart always, then try to restart docker or reboot host, docker service can't restart successfully.

In order to successfully launch the docker service, I must run dockerd manually without --cluster-store, then force to delete the bad container, finally restart the docker service.

Have you encountered a similar situation?

@liuxu623
Copy link
Author

liuxu623 commented Feb 12, 2019

@ansiz Maybe docker.service start before calico-xxx.service,you can modify docker.service file,make docker.service start after calico-xxx.service. You should look dockerd's log. If dockerd start succeed , but container start failed, maybe you should use calicoctl release ip address.

@ansiz
Copy link

ansiz commented Feb 12, 2019

@ansiz Maybe docker.service start before calico-xxx.service,you can modify docker.service file,make docker.service start after calico-xxx.service. You should look dockerd's log.

I have configured service dependencies in systemd:

After=network-online.target etcd.service calico-felix.service calico-bird.service calico-libnetwork.service

And even though I didn't stop other services, Docker still can't restart.

Here is the systemctl log of Docker

no error or warning and won't timeout

● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: activating (start) since Tue 2019-02-12 18:00:19 CST; 6s ago
     Docs: https://docs.docker.com
 Main PID: 10499 (dockerd)
    Tasks: 26
   Memory: 49.1M
   CGroup: /system.slice/docker.service
           ├─10499 /usr/bin/dockerd --cluster-store=etcd://192.168.1.235:2379
           └─10507 docker-containerd --config /var/run/docker/containerd/containerd.toml

Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.189400947+08:00" level=info msg="[graphdriver] using prior storage driver: devicemapper"
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.228479158+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.231862733+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.231952634+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.232210547+08:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.232319572+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.232394937+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42037e770, CONNECTING" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.232765075+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42037e770, READY" module=grpc
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.232815792+08:00" level=info msg="Loading containers: start."
Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.847905583+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"

The manual execution log:

 ⚡ root@hpc-dev  ~  /usr/bin/dockerd --cluster-store=etcd://192.168.1.235:2379
INFO[2019-02-12T18:05:30.359664425+08:00] libcontainerd: started new docker-containerd process  pid=10872
INFO[2019-02-12T18:05:30.359905495+08:00] parsed scheme: "unix"                         module=grpc
INFO[2019-02-12T18:05:30.359923242+08:00] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2019-02-12T18:05:30.360018979+08:00] ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]  module=grpc
INFO[2019-02-12T18:05:30.360043948+08:00] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2019-02-12T18:05:30.360119100+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc420207a30, CONNECTING  module=grpc
INFO[0000] starting containerd                           revision=d64c661f1d51c48782c9cec8fda7604785f93587 version=v1.1.1
INFO[0000] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[0000] loading plugin "io.containerd.snapshotter.v1.btrfs"...  type=io.containerd.snapshotter.v1
WARN[0000] failed to load plugin io.containerd.snapshotter.v1.btrfs  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
INFO[0000] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
WARN[0000] failed to load plugin io.containerd.snapshotter.v1.aufs  error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1"
INFO[0000] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[0000] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
WARN[0000] failed to load plugin io.containerd.snapshotter.v1.overlayfs  error="/var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.overlayfs does not support d_type. If the backing filesystem is xfs, please reformat with ftype=1 to enable d_type support"
INFO[0000] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
WARN[0000] failed to load plugin io.containerd.snapshotter.v1.zfs  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter"
INFO[0000] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[0000] could not use snapshotter btrfs in metadata plugin  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
WARN[0000] could not use snapshotter aufs in metadata plugin  error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1"
WARN[0000] could not use snapshotter overlayfs in metadata plugin  error="/var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.overlayfs does not support d_type. If the backing filesystem is xfs, please reformat with ftype=1 to enable d_type support"
WARN[0000] could not use snapshotter zfs in metadata plugin  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter"
INFO[0000] loading plugin "io.containerd.differ.v1.walking"...  type=io.containerd.differ.v1
INFO[0000] loading plugin "io.containerd.gc.v1.scheduler"...  type=io.containerd.gc.v1
INFO[0000] loading plugin "io.containerd.service.v1.containers-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.content-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.diff-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.images-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.leases-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.namespaces-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.service.v1.snapshots-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.monitor.v1.cgroups"...  type=io.containerd.monitor.v1
INFO[0000] loading plugin "io.containerd.runtime.v1.linux"...  type=io.containerd.runtime.v1
INFO[0000] loading plugin "io.containerd.service.v1.tasks-service"...  type=io.containerd.service.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.containers"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.content"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.diff"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.events"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.healthcheck"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.images"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.leases"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.namespaces"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.snapshots"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.tasks"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.version"...  type=io.containerd.grpc.v1
INFO[0000] loading plugin "io.containerd.grpc.v1.cri"...  type=io.containerd.grpc.v1
INFO[0000] Start cri plugin with config {PluginConfig:{ContainerdConfig:{Snapshotter:overlayfs DefaultRuntime:{Type:io.containerd.runtime.v1.linux Engine: Root:} UntrustedWorkloadRuntime:{Type: Engine: Root:}} CniConfig:{NetworkPluginBinDir:/opt/cni/bin NetworkPluginConfDir:/etc/cni/net.d NetworkPluginConfTemplate:} Registry:{Mirrors:map[docker.io:{Endpoints:[https://registry-1.docker.io]}]} StreamServerAddress: StreamServerPort:10010 EnableSelinux:false SandboxImage:k8s.gcr.io/pause:3.1 StatsCollectPeriod:10 SystemdCgroup:false EnableTLSStreaming:false} ContainerdRootDir:/var/lib/docker/containerd/daemon ContainerdEndpoint:/var/run/docker/containerd/docker-containerd.sock RootDir:/var/lib/docker/containerd/daemon/io.containerd.grpc.v1.cri StateDir:/var/run/docker/containerd/daemon/io.containerd.grpc.v1.cri}
INFO[0000] Connect containerd service
WARN[0000] failed to load plugin io.containerd.grpc.v1.cri  error="failed to create CRI service: failed to find snapshotter "overlayfs""
INFO[0000] loading plugin "io.containerd.grpc.v1.introspection"...  type=io.containerd.grpc.v1
INFO[0000] serving...                                    address="/var/run/docker/containerd/docker-containerd-debug.sock"
INFO[0000] serving...                                    address="/var/run/docker/containerd/docker-containerd.sock"
INFO[0000] containerd successfully booted in 0.009422s
INFO[2019-02-12T18:05:30.401658190+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc420207a30, READY  module=grpc
INFO[2019-02-12T18:05:30.404715737+08:00] parsed scheme: "unix"                         module=grpc
INFO[2019-02-12T18:05:30.404740779+08:00] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2019-02-12T18:05:30.404794817+08:00] ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]  module=grpc
INFO[2019-02-12T18:05:30.404813379+08:00] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2019-02-12T18:05:30.404874725+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc4201801b0, CONNECTING  module=grpc
INFO[2019-02-12T18:05:30.405109760+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc4201801b0, READY  module=grpc
WARN[2019-02-12T18:05:31.482692863+08:00] Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man dockerd` to refer to dm.thinpooldev section.  storage-driver=devicemapper
WARN[2019-02-12T18:05:31.505788854+08:00] Base device already exists and has filesystem xfs on it. User specified filesystem  will be ignored.  storage-driver=devicemapper
INFO[2019-02-12T18:05:31.515348848+08:00] [graphdriver] using prior storage driver: devicemapper
INFO[2019-02-12T18:05:31.705031226+08:00] Graph migration to content-addressability took 0.00 seconds
INFO[2019-02-12T18:05:31.706408921+08:00] parsed scheme: "unix"                         module=grpc
INFO[2019-02-12T18:05:31.706430546+08:00] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2019-02-12T18:05:31.706477941+08:00] ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]  module=grpc
INFO[2019-02-12T18:05:31.706494505+08:00] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2019-02-12T18:05:31.706572080+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc4201ccc30, CONNECTING  module=grpc
INFO[2019-02-12T18:05:31.706845819+08:00] pickfirstBalancer: HandleSubConnStateChange: 0xc4201ccc30, READY  module=grpc
INFO[2019-02-12T18:05:31.706882233+08:00] Loading containers: start.
INFO[2019-02-12T18:05:32.138401410+08:00] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address

@liuxu623
Copy link
Author

Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.231952634+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc

please cat /usr/lib/systemd/system/docker.service

@ansiz
Copy link

ansiz commented Feb 12, 2019

Feb 12 18:00:19 hpc-dev dockerd[10499]: time="2019-02-12T18:00:19.231952634+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc

please cat /usr/lib/systemd/system/docker.service

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target etcd.service calico-felix.service calico-bird.service calico-libnetwork.service
Wants=network-online.target

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd --cluster-store=etcd://192.168.1.235:2379
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

@ansiz
Copy link

ansiz commented Feb 13, 2019

@liuxu623 Not caused by the Docker service, but a problem with calico-libnetwork.service,the Docker service can start normally after restarting the calico-libnetwork service, container will be stoped and here is error log:

WARN[2019-02-13T10:08:16.185390678+08:00] Unable to connect to plugin: /run/docker/plugins/calico.sock/NetworkDriver.CreateEndpoint: Post http://%2Frun%2Fdocker%2Fplugins%2Fcalico.sock/NetworkDriver.CreateEndpoint: EOF, retrying in 1s
WARN[2019-02-13T10:08:17.218588033+08:00] error locating sandbox id ffb15bee1be677efdff29ec2ee319ea6a1dc187c9d25096b6fd5de0aa586211c: sandbox ffb15bee1be677efdff29ec2ee319ea6a1dc187c9d25096b6fd5de0aa586211c not found
ERRO[2019-02-13T10:08:17.252914333+08:00] 0675a5e2c9234d528dc10c864e6fdc6e9e9d925fdf7d75d0670d0dcd964ea6fa cleanup: failed to delete container from containerd: no such container
ERRO[2019-02-13T10:08:17.252959524+08:00] Failed to start container 0675a5e2c9234d528dc10c864e6fdc6e9e9d925fdf7d75d0670d0dcd964ea6fa: failed to create endpoint test1 on network net1: NetworkDriver.CreateEndpoint: EOF

I think it should be caused by a calico-libnetwork data error. I will continue to investigate the issue

Thanks for your reply.

@ansiz
Copy link

ansiz commented Feb 13, 2019

@liuxu623 I finally found the problem after reading the source code.(I'm using v1.1.3)

// Use the Docker API to fetch the network name (so we don't have to use an ID everywhere)
	dockerCli, err := dockerClient.NewEnvClient()
	if err != nil {
		err = errors.Wrap(err, "Error while attempting to instantiate docker client from env")
		log.Errorln(err)
		return nil, err
	}
	defer dockerCli.Close()
	networkData, err := dockerCli.NetworkInspect(context.Background(), request.NetworkID)
	if err != nil {
		err = errors.Wrapf(err, "Network %v inspection error", request.NetworkID)
		log.Errorln(err)
		return nil, err
	}

When the Docker service starts, the container will start at the same time, it requests new endpoint via libnetwork-plugin, the libnetwork-plugin use the Docker API to fetch the network name, but the Docker service has not yet started...This will lead to a deadlock.

I temporarily use request.NetworkID instead of networkData.Name, the problem is solved.But I don't know if there are any other effects, I will test to confirm.

Here is the diff code: ansiz/libnetwork-plugin@3f052ae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants