Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random "Cannot start container" Errors on 1.9.0-rc5 CentOS7 #17653

Closed
ryanbauman opened this issue Nov 3, 2015 · 41 comments
Closed

Random "Cannot start container" Errors on 1.9.0-rc5 CentOS7 #17653

ryanbauman opened this issue Nov 3, 2015 · 41 comments
Assignees
Milestone

Comments

@ryanbauman
Copy link

Since upgrading to 1.9.0 RCs on a CentOS7 host (3.10.0-229.14.1.el7.x86_64), I've been seeing random Cannot start container errors, e.g.:

Error response from daemon: Cannot start container 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b.scope/cgroup.procs: no such device

Relevant log entries are below for reference:

 Nov  3 08:48:10 dockerhost systemd: Starting docker container 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b.
Nov  3 08:48:10 dockerhost systemd: Started docker container 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b.
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.510245551-05:00" level=warning msg="signal: killed"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.551716997-05:00" level=debug msg="Releasing addresses for endpoint adoring_bhabha's interface on network bridge"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.551781175-05:00" level=debug msg="ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.4)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.553129616-05:00" level=debug msg="[devmapper] UnmountDevice(hash=479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.553166270-05:00" level=debug msg="[devmapper] Unmount(/data/docker/devicemapper/mnt/479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.555363308-05:00" level=debug msg="[devmapper] Unmount done"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.555390143-05:00" level=debug msg="[devmapper] deactivateDevice(479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.555473755-05:00" level=debug msg="[devmapper] removeDevice START(docker-253:4-1074522036-479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost systemd: Unit iscsi.service cannot be reloaded because it is inactive.
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.593476956-05:00" level=debug msg="[devmapper] removeDevice END(docker-253:4-1074522036-479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.593519481-05:00" level=debug msg="[devmapper] deactivateDevice END(479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.593543175-05:00" level=debug msg="[devmapper] UnmountDevice(hash=479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b) END"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.594937237-05:00" level=error msg="failed to umount /data/docker/containers/479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b/shm: invalid argument"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.594989602-05:00" level=error msg="failed to umount /data/docker/containers/479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b/mqueue: invalid argument"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595014866-05:00" level=error msg="479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b: Failed to umount ipc filesystems: failed to cleanup ipc mounts:\ninvalid argument\ninvalid argument"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595046108-05:00" level=debug msg="[devmapper] UnmountDevice(hash=479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b)"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595067916-05:00" level=debug msg="[devmapper] UnmountDevice(hash=479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b) END"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595093237-05:00" level=error msg="Error unmounting device 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b: UnmountDevice: device not-mounted id 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595199876-05:00" level=error msg="Handler for POST /v1.21/containers/479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b/start returned error: Cannot start container 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b.scope/cgroup.procs: no such device"
Nov  3 08:48:10 dockerhost docker: time="2015-11-03T08:48:10.595236064-05:00" level=error msg="HTTP Error" err="Cannot start container 479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-479acdbe53fc117178f9797bce1eaf3a4c3b68021f22df2f3e0f95bd28afdb6b.scope/cgroup.procs: no such device" statusCode=500

I'm able to reproduce this fairly reliably on CentOS7 using the technique mentioned in #17387:

 (set -e ; while true; do ID=$(docker run -d ibuildthecloud/helloworld:latest); docker rm -fv $ID;done)

This issue does not occur with Docker 1.8.3 installed, nor can I reproduce it with 1.9.0-rc5 on a system without systemd using an AUFS storage backend (Ubuntu 14.04).

Bug Report Information

$ docker version
Client:
 Version:      1.9.0-rc5
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   9318004
 Built:        Tue Nov  3 06:14:13 UTC 2015
 OS/Arch:      linux/amd64
Server:
 Version:      1.9.0-rc5
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   9318004
 Built:        Tue Nov  3 06:14:13 UTC 2015
 OS/Arch:      linux/amd64
$ docker info
Containers: 47
Images: 731
Server Version: 1.9.0-rc5
Storage Driver: devicemapper
Pool Name: docker-253:4-1074522036-pool
Pool Blocksize: 65.54 kB
Base Device Size: 107.4 GB
Backing Filesystem: xfs
Data file: /dev/centos_docker/docker_data
Metadata file: /dev/centos_docker/docker_metadata
Data Space Used: 88.79 GB
Data Space Total: 614.9 GB
Data Space Available: 526.1 GB
Metadata Space Used: 120 MB
Metadata Space Total: 17.05 GB
Metadata Space Available: 16.93 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.14.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 32
Total Memory: 188.7 GiB
Name: dockerhost
ID: IBQH:7AOM:H6KR:XXK5:5JKT:GODZ:EUEM:7VF2:6THK:W7NQ:SOPU:DL5Y
Debug mode (server): true
   File Descriptors: 33
   Goroutines: 59
   System Time: 2015-11-03T11:38:44.424354397-05:00
   EventsListeners: 0
   Init SHA1: d676303f7af5bf0ded806f3e60d02b299cd01060
   Init Path: /usr/libexec/docker/dockerinit
   Docker Root Dir: /data/docker
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
$ uname -a
 Linux dockerhost 3.10.0-229.14.1.el7.x86_64 #1 SMP Tue Sep 15 15:05:51 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
@ryanbauman ryanbauman changed the title Random "Cannot start container" Random "Cannot start container" Errors on 1.9.0-rc5 CentOS7 Nov 3, 2015
@bings
Copy link

bings commented Nov 4, 2015

I do see the same here with 1.9.0 on CentOS 7:

[bings@build df-cmtools]$ docker run -d -p 9200:9200 -p 9300:9300 build.tools.datafusion.systems:5000/dfsystems/elasticsearch
2b4d075094472318c81d57b7115ae70d22303d8c3d01851ff0607a8c7c6d4933
[bings@build df-cmtools]$ docker run build.tools.datafusion.systems:5000/dfsystems/elasticsearch
Error response from daemon: Cannot start container d326f24c3e98da59d8812926c99657016a1d42cc2f07821acf258da3c579d360: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-d326f24c3e98da59d8812926c99657016a1d42cc2f07821acf258da3c579d360.scope/cgroup.procs: no such device
[bings@build df-cmtools]$ docker run build.tools.datafusion.systems:5000/dfsystems/elasticsearch
[2015-11-04 09:14:47,596][INFO ][node                     ] [Locus] version[1.7.3], pid[1], build[05d4530/2015-10-15T09:14:17Z]
[2015-11-04 09:14:47,597][INFO ][node                     ] [Locus] initializing ...
[2015-11-04 09:14:47,797][INFO ][plugins                  ] [Locus] loaded [], sites []

Unfortunately random, not really reproducible.

[bings@build df-cmtools]$ docker version
Client:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   76d6bc9
 Built:        Tue Nov  3 18:00:05 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   76d6bc9
 Built:        Tue Nov  3 18:00:05 UTC 2015
 OS/Arch:      linux/amd64
[bings@build df-cmtools]$ docker info
Containers: 17
Images: 1182
Server Version: 1.9.0
Storage Driver: btrfs
 Build Version: Btrfs v3.16.2
 Library Version: 101
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.14.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 8
Total Memory: 15.51 GiB
Name: build.tools.datafusion.systems
ID: 57A7:MCG2:ISNF:J6RP:X73E:AZZY:C4FG:2ESM:MHW6:7WTZ:RZCN:F373
[bings@build df-cmtools]$ uname -a
Linux build.tools.datafusion.systems 3.10.0-229.14.1.el7.x86_64 #1 SMP Tue Sep 15 15:05:51 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

@thaJeztah thaJeztah added this to the 1.9.1 milestone Nov 4, 2015
@thaJeztah
Copy link
Member

Added this to the 1.9.1 milestone so we don't loose sight.

ping @LK4D4 care to have a look?

@bings
Copy link

bings commented Nov 4, 2015

Quick update: I've seen that now also during "docker build":

Step 5 : RUN rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch &&     yum -y update && yum install -y which hostname &&     yum install -y elasticsearch && yum clean all
 ---> Running in 6d8cb2c3fd15
[8] System error: write /sys/fs/cgroup/devices/system.slice/docker-6d8cb2c3fd15ec244eafa060a43ca648fc6711f041c47b343c820b36ced79c32.scope/cgroup.procs: no such device

Some config / setup as written in my previous comment.

@crosbymichael
Copy link
Contributor

this maybe something related to centos because i cannot repo on my systemd based ubuntu system.

@bings
Copy link

bings commented Nov 4, 2015

@crosbymichael pretty sure, it seems to be a timing issue, as repeating the same command (docker build / run) after such a failure mostly succeeds then...
Same with the newest kernel that appeared today for CentOS (3.10.0-229.20)

@crosbymichael
Copy link
Contributor

ya, i'll have to setup a centos box and try to reproduce this

@bings
Copy link

bings commented Nov 4, 2015

Let me know if I can support in any way

@icecrime icecrime added the priority/P1 Important: P1 issues are a top priority and a must-have for the next release. label Nov 6, 2015
@icecrime
Copy link
Contributor

icecrime commented Nov 6, 2015

@crosbymichael Assigning this to you, please let us know how the attempts to reproduce go.

@sklaus
Copy link

sklaus commented Nov 8, 2015

Same here:

Cannot start container 6026f44a59244a453498b12e156b365dc529bf555907704ea74a839e41002d1e: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-6026f44a59244a453498b12e156b365dc529bf555907704ea74a839e41002d1e.scope/cgroup.procs: no such device

On a Centos machine. Happens occasionally.

sudo docker --version Docker version 1.9.0, build 76d6bc9

@msoho
Copy link

msoho commented Nov 9, 2015

Was happening randomly on Docker version 1.9.0, build 76d6bc9 running Centos 7.1, 3.10.0-229.20.1.el7.x86_64.

Error response from daemon: Cannot start container 54fafc9cdea4167c6be9fd5162e5db7ec76a58be86993b1b5ce5086be4f89a40: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-54fafc9cdea4167c6be9fd5162e5db7ec76a58be86993b1b5ce5086be4f89a40.scope/cgroup.procs: no such device

Rolled back to 1.8.3 and can't repro.

@saily
Copy link

saily commented Nov 9, 2015

same here:

ERROR: Build failed with: API error (500): 
Cannot start container 4186b9edfe1f530f275e8cc4dbf9a94a91030e4f8af66805de6506f21031fd5d: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-4186b9edfe1f530f275e8cc4dbf9a94a91030e4f8af66805de6506f21031fd5d.scope/cgroup.procs: no such device

docker:

$ docker --version
Docker version 1.9.0, build 76d6bc9

running on CentOS 7.1:

$ uname -a
Linux ci2 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)

@crosbymichael
Copy link
Contributor

ok, i can reproduce. looking for the cause

@crosbymichael
Copy link
Contributor

@LK4D4 can you help me with this. cannot find anything yet that would cause this on systemd only

@LK4D4
Copy link
Contributor

LK4D4 commented Nov 10, 2015

@crosbymichael how you reproduced it?

@crosbymichael
Copy link
Contributor

So far I can reproduce but cannot find the root cause.

A workaround for everyone is to launch the daemon with --exec-opt native.cgroupdriver=cgroupfs to not use systemd as the cgroups. Your unit file would look like this.

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
Type=notify
ExecStart=/usr/bin/docker daemon --exec-opt native.cgroupdriver=cgroupfs -H fd:// 
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity

[Install]
WantedBy=multi-user.target

Sorry about not finding a better fix, i'll keep looking for clues. However, if you make this change, i have been running tests for a day now and there are zero errors.

@gvd
Copy link

gvd commented Nov 12, 2015

I'm running into the same issue with

docker --version
Docker version 1.9.0, build 76d6bc9

@crosbymichael 's workaround "resolves" the issue

@emlun
Copy link

emlun commented Nov 13, 2015

I'll echo having the same problem since yesterday, when I upgraded to Docker 1.9. Containers randomly fail to start, but usually succeed on the first or second retry. The workaround suggested by @crosbymichael seems to "resolve" the issue for me as well.

$ docker info
Containers: 12
Images: 383
Server Version: 1.9.0
Storage Driver: devicemapper
 Pool Name: docker-253:2-10647568-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 9.48 GB
 Data Space Total: 107.4 GB
 Data Space Available: 97.89 GB
 Metadata Space Used: 19.79 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.128 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.20.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 24
Total Memory: 78.47 GiB

@ethinx
Copy link

ethinx commented Nov 18, 2015

ERROR: Cannot start container b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c: [8] System error: write /sys/fs/cgroup/devices/system.sl
ice/docker-b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c.scope/cgroup.procs: no such device

Log for ref:

Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.793661228+08:00" level=error msg="failed to umount /docker/containers/b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c/shm: invalid argument"
Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.793728213+08:00" level=error msg="failed to umount /docker/containers/b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c/mqueue: invalid argument"
Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.793758066+08:00" level=error msg="b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c: Failed to umount ipc filesystems: failed to cleanup ipc mounts:\ninvalid argument\ninvalid argument"
Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.793791777+08:00" level=error msg="Error unmounting device b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c: UnmountDevice: device not-mounted id b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c"
Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.794089953+08:00" level=error msg="Handler for POST /v1.19/containers/b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c/start returned error: Cannot start container b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c.scope/cgroup.procs: no such device"
Nov 18 10:29:02 dockerhost docker: time="2015-11-18T10:29:02.794135527+08:00" level=error msg="HTTP Error" err="Cannot start container b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-b3989a4f7a04c61e9abc584081e2974fd786743a41193b3a631f10bad7e8283c.scope/cgroup.procs: no such device" statusCode=500
$ sudo docker version
Client:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   76d6bc9
 Built:        Tue Nov  3 18:00:05 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   76d6bc9
 Built:        Tue Nov  3 18:00:05 UTC 2015
 OS/Arch:      linux/amd64
$ uname -r
3.10.0-229.20.1.el7.x86_64

@bruth
Copy link

bruth commented Nov 22, 2015

@crosbymichael Thanks, the --exec-opt native.cgroupdriver=cgroupfs option worked for me as well on RHEL 7.

@thaJeztah
Copy link
Member

Wondering if this is fixed for all following this by changing to cgroupfs. A recent PR changed the default to cgroupfs so (unless there are breaking changes), this will become the default in 1.10. See the PR that changed the default here: #17704

@ryanbauman
Copy link
Author

@thaJeztah changing to cgroupfs has resolved the issue for me.

@TrentBrown
Copy link

Adding "--exec-opt native.cgroupdriver=cgroupfs" to my Docker unit file worked for me, too. Am running CentOS 7.

@IBMRob
Copy link

IBMRob commented Dec 3, 2015

I was hitting this with docker 1.9.1 on RHEL 7.2 but the "--exec-opt native.cgroupdriver=cgroupfs" option has fixed it for me

@thaJeztah
Copy link
Member

@woshihaoren does using cgroupfs resolve this for you?

@woshihaoren
Copy link

@thaJeztah No,just docker rm it, and create again

@woshihaoren
Copy link

@thaJeztah I use cgroupfs , create 10 times, not see this problem. Looks good, Thank you.

@sruffell
Copy link

sruffell commented Jan 7, 2016

Ran into this issue with centos. I edited the unit file and problem was fixed.

This was with version Docker version 1.9.1, build a34a1d5

@ps-account
Copy link

Encountered: System error: open /sys/fs/cgroup/devices/system.slice/docker [...].scope/cgroup.pcros: no such file or directory

with docker 1.9.1 on 3.10.0-327.4.5.el7.x86_64

I will try the workaround.

@ps-account
Copy link

Just a note for others: Trying to edit the unit file for this workaround I got the error:

docker: "daemon" requires 0 arguments

when trying to start the service, but that is due to options now requiring an "=" sign.

Seems to work so far.

@thaJeztah
Copy link
Member

ping @LK4D4 @crosbymichael is this resolved with the switching to cgroupfs as default in 1.10? #17704

akanto added a commit to hortonworks/cloudbreak-images that referenced this issue Feb 18, 2016
@markfra
Copy link

markfra commented Feb 18, 2016

We received these messages occasionally as well:

Error response from daemon: Cannot start container a6759992e6fc885958fe5208b29da1624158a8ea5cbeef6f75610edfc5720d37: [8] System error: write /sys/fs/cgroup/devices/system.slice/docker-a6759992e6fc885958fe5208b29da1624158a8ea5cbeef6f75610ed
fc5720d37.scope/cgroup.procs: no such device

After upgrading to the latest available (docker-engine-1.10.1) our issues seems to have dissapeared.

From the release notes:

Change the default cgroup-driver to cgroupfs #17704

So this seems resolved at least on rhel7 using docker-engine-1.10.1-1.el7

@antoineco
Copy link

I'm seeing this error message on CoreOS 983.0.0 with Docker 1.10.2, where native.cgroupdriver defaults to systemd.

dockerd[1080]: time="2016-03-17T10:28:24.942147259Z" level=error msg="Error unmounting container 29d4f79fce06105a45fc4f3c0de469608d409231e126990eeac8d087d997fd28: not mounted"
dockerd[1080]: time="2016-03-17T10:28:24.942308188Z" level=error msg="Handler for POST /containers/29d4f79fce06105a45fc4f3c0de469608d409231e126990eeac8d087d997fd28/start returned error: Cannot start container 29d4f79fce06105a45fc4f3c0de469608d409231e126990eeac8d087d997fd28: [9] System error: write /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-29d4f79fce06105a45fc4f3c0de469608d409231e126990eeac8d087d997fd28.scope/cpu.cfs_quota_us: invalid argument"

Setting native.cgroupdriver to cgroupfs doesn't seem to have any effect.

$ systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib64/systemd/system/docker.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/docker.service.d
           └─cgroups.conf
...
   CGroup: /system.slice/docker.service
           ├─5147 docker daemon --host=fd:// --exec-opt native.cgroupdriver=cgroupfs --bip=172.17.95.1/24 --mtu=9001 --ip-masq=false --selinux-enabled
$ docker version
Client:
 Version:      1.10.2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   eb1bdb1
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   eb1bdb1
 Built:        
 OS/Arch:      linux/amd64

$ docker info
Containers: 8
 Running: 6
 Paused: 0
 Stopped: 2
Images: 7
Server Version: 1.10.2
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 4.4.4-coreos
Operating System: CoreOS 983.0.0 (Coeur Rouge)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.8 GiB
Name: ip-10-0-12-211.eu-west-1.compute.internal
ID: JYD4:SY7M:2IUK:XLCB:LGQ7:CHHH:RTRW:XKTO:AHO4:TRRR:XI5C:IOLK

edit

This was an issue with a bad CpuQuota in my case.

@chbatey
Copy link

chbatey commented Mar 21, 2016

Also see this on docker 1.8.3 using Kubernetes.

docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 06:06:01 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 06:06:01 UTC 2015
 OS/Arch:      linux/amd64
docker info
Containers: 12
Images: 882
Storage Driver: devicemapper
 Pool Name: direct_lvm-thin_pool
 Pool Blocksize: 65.54 kB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 40.35 GB
 Data Space Total: 137.2 GB
 Data Space Available: 96.82 GB
 Metadata Space Used: 49.35 MB
 Metadata Space Total: 134.2 MB
 Metadata Space Available: 84.87 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Library Version: 1.02.107-RHEL7 (2015-12-01)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-327.4.5.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 8
Total Memory: 14.28 GiB
Name: ip-10-50-185-91
ID: KA7J:MUKZ:2GH5:HBRI:NJ6H:GJ7I:TDEZ:VXY3:V3UG:HNON:FPMR:YB6H

Error message:

[8] System error: write /sys/fs/cgroup/devices/system.slice/docker-68eb24155d961e1ddb95988dc96f790b12de6ffa87b5097d744775a24f1b5f3f.scope/cgroup.procs: no such device

@thaJeztah
Copy link
Member

can you try using native.cgroupdriver=cgroupfs this is now the default in newer versions of docker because of known issues in systemd cgroups

@vdemeester vdemeester removed the priority/P1 Important: P1 issues are a top priority and a must-have for the next release. label Jul 7, 2016
@WigiPedia
Copy link

WigiPedia commented Sep 16, 2016

Hey y'all.
I'm new to this program but I'm trying to install the latest version of Zenoss Core which requires this particular version of docker. That being said, I am receiving the following status message after following the instructions in the installation manual.

● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─docker.conf
Active: failed (Result: exit-code) since Fri 2016-09-16 13:51:06 EDT; 7min ago
Docs: https://docs.docker.com
Process: 1684 ExecStart=/usr/bin/docker daemon $OPTIONS -H fd:// (code=exited, status=1/FAILURE)
Main PID: 1684 (code=exited, status=1/FAILURE)

Sep 16 13:51:06 COMPNAME systemd[1]: Starting Docker Application Container Engine...
Sep 16 13:51:06 COMPNAME docker[1684]: docker: "daemon" requires 0 arguments.
Sep 16 13:51:06 COMPNAME docker[1684]: See '/usr/bin/docker daemon --help'.
Sep 16 13:51:06 COMPNAME docker[1684]: Usage: docker daemon [OPTIONS]
Sep 16 13:51:06 COMPNAME docker[1684]: Enable daemon mode
Sep 16 13:51:06 COMPNAME systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Sep 16 13:51:06 COMPNAME systemd[1]: Failed to start Docker Application Container Engine.
Sep 16 13:51:06 COMPNAME systemd[1]: Unit docker.service entered failed state.
Sep 16 13:51:06 COMPNAME systemd[1]: docker.service failed.

Any help would be appreciated.
Thanks,
WigiPedia

@thaJeztah
Copy link
Member

@WigiPedia docker: "daemon" requires 0 arguments. Looks like \$OPTIONS is escaped in your unit-file, therefore passed as a literal $OPTIONS. Anyway that looks not like a bug in docker, and not related to the issue being discussed here

@Jimilian
Copy link

Jimilian commented Oct 13, 2016

Issue still happens on CentOS based image (AMI) even with Cgroup Driver: cgroupfs:

docker: Error response from daemon: linux runtime spec devices: lstat /dev/.rename_device.lock: no such file or directory.

In next attempt docker can start without any issues. Issue can be observed in ~1% of all executions.

+ docker version
Client:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.3
 Git commit:   b9f10c9/1.11.2
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.3
 Git commit:   b9f10c9/1.11.2
 Built:        
 OS/Arch:      linux/amd64
+ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 35
Server Version: 1.11.2
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 4.4.8-20.46.amzn1.x86_64
Operating System: Amazon Linux AMI 2016.03
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 58.97 GiB
Name: ..............
ID: DXFV:3N4R:4VZ4:ADUT:KMCN:7LEE:5JQR:A5QW:D45A:P3XI:F2KQ:5HXK
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/

@crosbymichael, I think to create PR with retry-mechanism for docker run. Does such PR have any chance to be accepted?

@thaJeztah
Copy link
Member

Let me close this ticket for now, as it looks like it went stale.

@thaJeztah thaJeztah closed this as not planned Won't fix, can't repro, duplicate, stale Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests