Skip to content

overlay2 + linux v4.13: error creating overlay mount to /var/lib/docker/overlay2/ID/merged: device or resource busy #34672

@euank

Description

@euank

Description

The overlay driver in the kernel, starting with 4.13, will return an error for overlay mounts that re-use the upper dir. This error was introduced in this patch.

Using docker 17.06.1-ce on the 4.13-rc6 kernel I can unreliably reproduce this error message.
I've only ever observed it on the first container run, and only infrequently. I assume that there are two mounts that race and sometimes clash.

Steps to reproduce the issue:

  1. Install the 4.13 kernel
  2. Boot the machine with an empty /var/lib/docker directory. Start dockerd, and as soon as possible, run a few dozen containers in parallel.
  3. Occasionally, (perhaps 1 out of 30 runs), get the error "error creating overlay mount to /var/lib/docker/overlay2/ID/merged: device or resource busy"
  4. Note that the dmesg output includes "overlayfs: upperdir is in-use by another mount"

Note that this only impacts running multiple containers at once. Serializing all container runs avoids it.

Output of docker version:

$ docker version
Client:
 Version:      17.06.1-ce
 API version:  1.30
 Go version:   go1.8.2
 Git commit:   874a737
 Built:        Sat Aug 26 01:07:04 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.1-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.2
 Git commit:   874a737
 Built:        Fri Aug 25 18:06:27 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 17.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: v0.13.2 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  Profile: default
 selinux
Kernel Version: 4.13.0-rc6-coreos
Operating System: Container Linux by CoreOS 1506.0.0+2017-08-25-1813 (Ladybug)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 996.3MiB
Name: localhost
ID: ZBQY:PD55:UTX2:K2N4:CPQJ:HWIY:SOIQ:IC6P:NNXT:YKUZ:XFNP:ESWJ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

I've seen it on AWS and Qemu, presumably happens on all.

I've also reported this issue over here on the CoreOS bug tracker.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions