New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker run fails with "invalid argument" when using overlay driver on top of xfs #10294

Closed
renato-zannon opened this Issue Jan 23, 2015 · 75 comments

Comments

Projects
None yet
@renato-zannon
Contributor

renato-zannon commented Jan 23, 2015

$ docker version
Client version: 1.4.1
Client API version: 1.16
Go version (client): go1.3.3
Git commit (client): 5bc2ff8/1.4.1
OS/Arch (client): linux/amd64
Server version: 1.4.1
Server API version: 1.16
Go version (server): go1.3.3
Git commit (server): 5bc2ff8/1.4.1

$ docker info
Containers: 0
Images: 0
Storage Driver: overlay
Execution Driver: native-0.2
Kernel Version: 3.18.2-200.playground.fc21.x86_64
Operating System: Fedora 21 (Twenty One)
CPUs: 4
Total Memory: 3.646 GiB
Name: zannon
ID: ST2Y:3CA3:RYLL:QW67:55K6:BVVX:LQWB:Q4JL:XOE6:JNKH:JIQJ:MC7A

$ docker run --rm -ti ubuntu:14.04 bin/bash                                       
FATA[0000] Error response from daemon: mkdir /var/lib/docker/overlay/c4a8f5e516d401534f2d994f5546f7e08639ffd675eb3573267f76d79394f172-init/merged/dev/shm: invalid argument

This does not happen if /var/lib/docker is on ext4

@coolljt0725

This comment has been minimized.

Show comment
Hide comment
@coolljt0725

coolljt0725 Jan 23, 2015

Contributor

I have meet this situation too

Contributor

coolljt0725 commented Jan 23, 2015

I have meet this situation too

@ChaosEngine

This comment has been minimized.

Show comment
Hide comment
@ChaosEngine

ChaosEngine Mar 8, 2015

I have same thing with utuntu:14.04 and with my image (chaosengine/memsql). when deamon has --graph=/mnt/xfsPartition/docker added it generaes core dump as follows:

$ docker run -p 127.0.0.1:3307:3306 -ti chaosengine/memsql
2015-03-08 09:25:58 STDERR INFO: Successfully became user 'memsql' (uid 999, gid 999)
00001378 2015-03-08 09:25:58 INFO: System Information: sockets (1), physical cores (4), virtual cores (4), model name (Intel(R) Core(TM) i5-2320 CPU @ 3.00GHz), memory (8309628928), uname (Linux c3c556fb0e2c 3.18.7-gentoo #1 SMP PREEMPT Sat Feb 21 21:33:17 CET 2015 x86_64)
0000014 2015-03-08 09:25:58 INFO: Log level changed to 0
00908005 2015-03-08 09:25:59 INFO: MemSQL version hash: 1925f16a786656eeb13c72bdc6e485d95e948713 (Thu Feb 5 13:31:10 2015 -0800)
01366084 2015-03-08 09:25:59 INFO: ./memsqld: ready for connections.
01372355 2015-03-08 09:25:59 INFO: Snapshot snapshots/memsql_snapshot_0 of database memsql: started replaying from offset 0
01372531 2015-03-08 09:25:59 INFO: Snapshot snapshots/memsql_snapshot_0 of database memsql: completed replaying at offset 89
01372707 2015-03-08 09:25:59 INFO: Log logs/memsql_log_0 of database memsql: started replaying from offset 0
01373215 2015-03-08 09:25:59 INFO: Log logs/memsql_log_0 of database memsql: completed replaying at offset 490
01373261 2015-03-08 09:25:59 INFO: Cleaning up columnstore file segments for database memsql
01373366 2015-03-08 09:25:59 INFO: Finished cleaning up columnstore file segments for database memsql
01412159 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.2a0752 to avoid overwriting it
01412225 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.2a0752 (No such file or directory)
01412251 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.7694bd to avoid overwriting it
01412283 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.7694bd (No such file or directory)
01412312 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.1f176 to avoid overwriting it
01412334 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.1f176 (No such file or directory)
01412355 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.609101 to avoid overwriting it
01412377 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.609101 (No such file or directory)
01412398 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.79c2c5 to avoid overwriting it
01412419 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.79c2c5 (No such file or directory)
01412439 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.3115b1 to avoid overwriting it
01412460 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.3115b1 (No such file or directory)
01412480 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.262f0f to avoid overwriting it
01412501 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.262f0f (No such file or directory)
01412521 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.6a1ad6 to avoid overwriting it
01412543 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.6a1ad6 (No such file or directory)
01412563 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.9a2058 to avoid overwriting it
01412584 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.9a2058 (No such file or directory)
01412604 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.d54791 to avoid overwriting it
01412625 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.d54791 (No such file or directory)
01412651 2015-03-08 09:25:59 FAIL: Failed to create new 'information_schema' database (error 4)
+-----------------------------------------------------------------------------+
| MemSQL has encountered a fatal error and exited. |
| It could be a bug, misconfiguration, or hardware issue. |
+-----------------------------------------------------------------------------+
| |
| MemSQL was running with durability on, which means that when you |
| restart MemSQL, your data will be recovered back to a consistent state. |
| |
| Since the 'core-file' setting is enabled a core file will now be |
| generated at: |
| |
| /var/lib/memsql/data/core |
| |
| You can contact MemSQL directly at any time by emailing support@memsql.com |
| or calling 1-855-4-MEMSQL |
| |
+-----------------------------------------------------------------------------+
Aborted (core dumped)

when default store location /var/lib/docker is used on ext4 none of this is happening

docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.4.1
Git commit (client): a8a31ef
OS/Arch (client): linux/amd64
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.4.1
Git commit (server): a8a31ef

ChaosEngine commented Mar 8, 2015

I have same thing with utuntu:14.04 and with my image (chaosengine/memsql). when deamon has --graph=/mnt/xfsPartition/docker added it generaes core dump as follows:

$ docker run -p 127.0.0.1:3307:3306 -ti chaosengine/memsql
2015-03-08 09:25:58 STDERR INFO: Successfully became user 'memsql' (uid 999, gid 999)
00001378 2015-03-08 09:25:58 INFO: System Information: sockets (1), physical cores (4), virtual cores (4), model name (Intel(R) Core(TM) i5-2320 CPU @ 3.00GHz), memory (8309628928), uname (Linux c3c556fb0e2c 3.18.7-gentoo #1 SMP PREEMPT Sat Feb 21 21:33:17 CET 2015 x86_64)
0000014 2015-03-08 09:25:58 INFO: Log level changed to 0
00908005 2015-03-08 09:25:59 INFO: MemSQL version hash: 1925f16a786656eeb13c72bdc6e485d95e948713 (Thu Feb 5 13:31:10 2015 -0800)
01366084 2015-03-08 09:25:59 INFO: ./memsqld: ready for connections.
01372355 2015-03-08 09:25:59 INFO: Snapshot snapshots/memsql_snapshot_0 of database memsql: started replaying from offset 0
01372531 2015-03-08 09:25:59 INFO: Snapshot snapshots/memsql_snapshot_0 of database memsql: completed replaying at offset 89
01372707 2015-03-08 09:25:59 INFO: Log logs/memsql_log_0 of database memsql: started replaying from offset 0
01373215 2015-03-08 09:25:59 INFO: Log logs/memsql_log_0 of database memsql: completed replaying at offset 490
01373261 2015-03-08 09:25:59 INFO: Cleaning up columnstore file segments for database memsql
01373366 2015-03-08 09:25:59 INFO: Finished cleaning up columnstore file segments for database memsql
01412159 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.2a0752 to avoid overwriting it
01412225 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.2a0752 (No such file or directory)
01412251 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.7694bd to avoid overwriting it
01412283 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.7694bd (No such file or directory)
01412312 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.1f176 to avoid overwriting it
01412334 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.1f176 (No such file or directory)
01412355 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.609101 to avoid overwriting it
01412377 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.609101 (No such file or directory)
01412398 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.79c2c5 to avoid overwriting it
01412419 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.79c2c5 (No such file or directory)
01412439 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.3115b1 to avoid overwriting it
01412460 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.3115b1 (No such file or directory)
01412480 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.262f0f to avoid overwriting it
01412501 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.262f0f (No such file or directory)
01412521 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.6a1ad6 to avoid overwriting it
01412543 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.6a1ad6 (No such file or directory)
01412563 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.9a2058 to avoid overwriting it
01412584 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.9a2058 (No such file or directory)
01412604 2015-03-08 09:25:59 INFO: Renaming snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.d54791 to avoid overwriting it
01412625 2015-03-08 09:25:59 WARN: Failed to rename file snapshots/information_schema_snapshot_0 to snapshots/information_schema_snapshot_0.d54791 (No such file or directory)
01412651 2015-03-08 09:25:59 FAIL: Failed to create new 'information_schema' database (error 4)
+-----------------------------------------------------------------------------+
| MemSQL has encountered a fatal error and exited. |
| It could be a bug, misconfiguration, or hardware issue. |
+-----------------------------------------------------------------------------+
| |
| MemSQL was running with durability on, which means that when you |
| restart MemSQL, your data will be recovered back to a consistent state. |
| |
| Since the 'core-file' setting is enabled a core file will now be |
| generated at: |
| |
| /var/lib/memsql/data/core |
| |
| You can contact MemSQL directly at any time by emailing support@memsql.com |
| or calling 1-855-4-MEMSQL |
| |
+-----------------------------------------------------------------------------+
Aborted (core dumped)

when default store location /var/lib/docker is used on ext4 none of this is happening

docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.4.1
Git commit (client): a8a31ef
OS/Arch (client): linux/amd64
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.4.1
Git commit (server): a8a31ef

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Mar 16, 2015

Contributor

presently is not supported. Red Hat is tracking getting xfs to support overlayfs here: https://bugzilla.redhat.com/show_bug.cgi?id=1158888

Contributor

vbatts commented Mar 16, 2015

presently is not supported. Red Hat is tracking getting xfs to support overlayfs here: https://bugzilla.redhat.com/show_bug.cgi?id=1158888

@ChaosEngine

This comment has been minimized.

Show comment
Hide comment
@ChaosEngine

ChaosEngine Mar 16, 2015

@vbatts "You are not authorized to access bug #1158888." but...thanks for the tip

ChaosEngine commented Mar 16, 2015

@vbatts "You are not authorized to access bug #1158888." but...thanks for the tip

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Mar 16, 2015

Contributor

@ChaosEngine ah sorry :-
none the less, there is work happening to support overlayfs on xfs

Though I'm not sure whether we should explicitly not attempt it, like we do for btrfs and other non-supported underlying filesystems.

Contributor

vbatts commented Mar 16, 2015

@ChaosEngine ah sorry :-
none the less, there is work happening to support overlayfs on xfs

Though I'm not sure whether we should explicitly not attempt it, like we do for btrfs and other non-supported underlying filesystems.

@dmcgowan

This comment has been minimized.

Show comment
Hide comment
@dmcgowan

dmcgowan Mar 17, 2015

Member

@vbatts thanks for the insight (not for the link), I am seeing this issue as well.

Member

dmcgowan commented Mar 17, 2015

@vbatts thanks for the insight (not for the link), I am seeing this issue as well.

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Mar 17, 2015

Contributor

It is nothing magical, just that xfs does not support RENAME_WHITEOUT and RENAME_EXCHANGE. This rfe is being worked on.

Contributor

vbatts commented Mar 17, 2015

It is nothing magical, just that xfs does not support RENAME_WHITEOUT and RENAME_EXCHANGE. This rfe is being worked on.

@susandiamond

This comment has been minimized.

Show comment
Hide comment
@susandiamond

susandiamond Apr 30, 2015

I run into the similar issue when trying to run a docker container that needs to mount to remote storage in a docker container. It worked fine the first time when I ran it. Then I created a snapshot of the VM and deployed the VM again, Then I have problem start up the docker container.

root@dockerregistry:/etc/docker# docker run -d -p 5000:5000 -e SETTINGS_FLAVOR=csfdev wdcloud/registry:latest
FATA[0000] Error response from daemon: Error mounting '/dev/mapper/docker-202:2-1163666-b58e234600563feffd23c0f124373279a786b6f40ce27132f1f038274bdcab6b-init' on '/var/lib/docker/devicemapper/mnt/b58e234600563feffd23c0f124373279a786b6f40ce27132f1f038274bdcab6b-init': invalid argument

root@dockerregistry:/etc/docker# docker version
Client version: 1.6.0
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 4749651
OS/Arch (client): linux/amd64
Server version: 1.6.0
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 4749651
OS/Arch (server): linux/amd64

root@dockerregistry:/etc/docker# docker info
Containers: 1
Images: 19
Storage Driver: device mapper
Pool Name: docker-202:2-1163666-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 896.1 MB
Data Space Total: 80.53 GB
Data Space Available: 79.63 GB
Metadata Space Used: 1.692 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.146 GB
Udev Sync Supported: false
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.82-git (2013-10-04)
Execution Driver: native-0.2
Kernel Version: 3.13.0-45-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 4
Total Memory: 7.829 GiB
Name: dockerregistry
ID: 2KGY:TIVI:PQ2P:3FIN:XYZE:N2VA:PWVV:ILZO:NQH5:B2IW:XQQN:EPRX
WARNING: No swap limit support

susandiamond commented Apr 30, 2015

I run into the similar issue when trying to run a docker container that needs to mount to remote storage in a docker container. It worked fine the first time when I ran it. Then I created a snapshot of the VM and deployed the VM again, Then I have problem start up the docker container.

root@dockerregistry:/etc/docker# docker run -d -p 5000:5000 -e SETTINGS_FLAVOR=csfdev wdcloud/registry:latest
FATA[0000] Error response from daemon: Error mounting '/dev/mapper/docker-202:2-1163666-b58e234600563feffd23c0f124373279a786b6f40ce27132f1f038274bdcab6b-init' on '/var/lib/docker/devicemapper/mnt/b58e234600563feffd23c0f124373279a786b6f40ce27132f1f038274bdcab6b-init': invalid argument

root@dockerregistry:/etc/docker# docker version
Client version: 1.6.0
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 4749651
OS/Arch (client): linux/amd64
Server version: 1.6.0
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 4749651
OS/Arch (server): linux/amd64

root@dockerregistry:/etc/docker# docker info
Containers: 1
Images: 19
Storage Driver: device mapper
Pool Name: docker-202:2-1163666-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 896.1 MB
Data Space Total: 80.53 GB
Data Space Available: 79.63 GB
Metadata Space Used: 1.692 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.146 GB
Udev Sync Supported: false
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.82-git (2013-10-04)
Execution Driver: native-0.2
Kernel Version: 3.13.0-45-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 4
Total Memory: 7.829 GiB
Name: dockerregistry
ID: 2KGY:TIVI:PQ2P:3FIN:XYZE:N2VA:PWVV:ILZO:NQH5:B2IW:XQQN:EPRX
WARNING: No swap limit support

@ronin13

This comment has been minimized.

Show comment
Hide comment
@ronin13

ronin13 May 5, 2015

Oddly enough I am getting this with ext4 as well for docker-1.5.0/CentOS 7.1/overlayfs:

docker run  --rm -i --name box2 busybox ls
FATA[0000] Error response from daemon: mkdir /ssd/docker/overlay/e9d2a7c18d461aa286b51db3c3d5f4062c3cbe1a96954e568ad1685822c1448c-init/merged/run: invalid argument
docker info
Containers: 2
Images: 3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Kernel Version: 3.10.0-229.1.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 8
Total Memory: 31.23 GiB
Name: XXXXXXX
ID: CKNT:AAX7:GLXY:JZ3P:JL55:SZS2:HFE7:57P5:VQWN:HMZG:A2W2:24BA
 docker version
Client version: 1.5.0-dev
Client API version: 1.18
Go version (client): go1.3.3
Git commit (client): fc0329b/1.5.0
OS/Arch (client): linux/amd64
Server version: 1.5.0-dev
Server API version: 1.18
Go version (server): go1.3.3
Git commit (server): fc0329b/1.5.0
OS/Arch (server): linux/amd64

ronin13 commented May 5, 2015

Oddly enough I am getting this with ext4 as well for docker-1.5.0/CentOS 7.1/overlayfs:

docker run  --rm -i --name box2 busybox ls
FATA[0000] Error response from daemon: mkdir /ssd/docker/overlay/e9d2a7c18d461aa286b51db3c3d5f4062c3cbe1a96954e568ad1685822c1448c-init/merged/run: invalid argument
docker info
Containers: 2
Images: 3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Kernel Version: 3.10.0-229.1.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 8
Total Memory: 31.23 GiB
Name: XXXXXXX
ID: CKNT:AAX7:GLXY:JZ3P:JL55:SZS2:HFE7:57P5:VQWN:HMZG:A2W2:24BA
 docker version
Client version: 1.5.0-dev
Client API version: 1.18
Go version (client): go1.3.3
Git commit (client): fc0329b/1.5.0
OS/Arch (client): linux/amd64
Server version: 1.5.0-dev
Server API version: 1.18
Go version (server): go1.3.3
Git commit (server): fc0329b/1.5.0
OS/Arch (server): linux/amd64
@ronin13

This comment has been minimized.

Show comment
Hide comment
@ronin13

ronin13 May 5, 2015

Note that the previous error doesn't occur with docker 1.6.0, however, 1.6.0 fails at a latter stage #12984

ronin13 commented May 5, 2015

Note that the previous error doesn't occur with docker 1.6.0, however, 1.6.0 fails at a latter stage #12984

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts May 5, 2015

Contributor

@susandiamond interesting, but is unrelated here. Please open a separate issue for that.

F.Y.I. et al,
unrelated to overlay, devicemapper, or any specific thing, when dealing with ioctl or kernel interfaces and anything about the interaction is not okay the default response from the kernel is "invalid argument". Just seeing this invalid argument does not mean that the issues are in any way related.

Contributor

vbatts commented May 5, 2015

@susandiamond interesting, but is unrelated here. Please open a separate issue for that.

F.Y.I. et al,
unrelated to overlay, devicemapper, or any specific thing, when dealing with ioctl or kernel interfaces and anything about the interaction is not okay the default response from the kernel is "invalid argument". Just seeing this invalid argument does not mean that the issues are in any way related.

@juanluisbaptiste

This comment has been minimized.

Show comment
Hide comment
@juanluisbaptiste

juanluisbaptiste May 20, 2015

I'm seeing this too with start, I had some containers running on CentOS 7 with xfs and docker 1.6, the server had to be hard resetted and now all of the containers fail to start with the invalid argument error:

# docker start 178c58f02188
Error response from daemon: Cannot start container 178c58f02188: Error getting container 178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4 from driver devicemapper: Error mounting '/dev/mapper/docker-253:3-268689216-178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4' on '/var/lib/docker/devicemapper/mnt/178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4': invalid argument

juanluisbaptiste commented May 20, 2015

I'm seeing this too with start, I had some containers running on CentOS 7 with xfs and docker 1.6, the server had to be hard resetted and now all of the containers fail to start with the invalid argument error:

# docker start 178c58f02188
Error response from daemon: Cannot start container 178c58f02188: Error getting container 178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4 from driver devicemapper: Error mounting '/dev/mapper/docker-253:3-268689216-178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4' on '/var/lib/docker/devicemapper/mnt/178c58f0218860525cfabe8a329139c4a0d8a3f161d18c8f3b0d3f612f193cb4': invalid argument
@akranga

This comment has been minimized.

Show comment
Hide comment
@akranga

akranga Jun 25, 2015

Is there any progress regarding this issue? Docker v1.7 overlay over xfs and RHEL7 are facing the same

akranga commented Jun 25, 2015

Is there any progress regarding this issue? Docker v1.7 overlay over xfs and RHEL7 are facing the same

@ChaosEngine

This comment has been minimized.

Show comment
Hide comment
@ChaosEngine

ChaosEngine Jun 25, 2015

We'd probably have to wait till linux-4.1 (https://www.phoronix.com/scan.php?page=news_item&px=XFS-Linux-4.1), or use unstable current version

ChaosEngine commented Jun 25, 2015

We'd probably have to wait till linux-4.1 (https://www.phoronix.com/scan.php?page=news_item&px=XFS-Linux-4.1), or use unstable current version

@marcellodesales

This comment has been minimized.

Show comment
Hide comment
@marcellodesales

marcellodesales Jul 9, 2015

Same here... RHEL 7.1 :( Will be tracking this...

marcellodesales commented Jul 9, 2015

Same here... RHEL 7.1 :( Will be tracking this...

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jul 9, 2015

Member

@vbatts should xfs be added to the incompatible backing filesystem list in the overlay graph driver?

Member

thaJeztah commented Jul 9, 2015

@vbatts should xfs be added to the incompatible backing filesystem list in the overlay graph driver?

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Jul 9, 2015

Contributor

@thaJeztah it's been with issues, but it's under active development to provide support for XFS

Contributor

vbatts commented Jul 9, 2015

@thaJeztah it's been with issues, but it's under active development to provide support for XFS

@LordFPL

This comment has been minimized.

Show comment
Hide comment
@LordFPL

LordFPL Aug 6, 2015

So the problem is on the side of xfs and redhat is working on it via https://bugzilla.redhat.com/show_bug.cgi?id=1158888.
Unfortunately this bug is not open to the public and I'm under Centos 7.1.
Does anybody know:

  • The advance of the correction
  • If it will be via a conventional 7.1 upgrade, or will be set up for the 7.2?

Thank you in advance for any info :)

LordFPL commented Aug 6, 2015

So the problem is on the side of xfs and redhat is working on it via https://bugzilla.redhat.com/show_bug.cgi?id=1158888.
Unfortunately this bug is not open to the public and I'm under Centos 7.1.
Does anybody know:

  • The advance of the correction
  • If it will be via a conventional 7.1 upgrade, or will be set up for the 7.2?

Thank you in advance for any info :)

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Aug 8, 2015

Contributor

The changes are in upstream kernel. The exact version it will be in the
Rhel kernel, I'm not certain. It is backported and is being tested now. The
trickle down to centos will follow.
On Aug 6, 2015 04:26, "LordFPL" notifications@github.com wrote:

So the problem is on the side of xfs and redhat is working on it via
https://bugzilla.redhat.com/show_bug.cgi?id=1158888.
Unfortunately this bug is not open to the public and I'm under Centos 7.1.
Does anybody know:

  • The advance of the correction
  • If it will be via a conventional 7.1 upgrade, or will be set up for
    the 7.2?

Thank you in advance for any info :)


Reply to this email directly or view it on GitHub
#10294 (comment).

Contributor

vbatts commented Aug 8, 2015

The changes are in upstream kernel. The exact version it will be in the
Rhel kernel, I'm not certain. It is backported and is being tested now. The
trickle down to centos will follow.
On Aug 6, 2015 04:26, "LordFPL" notifications@github.com wrote:

So the problem is on the side of xfs and redhat is working on it via
https://bugzilla.redhat.com/show_bug.cgi?id=1158888.
Unfortunately this bug is not open to the public and I'm under Centos 7.1.
Does anybody know:

  • The advance of the correction
  • If it will be via a conventional 7.1 upgrade, or will be set up for
    the 7.2?

Thank you in advance for any info :)


Reply to this email directly or view it on GitHub
#10294 (comment).

@kervinpierre

This comment has been minimized.

Show comment
Hide comment
@kervinpierre

kervinpierre Aug 25, 2015

This is happening with Ext4 as well... https://bugs.centos.org/view.php?id=8493

kervinpierre commented Aug 25, 2015

This is happening with Ext4 as well... https://bugs.centos.org/view.php?id=8493

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Sep 21, 2015

Contributor

Get the same on CentOS 7.1 / ext4 as well. #15668 was closed as a dup of this issue. Can someone update the issue title given that several people confirmed this issue on ext4?

Contributor

discordianfish commented Sep 21, 2015

Get the same on CentOS 7.1 / ext4 as well. #15668 was closed as a dup of this issue. Can someone update the issue title given that several people confirmed this issue on ext4?

@shenhequnying

This comment has been minimized.

Show comment
Hide comment
@shenhequnying

shenhequnying May 12, 2016

@thaJeztah same question in centos7.2 with xfs backing system on overlayfs:
error message:

docker run --rm --name "zookeeper" docker.io/wurstmeister/zookeeper
   Error response from daemon: mkdir /var/lib/docker/overlay/261eb7ff53f4a60439894d13468f8e080e759fc7ad2c91035755e2a7d0707204-init/merged/dev/shm: invalid argument

docker info

Containers: 2
Images: 72
Server Version: 1.9.1
Storage Driver: overlay
 Backing Filesystem: xfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 40
Total Memory: 251.6 GiB
Name: data-1-26
ID: KQHI:H3Q6:UF5A:YLOV:AZAC:TCUF:PBYF:7MCZ:CMVM:PPQZ:Y76X:6M6J
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

os-release

CentOS Linux release 7.2.1511 (Core)

shenhequnying commented May 12, 2016

@thaJeztah same question in centos7.2 with xfs backing system on overlayfs:
error message:

docker run --rm --name "zookeeper" docker.io/wurstmeister/zookeeper
   Error response from daemon: mkdir /var/lib/docker/overlay/261eb7ff53f4a60439894d13468f8e080e759fc7ad2c91035755e2a7d0707204-init/merged/dev/shm: invalid argument

docker info

Containers: 2
Images: 72
Server Version: 1.9.1
Storage Driver: overlay
 Backing Filesystem: xfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 40
Total Memory: 251.6 GiB
Name: data-1-26
ID: KQHI:H3Q6:UF5A:YLOV:AZAC:TCUF:PBYF:7MCZ:CMVM:PPQZ:Y76X:6M6J
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

os-release

CentOS Linux release 7.2.1511 (Core)

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah May 12, 2016

Member

@shenhequnying may actually be #20640 what you're seeing, and it may be something specific to RHEL/CentOS

Member

thaJeztah commented May 12, 2016

@shenhequnying may actually be #20640 what you're seeing, and it may be something specific to RHEL/CentOS

@chenqiangzhishen

This comment has been minimized.

Show comment
Hide comment
@chenqiangzhishen

chenqiangzhishen Jun 6, 2016

can't for me.

I0520 17:32:52.642983 25073 exec.cpp:143] Version: 0.28.1
I0520 17:32:52.646420 25078 exec.cpp:217] Executor registered on slave 5170dd44-9ba6-4a6e-9a52-3e33625a27b5-S1
docker: Error response from daemon: mkdir /var/lib/docker/overlay/c61457600ab5050a9d261527283bbea6700c0c61d93bc06a5ff176fb5f74113c-init/merged/dev/shm: invalid argument.
See 'docker run --help'.
W0520 17:32:52.646420 25080 logging.cpp:88] RAW: Received signal SIGTERM from process 12138 of user 0; exiting
W0520 17:32:52.646420 25080 logging.cpp:88] RAW: Received signal SIGTERM from process 12138 of user 0; exiting

docker info

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 16
Server Version: 1.11.1
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.10.0-327.10.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.26 GiB
Name: chenqiang-worker-dev004-xxx.xxx
ID: DDNF:RQD6:YOHW:FE5O:GHSI:BJJT:YZC3:RH63:DGSL:EKSO:VPFL:VQRE
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

chenqiangzhishen commented Jun 6, 2016

can't for me.

I0520 17:32:52.642983 25073 exec.cpp:143] Version: 0.28.1
I0520 17:32:52.646420 25078 exec.cpp:217] Executor registered on slave 5170dd44-9ba6-4a6e-9a52-3e33625a27b5-S1
docker: Error response from daemon: mkdir /var/lib/docker/overlay/c61457600ab5050a9d261527283bbea6700c0c61d93bc06a5ff176fb5f74113c-init/merged/dev/shm: invalid argument.
See 'docker run --help'.
W0520 17:32:52.646420 25080 logging.cpp:88] RAW: Received signal SIGTERM from process 12138 of user 0; exiting
W0520 17:32:52.646420 25080 logging.cpp:88] RAW: Received signal SIGTERM from process 12138 of user 0; exiting

docker info

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 16
Server Version: 1.11.1
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.10.0-327.10.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.26 GiB
Name: chenqiang-worker-dev004-xxx.xxx
ID: DDNF:RQD6:YOHW:FE5O:GHSI:BJJT:YZC3:RH63:DGSL:EKSO:VPFL:VQRE
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
@chenqiangzhishen

This comment has been minimized.

Show comment
Hide comment
@chenqiangzhishen

chenqiangzhishen Jun 6, 2016

Fixed...

from docker official site.

To configure Docker to use the overlay storage driver your Docker host must be running version 3.18 of the Linux kernel (preferably newer) with the overlay kernel module loaded. OverlayFS can operate on top of most supported Linux filesystems. However, ext4 is currently recommended for use in production environments.

so, update kernel from 3.10.0 to 3.18.0 + fixed the issue.

chenqiangzhishen commented Jun 6, 2016

Fixed...

from docker official site.

To configure Docker to use the overlay storage driver your Docker host must be running version 3.18 of the Linux kernel (preferably newer) with the overlay kernel module loaded. OverlayFS can operate on top of most supported Linux filesystems. However, ext4 is currently recommended for use in production environments.

so, update kernel from 3.10.0 to 3.18.0 + fixed the issue.

@willstudy

This comment has been minimized.

Show comment
Hide comment
@willstudy

willstudy Jun 17, 2016

@chenqiangzhishen updating kernel is difficult for a online production, can you solve it in other ways ?

willstudy commented Jun 17, 2016

@chenqiangzhishen updating kernel is difficult for a online production, can you solve it in other ways ?

@chenqiangzhishen

This comment has been minimized.

Show comment
Hide comment
@chenqiangzhishen

chenqiangzhishen Jun 20, 2016

@willstudy , I have no idea if don't update kernel.

chenqiangzhishen commented Jun 20, 2016

@willstudy , I have no idea if don't update kernel.

@youngsu999

This comment has been minimized.

Show comment
Hide comment
@youngsu999

youngsu999 Nov 23, 2016

@willstudy, I have same issue. My RHEL system uses a special device driver that have dependency on kernel.
The device driver does not support over 3.10.xx (RHEL 7.2) and the system uses ext4 for security issues.
So I tried figure out what is the problem, and I found some points.

[root@dcv04 docker]# nvidia-docker run -it --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: mkdir /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm: invalid argument.
See 'docker run --help'.

I think that this error can comes from the attempt to create a directory in non-existing directory (/var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev)
And I tried following "mkdir" commands.

[root@xx docker]# mkdir /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm
mkdir: `/var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm' can not create directory : no such a file or directory
[root@xx docker]# mkdir -p /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm

I thought that we can bypass the compatibility problem " a kernel version and overlayfs" with "the ext4",
If I can point out the code that the "invalid mkdir action".

And I searched "mkdir" related codes. (with the strings "mkdir", "Error response from daemon", "/dev/shm" and other relevant function names.
Most of codes are used "-p" option when "mkdir" is invoked.
But I found a suspicious code that might be miss the option, and possibly try to create a "/dev/shm" folder that does not exist.

from grap

[root@xx docker]# grep -rn mkdir
....
pkg/idtools/idtools_unix.go:23:func mkdirAs(path string, mode os.FileMode, ownerUID, ownerGID int, mkAll, chownExisting bool) error {
....

from source code

func mkdirAs(path string, mode os.FileMode, ownerUID, ownerGID int, mkAll, chownExisting bool) error {
// make an array containing the original path asked for, plus (for mkAll == true)
// all path components leading up to the complete path that don't exist before we MkdirAll
// so that we can chown all of them properly at the end. If chownExisting is false, we won't
// chown the full directory path if it exists
var paths []string
if _, err := os.Stat(path); err != nil && os.IsNotExist(err) {
paths = []string{path}
} else if err == nil && chownExisting {
if err := os.Chown(path, ownerUID, ownerGID); err != nil {
return err
}
// short-circuit--we were called with an existing directory and chown was requested
return nil
} else if err == nil {
// nothing to do; directory path fully exists already and chown was NOT requested
return nil
}

    if mkAll {
            // walk back to "/" looking for directories which do not exist
            // and add them to the paths array for chown after creation
            dirPath := path
            for {
                    dirPath = filepath.Dir(dirPath)
                    if dirPath == "/" {
                            break
                    }
                    if _, err := os.Stat(dirPath); err != nil && os.IsNotExist(err) {
                            paths = append(paths, dirPath)
                    }
            }
            if err := system.MkdirAll(path, mode); err != nil && !os.IsExist(err) {
                    return err
            }
    } else {
            if err := os.Mkdir(path, mode); err != nil && !os.IsExist(err) {

Is there anyone can analysis the codes more?
I'm not a developer and have very little experience of programming.

youngsu999 commented Nov 23, 2016

@willstudy, I have same issue. My RHEL system uses a special device driver that have dependency on kernel.
The device driver does not support over 3.10.xx (RHEL 7.2) and the system uses ext4 for security issues.
So I tried figure out what is the problem, and I found some points.

[root@dcv04 docker]# nvidia-docker run -it --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: mkdir /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm: invalid argument.
See 'docker run --help'.

I think that this error can comes from the attempt to create a directory in non-existing directory (/var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev)
And I tried following "mkdir" commands.

[root@xx docker]# mkdir /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm
mkdir: `/var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm' can not create directory : no such a file or directory
[root@xx docker]# mkdir -p /var/lib/docker/overlay/4a0fd230b2c3f418a230a29efcb8868cc12f4c24366370ad8ae4132c01f85ae1-init/merged/dev/shm

I thought that we can bypass the compatibility problem " a kernel version and overlayfs" with "the ext4",
If I can point out the code that the "invalid mkdir action".

And I searched "mkdir" related codes. (with the strings "mkdir", "Error response from daemon", "/dev/shm" and other relevant function names.
Most of codes are used "-p" option when "mkdir" is invoked.
But I found a suspicious code that might be miss the option, and possibly try to create a "/dev/shm" folder that does not exist.

from grap

[root@xx docker]# grep -rn mkdir
....
pkg/idtools/idtools_unix.go:23:func mkdirAs(path string, mode os.FileMode, ownerUID, ownerGID int, mkAll, chownExisting bool) error {
....

from source code

func mkdirAs(path string, mode os.FileMode, ownerUID, ownerGID int, mkAll, chownExisting bool) error {
// make an array containing the original path asked for, plus (for mkAll == true)
// all path components leading up to the complete path that don't exist before we MkdirAll
// so that we can chown all of them properly at the end. If chownExisting is false, we won't
// chown the full directory path if it exists
var paths []string
if _, err := os.Stat(path); err != nil && os.IsNotExist(err) {
paths = []string{path}
} else if err == nil && chownExisting {
if err := os.Chown(path, ownerUID, ownerGID); err != nil {
return err
}
// short-circuit--we were called with an existing directory and chown was requested
return nil
} else if err == nil {
// nothing to do; directory path fully exists already and chown was NOT requested
return nil
}

    if mkAll {
            // walk back to "/" looking for directories which do not exist
            // and add them to the paths array for chown after creation
            dirPath := path
            for {
                    dirPath = filepath.Dir(dirPath)
                    if dirPath == "/" {
                            break
                    }
                    if _, err := os.Stat(dirPath); err != nil && os.IsNotExist(err) {
                            paths = append(paths, dirPath)
                    }
            }
            if err := system.MkdirAll(path, mode); err != nil && !os.IsExist(err) {
                    return err
            }
    } else {
            if err := os.Mkdir(path, mode); err != nil && !os.IsExist(err) {

Is there anyone can analysis the codes more?
I'm not a developer and have very little experience of programming.

@trendsoa

This comment has been minimized.

Show comment
Hide comment
@trendsoa

trendsoa Dec 11, 2016

I have the same issue with Redhat7.2 + docker 1.12.3 + kernel3.10.0-327.36.3.el7.x86_64 + overlay/extfs

how could we fix this issue?

PS: if we upgrade kernel to 4.4, it works.

trendsoa commented Dec 11, 2016

I have the same issue with Redhat7.2 + docker 1.12.3 + kernel3.10.0-327.36.3.el7.x86_64 + overlay/extfs

how could we fix this issue?

PS: if we upgrade kernel to 4.4, it works.

@bobrik

This comment has been minimized.

Show comment
Hide comment
@bobrik

bobrik Dec 18, 2016

Contributor

I wanted to share some internal findings. We've tried linux 4.9 and found this in dmesg:

ivan@foo:~$ dmesg | tail
[  184.223910] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[  209.364018] overlayfs: upper fs needs to support d_type.
[  209.582764] overlayfs: upper fs needs to support d_type.
[  209.944933] overlayfs: upper fs needs to support d_type.
[  210.110592] overlayfs: upper fs needs to support d_type.
[  210.317340] overlayfs: upper fs needs to support d_type.
[  210.844979] overlayfs: upper fs needs to support d_type.
[  211.055778] overlayfs: upper fs needs to support d_type.
[  211.305498] overlayfs: upper fs needs to support d_type.
[  212.146527] overlayfs: upper fs needs to support d_type.

Let's see what's that: https://patchwork.kernel.org/patch/8377611/

In some instances xfs has been created with ftype=0 and there if a file
on lower fs is removed, overlay leaves a whiteout in upper fs but that
whiteout does not get filtered out and is visible to overlayfs users.

To see what we have:

ivan@foo:~$ sudo xfs_info /dev/md127
meta-data=/dev/md127             isize=256    agcount=32, agsize=5491456 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=175724928, imaxpct=25
         =                       sunit=128    swidth=384 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=85808, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

http://man7.org/linux/man-pages/man8/mkfs.xfs.8.html

                   ftype=value
                          This feature allows the inode type to be stored in
                          the directory structure so that the readdir(3) and
                          getdents(2) do not need to look up the inode to
                          determine the inode type.

                          The value is either 0 or 1, with 1 signifiying
                          that filetype information will be stored in the
                          directory structure. The default value is 0.

                          When CRCs are enabled via -m crc=1, the ftype
                          functionality is always enabled. This feature can
                          not be turned off for such filesystem
                          configurations.

CRC sounds useful and should be cheap on modern CPUs.

                   crc=value
                          This is used to create a filesystem which
                          maintains and checks CRC information in all
                          metadata objects on disk. The value is either 0 to
                          disable the feature, or 1 to enable the use of
                          CRCs.

                          CRCs enable enhanced error detection due to
                          hardware issues, whilst the format changes also
                          improves crash recovery algorithms and the ability
                          of various tools to validate and repair metadata
                          corruptions when they are found.  The CRC
                          algorithm used is CRC32c, so the overhead is
                          dependent on CPU architecture as some CPUs have
                          hardware acceleration of this algorithm.
                          Typically the overhead of calculating and checking
                          the CRCs is not noticeable in normal operation.

                          By default, mkfs.xfs will enable metadata CRCs.

I've reformatted disk with crc=1 and it fixed the issue we were facing. It may have been a different issue, but a very similar one.

Contributor

bobrik commented Dec 18, 2016

I wanted to share some internal findings. We've tried linux 4.9 and found this in dmesg:

ivan@foo:~$ dmesg | tail
[  184.223910] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[  209.364018] overlayfs: upper fs needs to support d_type.
[  209.582764] overlayfs: upper fs needs to support d_type.
[  209.944933] overlayfs: upper fs needs to support d_type.
[  210.110592] overlayfs: upper fs needs to support d_type.
[  210.317340] overlayfs: upper fs needs to support d_type.
[  210.844979] overlayfs: upper fs needs to support d_type.
[  211.055778] overlayfs: upper fs needs to support d_type.
[  211.305498] overlayfs: upper fs needs to support d_type.
[  212.146527] overlayfs: upper fs needs to support d_type.

Let's see what's that: https://patchwork.kernel.org/patch/8377611/

In some instances xfs has been created with ftype=0 and there if a file
on lower fs is removed, overlay leaves a whiteout in upper fs but that
whiteout does not get filtered out and is visible to overlayfs users.

To see what we have:

ivan@foo:~$ sudo xfs_info /dev/md127
meta-data=/dev/md127             isize=256    agcount=32, agsize=5491456 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=175724928, imaxpct=25
         =                       sunit=128    swidth=384 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=85808, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

http://man7.org/linux/man-pages/man8/mkfs.xfs.8.html

                   ftype=value
                          This feature allows the inode type to be stored in
                          the directory structure so that the readdir(3) and
                          getdents(2) do not need to look up the inode to
                          determine the inode type.

                          The value is either 0 or 1, with 1 signifiying
                          that filetype information will be stored in the
                          directory structure. The default value is 0.

                          When CRCs are enabled via -m crc=1, the ftype
                          functionality is always enabled. This feature can
                          not be turned off for such filesystem
                          configurations.

CRC sounds useful and should be cheap on modern CPUs.

                   crc=value
                          This is used to create a filesystem which
                          maintains and checks CRC information in all
                          metadata objects on disk. The value is either 0 to
                          disable the feature, or 1 to enable the use of
                          CRCs.

                          CRCs enable enhanced error detection due to
                          hardware issues, whilst the format changes also
                          improves crash recovery algorithms and the ability
                          of various tools to validate and repair metadata
                          corruptions when they are found.  The CRC
                          algorithm used is CRC32c, so the overhead is
                          dependent on CPU architecture as some CPUs have
                          hardware acceleration of this algorithm.
                          Typically the overhead of calculating and checking
                          the CRCs is not noticeable in normal operation.

                          By default, mkfs.xfs will enable metadata CRCs.

I've reformatted disk with crc=1 and it fixed the issue we were facing. It may have been a different issue, but a very similar one.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Dec 18, 2016

Member

@bobrik that's a known issue indeed; in future, docker will refuse to use overlay in that situation, see https://github.com/docker/docker/blob/v1.13.0-rc4/docs/deprecated.md#backing-filesystem-without-d_type-support-for-overlayoverlay2, and #27358

Member

thaJeztah commented Dec 18, 2016

@bobrik that's a known issue indeed; in future, docker will refuse to use overlay in that situation, see https://github.com/docker/docker/blob/v1.13.0-rc4/docs/deprecated.md#backing-filesystem-without-d_type-support-for-overlayoverlay2, and #27358

@ensemblebd

This comment has been minimized.

Show comment
Hide comment
@ensemblebd

ensemblebd Aug 17, 2017

Instances have been fine for almost year and a half. All of sudden, out of nowhere, this happens - mid day was running this morning.
Seems to affect certain containers (all the database ones). Which is strange. Would be such a loss if I can't recover these. Should've mounted storage volumes for them and backed them up.... sigh.

ensemblebd commented Aug 17, 2017

Instances have been fine for almost year and a half. All of sudden, out of nowhere, this happens - mid day was running this morning.
Seems to affect certain containers (all the database ones). Which is strange. Would be such a loss if I can't recover these. Should've mounted storage volumes for them and backed them up.... sigh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment