Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vfs storage driver does not work on NFS #45417

Closed
ChenQi1989 opened this issue Apr 27, 2023 · 5 comments · Fixed by #45463
Closed

vfs storage driver does not work on NFS #45417

ChenQi1989 opened this issue Apr 27, 2023 · 5 comments · Fixed by #45463
Labels
area/storage kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.

Comments

@ChenQi1989
Copy link

Description

vfs storage driver does not work on NFS.

A simple 'docker run -it alpine' results in the following error:
docker: Error response from daemon: operation not supported.
level=error msg="Handler for POST /v1.42/containers/create returned error: operation not supported"

Using strace and what I got is:
lgetxattr("/var/lib/docker/vfs/dir/a93d6acc41f0fddc597f35fa1fb0b1c1b79c8ab04000570473cd15da20131cf3", "
security.capability", 0xc000f3b200, 128) = -1 EOPNOTSUPP (Operation not supported)

This means it's trying to get extended security attributes but the underlying NFS does not support it.
Is this expected or is this a bug?

Reproduce

docker run -it alpine

Expected behavior

docker run succeeds

docker version

# docker version
Client:
 Version:           20.10.21-ce
 API version:       1.41
 Go version:        go1.20.1
 Git commit:        baeda1f82a
 Built:             Thu Apr 27 02:36:59 2023
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.21-ce
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.20.1
  Git commit:       4ed81ac0e2-unsupported
  Built:            Wed Nov  9 03:13:48 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.0-11-g6ea9bc57f.m
  GitCommit:        6ea9bc57f97cd6bdd62afe8c8295706de36afd51.m
 runc:
  Version:          1.1.5+dev
  GitCommit:        v1.1.5-1-g17a2d451-dirty
 docker-init:
  Version:          0.19.0
  GitCommit:        b9f42a0-dirty

docker info

# docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.21-ce
 Storage Driver: vfs
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 6ea9bc57f97cd6bdd62afe8c8295706de36afd51.m
 runc version: v1.1.5-1-g17a2d451-dirty
 init version: b9f42a0-dirty
 Kernel Version: 5.15.103-yocto-standard
 Operating System: Poky (Yocto Project Reference Distro) 4.2 (mickledore)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.841GiB
 Name: qemux86-64
 ID: PRTU:AZVE:RYAU:RIET:2XA2:AEPF:RJAZ:XFVO:XZEC:SOS4:WH2P:E776
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

The above 'docker version' & 'docker info' output are about 20.10.21, but I want to clarify that this issue has been is still there on current docker.

On docker 23.0.2, we got this problem.
On docker 20.10.21, we got this problem.
On docker 20.10.17, there's no such problem.

@ChenQi1989 ChenQi1989 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Apr 27, 2023
@ChenQi1989
Copy link
Author

https://docs.docker.com/storage/storagedriver/select-storage-driver/
The above link says 'vfs' supports 'any filesystem'. This does not seem to be the case any more. It currently does not support NFS.
Or am I missing something?

@ChenQi1989
Copy link
Author

The commit that introduces this problem is: 31f654a

@ChenQi1989
Copy link
Author

The commit (31f654a) is changing the semantics of DirCopy but it did not change daemon/graphdriver/vfs/copy_linux.go's call to DirCopy.

ChenQi1989 added a commit to ChenQi1989/moby that referenced this issue Apr 27, 2023
vfs is declared to work with any filesystem, but after
moby@31f654a
it's no longer working with NFS.

As the extended attribute support depends on filesystem and
if we do copy it in vfs and do not allow failure, that would
essentially mean that vfs does NOT support all filesystems but
only those that support xattr.

So we should just try to copy security.capabilities and allow
for failure. In this way, vfs come back to the state of
being able to run on any filesystem as declared in
https://docs.docker.com/storage/storagedriver/select-storage-driver/.

Fixes moby#45417

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
ChenQi1989 added a commit to ChenQi1989/moby that referenced this issue Apr 27, 2023
vfs is declared to work with any filesystem, but after
moby@31f654a
it's no longer working with NFS.

As the extended attribute support depends on filesystem and
if we do copy it in vfs and do not allow failure, that would
essentially mean that vfs does NOT support all filesystems but
only those that support xattr.

So we should just try to copy security.capabilities and allow
for failure. In this way, vfs come back to the state of
being able to run on any filesystem as declared in
https://docs.docker.com/storage/storagedriver/select-storage-driver/.

Fixes moby#45417

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
@corhere
Copy link
Contributor

corhere commented May 3, 2023

NFS v4.2 on Linux 5.9 and above support extended attributes when the underlying filesystem of the export does.

https://www.phoronix.com/news/Linux-5.9-NFS-Server-User-Xattr
https://www.phoronix.com/news/Linux-5.9-NFS-Client-Changes

Allowing the VFS driver to operate on filesystems which do not support xattrs will seemingly work at first but will cause hard-to-diagnose problems for containers which depend on file capabilities for proper functioning. The source of the "regression" is a fix for a real problem which affects real containers.

In my opinion, the only bugs here are in the documentation and a lack of a filesystem-compatibility check by the vfs driver on daemon startup.

@ChenQi1989
Copy link
Author

NFS v4.2 on Linux 5.9 and above support extended attributes when the underlying filesystem of the export does.

https://www.phoronix.com/news/Linux-5.9-NFS-Server-User-Xattr https://www.phoronix.com/news/Linux-5.9-NFS-Client-Changes

According to the above link, it only supports user xattr, not security xattr which is copied by vfs.
Anyway, even if new kernel & nfs suports security xattr, it's not favorable to require people to upgrade their testing infrastructure because such upgrade will usually comes with some chaos and chaos means cost.
If I ask people to upgrade the testing infrasture and tell them that the failure is from xattr copying, the next question I would get is: does your image have xattr?

Allowing the VFS driver to operate on filesystems which do not support xattrs will seemingly work at first but will cause hard-to-diagnose problems for containers which depend on file capabilities for proper functioning. The source of the "regression" is a fix for a real problem which affects real containers.

In my opinion, the only bugs here are in the documentation and a lack of a filesystem-compatibility check by the vfs driver on daemon startup.

In the PR's comments, we all agree with the daemon option solution. I'll do it. I'm not familiar with golang and docker source codes. So that may take some time. Please help review when it's done. Thanks :)

mseaster-wr pushed a commit to WindRiverLinux23/meta-virtualization that referenced this issue Aug 31, 2023
Issue: LIN1023-638

Docker now errors out  when running on NFS, this is because the
vfs storage driver was introduced a regression about xattr copying.
See moby/moby#45417

Backport a patch to fix this issue.

(LOCAL REV: NOT UPSTREAM) -- WRLinux Specific Patch

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Projects
None yet
3 participants