Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker pull inside rootless LXC: failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat #45884

Open
frenzymind opened this issue Jul 5, 2023 · 15 comments · May be fixed by #45890
Labels
area/storage/overlay area/storage/zfs kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/23.0

Comments

@frenzymind
Copy link

Description

Proxmox lxc rootless container. I get:
failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat /tmp/v8-compile-cache-0/8.4.371.23-node.88: invalid argument
Other issues says that this uid/gid problem. I check image and seems there is no problems with uid/gid, but when rm -rf executing, that cause the problem I guess. Here is screenshot.
issue
I also change uid/gid range of lxc for the sake of experiment - problem still there. So I think again uid is not the reason here.
User inside image is root (0/0). Right permission looks ok.
Rootfull container works well. No idea what is wrong in this case.

What happening here ? What should I look for ?

Reproduce

  1. docker pull teracy/angular-cli:14.0.6

Expected behavior

Pull image well

docker version

Client: Docker Engine - Community
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:46:54 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       bc3805a
  Built:            Thu Feb  9 19:46:54 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.16.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 5
  Running: 0
  Paused: 0
  Stopped: 5
 Images: 11
 Server Version: 23.0.1
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2456e983eb9e37e47538f59ea18f2043c9a73640
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.104-1-pve
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 4GiB
 Name: billing
 ID: 9ead38ad-466b-4f31-82ea-c3dd4ae3f3cb
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

pve-manager/7.4-3/9002ab8a (running kernel: 5.15.104-1-pve)

@frenzymind frenzymind added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Jul 5, 2023
@neersighted
Copy link
Member

Could you please provide output from dmesg? I suspect that what is happening here is you've updated your kernel or Docker, and that the newer iteration of overlay2 that attempts to detect if a filesystem is suitable is hitting issues.

Additionally, can you share the subuid/subgid maps and the image? It's possible, though unlikely (I don't the kernel will return EINVAL in that case) that something is acting up there given remapping.

@neersighted
Copy link
Member

More evidence it's not the subuid/subgid maps directly: that would make this a duplicate of #43576, and the version you're running has an improve error message for that specific case.

@frenzymind
Copy link
Author

Could you please provide output from dmesg?

dmesg is:

[136959.869942] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136959.870008] overlayfs: fs on '/var/lib/docker/overlay2/l/5XRG6RCSU375FNPO7QQZRXVSCR' does not support file handles, falling back to xino=off.
[136960.043191] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136960.043254] overlayfs: fs on '/var/lib/docker/overlay2/l/MJABGHRZCXW23IQX74BJZXGRGW' does not support file handles, falling back to xino=off.
[136964.913458] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136964.913520] overlayfs: fs on '/var/lib/docker/overlay2/l/DOPK2ALFGZYAYB3YYGWABXNDPX' does not support file handles, falling back to xino=off.
[136965.261145] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136965.261208] overlayfs: fs on '/var/lib/docker/overlay2/l/GP545U4TH7IE2OT7WFKCZR2MKJ' does not support file handles, falling back to xino=off.
[136965.616563] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136965.616624] overlayfs: fs on '/var/lib/docker/overlay2/l/DKSJIGBRZ2GYTCO7RX44W4XALM' does not support file handles, falling back to xino=off.

but this log happens and with rootfull container. I use zfs as a backend fs, and with RENAME_WHITEOUT I haven't seen any negative consequences, even for more complex Docker images such as Gitlab

Additionally, can you share the subuid/subgid maps and the image?

Image from docker hub: docker pull teracy/angular-cli:14.0.6

On the host I allow this ids:

root@testlab:~# cat /etc/subuid
root:100000:510000000

root@testlab:~# cat /etc/subgid
root:100000:510000000

and in lxc config:

lxc.idmap: u 0 100000 500000000
lxc.idmap: g 0 100000 500000000

That should be enough to use high id inside lxc and allowed them on host.

More evidence it's not the subuid/subgid maps directly: that would make this a duplicate of #43576, and the version you're running has an improve error message for that specific case.

I read that topic, there was pure uid issue.

@neersighted
Copy link
Member

[136965.616563] overlayfs: upper fs does not support RENAME_WHITEOUT.
[136965.616624] overlayfs: fs on '/var/lib/docker/overlay2/l/DKSJIGBRZ2GYTCO7RX44W4XALM' does not support file handles, falling back to xino=off.

This is going to be the issue -- you're on an old enough version of ZFS that some of the system calls we use don't work in every scenario. Things might appear to work with root, but that should be an illusion; I need to find some time to download the Proxmox kernel sources and confirm the ZFS version, but it's almost guaranteed to be < 2.2 (see openzfs/zfs#8648 for more).

We should consider adding a more functional test than "can it mount overlay with multiple lowerdirs?" to prevent incorrectly picking overlay2; I suspect that previously in this situation we would have fallen all the way back to vfs unless you made fuse-overlayfs available.

I would suggest manually selecting one of those two storage drivers as overlay2 will not work here with a busted underlying filesystem.

@neersighted neersighted changed the title Rootless docker pull failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat docker pull inside rootless LXC: failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat Jul 5, 2023
@neersighted
Copy link
Member

neersighted commented Jul 5, 2023

Yes, you're on an incompatible version of ZFS that's in the uncanny valley. Your kernel was built from 9beed8f7a598fce47040fb476c15309760491aaf, which includes ZoL at 5ea8a38968ee2cc9e50b3a66819b5520f46eb660.

This version of OpenZFS/ZoL self-reports as 2.1.9, which is missing RENAME_WHITEOUT vfs support; you can test this with e.g. renameat2(2), which will return EINVAL.

We need a more functional test to detect these edge-case filesystems, as the current detection logic when combined with your kernel results in this uncanny valley situation.

@neersighted neersighted linked a pull request Jul 5, 2023 that will close this issue
@frenzymind
Copy link
Author

@neersighted Thanks you very much for detailed explanation and your time.
Wait for zfs 2.2.0 release.
Here I make some kind of conclusion based on this thread and how to escape this error with 'play around' solutions: Proxmox + ZFS + LXC + Docker

@f0re1gnKey
Copy link

running lxc in privileged mode can solve this problem.

but this will cause security problem unless you TRUST YOUR DOCKER LXC MACHINE (not docker container, it still in unprivileged mode)

@qupfer
Copy link

qupfer commented Sep 27, 2023

I helped me with fuse-overlayfs.
In the lxc container (aka docker host):
systemctl stop docker
rm -rf /var/lib/docker
apt -y install fuse-overlayfs and add to /etc/docker/daemon.json

{
  "storage-driver": "fuse-overlayfs"
}

reboot.

Obvoiusly, this is not that performant than overlayfs2 and needs fuse cap inside the container but (I think) still better than vfs storage driver, privileged container or some loopback-ext4-workarounds.

( I read somewhere, that the lxc host also needs fuse-overlayfs installed...)

@nbrugger-tgm
Copy link

@qupfer be very cautious with posting things like rm -rf /var/lib/docker! This wipes your complete docker installation - all containers, all (local) volumes. Depending on who executes this where a lot of work can be lost. If you do post such destructive commands please add a notice what it does and causes, not everyone knows where docker stores what.

@weboide
Copy link

weboide commented Nov 22, 2023

Still the same issue on proxmox 8.0.9 and zfs 2.2.0-pve3, with the same subuid/subgid and lxc.idmap as OP.

ERROR: failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat /usr/lib/locale/C.UTF-8/LC_MESSAGES: invalid argument

@DanielHabenicht
Copy link

For me, the problem is now resolved with Proxmox 8.1 and ZFS 2.2.0. I could not reproduce it anymore.

@weboide
Copy link

weboide commented Nov 29, 2023

I also confirm this is resolved in Proxmox 8.1 with zfs 2.2.0-pve3.

@cpadil
Copy link

cpadil commented Dec 1, 2023

@weboide I'm also testing this with an unprivileged LXC container in PVE 8.1, seems to be working just fine, but I see this in the PVE node log

overlayfs: fs on '/var/lib/docker/overlay2/l/2SWWDMHD7HAHLZVDLXFYX7RZZZ' does not support file handles, falling back to xino=off.

I've had any issues so far, but I'm wondering if you also get the same warnings?
thanks

@fritzi001
Copy link

fritzi001 commented Jan 8, 2024

i get this warnings as well and i am not sure if i have to be concerned or if its just an info and will work, maybe with small decrease of performance?

Proxmox 8.1 with zfs 2.2.0-pve3.
LXC Container
image

@s0129
Copy link

s0129 commented Mar 10, 2024

Has this regressed?
I am still getting the error on Proxmox 8.1.4 with zfs-2.2.2-pve2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage/overlay area/storage/zfs kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/23.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.