-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Closed
Labels
type: bugSomething isn't workingSomething isn't working
Description
Description
Sometimes the mv
command prompts about overwriting a non-existent file:
* dseomn@8435c984468e:/tmp/tmp.xPFwwVcgKD$ ls -laR
.:
total 0
drwx------ 2 dseomn dseomn 100 Jun 5 20:34 .
drwxrwxrwt 2 root root 72260 Jun 5 20:33 ..
drwxr-xr-x 2 dseomn dseomn 60 Jun 5 20:34 foo
./foo:
total 0
drwxr-xr-x 2 dseomn dseomn 60 Jun 5 20:34 .
drwx------ 2 dseomn dseomn 100 Jun 5 20:34 ..
-rw-r--r-- 1 dseomn dseomn 0 Jun 5 20:34 bar
* dseomn@8435c984468e:/tmp/tmp.xPFwwVcgKD$ mv foo/bar quux
mv: replace 'quux', overriding mode 5000 (--S-----T)?
The actual mode it shows is different each time, and occasionally it works, which makes me think it might be reading the mode from uninitialized memory. Here's part of the strace log:
newfstatat(AT_FDCWD, "quux", 0x7ee458b79eb0, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
geteuid() = 1000
faccessat2(AT_FDCWD, "quux", W_OK, AT_EACCESS) = -1 ENOENT (No such file or directory)
faccessat2(AT_FDCWD, "quux", W_OK, AT_EACCESS) = -1 ENOENT (No such file or directory)
write(2, "mv: replace 'quux', overriding m"..., 54mv: replace 'quux', overriding mode 0000 (---------)? ) = 54
Steps to reproduce
I created the files and directories above using mktemp -d
, mkdir foo
, and touch foo/bar
. runsc is running rootless with -ignore-cgroups
and -network=host
.
runsc version
runsc version 0.0~20240729.0
spec: 1.2.0
docker version (if using docker)
Client: Podman Engine
Version: 5.4.2
API Version: 5.4.2
Go Version: go1.24.2
Built: Sat May 24 14:25:04 2025
Build Origin: Debian
OS/Arch: linux/amd64
uname
Linux solaria 6.12.27-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.27-1 (2025-05-06) x86_64 GNU/Linux
kubectl (if using Kubernetes)
N/A
repo state (if built from source)
N/A
runsc debug logs (if available)
Metadata
Metadata
Assignees
Labels
type: bugSomething isn't workingSomething isn't working
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
ayushr2 commentedon Jun 5, 2025
Can you provide the full debug logs?
dseomn commentedon Jun 5, 2025
When I restarted the container (with the debug logging options), it didn't happen right away. However, I tried to recreate what I had done before, and got the error.
Here's the relevant part of a shell script that I was running. The
m17n/m17n-lib
directory is a git clone of https://cgit.git.savannah.gnu.org/cgit/m17n/m17n-lib.git/tree/src/ with some local modifications, mounted with--volume="$HOME/Code/m17n:$HOME/m17n:ro"
.Before this issue happened the first time (I think), I had started to run that, then realized that I was missing something I wanted to build, so I pressed Ctrl-C to try to cancel it. That didn't work (I think because I didn't pass
--interactive
to podman), so I killed the podman process instead. After killing it and running it again, I started to get the bug. Here's the end of the output from running it again after killing it, which I think is the same issue withmv
getting the wrong stat results:I got an error uploading the logs to GitHub, maybe because of the file size, so here's a Google Drive link: https://drive.google.com/file/d/1hMZIoSJpgQAwvkA9Mo7ejVv1dze8lgOY/view?usp=sharing
dseomn commentedon Jun 5, 2025
It just happened again after restarting the container, without killing any podman processes this time.
ayushr2 commentedon Jun 5, 2025
IIUC, we are trying to move
/home/dseomn/work/m17n-lib/confpNR0ei/out
to/home/dseomn/work/m17n-lib/Makefile
./home/dseomn/work/m17n-lib/confpNR0ei/out
is a regular file that exists.renameat2
fails with EEXIST butopenat()
andnewfstatat()
confirms that/home/dseomn/work/m17n-lib/Makefile
does not exist.I can see that
/home/dseomn/work/m17n-lib/Makefile
existed in the past:But it was soon deleted:
It is confusing with renameat2() would fail with EEXIST in that case.
dseomn commentedon Jun 6, 2025
That sounds right. It should be generated during the configure/build process, and I'm deleting all generated files and rebuilding because it's fast enough and I don't want to deal with stale build artifacts.
ayushr2 commentedon Jun 6, 2025
@dseomn Could you:
mktemp -d
,mkdir foo
,touch foo/bar
,ls -laR
andmv foo/bar quux
. I tried this multiple times but can't repro. You mentioned that this happens frequently enough. The reproducer I am trying is:docker run --runtime=op --rm -it ubuntu bash -c 'cd "$(mktemp -d)" && pwd && mkdir foo && touch foo/bar && ls -laR && mv foo/bar quux'
.20240729
release. Could you try with the latest runsc release and see if it still repros?--overlay2=none
and check? This is just to narrow down the issue to overlayfs/tmpfs.dseomn commentedon Jun 6, 2025
Here's a one-liner that reliably reproduces the bug for me in some containers:
In this container, I don't get the bug:
In this container, I do get the bug:
localhost/dev-environment
was built with https://github.com/dseomn/dotfiles/blob/7074061500b7b81bdadfa5f0e290ff72ec1e6664/.local/bin/dev-environment-build using https://github.com/dseomn/dotfiles/blob/7074061500b7b81bdadfa5f0e290ff72ec1e6664/.local/share/dotfiles/dev-environment/ContainerfileI added the
--user
,--workdir
, and--env
flags and specified a command to get rid of those differences between the two images. I don't think any of the other differences are relevant, except that they add additional layers and size to the image?I tried the same command with new images generated from these to see if the additional layers mattered, but I didn't get the bug:
I want to work on other stuff today, but I can try that in a bit.
With this container, I don't see the bug:
So that seems to be a good workaround for now, thank you!
ayushr2 commentedon Jun 6, 2025
Thanks @dseomn
I can reproduce this with the latest build. So no action needed from you. I will investigate this.
Ah so this is indeed an overlayfs OR tmpfs bug.
ayushr2 commentedon Jun 6, 2025
I think I figured out the issue. This is a bug in our overlayfs implementation. This occurs when using
renameat2(2)
syscall withRENAME_NOREPLACE
flag and trying to rename a file into a position which contains a whiteout. Overlayfs correctly identifies the whiteout and considers it an empty slot. But when it calls into the upper layer to actually do the rename work, it passes the sameRENAME_NOREPLACE
flag. tmpfs (upper layer) does not treat the whiteout (which is a character device) specially and errors out with EEXIST.Usually this does not cause any issues because the
mv
binary just retries the rename without theRENAME_NOREPLACE
flag and that syscall succeeds. This is evidenced in the logs here from the non-error-printing attempts:But sometimes, the
mv
binary does not do this retry step, as evidenced in these logs:I also can't reproduce these failures with images like
debian:testing
andubuntu:latest
(as you pointed out). This is only reproducible with the custom image from your project. It remains a mystery to me why using your image will make themv
binary flaky in its behavior. I verified that themv
binary is the same between the two images:Anyways, I will write a fix for this. I think overlayfs should not pass the
RENAME_NOREPLACE
to filesystem implementations. This is consistent with Linux; the kernel clears theRENAME_NOREPLACE
flag infs/overlayfs/dir.c:ovl_rename()
.ayushr2 commentedon Jun 6, 2025
I realized that
mv
is a dynamically linked binary. And confirmed withldd
that their dependency.so
files are changing between the two images:Notice that the offsets in these files changed. So some dependencies that you are installing in your image is causing the behavior of the
mv
binary to change.overlayfs: Clear RENAME_NOREPLACE before calling into layers.
dseomn commentedon Jun 6, 2025
I don't think I'm doing anything unusual with mv or any of its dependencies. I tried running debian:testing with an extra
apt-get update
andapt-get -y dist-upgrade
, but that still didn't show the bug. Not sure what else could affectmv
.ayushr2 commentedon Jun 7, 2025
Actually I get this bug on
debian:testing
too:docker run --runtime=runsc --rm -it debian:testing bash -c 'mkdir /bug-test && touch /bug-test/foo && for i in {1..50}; do echo $i; mv /bug-test/{foo,bar}; mv /bug-test/{bar,foo}; done'
Not sure why with Podman I was not seeing this bug with
debian:testing
. Anyways, I confirmed that #11798 fixes this issue.overlayfs: Clear RENAME_NOREPLACE before calling into layers.
overlayfs: Clear RENAME_NOREPLACE before calling into layers.
overlayfs: Clear RENAME_NOREPLACE before calling into layers.
overlayfs: Clear RENAME_NOREPLACE before calling into layers.