New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm: skipping stage1 GC: device or resource busy due to 'upperdir is in-use by another mount' #3805

Closed
euank opened this Issue Sep 20, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@euank
Member

euank commented Sep 20, 2017

Environment

rkt Version: 1.28.1
appc Version: 0.8.10
Go Version: go1.7.6
Go OS/Arch: linux/amd64
Features: -TPM +SDJOURNAL
--
Linux 4.13.2-coreos x86_64
--
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1535.1.0
VERSION_ID=1535.1.0
BUILD_ID=2017-09-15-1954
PRETTY_NAME="Container Linux by CoreOS 1535.1.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
--
systemd 234
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT -GNUTLS -ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN default-hierarchy=legacy

What did you do?

$ rkt run --uuid-file-save=/tmp/uuid docker://busybox --insecure-options=image
$ rkt rm --uuid-file=/tmp/uuid

What did you expect to see?

Normal output

What did you see instead?

rm: skipping stage1 GC: device or resource busy

$ dmesg | tail -n 1
[  614.343314] overlayfs: upperdir is in-use by another mount

Note, this doesn't actually prevent rm from succeeding. It continues on and works correctly as far as I can tell.
However, the error looks a bit scary, and I think it's a clear sign rkt's code is doing something wrong.

There's a similar error which docker hits unreliably over here: coreos/bugs#2127

@lucab

This comment has been minimized.

Show comment
Hide comment
@lucab

lucab Sep 20, 2017

Member

For reference, the full error is:

rm: skipping stage1 GC
  └─error mounting stage1
    └─device or resource busy

It comes from here:

rkt/rkt/gc.go

Line 219 in 65d9e1c

return errwrap.Wrap(errors.New("error mounting stage1"), err)

Note, this doesn't actually prevent rm from succeeding.

Right, it doesn't. That's because stage0 logic tries to recover from complex issues with an "eventually clean" logic. This error however means that we are skipping the stage1 GC logic, which may result in other resources (i.e. non mounts/FS) leakage. I guess mostly cgroups and CNI items could be left around.

However, the error looks a bit scary, and I think it's a clear sign rkt's code is doing something wrong.

Partially true. It looks like the kernel commit you linked changed the behavior of mount() and broke this invariant in our code:

rkt/rkt/gc.go

Line 257 in 65d9e1c

// an overlay fs can be mounted over itself, let's unmount it here

It used to be possible to double-mount the same overlayfs in place, but now that fails with EBUSY. We should proceed in that case, as it means we already have everything we need in place.

Member

lucab commented Sep 20, 2017

For reference, the full error is:

rm: skipping stage1 GC
  └─error mounting stage1
    └─device or resource busy

It comes from here:

rkt/rkt/gc.go

Line 219 in 65d9e1c

return errwrap.Wrap(errors.New("error mounting stage1"), err)

Note, this doesn't actually prevent rm from succeeding.

Right, it doesn't. That's because stage0 logic tries to recover from complex issues with an "eventually clean" logic. This error however means that we are skipping the stage1 GC logic, which may result in other resources (i.e. non mounts/FS) leakage. I guess mostly cgroups and CNI items could be left around.

However, the error looks a bit scary, and I think it's a clear sign rkt's code is doing something wrong.

Partially true. It looks like the kernel commit you linked changed the behavior of mount() and broke this invariant in our code:

rkt/rkt/gc.go

Line 257 in 65d9e1c

// an overlay fs can be mounted over itself, let's unmount it here

It used to be possible to double-mount the same overlayfs in place, but now that fails with EBUSY. We should proceed in that case, as it means we already have everything we need in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment