Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution issue when rootfs is mounted twice "Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory" #273

Open
TristanCacqueray opened this issue Jun 24, 2018 · 4 comments

Comments

@TristanCacqueray
Copy link
Contributor

TristanCacqueray commented Jun 24, 2018

Greetings, sometimes our CI jobs fails when bwrap fails to start and dies with this message: "bwrap: Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory"

It seems like this happens when the rootfs is mounted twice:

$ cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=3984364k,nr_inodes=996091,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgrou
ps-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=28,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1302
8 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=801012k,mode=700 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0

$ bwrap --dir /tmp --tmpfs /tmp --chdir /tmp/ --dir /var --dir /var/tmp --dir /run/user/974 --ro-bind /usr /usr --ro-bind /lib /lib  --ro-bind /lib64 /lib64 --ro-bind /bin /bin --ro-bind /sbin /sbin --ro-bind /etc/resolv.conf /etc/resolv.conf --ro-bind /etc/hosts /etc/hosts  --proc /proc --dev /dev      --unshare-all --share-net --die-with-parent  /bin/echo ok
bwrap: Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory

I'm still looking for how this double "/dev/vda1 / xfs" mount happens... This is an OpenStack cloud instance and perhaps the cloud-init growroot process does the remount? There is also a service that bind mount / to /srv/host-rootfs to run runC containers that may be causing the double mount.

It seems like an expected behavior as the bwrap initial pivot_root may not have the SLAVE flag on the right mount point or something... Any help to debug this would be appreciated :-)

sf-project-io pushed a commit to softwarefactory-project/sf-ci that referenced this issue Jun 24, 2018
See: containers/bubblewrap#273

Change-Id: I234f1b2e8695e058573cd49918c7793c825f4f95
@TristanCacqueray
Copy link
Contributor Author

Here is how to reproduce this issue:

# mkdir -p /srv/host-rootfs
# echo "/ /srv/host-rootfs none bind,ro,private 0 0" >> /etc/fstab
# for i in $(seq 100); do mount /srv/host-rootfs/ & done
# cat /proc/mounts
...
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
...
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
...
# uname -a
Linux managesf.sftests.com 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

When this happen, bwrap will fail to use any things that was mounted on the first rootfs, e.g. /dev or /tmp results in "bwrap: Can't bind mount" or "bwrap: Can't get type of source"

To fix the system, do:

while true; do umount -l /srv/host-rootfs || break; done;

@cgwalters
Copy link
Collaborator

Hmm. First, there shouldn't be a need to modify /etc/fstab, you can just do:

for i in $(seq 100); do mount --bind / /srv/host-rootfs/ & done

That fails pretty fast for me, somehow mount exits with ENOSPC? Ah I see this uncommented count_mounts() function in linux/fs/namespace.c...

Anyways, surely you aren't stacking mounts like this (why would you do that?), just trying to illustrate a race condition? So the reproducer instead would be like:

while true; do mount --bind / /srv/host-rootfs/ && umount /srv/host-rootfs; done

Then yep, this quickly fails for me:

while bwrap --ro-bind / / true; do :; done

Will look.

@TristanCacqueray
Copy link
Contributor Author

The "for i in $(seq 100)" loop is just a reproducer that happen to trigger the same bug that was making our CI flaky. The culprit was a badly written container driver that was doing simultaneous bind mount that has been fixed by using "mount -a" instead.

@cgwalters
Copy link
Collaborator

I think the problem here is that the mount points are changing underneath us while we're executing the pivot, since we're using MS_SLAVE. Perhaps what we could do is use MS_PRIVATE during the setup, then change to MS_SLAVE after?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants