Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double bind-mount of file systems during lxc-start #3073

Closed
Rachid-Koucha opened this issue Jul 3, 2019 · 0 comments
Closed

Double bind-mount of file systems during lxc-start #3073

Rachid-Koucha opened this issue Jul 3, 2019 · 0 comments

Comments

@Rachid-Koucha
Copy link
Contributor

The template below is mostly useful for bug reports and support questions.
Feel free to remove anything which doesn't apply to you and add more information where it makes sense.

Required information

  • Distribution: Ubuntu
  • Distribution version: 18.04 bionic
  • The output of
    • lxc-start --version
      3.1.0-devel
    • lxc-checkconfig
      Kernel configuration not found at /proc/config.gz; searching...
      Kernel configuration found at /boot/config-4.15.0-47-generic
      --- Namespaces ---
      Namespaces: enabled
      Utsname namespace: enabled
      Ipc namespace: enabled
      Pid namespace: enabled
      User namespace: enabled
      Network namespace: enabled

--- Control groups ---
Cgroups: enabled

Cgroup v1 mount points:
/sys/fs/cgroup/systemd
/sys/fs/cgroup/rdma
/sys/fs/cgroup/pids
/sys/fs/cgroup/net_cls,net_prio
/sys/fs/cgroup/memory
/sys/fs/cgroup/freezer
/sys/fs/cgroup/perf_event
/sys/fs/cgroup/devices
/sys/fs/cgroup/hugetlb
/sys/fs/cgroup/cpu,cpuacct
/sys/fs/cgroup/cpuset
/sys/fs/cgroup/blkio

Cgroup v2 mount points:
/sys/fs/cgroup/unified

Cgroup v1 clone_children flag: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, not loaded
Macvlan: enabled, not loaded
Vlan: enabled, not loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, not loaded
CONFIG_NF_NAT_IPV4: enabled, loaded
CONFIG_NF_NAT_IPV6: enabled, loaded
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, not loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled

  • uname -a
    Linux pc-work 4.15.0-47-generic Unnecessary diagnostic in lxc-stop man page. #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • cat /proc/self/cgroup
    12:blkio:/user.slice
    11:cpuset:/
    10:cpu,cpuacct:/user.slice
    9:hugetlb:/
    8:devices:/user.slice
    7:perf_event:/
    6:freezer:/user/rachid/0
    5:memory:/user/rachid/0
    4:net_cls,net_prio:/
    3:pids:/user.slice/user-1000.slice/session-3.scope
    2:rdma:/
    1:name=systemd:/user.slice/user-1000.slice/session-3.scope
    0::/user.slice/user-1000.slice/session-3.scope
  • cat /proc/1/mounts
    sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
    proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
    udev /dev devtmpfs rw,nosuid,relatime,size=4006244k,nr_inodes=1001561,mode=755 0 0
    devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
    tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=807304k,mode=755 0 0
    /dev/sda6 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
    securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
    tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
    tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
    tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
    cgroup /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
    cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd 0 0
    pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
    efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
    cgroup /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma 0 0
    cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
    cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
    cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
    cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
    cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
    cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
    cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
    cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
    cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset,clone_children 0 0
    cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
    systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=26,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=14895 0 0
    mqueue /dev/mqueue mqueue rw,relatime 0 0
    debugfs /sys/kernel/debug debugfs rw,relatime 0 0
    hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0
    configfs /sys/kernel/config configfs rw,relatime 0 0
    fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
    /dev/sda7 /home ext4 rw,relatime,data=ordered 0 0
    /dev/sda1 /boot/efi vfat rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 0
    tmpfs /run/user/119 tmpfs rw,nosuid,nodev,relatime,size=807300k,mode=700,uid=119,gid=126 0 0
    tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=807300k,mode=700,uid=1000,gid=1000 0 0

Issue description

During lxc-start of a simple container (busybox template) with the default configuration, file systems are mounted twice with the same flags

Here is the configuration:

lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:d5:01:fb
lxc.rootfs.path = dir:/usr/local/var/lib/lxc/bci_01/rootfs
lxc.signal.halt = SIGUSR1
lxc.signal.reboot = SIGTERM
lxc.uts.name = "bci_01"
lxc.tty.max = 1
lxc.pty.max = 1
lxc.cap.drop = sys_module mac_admin mac_override sys_time

When using LXC with apparmor, uncomment the next line to run unconfined:

#lxc.apparmor.profile = unconfined
lxc.mount.auto = cgroup:mixed proc:mixed sys:mixed
lxc.mount.entry = shm /dev/shm tmpfs defaults 0 0
lxc.mount.entry = /lib lib none ro,bind 0 0
lxc.mount.entry = /usr/lib usr/lib none ro,bind 0 0
lxc.mount.entry = /usr/local/lib usr/local/lib none ro,bind 0 0
lxc.mount.entry = /lib64 lib64 none ro,bind 0 0
lxc.mount.entry = /sys/kernel/security sys/kernel/security none ro,bind,optional 0 0

Here is the result of "strace" at the time where the file systems are bind-mounted inside the container:

[pid 17518] memfd_create(".lxc_mount_file", MFD_CLOEXEC) = 10 ----------------> make_anonymous_mount_file()
[pid 17518] write(10, "shm /dev/shm tmpfs defaults 0 0", 31) = 31
[pid 17518] write(10, "\n", 1) = 1
[pid 17518] write(10, "/lib lib none ro,bind 0 0", 25) = 25
[pid 17518] write(10, "\n", 1) = 1
[pid 17518] write(10, "/usr/lib usr/lib none ro,bind 0 "..., 33) = 33
[pid 17518] write(10, "\n", 1) = 1
[pid 17518] write(10, "/lib64 lib64 none ro,bind 0 0", 29) = 29
[pid 17518] write(10, "\n", 1) = 1
[pid 17518] write(10, "/sys/kernel/security sys/kernel/"..., 66) = 66
[pid 17518] write(10, "\n", 1) = 1
[pid 17518] lseek(10, 0, SEEK_SET) = 0
[pid 17518] fcntl(10, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 17518] fstat(10, {st_mode=S_IFREG|0777, st_size=189, ...}) = 0
[pid 17518] read(10, "shm /dev/shm tmpfs defaults 0 0\n"..., 4096) = 189

[pid 17518] geteuid() = 0 -------------------> mount_file_entries(conf, rootfs, f, lxc_name, lxc_path);
[pid 17518] openat(AT_FDCWD, "/usr/local/lib/lxc/rootfs", O_RDONLY) = 13
[pid 17518] openat(13, "lib", O_RDONLY|O_NOFOLLOW) = 14
[pid 17518] close(13) = 0
[pid 17518] mount("/lib", "/proc/self/fd/14", 0x7ffe4b8eeb49, MS_RDONLY|MS_BIND, NULL) = 0
[pid 17518] close(14) = 0
[pid 17518] statfs("/lib", {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, f_blocks=71829832, f_bfree=69767402, f_bavail=66101214, f_files=18317312, f_ffree=18046543, f_fsid={val=[2517273643, 2573456793]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0
[pid 17518] mount("/lib", "/usr/local/lib/lxc/rootfs/lib", 0x7ffe4b8eeb49, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0 -------------------------> Double bind-mount here (without additional flags)
[pid 17518] openat(AT_FDCWD, "/usr/local/lib/lxc/rootfs", O_RDONLY) = 13
[pid 17518] openat(13, "usr", O_RDONLY|O_NOFOLLOW) = 14
[pid 17518] close(13) = 0
[pid 17518] openat(14, "lib", O_RDONLY|O_NOFOLLOW) = 13
[pid 17518] close(14) = 0
[pid 17518] mount("/usr/lib", "/proc/self/fd/13", 0x7ffe4b8eeb51, MS_RDONLY|MS_BIND, NULL) = 0
[pid 17518] close(13) = 0
[pid 17518] statfs("/usr/lib", {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, f_blocks=71829832, f_bfree=69767402, f_bavail=66101214, f_files=18317312, f_ffree=18046543, f_fsid={val=[2517273643, 2573456793]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0
[pid 17518] mount("/usr/lib", "/usr/local/lib/lxc/rootfs/usr/lib", 0x7ffe4b8eeb51, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0 -------------------------> Double bind-mount here (without additional flags)
[pid 17518] openat(AT_FDCWD, "/usr/local/lib/lxc/rootfs", O_RDONLY) = 13
[pid 17518] openat(13, "lib64", O_RDONLY|O_NOFOLLOW) = 14
[pid 17518] close(13) = 0
[pid 17518] mount("/lib64", "/proc/self/fd/14", 0x7ffe4b8eeb4d, MS_RDONLY|MS_BIND, NULL) = 0
[pid 17518] close(14) = 0
[pid 17518] statfs("/lib64", {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, f_blocks=71829832, f_bfree=69767402, f_bavail=66101214, f_files=18317312, f_ffree=18046543, f_fsid={val=[2517273643, 2573456793]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0
[pid 17518] mount("/lib64", "/usr/local/lib/lxc/rootfs/lib64", 0x7ffe4b8eeb4d, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0 -------------------------> Double bind-mount here (without additional flags)
[pid 17518] openat(AT_FDCWD, "/usr/local/lib/lxc/rootfs", O_RDONLY) = 13
[pid 17518] openat(13, "sys", O_RDONLY|O_NOFOLLOW) = 14
[pid 17518] close(13) = 0
[pid 17518] openat(14, "kernel", O_RDONLY|O_NOFOLLOW) = 13
[pid 17518] close(14) = 0
[pid 17518] openat(13, "security", O_RDONLY|O_NOFOLLOW) = 14
[pid 17518] close(13) = 0
[pid 17518] mount("/sys/kernel/security", "/proc/self/fd/14", 0x7ffe4b8eeb69, MS_RDONLY|MS_BIND, NULL) = 0
[pid 17518] close(14) = 0
[pid 17518] statfs("/sys/kernel/security", {f_type=SECURITYFS_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={val=[0, 0]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOSUID|ST_NODEV|ST_NOEXEC|ST_RELATIME}) = 0
[pid 17518] mount("/sys/kernel/security", "/usr/local/lib/lxc/rootfs/sys/kernel/security", 0x7ffe4b8eeb69, MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND, NULL) = 0
[pid 17518] read(10, "", 4096) = 0
[pid 17518] close(10) = 0

The "problem" comes from the conf.c/mount_entry() where local variable "rqd_flags" is set with
MS_RDONLY:
if (mountflags & MS_RDONLY)
rqd_flags |= MS_RDONLY;

After which, the following check is false although no missing flags were added to the configured flags:
/* If this was a bind mount request, and required_flags
* does not have any flags which are not already in
* mountflags, then skip the remount.
*/
if (!(mountflags & MS_REMOUNT)) {
if (!(required_flags & ~mountflags) &&
rqd_flags == 0) {
DEBUG("Mountflags already were %lu, "
"skipping remount", mountflags);
goto skipremount;
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant