Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust buildroot creation to work inside a user namespace. #234

Merged
merged 5 commits into from
Aug 4, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 44 additions & 8 deletions mock/py/mockbuild/buildroot.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# -*- coding: utf-8 -*-
# vim: noai:ts=4:sw=4:expandtab

import errno
import fcntl
import glob
import grp
Expand Down Expand Up @@ -368,6 +369,21 @@ def _init_aux_files(self):
with open(p, 'w+') as fo:
fo.write(chroot_file_contents[key])

@traceLog()
def ensure_rpm_config_home(self):
"""Create a fake home directory with an appropriate .rpmmacros"""

rpm_config_home = os.path.join(self.basedir, 'rpmconfig')
util.mkdirIfAbsent(rpm_config_home)

# Since /proc and /sys are mounted special filesystems when RPM is running
# to install the buildroot, it doesn't make sense for RPM to try and
# set the permissions on them - and that might fail with permission errors.
with open(os.path.join(rpm_config_home, ".rpmmacros"), "w") as f:
f.write("%_netsharedpath /proc:/sys\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why this way? Isn't it better to set config_opts['macros']['%_netsharedpath'] as default in util.py? And mention it in sitedefault.cfg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something, but my impression was that config_opts['macros'] affects rpmbuild, but would have no effect on the package management commands used to set up the chroot. [Except for microdnf? Perhaps something is needed for that case as well?]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS this is pretty much the only way to achieve the desired result at the moment if yum/dnf/whatever is run from outside the chroot. Rpm reads its configuration during Python module initialization, and since neither yum or dnf have a way to define arbitrary rpm macros via their configuration or cli-switches, this has to be placed in a file before yum/dnf is launched. If yum/dnf was run from within the target chroot then you could place the configuration there, but from the outside... you can't go changing system-wide configuration even temporarily, so that basically leaves what we have here: override HOME to a temporary directory and create ~/.rpmmacros there.

On the positive side, this could be used to house other similar configurations if needed, for example %_install_langs could be supported this way.

One possibility would be taking this all one step further and generate the temporary ~/.rpmmacros from config_opts['macros'] contents so the same config mechanism could be used for arbitrary macros on the target.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed DNF RFE and discussed that with DNF team in person:
https://bugzilla.redhat.com/show_bug.cgi?id=1673333
But it is a long run. It will not help with this PR - we can just make it easier later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Just realized/remembered that while module import still always reads in the rpm config, for many years now it's possible to reset + reload the configuration later. So there is an alternative: doing rpm.reloadConfig() inside the chroot. Just FWIW.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure about that .rpmmacros. It seems very hackish to me. It is not needed on bare metals hosts. Only in containers. Therefore, I would rather see this done in Toolbox itself or in Container kickstarts https://pagure.io/fedora-kickstarts/tree/master

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could be configured in /etc/rpm in the toolbox-support package which is suppose to be installed in all containers for toolbox, and I think it would get picked up correctly. Reasons I would rather see it here:

  • Solve it once, why make someone trying to run mock under podman themselves without the toolbox wrapper hit this error and hopefully google search and find this PR and find this workaround. (Yes, maybe we could get it changed in the Fedora base containers, but might take a while.)
  • This is really something that is needed for the container we create rather than the enclosing container; e.g., if mock was enhanced at some point to use systemd-nspawn --private-users to avoid needing root privileges, then you'd need the same thing.

That being said, if you really don't want this, I'd rather have the rest of this PR landed :-)


return rpm_config_home

@traceLog()
def nuke_rpm_db(self):
"""remove rpm DB lock files from the chroot"""
Expand Down Expand Up @@ -486,15 +502,40 @@ def _setup_devices(self):
kver = os.uname()[2]
self.root_log.debug("kernel version == %s", kver)
for i in devFiles:
src_path = "/" + i[2]
chroot_path = self.make_chroot_path(i[2])

if util.cmpKernelVer(kver, '2.6.18') >= 0 and src_path == '/dev/ptmx':
continue

# create node, but only if it exist on host too
# except for loop devices, which only show up on the host after they are first used
if os.path.exists("/" + i[2]) or "loop" in i[2]:
os.mknod(self.make_chroot_path(i[2]), i[0], i[1])
if os.path.exists(src_path) or "loop" in src_path:
try:
os.mknod(chroot_path, i[0], i[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to fall-back only if we are in the USE_NSPAWN scenario? I.e. don't fallback for normal chroot()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking back ... toolbox actually implies normal chroot() inside podman container. So I should reverse my question: Do we want to fallback if USE_NSPAWN is on? Because in that case mock is run in non-toolbox environment (normal host) - and it might be better to fail on mknod (if that really was failing) instead of trying useless bind mount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my reply to @xsuchy above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I view this PR approach (running toolbox/podman wrapper around mock, instead of starting containers by mock) as much more straight-forward way to make mock maintainable in future. And really superior. While USE_NSPAWN is better than nothing, the current nspawn support isn't complete (--installroot is still not run in container) and it IMO doesn't make much sense to continue with finishing it.. having it dropped one day sounds like good idea to me.

But still, ATM, USE_NSPAWN is used by default (and e.g. used by copr build system), and in that case - falling down from mknod to bind-mount doesn't make much sense.. Also, from the other side -- trying mknod when we know in advance it will fail (toolbox case) isn't easy reading. All that is mixed up with bare host + old-chroot use-case. Just saying.

But yeah, we seem to know now what is the change about and why - at least now, so +1 from me.

except OSError as e:
# If mknod gives us a permission error, fall back to a different
# strategy of using a bind mount from root to host. This won't
# work for the loop devices, so just skip them in this case.
if e.errno == errno.EPERM:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This is done even on bare metals. Not just for containers. Previously if the mknod failed on bare metal, it failed. Now we bind mount host device. I am not saying it is bad (not sure actually), but it is definitely change from past.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fairly difficult to detect "on bare metal" reliably without encoding assumptions about the container system (docker, podman, lxd, systemd-nspawn...), so I've avoided "in a container" conditionalization anywhere in the patchset.

And conditionalizing based on !USE_NSPAWN as Pavel suggests makes the assumption that
systemp-nspawn will never ever work nested in a container - which I don't think is necessarily going to be true forever.

IMO, we can make the code simpler and more future proof by just trying the fallback in all cases - maybe it will help if we get an EPERM even on bare metal because of some other reason, e..g, selinux seccomp.

Making error reporting more confusing is always a concern, but it's not the code is expecting mknod to fail and producing a nice informative error message - it's just dying with a traceback in that case.

if os.path.exists(src_path):
self.mounts.add_device_bindmount(src_path)
continue
else:
raise

# Further adjustments if we created a new node instead of bind-mounting
# an existing one:

# set context. (only necessary if host running selinux enabled.)
# fails gracefully if chcon not installed.
if self.selinux:
util.do(["chcon", "--reference=/" + i[2], self.make_chroot_path(i[2])],
util.do(["chcon", "--reference=" + src_path, chroot_path],
raiseExc=0, shell=False, env=self.env)

if src_path in ('/dev/tty', '/dev/ptmx'):
os.chown(chroot_path, pwd.getpwnam('root')[2], grp.getgrnam('tty')[2])

os.symlink("/proc/self/fd/0", self.make_chroot_path("dev/stdin"))
os.symlink("/proc/self/fd/1", self.make_chroot_path("dev/stdout"))
os.symlink("/proc/self/fd/2", self.make_chroot_path("dev/stderr"))
Expand All @@ -504,17 +545,12 @@ def _setup_devices(self):
os.remove(self.make_chroot_path('etc', 'mtab'))
os.symlink("../proc/self/mounts", self.make_chroot_path('etc', 'mtab'))

os.chown(self.make_chroot_path('dev/tty'), pwd.getpwnam('root')[2], grp.getgrnam('tty')[2])
os.chown(self.make_chroot_path('dev/ptmx'), pwd.getpwnam('root')[2], grp.getgrnam('tty')[2])

# symlink /dev/fd in the chroot for everything except RHEL4
if util.cmpKernelVer(kver, '2.6.9') > 0:
os.symlink("/proc/self/fd", self.make_chroot_path("dev/fd"))

os.umask(prevMask)

if util.cmpKernelVer(kver, '2.6.18') >= 0:
os.unlink(self.make_chroot_path('/dev/ptmx'))
os.symlink("pts/ptmx", self.make_chroot_path('/dev/ptmx'))

@traceLog()
Expand Down
53 changes: 43 additions & 10 deletions mock/py/mockbuild/mounts.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,17 +85,29 @@ def __repr__(self):
class BindMountPoint(MountPoint):
'''class for managing bind-mounts in the chroot'''
@traceLog()
def __init__(self, srcpath, bindpath):
def __init__(self, srcpath, bindpath, recursive=False, options=None):
MountPoint.__init__(self, mountsource=srcpath, mountpath=bindpath)
self.srcpath = srcpath
self.bindpath = bindpath
self.recursive = recursive
self.options = options
self.mounted = self.ismounted()

@traceLog()
def mount(self):
if not self.mounted:
util.mkdirIfAbsent(self.bindpath)
cmd = ['/bin/mount', '-n', '--bind', self.srcpath, self.bindpath]
if os.path.isdir(self.srcpath):
util.mkdirIfAbsent(self.bindpath)
elif not os.path.exists(self.bindpath):
util.touch(self.bindpath)
cmd = ['/bin/mount', '-n']
if self.recursive:
cmd.append('--rbind')
else:
cmd.append('--bind')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw., why --bind doesn't work and --rbind does? (considering that we don't actually need recursive mounts inside chroot for the package builds).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above about why we need to --rbind for the /proc and /sys mounts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, the question isn't answered here, moving here https://bugzilla.redhat.com/show_bug.cgi?id=1745048

if self.options:
cmd += ['-o', self.options]
cmd += [self.srcpath, self.bindpath]
util.do(cmd)
self.mounted = True
return True
Expand All @@ -104,7 +116,13 @@ def mount(self):
def umount(self):
if not self.mounted:
return None
cmd = ['/bin/umount', '-n', self.bindpath]
cmd = ['/bin/umount', '-n']
if self.recursive:
# The mount is busy because of the submounts - a lazy unmount
# implies a recursive unmount, so takes care of that.
# (-R also works, but is implemented in userspace, and thus racy)
cmd += ['-l']
cmd.append(self.bindpath)
try:
util.do(cmd)
except exception.Error:
Expand All @@ -125,12 +143,20 @@ def __init__(self, rootObj):
self.managed_mounts = [] # mounts owned by mock
self.user_mounts = [] # mounts injected by user
self.essential_mounts = [
FileSystemMountPoint(filetype='proc',
device='mock_chroot_proc',
path=rootObj.make_chroot_path('/proc')),
FileSystemMountPoint(filetype='sysfs',
device='mock_chroot_sys',
path=rootObj.make_chroot_path('/sys')),
# Instead of mounting a fresh procfs and sysfs, we bind mount /proc
# and /sys. This avoids problems with kernel restrictions if running
# within a user namespace, and is pretty much identical otherwise.
# The bind mount additionally needs to be recursive, because the
# kernel forbids mounts that might reveal parts of the filesystem
# that a container runtime overmounted to hide from the container.
BindMountPoint(srcpath='/proc',
bindpath=rootObj.make_chroot_path('/proc'),
recursive=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need the recursive here at all? Previously when we mounted /proc and /sys, it mounted just the top level and did not mounted anything within. E.g., /proc/sys/fs/binfmt_misc.
Why it worked in past and why we need to recursively traverse now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that a container system might bind-mount over parts of /proc and /sys to hide them from processes inside the container. To prevent the container from escaping from this, it's not allowed to mount a new copy of the filesystem, but it's also not allowed to bind mount the filesystem in a way that would reveal the hidden parts - thus the kernel bindmount of /proc will fail unless it's recursive. I can make the comment say more of that instead of just "kernel restrictions".

options="nodev,noexec,nosuid,readonly"),
BindMountPoint(srcpath='/sys',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR, changing /sys from mount to bind mount caused this: https://bugzilla.redhat.com/show_bug.cgi?id=1740421

bindpath=rootObj.make_chroot_path('/sys'),
recursive=True,
options="nodev,noexec,nosuid,readonly"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this would work for --new-chroot it would reveal all /proc/PIDs tree of host. Which goes against the isolation, which we want to have in Mock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but I think this isn't the case for two reasons:

First, we're not really changing anything - previously the code mounted in a fresh /proc from the outer PID namespace - now it's bind mounting in the same thing. (The main difference is that if parts of /proc /sys are shadowed by overmounts, the shadowing is preserved - this is why the fresh mount is restricted in the userns case.)

But second, the "essential mounts", as I understand it, aren't actually mounted when systemd-nspawn is run - for USE_NSPAWN, they are only mounted for running package management commands.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, right.

]
if rootObj.config['internal_dev_setup']:
self.essential_mounts.append(
Expand All @@ -157,6 +183,13 @@ def __init__(self, rootObj):
def add(self, mount):
self.managed_mounts.append(mount)

@traceLog()
def add_device_bindmount(self, path):
mount = BindMountPoint(path,
self.rootObj.make_chroot_path(path),
options="noexec,nosuid,readonly")
self.essential_mounts.append(mount)

@traceLog()
def add_user_mount(self, mount):
self.user_mounts.append(mount)
Expand Down
2 changes: 2 additions & 0 deletions mock/py/mockbuild/package_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ def execute(self, *args, **kwargs):
# intentionally we do not call bootstrap hook here - it does not have sense
env = self.config['environment'].copy()
env.update(util.get_proxy_environment(self.config))
# use a special home directory with only our desired RPM configuration
env['HOME'] = self.buildroot.ensure_rpm_config_home()
env['LC_MESSAGES'] = 'C.UTF-8'
if self.buildroot.nosync_path:
env['LD_PRELOAD'] = self.buildroot.nosync_path
Expand Down
7 changes: 7 additions & 0 deletions mock/py/mockbuild/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,13 @@ def unshare(flags):
except AttributeError:
pass

if flags & CLONE_NEWNS:
# Unsharing the mount namespace isn't immediately effective if the
# source mount propagation status is 'shared' - changes to the mounts
# will still propagate back to the parent namespace. Do the same
# thing as unshare(1) and make all mounts private.
do(['mount', '--make-rprivate', '/'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change the flag on / of host (!) - I am not sure we want this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? It's after the call to unshare()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am sure. Try to replace it with ['touch', '/DBG'] and see where the file will be created.



def sethostname(hostname):
getLog().info("Setting hostname: %s", hostname)
Expand Down