Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/snap-discard-ns: add support for per-user mount namespaces #6010

Merged
merged 12 commits into from Oct 25, 2018
202 changes: 180 additions & 22 deletions cmd/snap-discard-ns/snap-discard-ns.c
@@ -1,5 +1,5 @@
/*
* Copyright (C) 2015 Canonical Ltd
* Copyright (C) 2015-2018 Canonical Ltd
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 3 as
Expand All @@ -15,41 +15,199 @@
*
*/

#define _GNU_SOURCE

#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <fnmatch.h>
#include <limits.h>
#include <linux/magic.h>
#include <stdio.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/vfs.h>
#include <unistd.h>

#include "../libsnap-confine-private/error.h"
#include "../libsnap-confine-private/locking.h"
#include "../libsnap-confine-private/snap.h"
#include "../libsnap-confine-private/string-utils.h"
#include "../libsnap-confine-private/utils.h"
#include "../snap-confine/ns-support.h"

#ifndef NSFS_MAGIC
#define NSFS_MAGIC 0x6e736673
#endif

int main(int argc, char **argv)
{
if (argc != 2)
die("Usage: %s snap-name", argv[0]);
const char *snap_name = argv[1];

int snap_lock_fd = sc_lock_snap(snap_name);
debug("initializing mount namespace: %s", snap_name);
struct sc_mount_ns *group =
sc_open_mount_ns(snap_name, SC_NS_FAIL_GRACEFULLY);
if (group != NULL) {
sc_discard_preserved_mount_ns(group);
sc_close_mount_ns(group);
if (argc != 2) {
printf("Usage: snap-discard-ns <SNAP-INSTANCE-NAME>\n");
return 0;
}
// Unlink the current mount profile, if any.
char profile_path[PATH_MAX] = { 0 };
sc_must_snprintf(profile_path, sizeof(profile_path),
"/run/snapd/ns/snap.%s.fstab", snap_name);
if (unlink(profile_path) < 0) {
// Silently ignore ENOENT as the profile doens't have to be there.
if (errno != ENOENT) {
die("cannot remove current mount profile: %s",
profile_path);

const char *snap_instance_name = argv[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make sure snap instance name is sane?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we probably should. I will make that adjustment.

struct sc_error *err = NULL;
sc_instance_name_validate(snap_instance_name, &err);
sc_die_on_error(err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you.


/* Grab the lock holding the snap instance. This prevents races from
* concurrently executing snap-confine. The lock is explicitly released
* during normal operation but it is not preserved across the life-cycle of
* the process anyway so no attempt is made to unlock it ahead of any call
* to die() */
int snap_lock_fd = sc_lock_snap(snap_instance_name);
debug("discarding mount namespaces of snap %s", snap_instance_name);

const char *ns_dir_path = "/run/snapd/ns";
int ns_dir_fd = open(ns_dir_path, O_DIRECTORY | O_CLOEXEC | O_NOFOLLOW);
if (ns_dir_fd < 0) {
/* The directory may legitimately not exist if no snap has started to
* prepare it. This is not an error condition. */
if (errno == ENOENT) {
return 0;
}
die("cannot open path %s", ns_dir_path);
}

/* Move to the namespace directory. This is used so that we don't need to
* traverse the path over and over in our upcoming unmount2(2) calls. */
if (fchdir(ns_dir_fd) < 0) {
die("cannot move to directory %s", ns_dir_path);
}

/* Create shell patterns that describe the things we are interested in:
*
* Preserved mount namespaces to unmount and unlink:
* - "$SNAP_INSTANCE_NAME.mnt"
* - "$SNAP_INSTANCE_NAME.[0-9]+.mnt"
*
* Applied mount profiles to unlink:
* - "snap.$SNAP_INSTANCE_NAME.fstab"
* - "snap.$SNAP_INSTANCE_NAME.[0-9]+.fstab"
*
* Use PATH_MAX as the size of each buffer since those can store any file
* name. */
char sys_fstab_pattern[PATH_MAX];
char usr_fstab_pattern[PATH_MAX];
char sys_mnt_pattern[PATH_MAX];
char usr_mnt_pattern[PATH_MAX];
zyga marked this conversation as resolved.
Show resolved Hide resolved
sc_must_snprintf(sys_fstab_pattern, sizeof sys_fstab_pattern,
"snap\\.%s\\.fstab", snap_instance_name);
sc_must_snprintf(usr_fstab_pattern, sizeof usr_fstab_pattern,
"snap\\.%s\\.*\\.fstab", snap_instance_name);
sc_must_snprintf(sys_mnt_pattern, sizeof sys_mnt_pattern,
"%s\\.mnt", snap_instance_name);
sc_must_snprintf(usr_mnt_pattern, sizeof usr_mnt_pattern,
"%s\\.*\\.mnt", snap_instance_name);

DIR *ns_dir = fdopendir(ns_dir_fd);
zyga marked this conversation as resolved.
Show resolved Hide resolved
if (ns_dir == NULL) {
die("cannot fdopendir");
}
/* ns_dir_fd is now owned by ns_dir and will not be closed. */

while (true) {
/* Reset errno ahead of any call to readdir to differentiate errors
* from legitimate end of directory. */
errno = 0;
struct dirent *dent = readdir(ns_dir);
if (dent == NULL) {
if (errno != 0) {
die("cannot read next directory entry");
}
/* We've seen the whole directory. */
break;
}

/* We use dnet->d_name a lot so let's shorten it. */
const char *dname = dent->d_name;

/* Check the four patterns that we have against the name and set the
* two should flags to decide further actions. Note that we always
* unlink matching files so that is not reflected in the structure. */
bool should_unmount = false;
bool should_unlink = false;
struct variant {
const char *pattern;
bool unmount;
};
struct variant variants[4] = {
{.pattern = sys_mnt_pattern,.unmount = true},
{.pattern = usr_mnt_pattern,.unmount = true},
{.pattern = sys_fstab_pattern},
{.pattern = usr_fstab_pattern},
};
for (size_t i = 0; i < sizeof variants / sizeof *variants; ++i) {
struct variant *v = &variants[i];
debug("checking if %s matches %s", dname, v->pattern);
int match_result = fnmatch(v->pattern, dname, 0);
if (match_result == FNM_NOMATCH) {
continue;
} else if (match_result == 0) {
should_unmount |= v->unmount;
should_unlink = true;
debug("file %s matches pattern %s", dname,
v->pattern);
/* One match is enough. */
break;
} else if (match_result < 0) {
die("cannot execute match against pattern %s",
v->pattern);
}
}

/* Stat the candidate directory entry to know what we are dealing with. */
struct stat file_info;
if (fstatat(ns_dir_fd, dname, &file_info,
AT_SYMLINK_NOFOLLOW) < 0) {
die("cannot inspect file %s", dname);
}

/* We are only interested in regular files. The .mnt files, even if
* bind-mounted, appear as regular files and not as symbolic links due
* to the peculiarities of the Linux kernel. */
if (!S_ISREG(file_info.st_mode)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move this up & check early before trying to match?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this watching strace logs and realised we are stat-ing things that we don't care about. The pattern match is much faster than a system call so I chose to order it this way around.

continue;
}

if (should_unmount) {
/* If we are asked to unmount the file double check that it is
* really a preserved mount namespace since the error code from
* umount2(2) is inconclusive. */
int path_fd = openat(ns_dir_fd, dname,
O_PATH | O_CLOEXEC | O_NOFOLLOW);
if (path_fd < 0) {
die("cannot open path %s", dname);
}
struct statfs fs_info;
if (fstatfs(path_fd, &fs_info) < 0) {
die("cannot inspect file-system at %s", dname);
}
close(path_fd);
if (fs_info.f_type == NSFS_MAGIC
|| fs_info.f_type == PROC_SUPER_MAGIC) {
debug("unmounting %s", dname);
if (umount2(dname, MNT_DETACH | UMOUNT_NOFOLLOW)
< 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would read better if < 0) { was on the same line where the if starts, but I presume it was formatted automatically like this? Slightly odd given that the comment block above is much wider.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is automatically formatted, the mystery of indent. Perhaps given that this a near-full rewrite I could use clang-format for this file and actually switch to a sane format profile?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be a good idea... Feel free to give it a try before landing and apply if it produces sensible output.

die("cannot unmount %s", dname);
}
}
}

if (should_unlink) {
debug("unlinking %s", dname);
if (unlinkat(ns_dir_fd, dname, 0) < 0) {
die("cannot unlink %s", dname);
}
}
}

/* Close the directory and release the lock, we're done. */
if (closedir(ns_dir) < 0) {
die("cannot close directory");
}
sc_unlock(snap_lock_fd);
return 0;
}
20 changes: 14 additions & 6 deletions cmd/snap-discard-ns/snap-discard-ns.rst
Expand Up @@ -7,16 +7,16 @@ internal tool for discarding preserved namespaces of snappy applications
------------------------------------------------------------------------

:Author: zygmunt.krynicki@canonical.com
:Date: 2016-10-05
:Date: 2018-10-17
:Copyright: Canonical Ltd.
:Version: 1.0.43
:Version: 2.36
:Manual section: 5
:Manual group: snappy

SYNOPSIS
========

snap-discard-ns SNAP_NAME
snap-discard-ns SNAP_INSTANCE_NAME

DESCRIPTION
===========
Expand All @@ -43,11 +43,19 @@ FILES

`snap-discard-ns` uses the following files:

`/run/snapd/ns/$SNAP_NAME.mnt`:
`/run/snapd/ns/$SNAP_INSTNACE_NAME.mnt`:
`/run/snapd/ns/$SNAP_INSTNACE_NAME.*.mnt`:

The preserved mount namespace that is unmounted by `snap-discard-ns`.
The preserved mount namespace that is unmounted and removed by
`snap-discard-ns`. The second form is for the per-user mount namespace.

`/run/snapd/ns/snap.$SNAP_INSTNACE_NAME.fstab`:
`/run/snapd/ns/snap.$SNAP_INSTNACE_NAME.*.fstab`:

The current mount profile of a preserved mount namespace that is removed
by `snap-discard-ns`.

BUGS
====

Please report all bugs with https://bugs.launchpad.net/snap-confine/+filebug
Please report all bugs with https://bugs.launchpad.net/snapd/+filebug
72 changes: 72 additions & 0 deletions tests/main/snap-discard-ns/mount.py
@@ -0,0 +1,72 @@
"""
Pure-python limited-use wrapper around the mount library function.
"""
from __future__ import print_function, absolute_import, unicode_literals

from argparse import ArgumentParser
from ctypes import CDLL, c_char_p, c_long, get_errno
from ctypes.util import find_library
from os import strerror
from sys import stderr, version_info

try:
from typing import Text
except ImportError:
pass

__all__ = ('mount', )


PY2 = version_info[0] == 2


class MountOpts(object):
"""MountOpts contain various flags for the mount system call."""

Bind = 4096


def mount(source, target, fstype, flags=0, data=""):
# type: (Text, Text, Text, int, Text) -> None
"""mount is a thin wrapper around the mount library function."""
if PY2:
c = b"c"
else:
c = "c"
libc_name = find_library(c)
if libc_name is None:
raise Exception("cannot find the C library")
libc = CDLL(libc_name, use_errno=True)
retval = libc.mount(
c_char_p(source.encode("UTF-8")),
c_char_p(target.encode("UTF-8")),
c_char_p(fstype.encode("UTF-8")),
c_long(flags),
None if data == "" else c_char_p(data.encode("UTF-8")))
if retval < 0:
errno = get_errno()
raise OSError(errno, strerror(errno))


def main():
# type: () -> None
parser = ArgumentParser()
parser.add_argument("source", help="path of the device or bind source")
parser.add_argument("target", help="path of the new mount point")
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument("--bind", action="store_const", const="bind",
help="create a new mount point of existing path")
group.add_argument("-t", "--type", help="filesystem type to mount")
opts = parser.parse_args()
if opts.bind is not None:
mount(opts.source, opts.target, "", MountOpts.Bind)
else:
mount(opts.source, opts.target, opts.fstype)


if __name__ == "__main__":
try:
main()
except OSError as err:
print(err, file=stderr)
raise SystemExit(1)
9 changes: 9 additions & 0 deletions tests/main/snap-discard-ns/mount.sh
@@ -0,0 +1,9 @@
#!/bin/sh
if [ "$(command -v python3)" != "" ]; then
exec python3 ./mount.py "$@";
elif [ "$(command -v python2)" != "" ]; then
exec python2 ./mount.py "$@";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this logic if it works with both versions of python, why not simply check for /usr/bin/python (and have #!/usr/bin/env python shebang?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python is usually python2 but I wanted to be explicit. In practice we will only pick up the other case (python2) on the Amazon Linux 2 systems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, not a problem at all, just looks a little elaborate considering the fact that we don't actually care which python version it is.

else
echo "cannot mount: Python 2 or 3 required"
exit 1
fi
13 changes: 12 additions & 1 deletion tests/main/snap-discard-ns/task.yaml
Expand Up @@ -28,14 +28,25 @@ execute: |

echo "We can discard the namespace after a snap runs"
test-snapd-tools.success
# NOTE: the per-user profile is faked but we test that it is removed correctly.
touch /run/snapd/ns/test-snapd-tools.1000.mnt
# NOTE: use a low-level mount operation to avoid interacting with the
# /etc/mtab file. On pre-systemd systems mtab was a regular file and
# libmount can be confused into misbehaving because none of the snapd
# low-level mount tooling uses it.
./mount.sh --bind /run/snapd/ns/test-snapd-tools.mnt /run/snapd/ns/test-snapd-tools.1000.mnt
# The last hex is the same as nsfs but older stat on ubuntu 14.04 doesn't know
# proc is there because on older kernels /proc/*/ns/mnt is not on nsfs but still on procfs.
stat -f -c %T /run/snapd/ns/test-snapd-tools.mnt | MATCH 'proc|nsfs|0x6e736673'
stat -f -c %T /run/snapd/ns/test-snapd-tools.1000.mnt | MATCH 'proc|nsfs|0x6e736673'
$LIBEXECDIR/snapd/snap-discard-ns test-snapd-tools
stat -f -c %T /run/snapd/ns/test-snapd-tools.mnt | MATCH 'tmpfs'
test ! -e /run/snapd/ns/test-snapd-tools.mnt
test ! -e /run/snapd/ns/test-snapd-tools.1000.mnt

echo "We can fake a current mount profile and see that it is removed too"
test-snapd-tools.success
touch /run/snapd/ns/snap.test-snapd-tools.fstab
touch /run/snapd/ns/snap.test-snapd-tools.1000.fstab
$LIBEXECDIR/snapd/snap-discard-ns test-snapd-tools
test ! -e /run/snapd/ns/snap.test-snapd-tools.fstab
test ! -e /run/snapd/ns/snap.test-snapd-tools.1000.fstab