Add support module for namespace sharing #126

Merged
merged 35 commits into from Sep 12, 2016
Commits
Jump to file or symbol
Failed to load files and symbols.
+17 −20
Split
Viewing a subset of changes. View all

Use fstatfs() and NSFS_MAGIC to decide about setns()

This patch fixes a potentially iffy code that used to open the mount
namespace file /run/snapd/ns/$SNAP_NAME.mnt and then blindly call
sents() on the file descriptor, if the open call succeeded.

Instead, we now fstatfs() the file descriptor and check the resulting
f_type field. If it is NSFS_MAGIC then we know that we should use
setns(). If not we fall back to regular initialization.

All errors from setns() now result in snap-confine death. This should be
more reliable against unexpected bugs.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
  • Loading branch information...
commit 40a93a241b1fbfc34c6eb6b9e87f951b218ee1c7 @zyga zyga committed Sep 8, 2016
View
@@ -23,6 +23,7 @@
#include <errno.h>
#include <fcntl.h>
+#include <linux/magic.h>
#include <sched.h>
#include <signal.h>
#include <string.h>
@@ -32,6 +33,7 @@
#include <sys/prctl.h>
#include <sys/stat.h>
#include <sys/types.h>
+#include <sys/vfs.h>
#include <sys/wait.h>
#include <unistd.h>
#ifdef HAVE_APPARMOR
@@ -250,38 +252,33 @@ void sc_create_or_join_ns_group(struct sc_ns_group *group)
// sc_discard_preserved_ns_group() it will revert to a regular file. If
// snap-confine is killed for whatever reason after the file is created but
// before the file is bind-mounted it will also be a regular file.
- //
- // The code below handles this by trying to join the namespace with setns()
- // and handling both the successful and the unsuccessful paths.
mnt_fd =
openat(group->dir_fd, mnt_fname,
O_CREAT | O_RDONLY | O_CLOEXEC | O_NOFOLLOW, 0600);
@jdstrand

jdstrand Sep 6, 2016

Contributor

Can you add a comment here on why you are not using O_EXCL? AIUI, you want to either open the nsfs mounted file or open the newly created file. It would be good to document that for auditability.

@zyga

zyga Sep 6, 2016

Collaborator

Yes, you are right both in the deduction of the intent and the suggestion to document this better.

@zyga

zyga Sep 8, 2016

Collaborator

Documented

if (mnt_fd < 0) {
die("cannot open mount namespace file for namespace group %s",
group->name);
}
- // attempt to join an existing group
- debug
- ("attempting to re-associate the mount namespace with the namespace group %s",
- group->name);
- if (setns(mnt_fd, CLONE_NEWNS) == 0) {
+ // Check if we got an nsfs-based file or a regular file. This can be
+ // reliably tested because nsfs has an unique filesystem type NSFS_MAGIC.
+ // We can just ensure that this is the case thanks to fstatfs.
+ struct statfs buf;
+ if (fstatfs(mnt_fd, &buf) < 0) {
+ die("cannot perform fstatfs() on an mount namespace file descriptor");
+ }
+ if (buf.f_type == NSFS_MAGIC) {
+ debug
+ ("attempting to re-associate the mount namespace with the namespace group %s",
+ group->name);
+ if (setns(mnt_fd, CLONE_NEWNS) < 0) {
+ die("cannot re-associate the mount namespace with namespace group %s", group->name);
+ }
debug
("successfully re-associated the mount namespace with the namespace group %s",
group->name);
return;
}
- // Anything but EINVAL is an unexpected error.
- //
- // EINVAL is simply a sign that the file we've opened is not a valid
- // namespace file descriptor. One potential case where this can happen is
- // when another snap-confine tried to initialize the namespace but was
- // killed before it managed to complete the process.
- if (errno != EINVAL) {
- die("cannot re-associate the mount namespace with namespace group %s", group->name);
- }
- debug
- ("cannot re-associate the mount namespace with namespace group %s, falling back to initialization",
- group->name);
+ debug("initializing new namespace group %s", group->name);
@jdstrand

jdstrand Sep 8, 2016

Contributor

This is soo much nicer, thanks! :)

It's interesting that NSFS_MAGIC is not included in man fstatfs. Note that this was added only in http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/include/uapi/linux/magic.h?id=e149ed2b805fefdccf7ccdfc19eca22fdd4514ac so this will likely need to be added to the backports list if it isn't already. It does seem present in the 14.04 Ubuntu kernel.

@zyga

zyga Sep 12, 2016

Collaborator

I've informed tvoss about this. With the sanity check we'll know if tests are all green. Thanks for making me aware of this :)

// Create a new namespace and ask the caller to populate it.
// For rationale of forking see this:
// https://lists.linuxfoundation.org/pipermail/containers/2013-August/033386.html