Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow fuse in systemd-nspawn containers #17607

Open
paravoid opened this issue Nov 14, 2020 · 5 comments
Open

Allow fuse in systemd-nspawn containers #17607

paravoid opened this issue Nov 14, 2020 · 5 comments
Labels
nspawn RFE 🎁 Request for Enhancement, i.e. a feature request

Comments

@paravoid
Copy link

@poettering, in #2178 (comment) (Dec 2015) wrote:

fuse is currently not supported in nspawn containers, because the kernel interfaces aren't compatible with namespaces/containers. Or at least they weren't the last time I looked, and leaked information from inside the container to the host. If that changed now we can certainly open that up, too.

and in #7669 (comment) (Dec 2017) also wrote:

Well, I'd love to just make fuse available by default, but iirc fuse is still not properly set up for usage in containers, i.e. just making the fuse node available is not safe, as it leaks access on host-side mounts or so, I don't remember the details about that.

If that changes we should just make FUSE available by default. Until then: you can hack around this already if you are willing to ignore the security implications around this. Just use "--bind=/dev/fuse" on the nspawn cmdline, it knows how to deal with device nodes these days.

Evidently, this has actually changed :) Linux torvalds/linux@da315f6 (v4.18-rc1); excerpt from the commit message:

"The most interesting part of this update is user namespace support, mostly done by Eric Biederman. This enables safe unprivileged fuse mounts within a user namespace.

Given @mount is already part of the seccomp allow list for nspawn, and even /sys/fs/fuse/ is exposed, I think that, effectively, this should do it (completely untested):

diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c
index 0842731c18..20e34ab9a6 100644
--- a/src/nspawn/nspawn.c
+++ b/src/nspawn/nspawn.c
@@ -2122,6 +2122,7 @@ static int copy_devnodes(const char *dest) {
                 "random\0"
                 "urandom\0"
                 "tty\0"
+                "fuse\0"
                 "net/tun\0";
 
         _cleanup_umask_ mode_t u;

#2178 (sshfs) and #6553 (glusterfs) have some real-world use cases. I'm interested in fuse-overlayfs (for podman) myself.

@poettering poettering added nspawn RFE 🎁 Request for Enhancement, i.e. a feature request labels Nov 18, 2020
@Korynkai
Copy link

Given @mount is already part of the seccomp allow list for nspawn, and even /sys/fs/fuse/ is exposed, I think that, effectively, this should do it (completely untested):

diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c
index 0842731c18..20e34ab9a6 100644
--- a/src/nspawn/nspawn.c
+++ b/src/nspawn/nspawn.c
@@ -2122,6 +2122,7 @@ static int copy_devnodes(const char *dest) {
                 "random\0"
                 "urandom\0"
                 "tty\0"
+                "fuse\0"
                 "net/tun\0";
 
         _cleanup_umask_ mode_t u;

@paravoid Uhhh.... And wow does it work....

This one-line, eight-character simple patch is exactly what we needed, even without being tested. Thank you! (Well, err, I guess we just tested it)

Technically our usage case makes significant use of nspawn and we've been having difficulties with everything from filesystem binds when our usage case involves PrivateUsers=pick, to any type of remote filesystem. We technically wanted to use GlusterFS for our scenario, but considering it relies on FUSE, this was a significant issue. Now we're able to use GlusterFS without difficulty.

I am curious as to whether any issues (security or otherwise) will occur with this patch, but we're hopeful considering at least we have it working and no longer need to rely on unnecessarily redundant data consuming large amounts of disk space anymore.

If we come across any issues regarding this patch, I will post another comment. The only one I can think of currently would be where the userspace FUSE tools are not present on the container or host system and the device itself would be unnecessary in that case. I do hope, however, that the systemd developers (presumably in combination with the kernel developers) inevitably decide upon a preferred solution, but, well, this should work for now in our systems! Thanks again!

@paravoid
Copy link
Author

This one-line, eight-character simple patch is exactly what we needed, even without being tested. Thank you! (Well, err, I guess we just tested it)

Thank you for testing this and reporting back!

I am curious as to whether any issues (security or otherwise) will occur with this patch, but we're hopeful considering at least we have it working and no longer need to rely on unnecessarily redundant data consuming large amounts of disk space anymore.

The most obvious one is that older kernels (for mainline that's <= 4.17), there was no namespace support for fuse, and this has no version check, which I believe would create an insecure confiugration. In another issue @poettering expressed his preference for guarding a similar feature behind a kernel feature flag, rather than a version check (to account for backports etc.), but I'm not sure what we could check here to conditionally enable this support.

@poettering
Copy link
Member

If there's no sane way to check for the feature explicitly I figure we could query the fuse minor version FUSE_KERNEL_MINOR_VERSION) from the device and base things on that.

@poettering
Copy link
Member

(i.e. the point i am making, if we have to do a version check, then it should be a runtime hceck against the fuse version, and not the kernel. i.e. the more focussed and dynamic the better, since people backport features all the time to older kernels)

Anyway, happy to review/merge a patch for this.

@iam-TJ
Copy link

iam-TJ commented May 8, 2021

I'm investigating how to complete this issue. Apparently it "only" needs a run-time version check on the fuse version, but I've not been able to figure out how that should be done.

  1. link and call into libfuse
  2. direct open of /dev/fuse with a FUSE_INIT check (out.major > 7 || (out.major == 7 && out.minor >=27))
  3. some other?

Guidance please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nspawn RFE 🎁 Request for Enhancement, i.e. a feature request
Development

No branches or pull requests

4 participants