Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: if run as uid 0 in unpriv userns, use mount instead of fusermount #697

Closed
jonleivent opened this issue Oct 26, 2022 · 11 comments
Closed

Comments

@jonleivent
Copy link

Recent linux kernels (I think after 4.18?) allow mounts within unprivileged namespaces without using a suid like fusermount. This is especially advantageous within unprivileged sandboxes that have nonewprivs set (for example, flatpaks and other things using bwrap), hence cannot use suids like fusermount even if any are installed. Is it possible for gocryptfs to attempt to use mount before falling back on fusermount in such cases?

@rfjakob
Copy link
Owner

rfjakob commented Dec 29, 2022

Hi, yes it is. There's some brokenness in go-fuse that needs to be shaked out first, working on it: https://review.gerrithub.io/c/hanwen/go-fuse/+/547866

@jonleivent
Copy link
Author

Another similar case: no namespaces, but /etc/fstab has an unprivileged mount point set up for the user (user,nauto options). That would be another case where using mount before falling back to fusermount would make sense.

@jumoog
Copy link

jumoog commented Mar 18, 2023

Hi, yes it is. There's some brokenness in go-fuse that needs to be shaked out first, working on it: https://review.gerrithub.io/c/hanwen/go-fuse/+/547866

It's merged

rfjakob added a commit that referenced this issue May 17, 2023
Attempt to directly call mount(2) before trying fusermount. This means we
can do without fusermount if running as root.

#697
@rfjakob
Copy link
Owner

rfjakob commented May 17, 2023

So, uhmm, I implemented this. gocryptfs now tries direct mount before falling back to fusermount.

HOWEVER, I was unable to get this to work inside a podman container.

This says https://man7.org/linux/man-pages/man7/user_namespaces.7.html :

 Holding CAP_SYS_ADMIN within the user namespace that owns a
 process's mount namespace allows that process to create bind
 mounts and mount the following types of filesystems:

     * /proc (since Linux 3.8)
     * /sys (since Linux 3.8)
     * devpts (since Linux 3.9)
     * tmpfs(5) (since Linux 3.9)
     * ramfs (since Linux 3.9)
     * mqueue (since Linux 3.9)
     * bpf (since Linux 4.4)
     * overlayfs (since Linux 5.11)

Note the absence of FUSE.

OTOH, I see that this patch was merged in 2018: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4ad769f3c346ec3d458e255548dec26ca5284cf6

And this commit was included in Linux v4.18, as you said:

$ git describe --contains 4ad769f3c346ec3d458e255548dec26ca5284cf6
v4.18-rc1~113^2~3

@rfjakob
Copy link
Owner

rfjakob commented May 17, 2023

Looks like I needed to pass --privileged to podman run, but I guess --cap-add ... should be used instead. Anyway, lookie lookie:

$ podman run --detach --device /dev/fuse --privileged debian sleep 999d

$ podman ps
CONTAINER ID  IMAGE                            COMMAND     CREATED         STATUS         PORTS       NAMES
034dff6b4d0a  docker.io/library/debian:latest  sleep 999d  11 seconds ago  Up 11 seconds              hardcore_pike

$ ./build-without-openssl.bash
$ podman cp gocryptfs hardcore_pike:/bin/gocryptfs

$ podman exec -it hardcore_pike bash

root@034dff6b4d0a:/# ./gocryptfs -init a
[...]

root@034dff6b4d0a:/# ./gocryptfs a b
Password: 
Decrypting master key
Filesystem mounted and ready.
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchLoggerToSyslog: Unix syslog delivery error

root@034dff6b4d0a:/# df -T
Filesystem     Type           1K-blocks      Used Available Use% Mounted on
overlay        overlay        239238788 127865460  99147896  57% /
tmpfs          tmpfs              65536         0     65536   0% /dev
devtmpfs       devtmpfs            4096         0      4096   0% /dev/lp3
tmpfs          tmpfs            2445496       620   2444876   1% /etc/hosts
shm            tmpfs              64000         0     64000   0% /dev/shm
/a             fuse.gocryptfs 239238788 127865460  99147896  57% /b

@rfjakob rfjakob closed this as completed May 17, 2023
@jonleivent
Copy link
Author

I don't have a full understanding of what podman --privileged does. Would gocrypts mounts work there even without this fix?
Looking around to figure out what podman --privileged does: it sounds like it gives the child process pretty much the exact same security environment as the process calling podman, no more, no less. If so, the child could successfully call a suid exe like fusermount. So I am not sure your test case is adequate, even though your fix might be. At least that is my limited understanding of podman --privileged.

Maybe if you temporarily make fusermount non suid, so that gocryptfs fails outside podman, and redo this test? Or add --security-opt=no-new-privileges, assuming that works with --privileged properly?

@rfjakob
Copy link
Owner

rfjakob commented May 17, 2023

I guess I missed a detail of my test. fusermount is not installed in a default debian docker container. Also, enabling --fusedebug, we see that the mount syscall succeeds and fusermount is not attempted:

root@034dff6b4d0a:/# fusermount
bash: fusermount: command not found

root@034dff6b4d0a:/# fusermount3
bash: fusermount3: command not found

root@034dff6b4d0a:/# ./gocryptfs -fusedebug a b
Password: 
Decrypting master key
16:39:53.944555 mountDirect: calling syscall.Mount("/a", "/b", "fuse.gocryptfs", 0x6, "fd=8,rootmode=40000,user_id=0,group_id=0,max_read=1048576")
16:39:53.944983 rx 2: INIT n0 {7.38 Ra 131072 SPLICE_MOVE,POSIX_ACL,NO_OPENDIR_SUPPORT,EXPLICIT_INVAL_DATA,READDIRPLUS,ASYNC_DIO,WRITEBACK_CACHE,HANDLE_KILLPRIV,ASYNC_READ,ATOMIC_O_TRUNC,DONT_MASK,AUTO_INVAL_DATA,CACHE_SYMLINKS,IOCTL_DIR,ABORT_ERROR,BIG_WRITES,SPLICE_READ,FLOCK_LOCKS,NO_OPEN_SUPPORT,PARALLEL_DIROPS,MAX_PAGES,POSIX_LOCKS,EXPORT_SUPPORT,SPLICE_WRITE,READDIRPLUS_AUTO,0x70000000} "\x03\x00\x00\x00\x00\x00\x00\x00"... 48b
16:39:53.944999 tx 2:     OK, {7.28 Ra 131072 AUTO_INVAL_DATA,READDIRPLUS,ASYNC_READ,BIG_WRITES,NO_OPEN_SUPPORT,PARALLEL_DIROPS,MAX_PAGES 0/0 Wr 1048576 Tg 0 MaxPages 256}
16:39:53.945212 rx 4: LOOKUP n1 [".go-fuse-epoll-hack"] 20b
16:39:53.945246 tx 4:     OK, {n18446744073709551615 g0 tE=0s tA=0s {M0100644 SZ=0 L=1 0:0 B0*0 i0:18446744073709551615 A 0.000000 M 0.000000 C 0.000000}}
16:39:53.945364 rx 6: OPEN n18446744073709551615 {O_RDONLY,0x8000} 
16:39:53.945373 tx 6:     OK, {Fh 18446744073709551615 }
16:39:53.945435 rx 8: POLL n18446744073709551615 
16:39:53.945471 tx 8:     38=function not implemented
16:39:53.945544 rx 10: FLUSH n18446744073709551615 {Fh 18446744073709551615} 
16:39:53.945554 tx 10:     OK
Filesystem mounted and ready.
16:39:53.945692 rx 12: RELEASE n18446744073709551615 {Fh 18446744073709551615 0x8000  L0} 
16:39:53.945708 tx 12:     OK
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchToSyslog: Unix syslog delivery error
SwitchLoggerToSyslog: Unix syslog delivery error

@jonleivent
Copy link
Author

Cool! I guess this loses the no allow_other protections, though, which I wasn't thinking much about when I first submitted this, but obviously have been recently. I wonder whether my pam_mount gocryptfs will still use fuse? I don't know what caps pam_mount gives its client mounter, which runs as the user - probably enough to do a normal non-fuse mount over any mountpoint they own.
Maybe having a gocryptfs command line option to force fuse even if normal mount works?
There are 2 cases: if in a sandbox where the user can use normal mount, then the user is in their own mount namespace and so probably has no worries about spying from other users. But if outside a sandbox, then they may prefer fuse to get the no allow_others protections that normal mounting doesn't offer, even if they somehow have enough caps to run normal mount.

@rfjakob
Copy link
Owner

rfjakob commented May 18, 2023

It is my understanding that allow_other is independent of whether fusermount or direct mount is used.

Testing in podman confirms that:

root@034dff6b4d0a:/# ls b
hello
root@034dff6b4d0a:/# sudo -u daemon ls /b
ls: cannot access '/b': Permission denied

Also note that there's no allow_other in the mount flags inside the container:

root@034dff6b4d0a:/# mount | grep gocryptfs
/a on /b type fuse.gocryptfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_read=1048576)

And also, this mount cannot be accessed at all unless you are inside the container - it is not visible outside!

@jonleivent
Copy link
Author

Awesome!

@DrDaveD
Copy link

DrDaveD commented May 26, 2023

I confirm that this works without fusermount for building and running apptainer encrypted containers unprivileged using the apptainer main (i.e. not yet released) code branch. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants