Don't map ourselves to root in the user namespace. #3609

zenhack · 2022-02-13T06:40:57Z

Take 2 for fixing #3584; this follows @kentonv's suggestion (2). As it turns out, we were already creating a userns in the right spot, but it doesn't take effect until after a fork(), so we have to do some footwork to make sure the rest of the logic happens in a child process.

Caveat: a few tests still fail on my machine, but for reasons that don't seem to be related: things like:

 ✖ tests/system-api

   - Test system api (18.338s)
   Timed out while waiting for element <.grainlog-contents > pre> to be present for 5000 milliseconds.  - expected "visible" but got: "not found"
       at Object.module.exports.Test system api (/home/isd/src/foreign/sandstorm/tests/tests/system-api.js:38:6)
       at processTicksAndRejections (internal/process/task_queues.js:77:11)

Despite the fact that, when I run this with SHOW_BROWSER=true, the thing it's querying for is there. I'm also seeing these failures on my other branch, which I wasn't before. Before I fuss with this more I want to see if it passes in CI; marking this as a draft for now, but if I don't see those failures in CI I'll mark it as ready for review again.

Also, we do not appear to have tests that actually check that updates work correctly (we should add those), but I did check this manually. With only the first hunk (not mapping to uid 0), updates are indeed broken, but with the full patch they work again, as does launching apps afterwards.

zenhack · 2022-02-13T07:20:33Z

CI looks ok; marking this as ready for review.

kentonv · 2022-02-13T15:49:49Z

src/sandstorm/run-bundle.c++

@@ -1432,6 +1426,25 @@ private:
      tmpfsUidOpts = ",uid=0,gid=0";
    } else {
      unshareUidNamespaceOnce();


I think this normally won't create a user namespace because we already unshared the user namespace in the parent process -- hence the "once".

What we probably need to do here is change unshareUidNamespaceOnce() to unshare every time if getuid() returns non-zero.

There's a further complexity with regard to mounting /proc. In the logic below, we either mount proc if we're in a PID namespace, or we bind-mount the original proc if not. However, if we're in a PID namespace, it is owned by the parent userns, on which we've already lost capabilities. So this doesn't work anymore. I guess we may have to nest the PID namespaces as well, so we can create a new one here so that it's mountable? Ugh.

I think what ends up happening is:

sandstorm start does not create a userns here, as you suggest.

sandstorm continue does create a userns, since there is no prior call to this in that codepath. This is where it counts, after the exec.

I don't know why the fork seems to be necessary.

This raises another concern: If we're spawning a new userns on each exec, is it possible this is going to grow without bound, adding another level of nesting after each update? This probably wouldn't grow that quickly, but I could forsee some future point in time where we hit a limit and everybody's server crashes...

The unsharedUidNamespace flag is actually passed through sandstorm continue via the --userns flag, so in fact a user NS is never created here during either sandstorm start or sandstorm continue. (I think the reason this line exists is for the call site from sandstorm mongo.)

My guess is that this change works for the initial startup because in that case there's no exec() where the capabilities are dropped, but it probably won't work after the first update.

I think we probably need to create both a nested userns and a nested pidns. And I think to do that without creating yet another process in the hierarchy, we might want to use clone() instead of exec() to create the server monitor, so that we can pass CLONE_NEWUSER and CLONE_NEWPID at that time, without affecting the state of the parent process.

kentonv · 2022-02-13T15:52:22Z

src/sandstorm/run-bundle.c++

@@ -1432,6 +1426,25 @@ private:
      tmpfsUidOpts = ",uid=0,gid=0";
    } else {
      unshareUidNamespaceOnce();
+
+      // In order to actually enter the new user namespace, we


I don't think you need to fork to enter a userns. You do need to fork to enter a PID namespace.

kentonv · 2022-02-13T15:55:16Z

This doesn't look to me like it should work... as noted, I don't think we are creating a userns there, and I don't see why the added fork() would change anything in any case. Are you sure you tested it on an affected kernel?

zenhack · 2022-02-14T21:48:40Z

To make sure we are on the same page about the behavior I am observing:

With the full patch:
- if I do ./tests/tmp-sandstorm/sandstorm start, poke around, a bit, everything appears to work.
- After then doing ./tests/tmp-sandstorm/sandstorm update sandstorm-0-fast.tar.xz, it continues to work as expected.
With only the first hunk, i.e. no fork(), everything works on start, but then update breaks it; sandstorm fails to come up after an update, with this in the log:

** Shutting down due to signal
Not killing Dev daemon because it is not running.
** Restarting
WARNING: Auto-updates are disabled by config.
*** Uncaught exception ***
sandstorm/run-bundle.c++:1453: failed: unshare(CLONE_NEWNS): Operation not permitted
stack: 623b78 61df67 61b957 63a3f6 63a2ac 495ea7 495a84
** Server monitor died. Aborting.

Without any change, grains do not start, even before an update, as in the tests on master.

You have talked me into being confused by these observations, but empirically the fork() does seem to fix the problem. I don't know why, and we should obviously not merge this until we do.

zenhack · 2022-02-14T22:00:54Z

Hm, I noticed:

> ./tests/tmp-sandstorm/sandstorm stop
Waiting for PID 277346 to terminate...
Did not terminate after 10 seconds; killing...
Sandstorm server stopped.

If, using the apparently-working patched version I do a sandstorm update to the version on master, it appears to still work after the update, but if I then stop & restart sandstorm as above, then it stops working. So I think what's happening is that the fork is somehow preventing sandstorm from entirely restarting at all.

zenhack · 2022-02-15T01:18:14Z

I took a stab at doing this with clone() and a pid namespace. Per the commit message, it's still not working; for some reason waitpid() fails during shutdown:

** Shutting down due to signal
*** Uncaught exception ***
sandstorm/run-bundle.c++:2042: failed: waitpid(sandstormPid, &status, 0): No child processes
stack: 61be6b 619fc5 61977c 495ea7 495a84

I'm out of steam for the night though, so marking this as a draft and I'll debug later.

zenhack · 2022-02-20T02:19:09Z

I spent more time trying to debug this, and am stumped. Things I have observed:

If we just wait() instead of waitpid(), we still get ECHILD, so it's not that we're specifying the wrong pid or something. According to wait(2), this should only happen if the calling process really has no children that haven't been waited for.
However I have confirmed (by inspecting the process hierarchy manually) that the server monitor definitely is a child of the main process.
If we leave out the wait/waitpid entirely, we do get a zombie process.

I'm not sure what to try next.

kentonv · 2022-03-13T21:50:43Z

I think you're hitting this, from the man page:

   The child termination signal
       When  the child process terminates, a signal may be sent to the parent.  The termina‐
       tion signal is specified in the low byte of flags (clone()) or in cl_args.exit_signal
       (clone3()).   If  this  signal  is specified as anything other than SIGCHLD, then the
       parent process must specify the __WALL or __WCLONE options when waiting for the child
       with wait(2).  If no signal (i.e., zero) is specified, then the parent process is not
       signaled when the child terminates.

To fix this you need to OR SIGCHLD into the flags.

Meanwhile, though, you probably shouldn't use the glibc wrapper here. The glibc wrapper is really intended for thread use cases, and it doesn't allow you to use clone() as a drop-in replacement for fork(). If you invoke the raw syscall, you can do this (so you won't need to allocate a stack).

syscall(SYS_clone, CLONE_NEWUSER | CLONE_NEWPID | SIGCHLD, nullptr, nullptr, nullptr, nullptr);

Doing this is the source of sandstorm-io#3584. Instead, make sure we've fully entered a new user namespace before we have to do anything that would require the capabilities that are dropped on exec(). We also need to be in a new pid namespace, since we try to mount /proc so it needs to be a procfs that we own. We use clone() instead of unshare() for this so we don't have to disturb the process hierarchy.

zenhack · 2022-03-13T22:50:29Z

Aha! Ok, the latest version fixes the problem by adding SIGCHLD, and also invokes the raw system call as you suggest; marking as ready for review.

zenhack · 2022-04-03T21:28:12Z

Hm, it looks like CI is hanging -- it's passing locally...

src/sandstorm/run-bundle.c++

kentonv · 2022-04-10T16:22:24Z

Looks like that worked!

zenhack marked this pull request as draft February 13, 2022 06:42

zenhack mentioned this pull request Feb 13, 2022

Rework the way updates work to fix #3584 #3603

Closed

3 tasks

zenhack marked this pull request as ready for review February 13, 2022 07:20

ocdtrekkie linked an issue Feb 13, 2022 that may be closed by this pull request

Permissions error causing test failures #3584

Closed

ocdtrekkie added bug ready-for-review We think this is ready for review sandstorm-dev Issues hacking on Sandstorm labels Feb 13, 2022

kentonv reviewed Feb 13, 2022

View reviewed changes

zenhack force-pushed the userns-dont-map-root branch from 91b8b3f to 342f266 Compare February 15, 2022 01:16

zenhack marked this pull request as draft February 15, 2022 01:18

zenhack force-pushed the userns-dont-map-root branch from 342f266 to 4c03091 Compare February 20, 2022 00:36

ocdtrekkie removed the ready-for-review We think this is ready for review label Feb 23, 2022

zenhack mentioned this pull request Mar 13, 2022

Fix web-publishing test #3606

Merged

zenhack force-pushed the userns-dont-map-root branch from 4c03091 to 2bb450f Compare March 13, 2022 22:49

zenhack marked this pull request as ready for review March 13, 2022 22:50

zenhack requested a review from kentonv March 13, 2022 22:50

ocdtrekkie added the ready-for-review We think this is ready for review label Mar 13, 2022

Merge remote-tracking branch 'origin/master' into userns-dont-map-root

6d4ed35

zenhack mentioned this pull request Apr 3, 2022

WIP: make upgrading demo accounts available via a config option. #3614

Draft

kentonv reviewed Apr 10, 2022

View reviewed changes

src/sandstorm/run-bundle.c++ Outdated Show resolved Hide resolved

Use clone()'s return rather than parent_tid

a722f22

kentonv merged commit 7968a62 into sandstorm-io:master Apr 10, 2022

zenhack deleted the userns-dont-map-root branch April 10, 2022 21:09

ocdtrekkie removed the ready-for-review We think this is ready for review label Apr 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't map ourselves to root in the user namespace. #3609

Don't map ourselves to root in the user namespace. #3609

zenhack commented Feb 13, 2022

zenhack commented Feb 13, 2022

kentonv Feb 13, 2022

zenhack Feb 13, 2022

kentonv Feb 14, 2022

kentonv Feb 13, 2022

kentonv commented Feb 13, 2022

zenhack commented Feb 14, 2022

zenhack commented Feb 14, 2022

zenhack commented Feb 15, 2022

zenhack commented Feb 20, 2022

kentonv commented Mar 13, 2022

zenhack commented Mar 13, 2022

zenhack commented Apr 3, 2022

kentonv commented Apr 10, 2022

Don't map ourselves to root in the user namespace. #3609

Don't map ourselves to root in the user namespace. #3609

Conversation

zenhack commented Feb 13, 2022

zenhack commented Feb 13, 2022

kentonv Feb 13, 2022

Choose a reason for hiding this comment

zenhack Feb 13, 2022

Choose a reason for hiding this comment

kentonv Feb 14, 2022

Choose a reason for hiding this comment

kentonv Feb 13, 2022

Choose a reason for hiding this comment

kentonv commented Feb 13, 2022

zenhack commented Feb 14, 2022

zenhack commented Feb 14, 2022

zenhack commented Feb 15, 2022

zenhack commented Feb 20, 2022

kentonv commented Mar 13, 2022

zenhack commented Mar 13, 2022

zenhack commented Apr 3, 2022

kentonv commented Apr 10, 2022