Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot mount devpts or sysfs with a user namespace (as of v0.0.3) #225

Closed
wking opened this issue Aug 24, 2015 · 9 comments
Closed

Cannot mount devpts or sysfs with a user namespace (as of v0.0.3) #225

wking opened this issue Aug 24, 2015 · 9 comments

Comments

@wking
Copy link
Contributor

wking commented Aug 24, 2015

While adding user namespaces to my bundle, I had to drop the devpts and sysfs mounts from runC's default config, and I also had to drop ro from runC's default cgroup-mount options. Restoring them to my config.json.template leads to:

# runc config.json
Timestamp: 2015-08-24 16:47:34.679951846 -0700 PDT
Code: System error

Message: invalid argument

Frames:

---
0: setupRootfs
Package: github.com/opencontainers/runc/libcontainer
File: rootfs_linux.go@37

---
1: Init
Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit)
File: standard_init_linux.go@52

---
2: StartInitialization
Package: github.com/opencontainers/runc/libcontainer.(*LinuxFactory)
File: factory_linux.go@242

---
3: init·1
Package: main
File: run.go@21

---
4: init
Package: main
File: utils.go@177

---
5: main
Package: runtime
File: proc.go@58

---
6: goexit
Package: runtime
File: asm_amd64.s@2232WARN[0000] signal: killed
FATA[0000] Container start failed: [8] System error: invalid argument
Makefile:11: recipe for target 'run' failed
make: *** [run] Error 1

with the devpts entry,

# runc config.json
Timestamp: 2015-08-24 16:46:33.526868472 -0700 PDT
Code: System error

Message: operation not permitted

Frames:

---
0: setupRootfs
Package: github.com/opencontainers/runc/libcontainer
File: rootfs_linux.go@37

---
1: Init
Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit)
File: standard_init_linux.go@52

---
WARN[0000] signal: killed                               
FATA[0000] Container start failed: [8] System error: operation not permitted 
Makefile:11: recipe for target 'run' failed
make: *** [run] Error 1

with the sysfs entry, and:

# runc config.json
Timestamp: 2015-08-24 16:51:02.5931295 -0700 PDT
Code: System error

Message: operation not permitted

Frames:

---
0: setupRootfs
Package: github.com/opencontainers/runc/libcontainer
File: rootfs_linux.go@37

---
1: Init
Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit)
File: standard_init_linux.go@52

---
2: StartInitialization
Package: github.com/opencontainers/runc/libcontainer.(*LinuxFactory)
File: factory_linux.go@242

---
3: init·1
Package: main
File: run.go@21

---
4: init
Package: main
File: utils.go@WARN[0000] signal: killed                               
FATA[0000] Container start failed: [8] System error: operation not permitted 
Makefile:11: recipe for target 'run' failed
make: *** [run] Error 1

with ro in the cgroups entry. I haven't been able to figure out why I'm getting these mount errors. If I drop my user namespacing, the errors go away. Is this a runC bug? A user-namespace limitation? A bug in my config template? I'm happy to help with further digging, but I could use a few hints pointing me in a useful direction. Perhaps this is what @LK4D4 was thinking about when he mentioned reconsidering default mounts for unprivileged functionality.

@mrunalp
Copy link
Contributor

mrunalp commented Aug 26, 2015

@wking Thanks for creating the issue. @estesp reported this a few days back on IRC. This is most likely related to a kernel change. I will look into this.

@wking
Copy link
Contributor Author

wking commented Aug 26, 2015

On Tue, Aug 25, 2015 at 08:10:59PM -0700, Mrunal Patel wrote:

@estesp reported this a few days back on IRC.

Hmm, which channel/day? The most recent estesp comments in
#opencontainers look like Hangout linking after our last meeting
(2015-08-12).

This is most likely related to a kernel change. I will look into
this.

I'm running 4.1.0, if that helps.

@estesp
Copy link
Contributor

estesp commented Aug 26, 2015

This is the kernel patch which seems to be the culprit for the issue: http://www.spinics.net/lists/linux-fsdevel/msg86256.html

It went into one of the Linux 4.2-rc's, but Ubuntu recently brought it into their 3.16.0-45 kernel update, which is how I happened to catch it. @wking where are you sourcing your 4.1 kernel from? If from a distro maybe they backported this patch into your kernel? Here is a direct link to the upstream commit: torvalds/linux@1b852bc#diff-3fbed1fd4d15699b74b30cabf5be8133 which shows the 4.2 tag/lineage.

@wking
Copy link
Contributor Author

wking commented Aug 26, 2015

On Tue, Aug 25, 2015 at 08:42:58PM -0700, Phil Estes wrote:

@wking where are you sourcing your 4.1 kernel from?

I built it locally from 1, tag v4.1, so I don't have
torvalds/linux@1b852bceb or any FS_USERNS_VISIBLE in my source.

If we suspect a kernel issue, I can try bisecting to actually find the
culprit, but that may take a while and I'm happy to leave it to
someone who's got a spare box and more practice with kexec ;).

@mrunalp
Copy link
Contributor

mrunalp commented Aug 31, 2015

I tested with the kernel 4.2 on Fedora rawhide. There are two issues:

  1. ro needs to be removed from the cgroups mount as @wking mentioned earlier. (probably remount isn't permitted; looking further).
  2. Take out gid=5 from the devpts mount options.

With the above two changes to the config.json, I was able to bring up a container in the busybox rootfs.
I will look into this further. @estesp could you try with these two changes and see if it works for you ?

@wking
Copy link
Contributor Author

wking commented Aug 31, 2015

On Mon, Aug 31, 2015 at 03:39:47PM -0700, Mrunal Patel wrote:

  1. Take out gid=5 from the devpts mount options.

This works for me on my vanilla 4.1.

@estesp
Copy link
Contributor

estesp commented Sep 2, 2015

Works for me with those changes as well in runC. The original report I got was someone testing my Docker-based user namespace patchset before and after the Ubuntu kernel update I mentioned, so it might be trickier to get a test scenario there, although I could provide a modified default_template.go for the native execdriver to change the ro mount option, and gid=5 setting for devpts.

@crosbymichael
Copy link
Member

This has been resolved.

stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
Expand on the definition of our ops
stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
This slipped through the renumbering in 7117ede (Expand on the
definition of our ops, 2015-10-13, opencontainers#225).

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
# digest/hashing target

Most of this has spun off with [1], and I haven't heard of anyone
talking about verifying the on-disk filesystem in a while.  My
personal take is on-disk verification doesn't add much over serialized
verification unless you have a local attacker (or unreliable disk),
and you'll need some careful threat modeling if you want to do
anything productive about the local attacker case.  For some more
on-disk verification discussion, see the thread starting with [2].

# distributable-format target

This spun off with [1].

# lifecycle target

I think this is resolved since 7713efc (Add lifecycle for containers,
2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP
entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230).

# container-action target

Addressed by 7117ede (Expand on the definition of our ops,
2015-10-13, opencontainers#225), although there has been additional discussion in
a7a366b (Remove exec from required runtime functionalities,
2016-04-19, opencontainers#388) and 0430aaf1 (Split create and start, 2016-04-01,
opencontainers#384).

# validation and testing targets

Validation is partly covered by cdcabde (schema: JSON Schema and
validator for `config.json`, 2016-01-19, opencontainers#313) and subequent JSON
Schema work.  The remainder of these targets are handled by ocitools
[3].

# printable/compiled-spec target

The bulk of this was addressed by 4ee036f (*: printable documents,
2015-12-09, opencontainers#263).  Any remaining polishing of that workflow seems
like a GitHub-issue thing and not a ROADMAP thing.  And publishing
these to opencontainers.org certainly seems like it's outside the
scope of this repository (although I think that such publishing is a
good idea).

[1]: https://github.com/opencontainers/image-spec
[2]: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/NHpSQ19KCAAJ
     Subject: OCI Bundle Digests Summary
     Date: Wed, 14 Oct 2015 17:09:15 +0000
     Message-ID: <CAD2oYtN-9yLLhG_STO3F1h58Bn5QovK+u3wOBa=t+7TQi-hP1Q@mail.gmail.com>
[3]: https://github.com/opencontainers/ocitools

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
This wording is descended from 7117ede (Expand on the definition of
our ops, 2015-10-13, opencontainers#225), but the idea is covered generically by
e53a72b (Clarify the operation is not for command-line api,
2016-05-24, opencontainers#450), so we no longer need a create-specific note.
Especially in the lifecycle docs, where there's already enough going
on without this low-level detail.

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
This wording landed without comment as part of 7117ede (Expand on the
definition of our ops, 2015-10-13, opencontainers#225).  However, I'm not entirely
clear on the exception it's making.  It may be trying to say something
like:

  Just because you were authorized to manage that container when you
  created it doesn't mean you're still authorized to perform operation
  X on it now.  Maybe you've lost privileges in the meantime.

But as far as compliance testing is concerned, the same test harness
will be calling 'create' and the subsequent operations.  That harness
will be reporting MUST violations if the runtime refuses a subsequent
operation, and removing the access-control loophole makes it more
obvious that the runtime's refusal is non-compliant.

Signed-off-by: W. Trevor King <wking@tremily.us>
@goyalankit
Copy link

@crosbymichael Can you please point me to the commit that fixed this issue? I am hitting this with runc rc2 version. Or was it fixed in upstream kernel? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants