New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcontainer: allow setgroup in rootless mode #1693

Merged
merged 2 commits into from Apr 24, 2018

Conversation

Projects
None yet
5 participants
@AkihiroSuda
Contributor

AkihiroSuda commented Jan 15, 2018

Previously, /proc/PID/setgroups was always set to "deny" in rootless mode, as recent Linux kernel does not allow unprivileged users to update /proc/PID/uid_map without doing so.

However, we can mitigate this limitation by using newuidmap(1) and newgidmap(1), although it requires these binaries (not runc itself) to have suid bit.

This PR adds a new flag --force-mapping-tool to use these binaries compulsory.

Now apt works within a rootless container, without ptrace hack:

Example config.json: https://gist.github.com/AkihiroSuda/0a46ad9ba6f392b27ac4c8d372721207

Demo (Tested on Ubuntu 17.10, kernel 4.13.0-25-generic x86_64):

$ mkdir rootfs
$ wget -O /tmp/ubuntu-base-17.10-base-amd64.tar.gz http://cdimage.ubuntu.com/ubuntu-base/releases/17.10/release/ubuntu-base-17.10-base-amd64.tar.gz
$ tar xzf /tmp/ubuntu-base-17.10-base-amd64.tar.gz -C ./rootfs || echo "mknod error were ignored"
$ cp /etc/resolv.conf rootfs/etc
$ vi config.json
(you need to set your own uidMappings and gidMappings)
$ runc run --force-mapping-tool foo
root@runc:/# chmod 1777 /tmp
root@runc:/# apt update
root@runc:/# apt install cowsay
root@runc:/# /usr/games/cowsay hello rootless apt
 ____________________
< hello rootless apt >
 --------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

If depending on the mapping tool is problem, we can use ptrace(2) hacks instead: https://github.com/AkihiroSuda/runrootless

}
return buf
}
func (msg *Boolmsg) Len() int {
return unix.NLA_HDRLEN + 1
return unix.NLA_HDRLEN + 4 // alignment

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Jan 15, 2018

Contributor

This alignment issue had not appear because no multiple Boolmsgs had been serialized at once.

@frezbo

This comment has been minimized.

frezbo commented Jan 15, 2018

@AkihiroSuda looks like the newuidmap and newgidmap already has suid bit set, I tested on my local fedora 27 machine, so the suid bit should be ok. I wonder what all privileges these require so we can do a setcap and remove the suid bit.

@craigfurman

This comment has been minimized.

Contributor

craigfurman commented Jan 15, 2018

Interestingly, users do not need an entry in /etc/subgid at all in order to create a user namespace and setuid to 0 without writing deny into /proc/$pid/setgroups. This actually renders negative group ACLs (e.g. denying a directory to a group by using mode 0707) useless.

I raised this as an issue, but there's disagreement on how best to tackle it.

It's a bit of a shame that we have a kernel-level check that's subverted by popular setuid executables. I do see the point that negative group ACLs aren't very useful, but we're kind of in the worst of both worlds now, where we have /proc/ interface complexity (setgroups file) that's subverted by newgidmap!

@cyphar

This comment has been minimized.

Member

cyphar commented Jan 15, 2018

@craigfurman I think the logic is that you need to have setgroups allowed in order for a "normal" Linux distribution to work as-expected in a user namespace. The 707 issue is most definitely quite concerning, but I expect that the argument from the shadow folks will be

If you specify that users can set their GID in /etc/subgid, you are implicitly allowing setgroups and this is an administrative decision that is similar to adding a user to sudoers or changing /etc/passwd such that a user has uid=0.

Personally I can see why it is useful to allow setgroups (I know that the LXC folks use this quite heavily so you can boot full distributions inside containers -- something that would become significantly more difficult without this), though I do somewhat agree that it's more dangerous than it is being sold as.

@craigfurman

This comment has been minimized.

Contributor

craigfurman commented Jan 15, 2018

@cyphar you don't need an entry in /etc/subgid to be able to do this though

@craigfurman

This comment has been minimized.

Contributor

craigfurman commented Jan 15, 2018

And yeah tbh I totally value allowing setgroups more than I value negative group ACLs... it's certainly more useful for us container folks! :)

@cyphar

This comment has been minimized.

Member

cyphar commented Jan 15, 2018

I added a comment to the bug you mentioned with an idea for how we could extend /etc/subgid to fix this issue. I can work on a patch for this soon, and I think that it's probably a good idea to get a CVE assigned.

@craigfurman

This comment has been minimized.

Contributor

craigfurman commented Jan 15, 2018

Nice, thanks for replying to that thread. I'll continue the conversation on there.

@AkihiroSuda

This comment has been minimized.

Contributor

AkihiroSuda commented Jan 24, 2018

@cyphar @craigfurman Does this PR LGTY?

@cyphar

This comment has been minimized.

Member

cyphar commented Mar 17, 2018

The issue @craigfurman discussed has been fixed as part of CVE-2018-7169. So review on this has re-opened now. However it is no longer as obvious whether this actually will help with setgroups(2) in the single-user-being-mapped case (the admin would need to explicitly set cyphar:1000:1 in /etc/subuid).

@AkihiroSuda

This comment has been minimized.

Contributor

AkihiroSuda commented Mar 23, 2018

However it is no longer as obvious whether this actually will help with setgroups(2) in the single-user-being-mapped case (the admin would need to explicitly set cyphar:1000:1 in /etc/subuid).

For single-user-being-mapped case we would use ptrace hack instead.
So I think this PR is still fine.

@cyphar

This comment has been minimized.

Member

cyphar commented Mar 23, 2018

But if we have more than one user being mapped then we would already fall-back to newuidmap and newgidmap (because the write attempts will fail) -- this is what I was confused about.

@AkihiroSuda

This comment has been minimized.

Contributor

AkihiroSuda commented Mar 23, 2018

s/force-mapping-tools/keep-proc-setgroups-untouched/ then?

@cyphar

This comment has been minimized.

Member

cyphar commented Mar 24, 2018

We could do this without a CLI flag actually. You just need to figure out whether the mapping contains more than one user (because the single-user case is now the same with or without the setuid tools). If it does, then we assume we don't want to write anything to /proc/self/setgroups. Otherwise we write as usual.

How does that sound?

libcontainer: fix Boolmsg alignment
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

@AkihiroSuda AkihiroSuda changed the title from rootless: allow setgroups if `--force-mapping-tool` is set to libcontainer: allow setgroup in rootless mode Mar 26, 2018

@AkihiroSuda

This comment has been minimized.

Contributor

AkihiroSuda commented Mar 26, 2018

@cyphar updated, PTAL

gidMapCount := 0
for _, m := range c.config.GidMappings {
gidMapCount += m.Size
}

This comment has been minimized.

@cyphar

cyphar Mar 26, 2018

Member

This isn't quite right. It will miss if we have a single mapping of more than one group. Really, what we should do is have a helper function that does something like:

func requiresMappingTool(c *configs.Config) bool {
  gidMap := []configs.IDMap{
    {ContainerID: 0, HostID: os.Getegid(), Size: 1},
  }
  return !reflect.DeepEquals(c.GidMappings, gidMap)
}

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018

Contributor

It will miss if we have a single mapping of more than one group.

Why? Does not gidMapCount += m.Size catch such case?

This comment has been minimized.

@cyphar

cyphar Mar 27, 2018

Member

It doesn't catch if the HostID is not os.Getegid. Also I think the DeepEqual is clearer (from the perspective of someone trying to understand in what case we disable setgroups).

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018

Contributor

thanks for pointing out, updated

if (config.is_setgroup)
update_setgroups(child, SETGROUPS_ALLOW);
if (config.is_rootless)
else if (config.is_rootless)
update_setgroups(child, SETGROUPS_DENY);

This comment has been minimized.

@cyphar

cyphar Mar 26, 2018

Member

These four lines could just be shortened to

if (config.is_rootless && !config.is_setgroup)
  update_setgroups(child, SETGROUPS_DENY);

Because writing allow is a noop.

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018

Contributor

done

libcontainer: allow setgroup in rootless mode
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@cyphar

This comment has been minimized.

Member

cyphar commented Mar 27, 2018

LGTM. I've tested it out a bit and it appears to work pretty well. Thanks @AkihiroSuda!

/cc @opencontainers/runc-maintainers

Approved with PullApprove

@AkihiroSuda

This comment has been minimized.

Contributor

AkihiroSuda commented Apr 13, 2018

@opencontainers/runc-maintainers anyone PTAL?

@cyphar

This comment has been minimized.

Member

cyphar commented Apr 21, 2018

@crosbymichael

This comment has been minimized.

Member

crosbymichael commented Apr 24, 2018

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit bdbb9fa into opencontainers:master Apr 24, 2018

2 checks passed

code-review/pullapprove Approved by crosbymichael, cyphar
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@AkihiroSuda AkihiroSuda referenced this pull request May 22, 2018

Closed

setgroups broken #96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment