Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcontainer: allow setgroup in rootless mode #1693

Merged

Conversation

@AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Jan 15, 2018

Previously, /proc/PID/setgroups was always set to "deny" in rootless mode, as recent Linux kernel does not allow unprivileged users to update /proc/PID/uid_map without doing so.

However, we can mitigate this limitation by using newuidmap(1) and newgidmap(1), although it requires these binaries (not runc itself) to have suid bit.

This PR adds a new flag --force-mapping-tool to use these binaries compulsory.

Now apt works within a rootless container, without ptrace hack:

Example config.json: https://gist.github.com/AkihiroSuda/0a46ad9ba6f392b27ac4c8d372721207

Demo (Tested on Ubuntu 17.10, kernel 4.13.0-25-generic x86_64):

$ mkdir rootfs
$ wget -O /tmp/ubuntu-base-17.10-base-amd64.tar.gz http://cdimage.ubuntu.com/ubuntu-base/releases/17.10/release/ubuntu-base-17.10-base-amd64.tar.gz
$ tar xzf /tmp/ubuntu-base-17.10-base-amd64.tar.gz -C ./rootfs || echo "mknod error were ignored"
$ cp /etc/resolv.conf rootfs/etc
$ vi config.json
(you need to set your own uidMappings and gidMappings)
$ runc run --force-mapping-tool foo
root@runc:/# chmod 1777 /tmp
root@runc:/# apt update
root@runc:/# apt install cowsay
root@runc:/# /usr/games/cowsay hello rootless apt
 ____________________
< hello rootless apt >
 --------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

If depending on the mapping tool is problem, we can use ptrace(2) hacks instead: https://github.com/AkihiroSuda/runrootless

}
return buf
}

func (msg *Boolmsg) Len() int {
return unix.NLA_HDRLEN + 1
return unix.NLA_HDRLEN + 4 // alignment

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Jan 15, 2018
Author Member

This alignment issue had not appear because no multiple Boolmsgs had been serialized at once.

@frezbo
Copy link

@frezbo frezbo commented Jan 15, 2018

@AkihiroSuda looks like the newuidmap and newgidmap already has suid bit set, I tested on my local fedora 27 machine, so the suid bit should be ok. I wonder what all privileges these require so we can do a setcap and remove the suid bit.

@craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Jan 15, 2018

Interestingly, users do not need an entry in /etc/subgid at all in order to create a user namespace and setuid to 0 without writing deny into /proc/$pid/setgroups. This actually renders negative group ACLs (e.g. denying a directory to a group by using mode 0707) useless.

I raised this as an issue, but there's disagreement on how best to tackle it.

It's a bit of a shame that we have a kernel-level check that's subverted by popular setuid executables. I do see the point that negative group ACLs aren't very useful, but we're kind of in the worst of both worlds now, where we have /proc/ interface complexity (setgroups file) that's subverted by newgidmap!

@cyphar
Copy link
Member

@cyphar cyphar commented Jan 15, 2018

@craigfurman I think the logic is that you need to have setgroups allowed in order for a "normal" Linux distribution to work as-expected in a user namespace. The 707 issue is most definitely quite concerning, but I expect that the argument from the shadow folks will be

If you specify that users can set their GID in /etc/subgid, you are implicitly allowing setgroups and this is an administrative decision that is similar to adding a user to sudoers or changing /etc/passwd such that a user has uid=0.

Personally I can see why it is useful to allow setgroups (I know that the LXC folks use this quite heavily so you can boot full distributions inside containers -- something that would become significantly more difficult without this), though I do somewhat agree that it's more dangerous than it is being sold as.

@craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Jan 15, 2018

@cyphar you don't need an entry in /etc/subgid to be able to do this though

@craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Jan 15, 2018

And yeah tbh I totally value allowing setgroups more than I value negative group ACLs... it's certainly more useful for us container folks! :)

@cyphar
Copy link
Member

@cyphar cyphar commented Jan 15, 2018

I added a comment to the bug you mentioned with an idea for how we could extend /etc/subgid to fix this issue. I can work on a patch for this soon, and I think that it's probably a good idea to get a CVE assigned.

@craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Jan 15, 2018

Nice, thanks for replying to that thread. I'll continue the conversation on there.

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Jan 24, 2018

@cyphar @craigfurman Does this PR LGTY?

@cyphar
Copy link
Member

@cyphar cyphar commented Mar 17, 2018

The issue @craigfurman discussed has been fixed as part of CVE-2018-7169. So review on this has re-opened now. However it is no longer as obvious whether this actually will help with setgroups(2) in the single-user-being-mapped case (the admin would need to explicitly set cyphar:1000:1 in /etc/subuid).

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Mar 23, 2018

However it is no longer as obvious whether this actually will help with setgroups(2) in the single-user-being-mapped case (the admin would need to explicitly set cyphar:1000:1 in /etc/subuid).

For single-user-being-mapped case we would use ptrace hack instead.
So I think this PR is still fine.

@cyphar
Copy link
Member

@cyphar cyphar commented Mar 23, 2018

But if we have more than one user being mapped then we would already fall-back to newuidmap and newgidmap (because the write attempts will fail) -- this is what I was confused about.

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Mar 23, 2018

s/force-mapping-tools/keep-proc-setgroups-untouched/ then?

@cyphar
Copy link
Member

@cyphar cyphar commented Mar 24, 2018

We could do this without a CLI flag actually. You just need to figure out whether the mapping contains more than one user (because the single-user case is now the same with or without the setuid tools). If it does, then we assume we don't want to write anything to /proc/self/setgroups. Otherwise we write as usual.

How does that sound?

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:leave-setgroups-allow branch from 9e4fd2f to 29740fe Mar 26, 2018
@AkihiroSuda AkihiroSuda changed the title rootless: allow setgroups if `--force-mapping-tool` is set libcontainer: allow setgroup in rootless mode Mar 26, 2018
@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Mar 26, 2018

@cyphar updated, PTAL

gidMapCount := 0
for _, m := range c.config.GidMappings {
gidMapCount += m.Size
}

This comment has been minimized.

@cyphar

cyphar Mar 26, 2018
Member

This isn't quite right. It will miss if we have a single mapping of more than one group. Really, what we should do is have a helper function that does something like:

func requiresMappingTool(c *configs.Config) bool {
  gidMap := []configs.IDMap{
    {ContainerID: 0, HostID: os.Getegid(), Size: 1},
  }
  return !reflect.DeepEquals(c.GidMappings, gidMap)
}

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018
Author Member

It will miss if we have a single mapping of more than one group.

Why? Does not gidMapCount += m.Size catch such case?

This comment has been minimized.

@cyphar

cyphar Mar 27, 2018
Member

It doesn't catch if the HostID is not os.Getegid. Also I think the DeepEqual is clearer (from the perspective of someone trying to understand in what case we disable setgroups).

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018
Author Member

thanks for pointing out, updated


if (config.is_setgroup)
update_setgroups(child, SETGROUPS_ALLOW);
if (config.is_rootless)
else if (config.is_rootless)
update_setgroups(child, SETGROUPS_DENY);

This comment has been minimized.

@cyphar

cyphar Mar 26, 2018
Member

These four lines could just be shortened to

if (config.is_rootless && !config.is_setgroup)
  update_setgroups(child, SETGROUPS_DENY);

Because writing allow is a noop.

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Mar 27, 2018
Author Member

done

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:leave-setgroups-allow branch from 29740fe to bfffaeb Mar 27, 2018
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:leave-setgroups-allow branch from bfffaeb to 73f3dc6 Mar 27, 2018
@cyphar
Copy link
Member

@cyphar cyphar commented Mar 27, 2018

LGTM. I've tested it out a bit and it appears to work pretty well. Thanks @AkihiroSuda!

/cc @opencontainers/runc-maintainers

Approved with PullApprove

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Apr 13, 2018

@cyphar
Copy link
Member

@cyphar cyphar commented Apr 21, 2018

@crosbymichael
Copy link
Member

@crosbymichael crosbymichael commented Apr 24, 2018

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit bdbb9fa into opencontainers:master Apr 24, 2018
2 checks passed
2 checks passed
code-review/pullapprove Approved by crosbymichael, cyphar
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@haampie
Copy link

@haampie haampie commented Jul 11, 2020

@AkihiroSuda, I think this example has gone broken on recent versions of runc, can you verify that?

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda AkihiroSuda commented Jul 11, 2020

--force-mapping-tool is no longer needed

@haampie
Copy link

@haampie haampie commented Jul 11, 2020

Yeah, I'm aware of that, but I cannot get apt-get update to work inside a rootless container with a setup very similar as posted in your gist above, see #2517. I don't see obvious issues with the configuration, would be interested to know if you can still get your example from above to work in a rootless container.

@haampie
Copy link

@haampie haampie commented Jul 11, 2020

Sorry for the noise, I've figured it out, see the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.