Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[20.10 backport] Use v2 capabilities in layer archives #42352

Merged
merged 1 commit into from Jun 1, 2021

Conversation

AkihiroSuda
Copy link
Member

Cherry-pick #41724


- What I did

Fixes #41723

When building images in a user-namespaced container, v3 capabilities are
stored including the root UID of the creator of the user-namespace.

This UID does not make sense outside the build environment however. If
the image is run in a non-user-namespaced runtime, or if a user-namespaced
runtime uses a different UID, the capabilities requested by the effective
bit will not be honoured by execve(2) due to this mismatch.

Instead, we convert v3 capabilities to v2, dropping the root UID on the
fly.

- How I did it

Patched ReadSecurityXattrToTarHeader() to automatically convert v3 capabilities to v2 by switching the version identifier and dropping the root UID data.

- How to verify it

This reproducer can be used.

Compared with the output in issue #41723 this produces:

    default: + docker run --rm capabilities-built-with-no-userns:1.0 /bin/bash -c '(/usr/local/bin/sleep-test infinity & ); sleep 1; grep Cap /proc/$(pgrep sleep-test)/status'
    default: CapInh:	00000000a80425fb
    default: CapPrm:	0000000000000400
    default: CapEff:	0000000000000400
    default: CapBnd:	00000000a80425fb
    default: CapAmb:	0000000000000000
    default: + docker run --rm capabilities-built-with-userns:1.0 /bin/bash -c '(/usr/local/bin/sleep-test infinity & ); sleep 1; grep Cap /proc/$(pgrep sleep-test)/status'
    default: CapInh:	00000000a80425fb
    default: CapPrm:	0000000000000400
    default: CapEff:	0000000000000400
    default: CapBnd:	00000000a80425fb
    default: CapAmb:	0000000000000000

So we see that in the 2nd case also execve(2) honoured the effective bit.

- Description for the changelog

Capabilities in image layers are stored in v2 format even when built inside a non-root user-namespace.

- A picture of a cute animal (not mandatory but encouraged)

image

When building images in a user-namespaced container, v3 capabilities are
stored including the root UID of the creator of the user-namespace.

This UID does not make sense outside the build environment however. If
the image is run in a non-user-namespaced runtime, or if a user-namespaced
runtime uses a different UID, the capabilities requested by the effective
bit will not be honoured by `execve(2)` due to this mismatch.

Instead, we convert v3 capabilities to v2, dropping the root UID on the
fly.

Signed-off-by: Eric Mountain <eric.mountain@datadoghq.com>
(cherry picked from commit 95eb490)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

cc @EricMountain @tonistiigi

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mrueg
Copy link
Contributor

mrueg commented May 31, 2021

This seems to be the last open item for the v20.10.7 milestone. Since v20.10.7 includes new version of runc that has a security fix, I was wondering if there is anything we can do to help out here to get the release out?

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpuguy83 cpuguy83 merged commit b0f5bc3 into moby:20.10 Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants