Skip to content

setns: support user namespaces#13346

Open
copybara-service[bot] wants to merge 1 commit into
masterfrom
test/cl925507008
Open

setns: support user namespaces#13346
copybara-service[bot] wants to merge 1 commit into
masterfrom
test/cl925507008

Conversation

@copybara-service
Copy link
Copy Markdown

@copybara-service copybara-service Bot commented Jun 2, 2026

setns: support user namespaces

User namespace entries under /proc/[pid]/ns currently render as fake
namespace symlinks. They look like the other namespace files, but opening
them does not produce an nsfs file that setns(2) can use. Rootless
container tools such as buildah and podman rely on that file when they
re-enter the pause process user namespace, so the second lifecycle command
fails with EINVAL.

Make UserNamespace implement vfs.Namespace and give each user namespace
an nsfs inode when it is created. /proc/[pid]/ns/user now uses the
regular namespace symlink path, so opening it returns a joinable namespace
file instead of a fake link target.

Setns now accepts CLONE_NEWUSER from both nsfds and pidfds. It
follows the Linux restrictions for user namespace joins by rejecting the
caller's current user namespace, requiring CAP_SYS_ADMIN in the target
user namespace, rejecting multithreaded callers, and rejecting callers with
fs state shared outside the thread group. The capability checks for any
other namespaces in the same setns call use the credentials the caller
would have after joining the user namespace.

Add a syscall regression test that creates a child user namespace, opens
/proc/<pid>/ns/user, and verifies that setns(CLONE_NEWUSER) succeeds.

Fixes #13314

FUTURE_COPYBARA_INTEGRATE_REVIEW=#13323 from shayonj:issue-13314-userns-setns 8060b5f

@copybara-service copybara-service Bot added the exported Issue was exported automatically label Jun 2, 2026
@copybara-service copybara-service Bot force-pushed the test/cl925507008 branch 3 times, most recently from acc4aa3 to 2a0522c Compare June 4, 2026 00:01
User namespace entries under `/proc/[pid]/ns` currently render as fake
namespace symlinks. They look like the other namespace files, but opening
them does not produce an `nsfs` file that `setns(2)` can use. Rootless
container tools such as `buildah` and `podman` rely on that file when they
re-enter the pause process user namespace, so the second lifecycle command
fails with `EINVAL`.

Make `UserNamespace` implement `vfs.Namespace` and give each user namespace
an `nsfs` inode when it is created. `/proc/[pid]/ns/user` now uses the
regular namespace symlink path, so opening it returns a joinable namespace
file instead of a fake link target.

`Setns` now accepts `CLONE_NEWUSER` from both `nsfd`s and `pidfd`s. It
follows the Linux restrictions for user namespace joins by rejecting the
caller's current user namespace, requiring `CAP_SYS_ADMIN` in the target
user namespace, rejecting multithreaded callers, and rejecting callers with
`fs` state shared outside the thread group. The capability checks for any
other namespaces in the same `setns` call use the credentials the caller
would have after joining the user namespace.

Add a syscall regression test that creates a child user namespace, opens
`/proc/<pid>/ns/user`, and verifies that `setns(CLONE_NEWUSER)` succeeds.

Fixes #13314

FUTURE_COPYBARA_INTEGRATE_REVIEW=#13323 from shayonj:issue-13314-userns-setns 8060b5f
PiperOrigin-RevId: 925507008
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

exported Issue was exported automatically

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/proc/[pid]/ns/user is not usable with setns

1 participant