feat(dind): gate docker run --privileged / --cap-add behind config#74
Open
luthermonson wants to merge 4 commits into
Open
feat(dind): gate docker run --privileged / --cap-add behind config#74luthermonson wants to merge 4 commits into
luthermonson wants to merge 4 commits into
Conversation
Adds [dind] allow_privileged to gate the elevation stack a sibling
container can request via the fake docker daemon (Privileged=true,
all caps, all devices, seccomp/apparmor off, writable sysfs/cgroupfs).
Requests carrying HostConfig.Privileged=true or HostConfig.CapAdd are
short-circuited with HTTP 403 when the gate is closed, before any
image pull.
Default policy is platform-aware:
* Windows / macOS host → allowed. The dind backing containerd
runs inside a managed VM (WSL2 / Hyper-V / Vz), so an escape
stays inside that VM. KIND clusters and other workloads that
need real privileged continue to work without config changes.
* Linux host → denied. ephemerd runs directly on the
host with no VM fence; a privileged escape is bare-metal-host
compromise. Operators that trust their workloads can opt in
with `allow_privileged = true`.
The TOML key uses a *bool so an empty config block is distinguishable
from an explicit `allow_privileged = false`. Both runtime.Config and
the per-job dind.Server gain an AllowPrivileged field; main.go threads
cfg.Dind.ResolvedAllowPrivileged() through at both runtime.New sites.
Also corrects the package doc comment that previously claimed dind
never produces privileged containers — that was true once, isn't now.
Local CI: lint and tests pass cross-compiled for GOOS=linux. `mage
test` on Windows still hits the documented pkcs11/ocicrypt cgo build
failure for pkg/dind on Windows hosts; GOOS=linux go test -run xxx
compiles the package clean.
…ests
The previous tests built a Server with `&client.Client{}` (zero-value)
to satisfy the early nil-check on handleContainerCreate's client.
That works for the 403 short-circuit cases — the gate fires before
the client is touched — but the "not gated" tests proceed past the
gate and call GetImage on the zero-value client, which has no gRPC
ClientConn and panics. CI caught this in
TestHandleContainerCreate_NonPrivilegedRequestNotGated.
Extract the gate decision into a pure function checkPrivilegedGate
that takes the flag and the parsed HostConfig and returns the
rejection message + blocked flag. The handler becomes a one-liner;
the tests for the "not gated" cases call the pure function directly
and don't need a Server or client at all. The 403 tests still
exercise the full handler since that path is self-contained.
Reordering: gate is a request-shape validation, the nil-client check is a runtime-state check; check shape before state. CI caught this — the 403 tests construct a Server without a client (because the gate doesn't need one) and the original ordering returned 500 first. Also makes the operator-facing behavior more useful: a misconfigured deploy that lost its containerd client still rejects privileged requests with the actionable 403 rather than a generic 500.
After moving the request-shape validation ahead of the runtime nil- client check (so the privileged gate runs without a client), a missing Image field correctly returns 400 from the new ordering instead of 500 from the old nil-client-first ordering. The test name was always "NoImage" — 400 is the right answer; the old 500 assertion was an implementation-detail artifact.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
[dind] allow_privilegedto control whether sibling containers created through the fake docker daemon may opt into the full elevation stack (Privileged=true, all caps, all devices, seccomp+apparmor unconfined, writable sysfs/cgroupfs). Requests withHostConfig.Privileged=trueorHostConfig.CapAddare rejected with HTTP 403 before any image pull when the gate is closed.Threat model
A privileged sibling container is effectively root on whatever host runs the containerd that backs dind:
The runner container itself (
pkg/runtime/runtime.go) was already locked down to a minimum capability set + seccomp default. This PR closes the same hole for the dind-spawned siblings.Default policy
The TOML key is a
*boolso missing vs. explicit-false are distinguishable.ResolvedAllowPrivileged()picks the default:windowstruedarwintruelinuxfalseOperators can override explicitly:
Implementation
pkg/config/config.go— newDindConfig.AllowPrivileged *bool+ platform-awareResolvedAllowPrivileged().pkg/dind/dind.go—Server.allowPrivilegedfield +Config.AllowPrivilegedplumbing; fixed misleading package doc that previously claimed dind never spawns privileged containers.pkg/dind/containers.go— early 403 short-circuit inhandleContainerCreatefor bothPrivileged=trueandCapAdd != nil. Runs before image pull so we don't burn bandwidth on rejected requests.pkg/runtime/runtime.go—Config.DindAllowPrivilegedforwarded into each per-jobdind.Server.cmd/ephemerd/main.go— passescfg.Dind.ResolvedAllowPrivileged()at bothruntime.Newsites.Test plan
go test ./pkg/config/...— 5 new tests: platform default, explicit-true/false override, omitted TOML key falls through to platform default, explicit-false on a non-Linux host overrides the default.go test ./pkg/dind/...(compile-checked underGOOS=linux) — 5 new tests: gate-closed rejectsPrivileged=truewith 403 and useful body, gate-closed rejectsCapAdd, non-privileged request not gated, nil HostConfig not gated, gate-open passes Privileged through.GOOS=linux ./bin/golangci-lint run ./...→ 0 issues.mage build:windowsproduces a clean 700 MB binary.trueon Windows) still passes — should be a no-op behavior change on this host.Local CI notes
mage teston Windows hosts trips the documentedmiekg/pkcs11cgo failure for packages that transitively import it (pkg/dindis one).GOOS=linux go test -run xxx ./pkg/dind/...compiles clean — the GHA Linux runner will execute the new tests normally.