feat(security): Restricted Pod Security Standard across embedded workloads#521
Closed
bussyjd wants to merge 1 commit into
Closed
feat(security): Restricted Pod Security Standard across embedded workloads#521bussyjd wants to merge 1 commit into
bussyjd wants to merge 1 commit into
Conversation
…loads
Brings every embedded Deployment shipped by obol-stack up to PSS Restricted:
- runAsNonRoot: true with fixed non-zero UID/GID (65532)
- allowPrivilegeEscalation: false
- capabilities.drop: [ALL]
- seccompProfile: RuntimeDefault
- readOnlyRootFilesystem: true (with named emptyDir mounts where Python
needs writeable /tmp and HOME/.cache)
PSS labels (enforce=restricted, audit/warn=restricted) added to the x402
and llm namespaces so future Deployment edits that omit per-pod
securityContext are rejected at admission.
Also switches the serviceoffer-controller Dockerfile from
gcr.io/distroless/static-debian12 (UID 0) to ...:nonroot (UID 65532).
Container escape via a Go runtime CVE on a UID-0 / no-seccomp /
no-cap-drop / RW-rootfs container was the easiest path to host pivot
on k3s single-node; this closes it.
Files touched:
- Dockerfile.serviceoffer-controller (:nonroot base)
- internal/embed/infrastructure/base/templates/x402.yaml
(verifier + controller securityContext blocks, x402 ns PSS label)
- internal/embed/infrastructure/base/templates/llm.yaml
(litellm + x402-buyer securityContext, litellm-tmp + litellm-home
emptyDir mounts with HOME/XDG_CACHE_HOME/HF_HOME redirection,
llm ns PSS label)
Scope notes:
- local-path-provisioner lives in kube-system (k3d-managed); not
relabeled per PSS guidance to skip system namespaces.
- hermes-obol-agent runtime is generated dynamically by
serviceoffer-controller (internal/serviceoffercontroller/agent_render.go
and internal/hermes/hermes.go), not from the embedded templates;
its init-hermes-perms initContainer legitimately runs as UID 0
for /data chown and is intentionally left out of this PR's scope.
- cloudflared chart (internal/embed/infrastructure/cloudflared/...)
is a separate Helm chart and not in this PR's file list.
What may break:
- LiteLLM with readOnlyRootFilesystem may fail if it writes outside
/tmp or $HOME — watch the next release-smoke for permission-denied
errors and add named emptyDir mounts for any new write paths.
This was referenced May 24, 2026
Collaborator
Author
|
Superseded by bundle PR #536 — closing in favor of the consolidated merge target. Original branch and history preserved. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Architecture review surfaced this as P0 security:
grepacross every embedded Deployment manifest returned zero hits forsecurityContext,runAsNonRoot,readOnlyRootFilesystem, orseccompProfile. Theserviceoffer-controllerDockerfile usesgcr.io/distroless/static-debian12(not:nonroot) — it runs as UID 0. Container escape via a Go runtime CVE on a UID-0, no-seccomp, no-cap-drop, RW-rootfs container is the easiest path to host pivot on k3s single-node.Before
After
Files
Dockerfile.serviceoffer-controller—:nonrootbase (UID 65532)internal/embed/infrastructure/base/templates/x402.yaml— verifier + controller pod/container securityContext, x402 ns PSS labelinternal/embed/infrastructure/base/templates/llm.yaml— litellm + x402-buyer securityContext,litellm-tmp+litellm-homeemptyDir mounts (HOME / XDG_CACHE_HOME / HF_HOME redirected onto them), llm ns PSS labelSurvey
:nonroot):nonroot/tmp,$HOME, HF cache/tmp+/home/litellm:nonroot)/state(already emptyDir)internal/hermes/hermes.go(generated)init-hermes-permsruns as UID 0 for chown/data(PVC)internal/embed/infrastructure/cloudflared/...No third-party image had to stay root.
What may break
/tmp/$HOMEwould now fail with EROFS. Mitigated by adding the two emptyDir mounts +HOME=/home/litellm,XDG_CACHE_HOME,HF_HOME. Watch release-smoke forRead-only file systemerrors on first paid call.enforce: restricted— a future Deployment edit that omits per-podsecurityContextgets rejected at admission. Intentional.Test plan
go build ./...cleango test ./internal/embed/... ./internal/x402/... ./internal/serviceoffercontroller/...— greengo test ./...— only pre-existing failure (TestWarnIfNoChatModel_EmitsWarnWhenNoModelsininternal/stack) reproduces onorigin/main, unrelatedkubectl get pods -Aconfirms all run;kubectl logsclean for litellm cold startReference
:nonroottag: https://github.com/GoogleContainerTools/distroless#user