Skip to content

fix: add non-root USER directive to all Dockerfiles#73

Merged
thepagent merged 1 commit intomainfrom
fix/non-root-user-dockerfile
Apr 7, 2026
Merged

fix: add non-root USER directive to all Dockerfiles#73
thepagent merged 1 commit intomainfrom
fix/non-root-user-dockerfile

Conversation

@thepagent
Copy link
Copy Markdown
Collaborator

@thepagent thepagent commented Apr 6, 2026

Summary

Harden all Dockerfiles with non-root users, proper file ownership, health checks, and network retry logic. Add container-level security context to K8s manifests.

Changes

Dockerfile (kiro / default preset)

  • useradd -m -s /bin/bash -u 1000 agent — explicit UID matching K8s securityContext
  • COPY --chown=agent:agent — correct ownership on copied binary
  • HEALTHCHECK — process-alive check via pgrep
  • curl --retry 3 --retry-delay 5 — retry logic for kiro-cli download
  • Home: /home/agent (debian base, no Node.js needed — keeps image ~200MB smaller)

Dockerfile.claude / Dockerfile.codex / Dockerfile.gemini

  • Use built-in node user (UID 1000) from node:22-bookworm-slim
  • COPY --chown=node:node — correct ownership on copied binary
  • Removed unnecessary mkdir /home/agent — use /home/node directly
  • HEALTHCHECK — process-alive check via pgrep
  • npm install --retry 3 — retry logic for package installs
  • Home: /home/node (node base image, required for npm-based ACP adapters)

K8s & Helm

  • Pod-level securityContext: runAsNonRoot, UID/GID/fsGroup 1000
  • Container-level securityContext: allowPrivilegeEscalation: false, drop ALL capabilities
  • Helm helper agent-broker.agent.home resolves /home/node (codex/claude/gemini) vs /home/agent (default) based on agent.preset
  • Volume mounts, HOME env, and working_dir all use the preset-aware helper

Home directory decision

Preset Base Image User Home
default (kiro) debian:bookworm-slim agent /home/agent
codex node:22-bookworm-slim node /home/node
claude node:22-bookworm-slim node /home/node
gemini node:22-bookworm-slim node /home/node

Unifying to a single home path was considered but rejected — useradd agent on the node base gets UID 1001 (1000 is taken by node), breaking runAsUser: 1000. PVC is unaffected since both users are UID 1000.

Intentionally deferred

Related

Closes #47
See also #45

Dockerfile Outdated
rm -rf /tmp/kirocli /tmp/kirocli.zip

RUN mkdir -p /home/agent/.local/share/kiro-cli /home/agent/.kiro
RUN useradd -m -s /bin/bash agent
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

這邊創建 uid 預設 1000 嗎?

Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the non-root enforcement! A few things to address before merging.

Dockerfile Outdated
rm -rf /tmp/kirocli /tmp/kirocli.zip

RUN mkdir -p /home/agent/.local/share/kiro-cli /home/agent/.kiro
RUN useradd -m -s /bin/bash agent
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Neil mentioned — useradd doesn't guarantee UID 1000. If the base image already has a user at UID 1000, this will get a different UID and break the K8s securityContext (runAsUser: 1000).

Suggest explicitly setting it:

RUN useradd -m -s /bin/bash -u 1000 agent


COPY --from=builder /build/target/release/agent-broker /usr/local/bin/agent-broker

USER node
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USER node — but WORKDIR is /home/agent (line 18). The node user's home is /home/node, and it may not have write permission to /home/agent.

Either:

  1. Change WORKDIR to /home/node and update paths accordingly, or
  2. Add RUN chown -R node:node /home/agent before USER node

Same issue applies to Dockerfile.codex and Dockerfile.gemini.

@@ -20,6 +20,10 @@ spec:
labels:
{{- include "agent-broker.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also adding container-level securityContext for defense in depth:

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]

Pod-level covers runAsNonRoot/UID, but container-level locks down privilege escalation and capabilities.

@chaodu-agent
Copy link
Copy Markdown
Collaborator

chaodu-agent commented Apr 6, 2026

Learnings from openclaw/openclaw Dockerfile

Reviewed the OpenClaw Dockerfiles for reference. Key patterns worth adopting:

Action items

  • Dockerfile: useradd add -u 1000 to explicitly set UID, matching K8s securityContext
  • Dockerfile.claude/codex/gemini: add RUN chown -R node:node /home/agent before USER node, or change WORKDIR to /home/node
  • Dockerfile: add --chown=agent:agent to COPY instructions
  • Helm chart: add container-level securityContext (allowPrivilegeEscalation: false + drop ALL capabilities)
  • Pin base images by SHA256 digest instead of tag only
  • Pre-install gh CLI at build time (no runtime apt-get after non-root enforcement)
  • Add HEALTHCHECK instruction
  • Add retry logic for network operations (ref: OpenClaw curl/corepack retry pattern)

Non-root user handling

  • Main Dockerfile uses the built-in node user (uid 1000) from node:24-bookworm
  • All COPY --from use --chown=node:node to ensure correct ownership
  • RUN chown node:node /app before USER node to fix WORKDIR permissions
  • Sandbox Dockerfile uses useradd --create-home --shell /bin/bash sandbox

Takeaway for this PR: Dockerfile.claude/codex/gemini need chown on /home/agent before USER node, and the main Dockerfile should use useradd -u 1000 explicitly.

Build-time only installs

  • All apt-get install happens during build — runtime user cannot install packages
  • Extra packages controlled via build arg: OPENCLAW_DOCKER_APT_PACKAGES
  • Even Chromium and Docker CLI are optional build-time installs

Takeaway: gh CLI should be installed at build time (see #47 comment).

Security hardening

  • Base images pinned by SHA256 digest, not just tag
  • Docker GPG key fingerprint verified before trusting
  • Built-in HEALTHCHECK
  • Default bind to loopback (127.0.0.1)

Build optimization

  • Multi-stage build (4 stages) — runtime image has no build tools or source
  • --mount=type=cache for apt cache layers
  • Retry logic on network operations (curl, corepack)
  • pnpm prune --prod + strip .d.ts/.map files

Other nice patterns

  • COPY --chmod=755 for executables
  • Configurable final user via ARG FINAL_USER=sandbox

@thepagent
Copy link
Copy Markdown
Collaborator Author

Addressed the review feedback in two commits:

9202ad0 — Harden Dockerfiles and securityContext

  • Dockerfile: useradd -u 1000 for explicit UID + --chown=agent:agent on COPY
  • Dockerfile.claude/codex/gemini: --chown=node:node on COPY
  • All Dockerfiles: added HEALTHCHECK (pgrep agent-broker) and retry logic (curl --retry 3, npm --retry 3)
  • Helm chart + k8s manifests: added container-level securityContext (allowPrivilegeEscalation: false, drop ALL capabilities)

caa3619 — Use built-in node user for node-based images

  • Dockerfile.claude/codex/gemini: removed mkdir /home/agent, now use built-in node user with /home/node
  • Helm chart: added agent-broker.agent.home helper that resolves /home/node for codex/claude/gemini presets, /home/agent for default (kiro)
  • PVC unaffected — both users are UID 1000, fsGroup: 1000 ensures correct ownership on mount

Intentionally deferred:

@thepagent
Copy link
Copy Markdown
Collaborator Author

Note on home directory split:

The default (kiro) preset uses /home/agent while codex/claude/gemini use /home/node. This is driven by base image choice:

  • kiro: uses debian:bookworm-slim — no Node.js needed (kiro-cli is a standalone binary), so we create an agent user with useradd -u 1000. This keeps the image ~200MB smaller than a node base.
  • codex/claude/gemini: use node:22-bookworm-slim — Node.js is required for the ACP adapters (npm packages). The node user (UID 1000) and /home/node already exist in this base image.

We considered unifying to /home/agent everywhere, but useradd agent on the node base image would get UID 1001 (since 1000 is taken by node), breaking the runAsUser: 1000 securityContext. Forcing a duplicate UID with -o or renaming the node user via usermod are both hacky.

The Helm helper agent-broker.agent.home resolves the correct path per preset, so this is transparent to users.

@thepagent thepagent force-pushed the fix/non-root-user-dockerfile branch from caa3619 to e2fa092 Compare April 7, 2026 06:10
…Context

- Dockerfile: useradd -u 1000, --chown=agent:agent, curl --retry, HEALTHCHECK
- Dockerfile.claude/codex/gemini: use built-in node user /home/node, --chown=node:node, npm --retry, HEALTHCHECK
- Helm chart: podSecurityContext + containerSecurityContext, preset-aware home helper
- k8s manifests: pod + container securityContext
@thepagent thepagent force-pushed the fix/non-root-user-dockerfile branch from e2fa092 to 1c9fa70 Compare April 7, 2026 06:17
Copy link
Copy Markdown
Collaborator

@pahud pahud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@pahud pahud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thepagent thepagent merged commit 5fdc891 into main Apr 7, 2026
Reese-max pushed a commit to Reese-max/openab that referenced this pull request Apr 12, 2026
…Context (openabdev#73)

- Dockerfile: useradd -u 1000, --chown=agent:agent, curl --retry, HEALTHCHECK
- Dockerfile.claude/codex/gemini: use built-in node user /home/node, --chown=node:node, npm --retry, HEALTHCHECK
- Helm chart: podSecurityContext + containerSecurityContext, preset-aware home helper
- k8s manifests: pod + container securityContext

Co-authored-by: thepagent <thepagent@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Define a non-root USER in Dockerfile

4 participants