Skip to content

fix(docker): pre-create agent hidden directories with correct ownership#773

Open
pro5251 wants to merge 1 commit intoopenabdev:mainfrom
pro5251:feat/fix-docker-permissions-767
Open

fix(docker): pre-create agent hidden directories with correct ownership#773
pro5251 wants to merge 1 commit intoopenabdev:mainfrom
pro5251:feat/fix-docker-permissions-767

Conversation

@pro5251
Copy link
Copy Markdown

@pro5251 pro5251 commented May 8, 2026

What problem does this solve?

This PR fixes a common "Permission Denied" (EACCES) issue in Docker environments when agents (Claude, Copilot, Gemini, etc.) attempt to create their internal hidden configuration/cache directories (e.g., .claude, .copilot) at runtime while running as a non-root user.

As reported in Issue #767, when bind-mounting volumes that don't yet exist, Docker automatically creates the missing directories on the host as root. Even without mounts, if the base image's home directory has restrictive permissions, the runtime agent cannot initialize its state, leading to silent failures or crashes.

Closes #767

Discord Discussion URL

N/A - This issue was identified and discussed primarily in GitHub Issue #767.

At a Glance

Current Fix Flow:

Runtime Stage (Dockerfile):
1. [root]  mkdir -p /home/node/.agent
2. [root]  chown -R node:node /home/node/.agent  <-- Fix: Ensure ownership
3. [root]  USER node
4. [node]  ENTRYPOINT ["openab", "run"]
           (Agent writes to .agent folder -> SUCCESS)

Problematic Flow (Previous):

Runtime Stage (Dockerfile):
1. [root]  USER node
2. [node]  ENTRYPOINT ["openab", "run"]
           (Agent tries to mkdir .agent)
           -> If parent is root-owned or bind-mount created as root:
           -> EACCES: Permission Denied

Prior Art & Industry Research

OpenClaw:
OpenClaw typically recommends users manually run chown on the host machine (sudo chown -R 1000:1000 ~/.openclaw) or specify user: \"${UID}:${GID}\" in docker-compose.yml. While effective, it places the burden of permission management on the user's infrastructure setup.

Hermes Agent:
Hermes Agent uses environment variables like HERMES_UID and HERMES_GID to coordinate ownership, or maps volumes to /opt/data with specific UID requirements.

OpenAB Approach:
We chose a "battery-included" approach by pre-provisioning the necessary hidden directories inside the Dockerfile. This ensures that even if a user starts the container without advanced UID/GID mapping, the default internal storage paths are already owned by the runtime user, significantly reducing "out of the box" permission errors.

Proposed Solution

Modified the following Dockerfiles:

  • Dockerfile.claude
  • Dockerfile.codex
  • Dockerfile.copilot
  • Dockerfile.cursor
  • Dockerfile.gemini
  • Dockerfile.opencode

Added specific mkdir -p and chown commands for each agent's respective hidden directory (.claude, .codex, .copilot, .cursor, .gemini, .opencode) in the runtime stage before switching to the non-root user.

Why this approach?

It provides the most seamless user experience. By pre-creating the directories with correct ownership, we avoid the race condition where Docker or the application creates them as root. It aligns with best practices for "slim" and "distroless-like" execution where the runtime user is strictly non-privileged.

Alternatives Considered

  1. Host-side fix documentation: Rejected because it's error-prone for beginners.
  2. Entrypoint script with chown: Rejected because chown requires root privileges, and we want the container to start as a non-root user immediately for better security (avoiding sudo or gosu inside the container if possible).

Validation

  • Verified that built images contain the directories with correct ownership (ls -la /home/node).
  • Verified that agents can start and write to these directories without EACCES errors.
  • Confirmed images still run as non-root (UID 1000).

@pro5251 pro5251 requested a review from thepagent as a code owner May 8, 2026 17:37
@github-actions github-actions Bot added the closing-soon PR missing Discord Discussion URL — will auto-close in 3 days label May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

⚠️ This PR is missing a Discord Discussion URL in the body.

All PRs must reference a prior Discord discussion to ensure community alignment before implementation.

Please edit the PR description to include a link like:

Discord Discussion URL: https://discord.com/channels/...

This PR will be automatically closed in 3 days if the link is not added.

@github-actions github-actions Bot added the pending-screening PR awaiting automated screening label May 8, 2026
@pro5251 pro5251 closed this May 8, 2026
@pro5251 pro5251 deleted the feat/fix-docker-permissions-767 branch May 8, 2026 17:41
@pro5251 pro5251 restored the feat/fix-docker-permissions-767 branch May 8, 2026 17:44
@pro5251 pro5251 reopened this May 8, 2026
@pro5251 pro5251 force-pushed the feat/fix-docker-permissions-767 branch 4 times, most recently from 46426af to ddf78d6 Compare May 8, 2026 17:50
@pro5251 pro5251 force-pushed the feat/fix-docker-permissions-767 branch from ddf78d6 to 57b7898 Compare May 8, 2026 17:51
@pro5251 pro5251 changed the title fix(docker): pre-create agent hidden directories with correct ownership (openabdev#773) fix(docker): pre-create agent hidden directories with correct ownership May 8, 2026
@pro5251 pro5251 changed the title (openabdev#773) fix(docker): pre-create agent hidden directories with correct ownership fix(docker): pre-create agent hidden directories with correct ownership May 8, 2026
@shaun-agent
Copy link
Copy Markdown
Contributor

OpenAB PR Screening

This is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Click 👍 if you find this useful. Human review will be done within 24 hours. We appreciate your support and contribution 🙏

Screening report ## Intent

PR #773 tries to prevent Docker-based OpenAB agent images from failing when runtime agents need to create hidden home-directory state folders such as .claude, .codex, .copilot, .cursor, or .opencode.

The operator-visible problem is EACCES/permission-denied failures when containers run as the non-root node user but the needed agent config/cache directory is missing, root-owned, or created through Docker bind-mount behavior with the wrong ownership.

Feat

This is a Docker runtime fix.

It updates agent-specific Dockerfiles to pre-create each agent’s hidden directory under /home/node and chown it to node:node before switching to the non-root runtime user. The intended behavior is that agents can initialize their own local state without requiring users to manually repair permissions on the host or run containers as root.

Who It Serves

Primary beneficiary: deployers and agent runtime operators using OpenAB Docker images.

Secondary beneficiaries: maintainers and reviewers, because this reduces recurring support/debug burden around agent startup failures in containerized environments.

Rewritten Prompt

Implement a Docker permission fix for OpenAB agent images so each non-root runtime image pre-creates the agent’s expected hidden state directory under /home/node with node:node ownership before USER node.

Update the relevant agent Dockerfiles only. For each image, add a mkdir -p /home/node/<agent-hidden-dir> and chown -R node:node /home/node/<agent-hidden-dir> in the runtime stage before the user switch. Verify that the final image still runs as UID 1000 and that the agent process can write to its hidden directory without EACCES.

Include a short validation note covering ownership, non-root runtime behavior, and agent startup/write behavior.

Merge Pitch

This is worth advancing because it fixes a concrete Docker usability failure with a small, low-risk change. It keeps the security posture intact by preserving non-root runtime execution while making the default images more reliable out of the box.

The likely reviewer concern is whether the fix covers all relevant Dockerfiles consistently. The source summary lists modified files but mentions Dockerfile.gemini in the body while the file list does not include it, so review should confirm whether Gemini needs the same treatment or whether that mention is stale.

Best-Practice Comparison

OpenClaw principles that are relevant:

  • Durable job persistence: partially relevant, because agent state/config directories must be writable and predictable.
  • Isolated executions: relevant insofar as each container should own its runtime state without depending on privileged startup behavior.
  • Explicit delivery routing, retry/backoff, run logs, and gateway-owned scheduling: not directly relevant to this Docker permission fix.

Hermes Agent principles that are relevant:

  • Fresh session per scheduled run: indirectly relevant if agents initialize per-run state under their home directories.
  • Self-contained prompts for scheduled tasks: not relevant.
  • Gateway daemon tick model, file locking, and atomic writes for persisted state: not directly relevant.

Compared with both systems, this PR is not changing scheduling, persistence semantics, locking, or execution orchestration. It applies a narrower container hygiene principle: runtime writable paths should be provisioned during image build or runtime setup with the same UID/GID that will execute the process.

Implementation Options

Option 1: Conservative Dockerfile-only fix
Keep the PR as a direct Dockerfile patch. Pre-create the known hidden directory for each agent image and chown it before USER node.

Option 2: Balanced shared build pattern
Introduce a consistent Dockerfile pattern or build arg for agent home-state directories, reducing repeated bespoke commands across agent images while keeping the behavior at build time.

Option 3: Ambitious runtime ownership strategy
Add a controlled entrypoint or init step that validates and repairs writable paths at container start, potentially supporting bind mounts and custom home/cache locations. This could include UID/GID configuration, clearer diagnostics, and failure messages when ownership cannot be corrected.

Comparison Table

Option Speed to ship Complexity Reliability Maintainability User impact Fit for OpenAB right now
Conservative Dockerfile-only fix High Low Medium Medium High for default images Strong
Balanced shared build pattern Medium Medium Medium-High High High Good if Dockerfile duplication is already painful
Ambitious runtime ownership strategy Low High High for bind mounts/custom paths Medium Highest for varied deployments Too broad for this PR

Recommendation

Advance the conservative Dockerfile-only fix, with one review requirement: confirm coverage across all agent Dockerfiles, especially the possible Dockerfile.gemini mismatch between the PR body and changed-file list.

This is the right merge discussion path because the bug is specific, the fix is small, and it preserves non-root runtime behavior. A follow-up can separately consider a shared Docker build helper or runtime diagnostics if OpenAB continues to see permission failures from bind-mounted host paths or custom agent cache locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

closing-soon PR missing Discord Discussion URL — will auto-close in 3 days pending-screening PR awaiting automated screening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(Dockerfile.copilot): /home/node/.copilot owned by root when using Docker bind mount, causing silent exit 1

3 participants