Skip to content

fix(hermes): set file permissions so sandbox user can read copied files#2466

Merged
ericksoa merged 3 commits intomainfrom
fix/2191-hermes-dockerfile-permissions
Apr 25, 2026
Merged

fix(hermes): set file permissions so sandbox user can read copied files#2466
ericksoa merged 3 commits intomainfrom
fix/2191-hermes-dockerfile-permissions

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented Apr 25, 2026

Summary

Fix Hermes Dockerfile build failure caused by missing file permissions on copied files.
The sandbox user could not read generate-config.ts or the plugin directory, causing
EACCES at build time. Also add CI coverage so Hermes Dockerfile regressions are caught
on every PR.

Related Issue

Fixes #2191

Changes

  • Add chmod 444 on /opt/nemoclaw-generate-config.ts after COPY (fixes the reported EACCES)
  • Add chmod -R a+rX on /opt/nemoclaw-hermes-plugin/ after COPY (same class of bug)
  • Add build-hermes-sandbox-image job to sandbox-images-and-e2e.yaml — builds the
    Hermes image on every PR and verifies file permissions with test -r / test -x checks
  • Add resolve-hermes-base-image composite action (mirrors resolve-sandbox-base-image)
  • Add build-and-push-hermes job to base-image.yaml — publishes Hermes base image to
    GHCR when agents/hermes/Dockerfile.base changes on main

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • npx prek run --all-files passes
  • No secrets, API keys, or credentials committed
  • hadolint passes on modified Dockerfile

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features

    • Added an action to resolve the Hermes sandbox base image with a fallback when needed
    • Added a Hermes production image build and validation workflow that ensures runtime files and executables are accessible
  • Chores

    • Split base image build to manage Hermes separately
    • Hardened file permissions for Hermes plugin components and made config generation output read-only

The Hermes Dockerfile copies generate-config.ts and the plugin directory
without setting read permissions, causing EACCES when the sandbox user
runs the config generation step. Add chmod after each COPY to match the
pattern already used for the blueprint and startup scripts.

Also add the Hermes image build to PR CI (sandbox-images-and-e2e) with
permission verification checks, publish the Hermes base image from
base-image.yaml, and add a resolve-hermes-base-image composite action.

Fixes #2191
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 25, 2026

📝 Walkthrough

Walkthrough

Adds a composite GitHub Action to resolve a Hermes sandbox base image (pull or build fallback), extends CI to build/push a Hermes base image, adds a job that builds a Hermes sandbox image using the resolved base and verifies sandbox user file access, and hardens Hermes Dockerfile file permissions.

Changes

Cohort / File(s) Summary
Hermes Dockerfile Permission Hardening
agents/hermes/Dockerfile
Adjusts file permissions for Hermes plugin files: makes plugin files world-readable/executable (preserving directory execute bits) and sets generate-config.ts to read-only (0444).
Base Image Resolution Action
.github/actions/resolve-hermes-base-image/action.yaml
New composite action resolve-hermes-base-image that attempts to pull ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latest; on pull failure emits a workflow warning and builds a local fallback image (nemoclaw-hermes-base-local) from agents/hermes/Dockerfile.base, then writes HERMES_BASE_IMAGE to GITHUB_ENV.
Base Image Build Workflow
.github/workflows/base-image.yaml
Adds build-and-push-hermes job triggered on changes to agents/hermes/Dockerfile.base; sets up QEMU/Buildx, logs into GHCR, and builds/pushes multi-arch ghcr.io/nvidia/nemoclaw/hermes-sandbox-base (tags: latest, short SHA).
Hermes Sandbox Image Build & Validation
.github/workflows/sandbox-images-and-e2e.yaml
Adds build-hermes-sandbox-image job that builds a production Hermes sandbox image using HERMES_BASE_IMAGE and validates that the sandbox user can access expected config/plugins/blueprint files and execute Hermes binaries.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant Action as resolve-hermes-base-image
    participant GHCR as GHCR Registry
    participant Docker as Docker Build
    participant Env as GITHUB_ENV

    GH->>Action: invoke resolve-hermes-base-image
    Action->>GHCR: attempt pull ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latest
    alt pull succeeds
        GHCR-->>Action: pull success
        Action->>Env: write HERMES_BASE_IMAGE=ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latest
    else pull fails
        GHCR-->>Action: pull failed
        Action->>GH: emit workflow warning
        Action->>Docker: build image nemoclaw-hermes-base-local (agents/hermes/Dockerfile.base)
        Docker-->>Action: build complete
        Action->>Env: write HERMES_BASE_IMAGE=nemoclaw-hermes-base-local
    end
    Env-->>GH: HERMES_BASE_IMAGE available
    GH->>Docker: build hermes sandbox image using HERMES_BASE_IMAGE
    Docker->>Docker: verify sandbox user file access and Hermes binary executability
    Docker-->>GH: build & validation complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through layers, fixed bits with a cheer,
Pulled a base, or built one when absent here.
Sandbox can read, binaries can run,
A little rabbit dance—CI's job is done! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing file permissions on copied files so the sandbox user can read them.
Linked Issues check ✅ Passed The PR successfully implements all objectives from issue #2191: fixes file permissions on copied Dockerfile files, adds verification job, and implements fallback base image resolution.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing #2191: Dockerfile permission fixes, CI jobs for verification, and base image resolution infrastructure.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/2191-hermes-dockerfile-permissions

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericksoa ericksoa self-assigned this Apr 25, 2026
The previous commit renamed build-and-push to build-and-push-openclaw
and removed the IMAGE_NAME env var, which risks breaking branch
protection rules that reference the original job name.

Restore the original OpenClaw job untouched and keep the new Hermes
base image job as a purely additive change.

Fixes #2191
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/base-image.yaml (1)

92-133: Decouple Hermes publishing from OpenClaw manual dispatch.

build-and-push-hermes currently runs on workflow_dispatch as well. Since dispatch input is OpenClaw-specific, this introduces unnecessary coupling and can fail manual OpenClaw rebuilds for Hermes-only reasons. Consider scoping Hermes publish to push-triggered runs.

Proposed minimal change
   build-and-push-hermes:
-    if: github.repository == 'NVIDIA/NemoClaw'
+    if: github.repository == 'NVIDIA/NemoClaw' && github.event_name == 'push'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/base-image.yaml around lines 92 - 133, The
build-and-push-hermes job is currently running for manual workflow_dispatch runs
and is coupled to OpenClaw inputs; update its run condition so it only triggers
on repository push events (and still restrict to the NVIDIA/NemoClaw repo).
Modify the job's if expression for build-and-push-hermes to require
github.event_name == 'push' (e.g., if: github.repository == 'NVIDIA/NemoClaw' &&
github.event_name == 'push') so Hermes publishing is skipped for
workflow_dispatch/manual OpenClaw dispatches while preserving the repo guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/base-image.yaml:
- Around line 92-133: The build-and-push-hermes job is currently running for
manual workflow_dispatch runs and is coupled to OpenClaw inputs; update its run
condition so it only triggers on repository push events (and still restrict to
the NVIDIA/NemoClaw repo). Modify the job's if expression for
build-and-push-hermes to require github.event_name == 'push' (e.g., if:
github.repository == 'NVIDIA/NemoClaw' && github.event_name == 'push') so Hermes
publishing is skipped for workflow_dispatch/manual OpenClaw dispatches while
preserving the repo guard.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3b946c66-fec9-4d39-b482-a16a0e1fc00a

📥 Commits

Reviewing files that changed from the base of the PR and between 0d0dff2 and 20b7f43.

📒 Files selected for processing (1)
  • .github/workflows/base-image.yaml

@ericksoa ericksoa merged commit cc15689 into main Apr 25, 2026
34 checks passed
@miyoungc miyoungc mentioned this pull request Apr 28, 2026
13 tasks
miyoungc added a commit that referenced this pull request Apr 28, 2026
## Summary
Refreshes user-facing docs for the last 24 hours of merged NemoClaw
history and bumps the docs metadata to 0.0.29, the next version after
v0.0.28. The updates are limited to behavior supported by merged PR
descriptions and diffs.

## Changes
- `docs/reference/commands.md`: documented `nemoclaw <name> policy-add
--from-file` and `--from-dir`, including custom preset review guidance,
from #2077 / commit `7720b175`.
- `docs/deployment/deploy-to-remote-gpu.md`: clarified that non-loopback
`CHAT_UI_URL` disables OpenClaw device pairing for remote browser-only
deployments, from #2449 / commit `f5ee8a4d`.
- `docs/inference/inference-options.md`: documented provider-aware
credential retry validation and the NVIDIA-only `nvapi-` prefix check,
from #2389 / commit `6f7f0c6d`.
- `docs/inference/switch-inference-providers.md`: documented
`NEMOCLAW_INFERENCE_INPUTS` for text/image-capable model metadata baked
into `openclaw.json`, from #2441 / commit `f4391892`.
- `docs/reference/troubleshooting.md`: added the Git certificate
verification entry for proxy CA propagation through `GIT_SSL_CAINFO`,
`GIT_SSL_CAPATH`, `CURL_CA_BUNDLE`, and `REQUESTS_CA_BUNDLE`, from #2345
/ commit `fa0dc1ab`.
- `docs/versions1.json` and `docs/project.json`: promoted docs version
`0.0.29`; `docs/versions1.json` omits unpublished `0.0.26`, `0.0.27`,
and `0.0.28` entries.
- `.agents/skills/nemoclaw-user-*`: regenerated derived user skill
references from the updated docs.
- Reviewed with no extra doc changes: #2575 / `d392ec07`, #2565 /
`a3231049`, #1965 / `db1ef3ca`, #1990 / `db665834`, #2495 / `7da86fa3`,
#2496 / `3192f4f4`, #2490 / `8c209058`, #2487 / `1f615e2f`, #2483 /
`5653d33a`, #2482 / `31c782c0`, #2464 / `23bb5703`, #2472 / `a54f9a34`,
and #2437 / `6bc860d7`.
- Skipped per docs policy: #2420 / `7b76df6b` touched the experimental
sandbox config path listed in `docs/.docs-skip`; #2466 / `cc15689c`
touched a skipped term and CI-only sandbox image files.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
<!-- Check each item you ran and confirmed. Leave unchecked items you
skipped. -->
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes — failed locally in installer-integration tests
and one onboard helper timeout; the doc-scoped hook test projects passed
under `prek`.
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only) — build
succeeded, but local Sphinx emitted the existing version-switcher file
read message.
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure
<!-- If an AI agent authored or co-authored this PR, check the box and
name the tool. Remove this section for fully human-authored PRs. -->
- [x] AI-assisted — tool: Codex

---
<!-- DCO sign-off required by CI. Run: git config user.name && git
config user.email -->
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Support for custom YAML presets in policy configuration via
--from-file and --from-dir.
* New build-time inference input option to declare accepted modalities
(text or text,image).

* **Improvements**
* Credential validation now offers interactive recovery: re-enter key,
retry, choose another provider, or exit.
* Clarified provider-specific API key prefix handling (nvapi- only
applies to NVIDIA keys).

* **Documentation**
  * TLS certificate troubleshooting for inspected networks.
* Clarified remote dashboard security/device-pairing behavior; command
docs updated; docs version bumped.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong permissions in Dockerfile for Hermes Agent sandbox

1 participant