fix(hermes): keep remote secrets out of sandbox surfaces by ericksoa · Pull Request #4771 · NVIDIA/NemoClaw

ericksoa · 2026-06-04T14:36:17Z

Summary

preserve the pre-existing Hermes remote toolset surface for the API server and enabled messaging platforms, including terminal, file, code_execution, memory, session_search, delegation, and cronjob
keep platform_toolsets.cli unpinned and avoid no_mcp, so the fix does not disable default Hermes/MCP capability as the security control
fail Hermes startup when /sandbox/.hermes/.env or the startup process environment contains raw secret-shaped values (*_TOKEN, *_KEY, *_SECRET, *_API, *_PASSWORD, *_CREDENTIAL), while allowing OpenShell resolver placeholders and Slack SDK placeholder aliases
keep the only raw startup-env exception scoped to OPENCLAW_GATEWAY_TOKEN; known non-secret config names such as API_SERVER_PORT, API_SERVER_HOST, NEMOCLAW_INFERENCE_API, and NEMOCLAW_PROVIDER_KEY are explicitly allowlisted
move Hermes managed-tool gateway auth off raw TOOL_GATEWAY_USER_TOKEN sandbox env; Hermes now sends the attached OpenShell provider placeholder for NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN, and the host broker accepts the OpenShell-rewritten refresh credential while keeping raw OAuth state host-side
add a named Hermes sandbox secret-boundary smoke in the sandbox image workflow, plus focused generator/startup/plugin/broker/onboarding regression tests
merge current main and preserve the new Hermes api_mode routing added there

Regression story

No intentional Hermes tool regression remains in this PR. Slack, Discord, Telegram, WeChat/Weixin, WhatsApp, and the OpenAI-compatible API server keep the remote toolsets they previously had, including terminal, file, code execution, memory, session search, delegation, cron, and default MCP exposure. Managed tool presets still configure their backends; for example nous-code keeps terminal.backend=modal, and nous-audio still adds tts.

The security boundary is enforced by keeping NemoClaw-managed secrets out of sandbox-visible files and startup env. A terminal prompt can still print resolver placeholders such as openshell:resolve:env:* or Slack placeholder aliases, but it should not be able to print NemoClaw/OpenShell-managed raw credential values because those values are not placed in /sandbox/.hermes/.env or passed as TOOL_GATEWAY_USER_TOKEN anymore. Startup also refuses raw secret-shaped process env values, closing the other obvious env/printenv path for terminal-enabled Hermes sandboxes.

Startup rejects generic UUID-shaped values like the reported DEVTEST_API_TOKEN leak and also rejects bare *_API names such as INTERNAL_API. The CD smoke asserts the built Hermes image preserves the expected remote toolsets, has no raw secret-shaped .env values, and refuses injected raw .env and startup-env secrets. Existing sandboxes still need rebuild/recreate to pick up the new generated config/startup behavior; this PR does not try to do emergency credential replacement.

Boundary note: if code outside NemoClaw/OpenShell deliberately writes an arbitrary raw secret into a writable sandbox file after startup, Hermes can still echo that file. This PR closes the NemoClaw-managed paths implicated by #4770 and adds tests to prevent reintroducing that class through generated Hermes config, startup validation, managed-tool gateway auth, or the sandbox image workflow.

Tests

bash -n agents/hermes/start.sh test/e2e/test-hermes-sandbox-secret-boundary.sh
npx vitest run test/hermes-start.test.ts test/generate-hermes-config.test.ts --testTimeout 60000
npx vitest run test/hermes-plugin-handlers.test.ts test/hermes-tool-gateway-broker.test.ts test/onboard.test.ts --testTimeout 60000
npm run checks
npm run lint (passes; reports one unrelated existing warning in src/lib/onboard/child-exit-tracker.test.ts)
earlier: npm run build:cli
earlier: generated Hermes config locally for all messaging channels plus all managed tool gateway presets; confirmed API/Slack include terminal, file, code_execution, memory/session/delegation/cron, tts, and terminal.backend=modal, while .env has no TOOL_GATEWAY_USER_TOKEN and no raw secret-shaped values

Fixes #4770.

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

New Features
- Hermes startup enforces a secret boundary (rejects symlinked .env and raw secret-shaped values in file or process env); validated proxy host/port are now forwarded into sandboxes.
Bug Fixes
- Managed-tool broker/gateway auth updated to use a dedicated refresh-token variable and clearer unknown-credential errors to avoid exposing secrets.
Tests
- New e2e and unit tests covering secret-boundary checks, config/toolset generation, and broker/gateway behavior.
Chores
- CI/workflows run secret-boundary E2E job, conditionally supply live messaging secrets, and upload boundary logs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

copy-pr-bot · 2026-06-04T14:36:21Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-04T14:36:25Z

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)

.github/workflows/nightly-e2e.yaml
src/lib/onboard.ts
test/helpers/e2e-workflow-contract.ts
test/onboard.test.ts

✅ Files skipped from review due to trivial changes (1)

test/helpers/e2e-workflow-contract.ts

🚧 Files skipped from review as they are similar to previous changes (2)

test/onboard.test.ts
.github/workflows/nightly-e2e.yaml

📝 Walkthrough

Walkthrough

Adds Hermes startup secret-boundary validators, refactors tool-gateway credential flow to use a refresh-token env var, updates remote platform toolset assignment, and adds unit and E2E tests plus CI jobs that verify no raw secret-shaped values appear in sandbox .env or process environment.

Changes

Hermes secret boundary enforcement and credential refactoring

Layer / File(s)	Summary
Remote platform toolset configuration `agents/hermes/config/hermes-config.ts`, `test/generate-hermes-config.test.ts`	`REMOTE_PLATFORM_TOOLSETS` introduced and `MESSAGING_PLATFORM_BY_CHANNEL` maps messaging channels to platform keys. Config assigns `platform_toolsets.api_server` and messaging platform toolsets from the remote list; tests assert generated toolsets equal the remote baseline and scan generated `.env` for secret-shaped env violations.
Startup environment secret boundary validation `agents/hermes/start.sh`, `test/hermes-start.test.ts`, `test/e2e/test-hermes-slack-e2e.sh`	Adds `validate_hermes_env_secret_boundary()` (checks `/sandbox/.hermes/.env` for symlink and forbidden raw secret-shaped KEY=VALUE entries) and `validate_hermes_runtime_env_secret_boundary()` (scans process environment). Both run before `refresh_hermes_provider_placeholders()`; tests verify accept/reject rules and that rejected raw values are not echoed.
Tool gateway credential refactoring & onboarding `agents/hermes/host/tool-gateway-broker.ts`, `agents/hermes/plugin/__init__.py`, `src/lib/onboard.ts`, `test/hermes-plugin-handlers.test.ts`, `test/hermes-tool-gateway-broker.test.ts`, `test/onboard.test.ts`	Broker adds `findCredentialState()` to classify presented tokens and derive refresh tokens; `handleProxy` resolves `presentedToken` accordingly. Plugin `_broker_user_token()` now prefers `NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN` or returns an `openshell:resolve:env:` placeholder. Onboarding drops per-sandbox `hermesToolBrokerToken` injection and forwards validated `NEMOCLAW_PROXY_*` proxy settings to sandbox env. Tests updated to match new token flow and request headers.
E2E testing and CI workflow integration `test/e2e/test-hermes-sandbox-secret-boundary.sh`, `.github/workflows/sandbox-images-and-e2e.yaml`, `.github/workflows/nightly-e2e.yaml`, `.github/workflows/e2e-script.yaml`, `test/e2e-script-workflow.test.ts`, `test/validate-e2e-coverage.test.ts`	New E2E script runs in-image Python probes to validate sandbox `.env` and `config.yaml`, validates managed-tool image fragments, and asserts startup rejects injected secret-shaped `.env` and process env entries without echoing values. Reusable E2E workflow gains `messaging_live_secrets` input and conditional secret wiring; workflows run the script and upload failure logs as artifacts; nightly workflow adds `hermes-secret-boundary-e2e` job and wires it into aggregation jobs.

Sequence Diagram(s)

sequenceDiagram
  participant Startup as Hermes startup
  participant EnvValidator as validate_hermes_env_secret_boundary
  participant RuntimeValidator as validate_hermes_runtime_env_secret_boundary
  participant PythonScanner as Python secret scanner
  participant Refresh as refresh_hermes_provider_placeholders
  Startup->>EnvValidator: validate /sandbox/.hermes/.env (symlink, secret-shaped entries)
  EnvValidator->>PythonScanner: scan file for credential-like keys/values
  alt violations found
    PythonScanner-->>EnvValidator: report offending keys/lines
    EnvValidator-->>Startup: exit non-zero
  else
    EnvValidator-->>Startup: proceed
  end
  Startup->>RuntimeValidator: validate process environment for secret-shaped keys
  RuntimeValidator->>PythonScanner: scan process env
  alt violations found
    PythonScanner-->>RuntimeValidator: report offending keys
    RuntimeValidator-->>Startup: exit non-zero
  else
    RuntimeValidator-->>Startup: proceed
  end
  Startup->>Refresh: refresh provider placeholders

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

NVIDIA/NemoClaw#4718: Modifies agents/hermes/config/hermes-config.ts, related to config construction in the same area.
NVIDIA/NemoClaw#4703: Related startup-sequence changes in agents/hermes/start.sh that touch validations executed before refresh_hermes_provider_placeholders().

Suggested labels

E2E, area: onboarding, area: integrations

Suggested reviewers

cv
laitingsheng

Poem

🐰 I hop the sandbox, sniff and bound,
I nudge the secrets safe and sound,
Placeholders stand where raw tokens lay,
Tests watch the gate both night and day,
CI cheers — no secrets run away!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 3.03% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(hermes): keep remote secrets out of sandbox surfaces' directly addresses the main change: preventing environment-sourced secrets from being exposed to remote surfaces.
Linked Issues check	✅ Passed	The PR comprehensively implements all coding requirements from `#4770`: secret-boundary validation in .env and process environment, secret-shaped value rejection, allowlisting of non-secrets and resolver placeholders, and managed-tool gateway credential migration.
Out of Scope Changes check	✅ Passed	All changes are directly related to issue `#4770`'s objectives: secret-boundary enforcement, credential migration, test coverage, and workflow integration for secret-boundary validation testing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/hermes-remote-secret-boundary

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-04T14:39:49Z

E2E Advisor Recommendation

Required E2E: hermes-secret-boundary-e2e, hermes-e2e, hermes-onboard-security-posture-e2e, hermes-slack-e2e, messaging-providers-e2e
Optional E2E: hermes-root-entrypoint-smoke-e2e, hermes-discord-e2e, common-egress-agent-e2e

Dispatch hint: hermes-secret-boundary-e2e,hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e

Auto-dispatched E2E: hermes-e2e, hermes-onboard-security-posture-e2e, hermes-slack-e2e, messaging-providers-e2e via nightly-e2e.yaml at 5800d3783caaaed481f5660d1e061a63ba987202 — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

hermes-secret-boundary-e2e (medium): Directly validates the new Hermes sandbox secret-boundary behavior: built image inspection, managed-tool image credential surfaces, remote toolsets, and startup rejection of raw secret-shaped .env/runtime env values.
hermes-e2e (high): Exercises real install → onboard --agent hermes → health probe → live inference, covering the changed Hermes config generation, onboarding create path, and startup secret-boundary validation in an actual sandbox.
hermes-onboard-security-posture-e2e (high): Required because the PR changes Hermes runtime security posture and credential handling during startup/onboard. This validates a full Hermes onboard with non-root host-user and runtime guard assertions.
hermes-slack-e2e (high): Required for the changed Hermes messaging platform_toolsets and Slack credential placeholder/provider path. The PR also touches the Hermes Slack E2E script and CI secret-passing behavior for messaging providers.
messaging-providers-e2e (high): Validates the reusable workflow's new messaging_live_secrets gating plus the provider/placeholder/L7-proxy chain for Telegram, Discord, and Slack credentials. This is important because the PR changes whether live messaging secrets are passed to E2E scripts.

Optional E2E

hermes-root-entrypoint-smoke-e2e (medium): Useful adjacent coverage for the modified Hermes start.sh root/non-root entrypoint paths, layout repair, gateway-user execution, and PID migration. The required Hermes E2Es cover real startup, but this gives faster image-entrypoint confidence.
hermes-discord-e2e (high): Optional confidence for the same messaging platform_toolsets change on another Hermes messaging channel besides Slack.
common-egress-agent-e2e (very high): Optional expensive end-to-end agent-flow coverage for Hermes common-egress policy/tool behavior, including Hermes Nous policy routes. Helpful because managed-tool gateway configuration changed, but not strictly merge-blocking unless managed-tool runtime behavior is the PR's primary risk.

New E2E recommendations

Hermes managed-tool gateway broker runtime (high): Existing coverage now inspects managed-tool image/config and unit-tests the broker, but there is no full E2E that onboards Hermes with managed-tool gateways and verifies a sandbox tool request reaches a hermetic host broker/upstream while raw refresh tokens remain host-only.
- Suggested test: Add a hermetic Hermes managed-tool gateway broker E2E using a fake Nous portal/upstream to validate OpenShell resolver placeholder presentation, host-side refresh, token rotation, and no raw OAuth credential in sandbox env/config/logs.
Reusable E2E workflow secret gating (medium): The reusable e2e-script workflow now conditionally passes live messaging secrets. Unit/contract tests help, but an end-to-end workflow-level negative check would catch accidental secret exposure in future workflow edits.
- Suggested test: Add a lightweight workflow-contract E2E or CI smoke that dispatches a benign script twice, with messaging_live_secrets false/true, and asserts messaging secret env vars are absent unless explicitly enabled.

Dispatch hint

Workflow: nightly-e2e.yaml
jobs input: hermes-secret-boundary-e2e,hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e

github-actions · 2026-06-04T14:39:51Z

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-hermes, ubuntu-repo-cloud-hermes-slack, ubuntu-repo-cloud-hermes-discord
Optional scenario E2E: ubuntu-repo-cloud-openclaw, wsl-repo-cloud-openclaw, macos-repo-cloud-openclaw

Dispatch required scenario E2E:

gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack
gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required scenario E2E

ubuntu-repo-cloud-hermes: Core Hermes configuration, startup, plugin, broker, and onboard code changed. This scenario exercises repo-checkout Hermes onboarding, sandbox startup, gateway health, inference, and Hermes-specific health/history checks on the primary Ubuntu runner.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
ubuntu-repo-cloud-hermes-slack: Hermes messaging platform toolset configuration and sandbox secret-boundary behavior changed, with Slack-specific legacy tests also changed. This scenario exercises Hermes Slack onboarding plus messaging placeholder/no-secret-leak checks.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack
ubuntu-repo-cloud-hermes-discord: Hermes messaging platform mapping and top-level Discord config generation changed. This scenario exercises Hermes Discord onboarding and the messaging provider/placeholder/no-secret-leak path on the primary Ubuntu runner.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord

Optional scenario E2E

ubuntu-repo-cloud-openclaw: src/lib/onboard.ts is shared onboarding code. Although the visible diff is primarily Hermes-related, this adjacent OpenClaw baseline can catch unintended regressions in generic sandbox create/onboard behavior.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
wsl-repo-cloud-openclaw: Optional platform-adjacent coverage for shared onboarding changes on WSL. Special-runner scenario, so not required unless maintainers want cross-platform confidence.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=wsl-repo-cloud-openclaw
macos-repo-cloud-openclaw: Optional platform-adjacent coverage for shared onboarding/install behavior on macOS. Special-runner scenario and Docker-dependent suites are skipped, so keep optional.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=macos-repo-cloud-openclaw

Relevant changed files

agents/hermes/config/hermes-config.ts
agents/hermes/host/tool-gateway-broker.ts
agents/hermes/plugin/__init__.py
agents/hermes/start.sh
src/lib/onboard.ts

github-actions · 2026-06-04T14:42:56Z

PR Review Advisor

Findings: 1 needs attention, 4 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 2 still apply, 1 new item found

Review findings

🛠️ Needs attention

Gate live messaging secrets for target-ref dispatches (.github/workflows/nightly-e2e.yaml:361): The `messaging-providers-e2e` job still opts into live Telegram, Discord, and Slack secrets while running `test/e2e/test-messaging-providers.sh` from `${{ inputs.target_ref || github.ref }}`. On a `workflow_dispatch` run with a non-empty `target_ref`, the reusable runner checks out that target ref and executes code from it with the live messaging secrets in the environment, so target-ref-controlled E2E code can read or exfiltrate those secrets.
- Recommendation: Only provide live messaging secrets when the tested script is from a trusted ref. Use a guard equivalent to the Docker Hub credential guard, for example `github.event_name != 'workflow_dispatch' || inputs.target_ref == ''`, or split live-secret validation into a trusted-ref-only job and keep fake-token coverage for target refs. Add a workflow contract test that models `workflow_dispatch` with non-empty `inputs.target_ref` and asserts the live messaging secrets are blank or not requested.
- Evidence: `messaging-providers-e2e` sets `ref: ${{ inputs.target_ref || github.ref }}` and `messaging_live_secrets: true`; `.github/workflows/e2e-script.yaml` checks out `inputs.ref` into `repo` and sets `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, and `SLACK_APP_TOKEN_REAL` whenever `inputs.messaging_live_secrets` is true. The added test asserts explicit opt-in/default false, but does not assert target-ref withholding.

🔎 Worth checking

Source-of-truth review needed: agents/hermes/plugin/__init__.py managed-tool gateway monkeypatches: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `_broker_user_token()` returns `openshell:resolve:env:NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN` and `_install_nous_tool_broker_patch()` monkeypatches multiple Hermes modules.
Source-of-truth review needed: agents/hermes/host/tool-gateway-broker.ts credential compatibility: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `findCredentialState()` checks broker-token state first and refresh-token state second; `handleProxy()` uses the presented token directly for refresh-token matches and resolves host runtime refresh token for broker-token matches.
Hermes chat-output and memory redaction clauses remain out of scope (agents/hermes/start.sh:948): The PR prevents NemoClaw-managed raw secret-shaped values from entering Hermes `.env` or startup process env, which addresses the managed `.env`/startup-env path. It does not implement the linked issue's provenance-based chat redaction or memory persistence behavior, so raw secrets introduced after startup through another writable file/env path could still be echoed or persisted by Hermes.
- Recommendation: Either implement the Hermes output and memory redaction behavior requested by issue [Ubuntu 22.04][Security] Hermes agent echoes env-var token verbatim to Slack chat on "print $X_TOKEN" prompt #4770, or explicitly narrow the issue closure criteria to the NemoClaw-managed `.env` and startup-env sources fixed here and track chat-output and memory redaction as separate work. Add behavioral tests for chat response redaction and memory persistence redaction if those clauses are intended to be satisfied by this PR.
- Evidence: Issue [Ubuntu 22.04][Security] Hermes agent echoes env-var token verbatim to Slack chat on "print $X_TOKEN" prompt #4770 expected result includes clauses (b) and (c): scrub chat output for any value coming directly from `os.environ`, and replace env-var reads with `<redacted: ENV_VAR_NAME>` in chat and memory. This diff changes generated config, startup validation, managed-tool auth, and tests, but does not modify Hermes terminal output, chat response redaction, or memory persistence paths.
Document removal conditions for Hermes managed-tool compatibility shims (agents/hermes/plugin/__init__.py:221): The PR changes several localized compatibility shims around Hermes managed-tool gateway auth and legacy broker-token handling. The comments explain the invalid upstream state and include regression tests, but the removal condition is still only a broad 'long term' note and the legacy broker-token compatibility path lacks a concrete sunset/version condition.
- Recommendation: Document the specific upstream Hermes setting/version or NemoClaw compatibility milestone that will allow removing the monkeypatches and legacy `TOOL_GATEWAY_USER_TOKEN`/broker-token path. If legacy support must remain indefinitely, state that explicitly and keep the regression tests tied to that contract.
- Evidence: `_broker_user_token()` now returns an OpenShell resolver placeholder by default and `_install_nous_tool_broker_patch()` monkeypatches Hermes managed-tool modules. `tool-gateway-broker.ts` accepts both broker-token and raw refresh-token credential states. Tests cover these paths, but comments do not define a concrete removal trigger for the workaround or legacy compatibility behavior.

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Workflow target-ref live messaging secret withholding: model `workflow_dispatch` with non-empty `inputs.target_ref` and assert `messaging-providers-e2e` does not expose `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, or `SLACK_APP_TOKEN_REAL`.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
**Runtime validation** — Reusable E2E runner trusted-ref live secret guard: assert `.github/workflows/e2e-script.yaml` or all callers require both explicit live-secret opt-in and a trusted-ref predicate equivalent to the Docker Hub guard.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
**Runtime validation** — Hermes chat redaction provenance path: simulate an env-derived value reaching terminal/tool output and assert the Slack/chat response contains `<redacted: ENV_VAR_NAME>` rather than the raw value.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
**Runtime validation** — Hermes memory persistence redaction: simulate terminal/tool output containing an env-derived secret and assert persisted memory/transcript stores only the redacted placeholder.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
**Runtime validation** — Post-startup raw secret boundary scope: either test and document that arbitrary raw secrets written after startup remain out of scope, or add a runtime guard where Hermes loads or reads env values.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
**Acceptance clause:** When a Slack user prompts the bot to `print` or `check access` of an env-var-named secret (e.g. `DEVTEST_API_TOKEN`, `API_TOKEN`, `CQA_TOKEN`), the agent runs `echo "${X_TOKEN}"` via the terminal tool and posts the **complete plaintext token value** to the Slack channel. — add test evidence or identify existing coverage. The PR keeps terminal enabled and prevents NemoClaw-managed raw secret-shaped values from being present in `/sandbox/.hermes/.env` or startup process env. This prevents the reported managed startup/config path, but does not add a terminal/chat-layer refusal for arbitrary raw secrets introduced after startup.
**Acceptance clause:** The bot boot log declares `Secret redaction: ENABLED (tool output, logs, and chat responses are scrubbed before delivery)` but the redaction layer does not catch generic UUID/GUID-format token values — it apparently only matches known prefixes like `xoxb-` / `sk-`. — add test evidence or identify existing coverage. The new startup guard rejects UUID-like raw values in secret-shaped env names and tests verify the raw value is not printed in startup errors. The Hermes redaction layer itself is not changed.
**Acceptance clause:** Two independent prompts from a single user produced the leak in under a minute. — add test evidence or identify existing coverage. No PR diff evidence exercises the Slack prompt path end-to-end; the new tests target generated config, startup env validation, plugin/broker behavior, and image scans.

Since last review details

Current findings:

Source-of-truth review needed: agents/hermes/plugin/__init__.py managed-tool gateway monkeypatches: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `_broker_user_token()` returns `openshell:resolve:env:NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN` and `_install_nous_tool_broker_patch()` monkeypatches multiple Hermes modules.
Source-of-truth review needed: agents/hermes/host/tool-gateway-broker.ts credential compatibility: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `findCredentialState()` checks broker-token state first and refresh-token state second; `handleProxy()` uses the presented token directly for refresh-token matches and resolves host runtime refresh token for broker-token matches.
Gate live messaging secrets for target-ref dispatches (.github/workflows/nightly-e2e.yaml:361): The `messaging-providers-e2e` job still opts into live Telegram, Discord, and Slack secrets while running `test/e2e/test-messaging-providers.sh` from `${{ inputs.target_ref || github.ref }}`. On a `workflow_dispatch` run with a non-empty `target_ref`, the reusable runner checks out that target ref and executes code from it with the live messaging secrets in the environment, so target-ref-controlled E2E code can read or exfiltrate those secrets.
- Recommendation: Only provide live messaging secrets when the tested script is from a trusted ref. Use a guard equivalent to the Docker Hub credential guard, for example `github.event_name != 'workflow_dispatch' || inputs.target_ref == ''`, or split live-secret validation into a trusted-ref-only job and keep fake-token coverage for target refs. Add a workflow contract test that models `workflow_dispatch` with non-empty `inputs.target_ref` and asserts the live messaging secrets are blank or not requested.
- Evidence: `messaging-providers-e2e` sets `ref: ${{ inputs.target_ref || github.ref }}` and `messaging_live_secrets: true`; `.github/workflows/e2e-script.yaml` checks out `inputs.ref` into `repo` and sets `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, and `SLACK_APP_TOKEN_REAL` whenever `inputs.messaging_live_secrets` is true. The added test asserts explicit opt-in/default false, but does not assert target-ref withholding.
Hermes chat-output and memory redaction clauses remain out of scope (agents/hermes/start.sh:948): The PR prevents NemoClaw-managed raw secret-shaped values from entering Hermes `.env` or startup process env, which addresses the managed `.env`/startup-env path. It does not implement the linked issue's provenance-based chat redaction or memory persistence behavior, so raw secrets introduced after startup through another writable file/env path could still be echoed or persisted by Hermes.
- Recommendation: Either implement the Hermes output and memory redaction behavior requested by issue [Ubuntu 22.04][Security] Hermes agent echoes env-var token verbatim to Slack chat on "print $X_TOKEN" prompt #4770, or explicitly narrow the issue closure criteria to the NemoClaw-managed `.env` and startup-env sources fixed here and track chat-output and memory redaction as separate work. Add behavioral tests for chat response redaction and memory persistence redaction if those clauses are intended to be satisfied by this PR.
- Evidence: Issue [Ubuntu 22.04][Security] Hermes agent echoes env-var token verbatim to Slack chat on "print $X_TOKEN" prompt #4770 expected result includes clauses (b) and (c): scrub chat output for any value coming directly from `os.environ`, and replace env-var reads with `<redacted: ENV_VAR_NAME>` in chat and memory. This diff changes generated config, startup validation, managed-tool auth, and tests, but does not modify Hermes terminal output, chat response redaction, or memory persistence paths.
Document removal conditions for Hermes managed-tool compatibility shims (agents/hermes/plugin/__init__.py:221): The PR changes several localized compatibility shims around Hermes managed-tool gateway auth and legacy broker-token handling. The comments explain the invalid upstream state and include regression tests, but the removal condition is still only a broad 'long term' note and the legacy broker-token compatibility path lacks a concrete sunset/version condition.
- Recommendation: Document the specific upstream Hermes setting/version or NemoClaw compatibility milestone that will allow removing the monkeypatches and legacy `TOOL_GATEWAY_USER_TOKEN`/broker-token path. If legacy support must remain indefinitely, state that explicitly and keep the regression tests tied to that contract.
- Evidence: `_broker_user_token()` now returns an OpenShell resolver placeholder by default and `_install_nous_tool_broker_patch()` monkeypatches Hermes managed-tool modules. `tool-gateway-broker.ts` accepts both broker-token and raw refresh-token credential states. Tests cover these paths, but comments do not define a concrete removal trigger for the workaround or legacy compatibility behavior.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

…ret-boundary # Conflicts: # agents/hermes/config/hermes-config.ts

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-hermes-sandbox-secret-boundary.sh`:
- Around line 221-224: The test uses GUID-like literal values that trigger
secret scanners; update the E2E payloads used with
assert_startup_rejects_env_entry to use a benign non-secret sentinel (e.g.,
"SENTINEL_VALUE" or a dynamically composed string) instead of the GUID-like
literals so the startup-rejection behavior is still validated without leaking
realistic-looking secrets; make the same replacement for the other occurrence
referenced in the test (the second assert_startup_rejects_env_entry call).

In `@test/generate-hermes-config.test.ts`:
- Around line 183-197: The test "flags bare API-named .env secrets while
allowing API server config" uses a GUID-like literal in rawSecret which triggers
secret scanners; change the fixture to a non-sensitive sentinel (e.g.,
"raw-secret-value") or build the value from harmless fragments so
findRawSecretEnvEntries still sees a non-placeholder raw string; update the
constant referenced as rawSecret in this test so the assertion for
findRawSecretEnvEntries([...]) remains unchanged.

In `@test/hermes-start.test.ts`:
- Around line 596-605: The test uses a hard-coded GUID-like secret in the
"rejects bare API-named raw values without printing the value" spec—replace that
literal with a benign sentinel or construct it at runtime to avoid committing
scanner bait; update the value passed to runHermesEnvSecretBoundary (the envFile
string used in this test and the similar one at the other location around the
645-658 block) to use a non-sensitive token name (e.g., "SENTINEL_TOKEN" or a
runtime concatenation like "token-" + "123") so the test behavior remains the
same but no real-looking GUID is stored in the repo.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 32075eba-4954-4863-815a-545c52db6645

📥 Commits

Reviewing files that changed from the base of the PR and between c281aec and 69e3001.

📒 Files selected for processing (12)

.github/workflows/sandbox-images-and-e2e.yaml
agents/hermes/config/hermes-config.ts
agents/hermes/host/tool-gateway-broker.ts
agents/hermes/plugin/__init__.py
agents/hermes/start.sh
src/lib/onboard.ts
test/e2e/test-hermes-sandbox-secret-boundary.sh
test/generate-hermes-config.test.ts
test/hermes-plugin-handlers.test.ts
test/hermes-start.test.ts
test/hermes-tool-gateway-broker.test.ts
test/onboard.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

agents/hermes/start.sh
agents/hermes/config/hermes-config.ts

coderabbitai · 2026-06-05T09:53:48Z

+assert_startup_rejects_env_entry \
+  "INTERNAL_API=01234567-89ab-cdef-0123-456789abcdef" \
+  "INTERNAL_API" \
+  "01234567-89ab-cdef-0123-456789abcdef"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use non-secret sentinels in the E2E payloads.

These newly added GUID-like literals are already being reported by Betterleaks. This smoke test only needs a raw non-placeholder value to prove startup rejection, so swapping in a benign sentinel string or composing the value dynamically avoids secret-scanner noise without weakening the check.

Also applies to: 229-232

🧰 Tools

🪛 Betterleaks (1.3.1)

[high] 222-222: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-hermes-sandbox-secret-boundary.sh` around lines 221 - 224, The test uses GUID-like literal values that trigger secret scanners; update the E2E payloads used with assert_startup_rejects_env_entry to use a benign non-secret sentinel (e.g., "SENTINEL_VALUE" or a dynamically composed string) instead of the GUID-like literals so the startup-rejection behavior is still validated without leaking realistic-looking secrets; make the same replacement for the other occurrence referenced in the test (the second assert_startup_rejects_env_entry call).

coderabbitai · 2026-06-05T09:53:48Z

+  it("flags bare API-named .env secrets while allowing API server config", () => {
+    const rawSecret = "01234567-89ab-cdef-0123-456789abcdef";
+
+    expect(
+      findRawSecretEnvEntries(
+        [
+          "API_SERVER_PORT=18642",
+          "API_SERVER_HOST=127.0.0.1",
+          `INTERNAL_API=${rawSecret}`,
+          "SERVICE_API=openshell:resolve:env:SERVICE_API",
+          "",
+        ].join("\n"),
+      ),
+    ).toEqual(["INTERNAL_API line 3"]);
+  });


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid committing secret-scanner-shaped fixture values.

This GUID-like literal is already being flagged by Betterleaks in this PR. findRawSecretEnvEntries() only cares that the value is a non-placeholder raw string, so a benign sentinel like raw-secret-value or a value assembled from fragments will preserve the coverage without adding scanner noise.

🧰 Tools

🪛 Betterleaks (1.3.1)

[high] 184-184: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/generate-hermes-config.test.ts` around lines 183 - 197, The test "flags bare API-named .env secrets while allowing API server config" uses a GUID-like literal in rawSecret which triggers secret scanners; change the fixture to a non-sensitive sentinel (e.g., "raw-secret-value") or build the value from harmless fragments so findRawSecretEnvEntries still sees a non-placeholder raw string; update the constant referenced as rawSecret in this test so the assertion for findRawSecretEnvEntries([...]) remains unchanged.

coderabbitai · 2026-06-05T09:53:48Z

+  it("rejects bare API-named raw values without printing the value", () => {
+    const rawToken = "01234567-89ab-cdef-0123-456789abcdef";
+    const result = runHermesEnvSecretBoundary({
+      envFile: `INTERNAL_API=${rawToken}\n`,
+    });
+
+    expect(result.status).toBe(1);
+    expect(result.stderr).toContain("INTERNAL_API (line 1)");
+    expect(result.stderr).not.toContain(rawToken);
+  });


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace the hard-coded GUID-like test secrets.

These new fixture values are already being flagged by Betterleaks. The boundary checks here reject any raw non-placeholder value for those keys, so switching to a benign sentinel string—or composing the value at runtime—keeps the test intent without committing scanner bait.

Also applies to: 645-658

🧰 Tools

🪛 Betterleaks (1.3.1)

[high] 597-597: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/hermes-start.test.ts` around lines 596 - 605, The test uses a hard-coded GUID-like secret in the "rejects bare API-named raw values without printing the value" spec—replace that literal with a benign sentinel or construct it at runtime to avoid committing scanner bait; update the value passed to runHermesEnvSecretBoundary (the envFile string used in this test and the similar one at the other location around the 645-658 block) to use a non-sensitive token name (e.g., "SENTINEL_TOKEN" or a runtime concatenation like "token-" + "123") so the test behavior remains the same but no real-looking GUID is stored in the repo.

github-actions · 2026-06-05T10:00:47Z

Selective E2E Results — ❌ Some jobs failed

Run: 27007564674
Target ref: 69e300199bff3cf14a0a3a7d366187d38b829b25
Workflow ref: main
Requested jobs: all (no filter)
Summary: 55 passed, 2 failed, 2 skipped

Job	Result
agent-turn-latency-e2e	✅ success
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-add-remove-e2e	✅ success
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-dashboard-e2e	✅ success
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-onboard-security-posture-e2e	✅ success
hermes-root-entrypoint-smoke-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
issue-3600-gpu-proof-optional-e2e	✅ success
issue-4434-tui-unreachable-inference-e2e	❌ failure
issue-4462-gateway-pinned-approval-characterization-e2e	✅ success
issue-4462-scope-upgrade-approval-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-negative-paths-e2e	❌ failure
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-discord-pairing-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-onboard-security-posture-e2e	✅ success
openclaw-skill-cli-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openclaw-tui-chat-correlation-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
sessions-agents-cli-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	⚠️ cancelled
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success
vm-driver-privileged-exec-routing-e2e	✅ success

Failed jobs: issue-4434-tui-unreachable-inference-e2e, onboard-negative-paths-e2e. Check run artifacts for logs.

github-actions · 2026-06-05T10:03:47Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27008360665
Target ref: bfbe6062db63e93a7195b61985b39f5a5e724703
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
hermes-secret-boundary-e2e	✅ success
hermes-slack-e2e	✅ success

github-actions · 2026-06-05T10:05:55Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27008436572
Target ref: bfbe6062db63e93a7195b61985b39f5a5e724703
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,hermes-onboard-security-posture-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job	Result
hermes-e2e	✅ success
hermes-onboard-security-posture-e2e	✅ success
hermes-root-entrypoint-smoke-e2e	✅ success
hermes-slack-e2e	✅ success

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-script-workflow.test.ts`:
- Around line 65-69: The code incorrectly references a non-existent property
runnerWorkflow.true; replace the fallback expression with a safe
optional-chaining access on runnerWorkflow itself (e.g. compute callInputs from
runnerWorkflow?.on?.workflow_call?.inputs) so the value becomes: const
callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {}; update the
occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow and
callInputs are defined.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ac05cf9e-d111-4e28-a064-5953a9ee3bb0

📥 Commits

Reviewing files that changed from the base of the PR and between bfbe606 and da27088.

📒 Files selected for processing (4)

.github/workflows/e2e-script.yaml
.github/workflows/nightly-e2e.yaml
test/e2e-script-workflow.test.ts
test/validate-e2e-coverage.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/nightly-e2e.yaml

coderabbitai · 2026-06-05T10:45:44Z

+  it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {
+    const callInputs =
+      runnerWorkflow.on?.workflow_call?.inputs ??
+      runnerWorkflow.true?.workflow_call?.inputs ??
+      {};


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix the type error: runnerWorkflow.true is not a valid property.

Line 68 attempts to access runnerWorkflow.true?.workflow_call?.inputs, but true is not a valid property name. This appears to be a typo.

🐛 Proposed fix

- const callInputs = - runnerWorkflow.on?.workflow_call?.inputs ?? - runnerWorkflow.true?.workflow_call?.inputs ?? - {}; + const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {};

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {

const callInputs =

runnerWorkflow.on?.workflow_call?.inputs ??

runnerWorkflow.true?.workflow_call?.inputs ??

{};

it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {

const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {};

🧰 Tools

🪛 GitHub Check: checks

[failure] 68-68:
Property 'true' does not exist on type 'RunnerWorkflow'.

[failure] 67-67:
Property 'on' does not exist on type 'RunnerWorkflow'.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-script-workflow.test.ts` around lines 65 - 69, The code incorrectly references a non-existent property runnerWorkflow.true; replace the fallback expression with a safe optional-chaining access on runnerWorkflow itself (e.g. compute callInputs from runnerWorkflow?.on?.workflow_call?.inputs) so the value becomes: const callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {}; update the occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow and callInputs are defined.

github-actions · 2026-06-05T10:45:56Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27010209868
Target ref: da2708813bf3c8ac84aeb06894de0f6bc65f92c4
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
hermes-secret-boundary-e2e	✅ success
hermes-slack-e2e	✅ success

github-actions · 2026-06-05T10:50:55Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27010271846
Target ref: da2708813bf3c8ac84aeb06894de0f6bc65f92c4
Workflow ref: main
Requested jobs: hermes-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
hermes-e2e	✅ success
hermes-slack-e2e	✅ success
messaging-providers-e2e	⚠️ cancelled

github-actions · 2026-06-05T10:52:43Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27010496448
Target ref: 2c05e1a3d67a863b7b0496c25a0d185785770efa
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
hermes-secret-boundary-e2e	✅ success
hermes-slack-e2e	✅ success

github-actions · 2026-06-05T11:06:59Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27010609871
Target ref: 2c05e1a3d67a863b7b0496c25a0d185785770efa
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job	Result
hermes-e2e	✅ success
hermes-root-entrypoint-smoke-e2e	✅ success
hermes-slack-e2e	✅ success
messaging-providers-e2e	✅ success

cv · 2026-06-05T16:27:25Z

@ericksoa why is this tagged v0.0.61?

github-actions · 2026-06-05T16:37:48Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27027032111
Target ref: 7e5757e5f93507987fe01dcdd5c5dbd9f67a80b1
Workflow ref: main
Requested jobs: hermes-e2e,hermes-slack-e2e,hermes-discord-e2e,hermes-root-entrypoint-smoke-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job	Result
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-root-entrypoint-smoke-e2e	✅ success
hermes-slack-e2e	✅ success

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)

.github/workflows/nightly-e2e.yaml
src/lib/onboard.ts
test/helpers/e2e-workflow-contract.ts
test/onboard.test.ts

✅ Files skipped from review due to trivial changes (1)

test/helpers/e2e-workflow-contract.ts

🚧 Files skipped from review as they are similar to previous changes (2)

test/onboard.test.ts
.github/workflows/nightly-e2e.yaml

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)

.github/workflows/nightly-e2e.yaml
src/lib/onboard.ts
test/helpers/e2e-workflow-contract.ts
test/onboard.test.ts

✅ Files skipped from review due to trivial changes (1)

test/helpers/e2e-workflow-contract.ts

🚧 Files skipped from review as they are similar to previous changes (2)

test/onboard.test.ts
.github/workflows/nightly-e2e.yaml

🛑 Comments failed to post (1)

src/lib/onboard.ts (1)
5518-5521: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Scope policy-tier env validation to non-interactive onboarding.

selectPolicyTier() only consumes NEMOCLAW_POLICY_TIER in non-interactive mode, but validatePolicyTierEnvEarly() now runs for every onboard path. A stale invalid export will now abort an interactive onboarding run before the user ever reaches the tier prompt, even though the interactive path ignores that env var.
Suggested fix
-  policyTierEnv.validatePolicyTierEnvEarly();
+  if (isNonInteractive()) {
+    policyTierEnv.validatePolicyTierEnvEarly();
+  }
Also applies to: 6137-6139
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 5518 - 5521, validatePolicyTierEnvEarly() is
being run for all onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER
env to abort interactive runs even though selectPolicyTier() only uses that env
in non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.

github-actions · 2026-06-05T18:23:55Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27031834188
Target ref: 0bdfa21eb62a7a4c3dc667c5434ed09f0f06525e
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,hermes-discord-e2e,messaging-providers-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job	Result
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-root-entrypoint-smoke-e2e	✅ success
hermes-slack-e2e	✅ success
messaging-providers-e2e	✅ success

github-actions · 2026-06-05T19:47:58Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27035850282
Target ref: 5800d3783caaaed481f5660d1e061a63ba987202
Workflow ref: main
Requested jobs: hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job	Result
hermes-e2e	✅ success
hermes-onboard-security-posture-e2e	✅ success
hermes-slack-e2e	✅ success
messaging-providers-e2e	✅ success

## Summary - Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the dev announcement from discussion #4877. - Fills the source-doc gaps found during release-prep review across inference, policy tiers, command behavior, security boundaries, Hermes dashboard/tooling, runtime context, and troubleshooting. - Refreshes generated agent skills under `.agents/skills/` from the current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`. ## Source summary - #4037 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents system-only runtime context that stays out of visible chat. - #4875 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents try-first sandbox network/filesystem guidance and clearer failure classification. - #4788 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents shared OpenClaw device-approval policy for startup and connect. - #4768 -> `docs/reference/network-policies.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`: Documents `weather`, `public-reference`, and Hermes managed-tool gateway preset behavior. - #3788 and #4864 -> `docs/reference/network-policies.mdx`, `docs/reference/commands.mdx`: Documents non-interactive policy-tier fail-fast behavior and interactive prompt fallback. - #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware default sandbox resolution for `list`, `status`, and `tunnel` commands. - #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel status` behavior. - #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy preset descriptions in `policy-list`. - #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents package-managed OpenShell gateway service and Docker-driver gateway-marker behavior. - #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent gateway/dashboard cleanup isolation by sandbox name and port. - #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU patch rollback behavior. - #4610 -> `docs/reference/troubleshooting.mdx`, `docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission guidance aligned and removes skipped experimental wording. - #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling for custom `onboard --from <Dockerfile>` contexts in generated skills. - #4870 -> `docs/reference/commands.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`: Documents `NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage. - #4641 -> `docs/inference/inference-options.mdx`, `docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM platform-digest pulls and served-model id adoption. - #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash coverage. - #4852 -> `docs/inference/use-local-inference.mdx`, `docs/reference/troubleshooting.mdx`: Documents Ollama model fit filtering, 16K context floor, cold-load retry, and failed-model exclusion. - #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents API-family sync, Hermes `api_mode`, and Bedrock Runtime exception. - #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents Nemotron managed-inference native tool-search fallback. - #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents interactive multimodal input prompting. - #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass normalization in generated troubleshooting coverage. - #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents prebuilt Hermes dashboard assets and TUI recovery without runtime rebuilds. - #4854 -> `docs/inference/switch-inference-providers.mdx`, `docs/reference/commands.mdx`: Documents Hermes proxy API-key placeholder preservation during inference switches. - #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`, `.agents/skills/`: Keeps messaging enrollment behavior aligned with manifest-hook implementation. - #4771 -> `docs/security/best-practices.mdx`, `docs/security/credential-storage.mdx`: Documents Hermes placeholder-only secret boundary for sandbox-visible runtime files. - #4787 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents expanded memory scanner examples for OpenAI project keys and Slack app-level tokens. - #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill install mirroring into the agent home directory. - #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep structure and generated `.agents/skills/` refresh as the template for this release. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run` - `npm run docs` - `git diff --check` - skip-term scan across `docs/`, `.agents/skills/`, and `skills/` - `npm run build:cli` - `npm run typecheck:cli` - Commit and pre-push hook suites, including markdownlint, gitleaks, env-var docs gate, docs-to-skills verification, and skills YAML tests  ## Summary by CodeRabbit ## Release Notes * **New Features** * DeepSeek-V4-Flash now available as default inference model for DGX Station. * Hermes dashboard improved with dedicated port and OAuth-authenticated tool gateway selection. * Added weather and public-reference policy presets for expanded agent capabilities. * Enhanced Ollama model selection with GPU memory filtering and automatic retry for timeouts. * **Bug Fixes** * Improved policy tier validation to prevent invalid configurations. * Better sandbox cleanup scoping by port to prevent conflicts across deployments. * Added GPU patch failure recovery with automatic rollback. * **Documentation** * Expanded troubleshooting guides for inference, security, and sandbox lifecycle. * Added .dockerignore best practices for custom deployments.  --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com>

fix(hermes): keep remote secrets out of sandbox surfaces

1c1a8c9

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

cv and others added 3 commits June 4, 2026 12:26

Merge branch 'main' into fix/hermes-remote-secret-boundary

c281aec

Merge remote-tracking branch 'origin/main' into fix/hermes-remote-sec…

436aa38

…ret-boundary # Conflicts: # agents/hermes/config/hermes-config.ts

fix(hermes): preserve tools across secret boundary

b14e3bb

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

fix(hermes): reject runtime api secrets

69e3001

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa marked this pull request as ready for review June 5, 2026 09:40

ericksoa requested a review from cv June 5, 2026 09:40

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

test(hermes): add nightly secret boundary coverage

bfbe606

ci(e2e): gate live messaging secrets

da27088

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

test(e2e): type workflow-call inputs

2c05e1a

Merge branch 'main' into fix/hermes-remote-secret-boundary

7e5757e

cv added v0.0.60 Release target and removed v0.0.61 Release target labels Jun 5, 2026

Merge branch 'main' into fix/hermes-remote-secret-boundary

0bdfa21

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Merge branch 'main' into fix/hermes-remote-secret-boundary

5800d37

cv approved these changes Jun 5, 2026

View reviewed changes

cv enabled auto-merge (squash) June 5, 2026 19:29

cv merged commit e0aa9e3 into main Jun 5, 2026
29 checks passed

cv deleted the fix/hermes-remote-secret-boundary branch June 5, 2026 19:36

miyoungc mentioned this pull request Jun 6, 2026

docs: refresh v0.0.60 release notes #4879

Merged

Conversation

ericksoa commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Regression story

Tests

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented Jun 5, 2026

ericksoa commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading