Skip to content

fix(hermes): keep remote secrets out of sandbox surfaces#4771

Merged
cv merged 11 commits into
mainfrom
fix/hermes-remote-secret-boundary
Jun 5, 2026
Merged

fix(hermes): keep remote secrets out of sandbox surfaces#4771
cv merged 11 commits into
mainfrom
fix/hermes-remote-secret-boundary

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented Jun 4, 2026

Summary

  • preserve the pre-existing Hermes remote toolset surface for the API server and enabled messaging platforms, including terminal, file, code_execution, memory, session_search, delegation, and cronjob
  • keep platform_toolsets.cli unpinned and avoid no_mcp, so the fix does not disable default Hermes/MCP capability as the security control
  • fail Hermes startup when /sandbox/.hermes/.env or the startup process environment contains raw secret-shaped values (*_TOKEN, *_KEY, *_SECRET, *_API, *_PASSWORD, *_CREDENTIAL), while allowing OpenShell resolver placeholders and Slack SDK placeholder aliases
  • keep the only raw startup-env exception scoped to OPENCLAW_GATEWAY_TOKEN; known non-secret config names such as API_SERVER_PORT, API_SERVER_HOST, NEMOCLAW_INFERENCE_API, and NEMOCLAW_PROVIDER_KEY are explicitly allowlisted
  • move Hermes managed-tool gateway auth off raw TOOL_GATEWAY_USER_TOKEN sandbox env; Hermes now sends the attached OpenShell provider placeholder for NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN, and the host broker accepts the OpenShell-rewritten refresh credential while keeping raw OAuth state host-side
  • add a named Hermes sandbox secret-boundary smoke in the sandbox image workflow, plus focused generator/startup/plugin/broker/onboarding regression tests
  • merge current main and preserve the new Hermes api_mode routing added there

Regression story

No intentional Hermes tool regression remains in this PR. Slack, Discord, Telegram, WeChat/Weixin, WhatsApp, and the OpenAI-compatible API server keep the remote toolsets they previously had, including terminal, file, code execution, memory, session search, delegation, cron, and default MCP exposure. Managed tool presets still configure their backends; for example nous-code keeps terminal.backend=modal, and nous-audio still adds tts.

The security boundary is enforced by keeping NemoClaw-managed secrets out of sandbox-visible files and startup env. A terminal prompt can still print resolver placeholders such as openshell:resolve:env:* or Slack placeholder aliases, but it should not be able to print NemoClaw/OpenShell-managed raw credential values because those values are not placed in /sandbox/.hermes/.env or passed as TOOL_GATEWAY_USER_TOKEN anymore. Startup also refuses raw secret-shaped process env values, closing the other obvious env/printenv path for terminal-enabled Hermes sandboxes.

Startup rejects generic UUID-shaped values like the reported DEVTEST_API_TOKEN leak and also rejects bare *_API names such as INTERNAL_API. The CD smoke asserts the built Hermes image preserves the expected remote toolsets, has no raw secret-shaped .env values, and refuses injected raw .env and startup-env secrets. Existing sandboxes still need rebuild/recreate to pick up the new generated config/startup behavior; this PR does not try to do emergency credential replacement.

Boundary note: if code outside NemoClaw/OpenShell deliberately writes an arbitrary raw secret into a writable sandbox file after startup, Hermes can still echo that file. This PR closes the NemoClaw-managed paths implicated by #4770 and adds tests to prevent reintroducing that class through generated Hermes config, startup validation, managed-tool gateway auth, or the sandbox image workflow.

Tests

  • bash -n agents/hermes/start.sh test/e2e/test-hermes-sandbox-secret-boundary.sh
  • npx vitest run test/hermes-start.test.ts test/generate-hermes-config.test.ts --testTimeout 60000
  • npx vitest run test/hermes-plugin-handlers.test.ts test/hermes-tool-gateway-broker.test.ts test/onboard.test.ts --testTimeout 60000
  • npm run checks
  • npm run lint (passes; reports one unrelated existing warning in src/lib/onboard/child-exit-tracker.test.ts)
  • earlier: npm run build:cli
  • earlier: generated Hermes config locally for all messaging channels plus all managed tool gateway presets; confirmed API/Slack include terminal, file, code_execution, memory/session/delegation/cron, tts, and terminal.backend=modal, while .env has no TOOL_GATEWAY_USER_TOKEN and no raw secret-shaped values

Fixes #4770.

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features

    • Hermes startup enforces a secret boundary (rejects symlinked .env and raw secret-shaped values in file or process env); validated proxy host/port are now forwarded into sandboxes.
  • Bug Fixes

    • Managed-tool broker/gateway auth updated to use a dedicated refresh-token variable and clearer unknown-credential errors to avoid exposing secrets.
  • Tests

    • New e2e and unit tests covering secret-boundary checks, config/toolset generation, and broker/gateway behavior.
  • Chores

    • CI/workflows run secret-boundary E2E job, conditionally supply live messaging secrets, and upload boundary logs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 4, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)
  • .github/workflows/nightly-e2e.yaml
  • src/lib/onboard.ts
  • test/helpers/e2e-workflow-contract.ts
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/helpers/e2e-workflow-contract.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/onboard.test.ts
  • .github/workflows/nightly-e2e.yaml

📝 Walkthrough

Walkthrough

Adds Hermes startup secret-boundary validators, refactors tool-gateway credential flow to use a refresh-token env var, updates remote platform toolset assignment, and adds unit and E2E tests plus CI jobs that verify no raw secret-shaped values appear in sandbox .env or process environment.

Changes

Hermes secret boundary enforcement and credential refactoring

Layer / File(s) Summary
Remote platform toolset configuration
agents/hermes/config/hermes-config.ts, test/generate-hermes-config.test.ts
REMOTE_PLATFORM_TOOLSETS introduced and MESSAGING_PLATFORM_BY_CHANNEL maps messaging channels to platform keys. Config assigns platform_toolsets.api_server and messaging platform toolsets from the remote list; tests assert generated toolsets equal the remote baseline and scan generated .env for secret-shaped env violations.
Startup environment secret boundary validation
agents/hermes/start.sh, test/hermes-start.test.ts, test/e2e/test-hermes-slack-e2e.sh
Adds validate_hermes_env_secret_boundary() (checks /sandbox/.hermes/.env for symlink and forbidden raw secret-shaped KEY=VALUE entries) and validate_hermes_runtime_env_secret_boundary() (scans process environment). Both run before refresh_hermes_provider_placeholders(); tests verify accept/reject rules and that rejected raw values are not echoed.
Tool gateway credential refactoring & onboarding
agents/hermes/host/tool-gateway-broker.ts, agents/hermes/plugin/__init__.py, src/lib/onboard.ts, test/hermes-plugin-handlers.test.ts, test/hermes-tool-gateway-broker.test.ts, test/onboard.test.ts
Broker adds findCredentialState() to classify presented tokens and derive refresh tokens; handleProxy resolves presentedToken accordingly. Plugin _broker_user_token() now prefers NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN or returns an openshell:resolve:env: placeholder. Onboarding drops per-sandbox hermesToolBrokerToken injection and forwards validated NEMOCLAW_PROXY_* proxy settings to sandbox env. Tests updated to match new token flow and request headers.
E2E testing and CI workflow integration
test/e2e/test-hermes-sandbox-secret-boundary.sh, .github/workflows/sandbox-images-and-e2e.yaml, .github/workflows/nightly-e2e.yaml, .github/workflows/e2e-script.yaml, test/e2e-script-workflow.test.ts, test/validate-e2e-coverage.test.ts
New E2E script runs in-image Python probes to validate sandbox .env and config.yaml, validates managed-tool image fragments, and asserts startup rejects injected secret-shaped .env and process env entries without echoing values. Reusable E2E workflow gains messaging_live_secrets input and conditional secret wiring; workflows run the script and upload failure logs as artifacts; nightly workflow adds hermes-secret-boundary-e2e job and wires it into aggregation jobs.

Sequence Diagram(s)

sequenceDiagram
  participant Startup as Hermes startup
  participant EnvValidator as validate_hermes_env_secret_boundary
  participant RuntimeValidator as validate_hermes_runtime_env_secret_boundary
  participant PythonScanner as Python secret scanner
  participant Refresh as refresh_hermes_provider_placeholders
  Startup->>EnvValidator: validate /sandbox/.hermes/.env (symlink, secret-shaped entries)
  EnvValidator->>PythonScanner: scan file for credential-like keys/values
  alt violations found
    PythonScanner-->>EnvValidator: report offending keys/lines
    EnvValidator-->>Startup: exit non-zero
  else
    EnvValidator-->>Startup: proceed
  end
  Startup->>RuntimeValidator: validate process environment for secret-shaped keys
  RuntimeValidator->>PythonScanner: scan process env
  alt violations found
    PythonScanner-->>RuntimeValidator: report offending keys
    RuntimeValidator-->>Startup: exit non-zero
  else
    RuntimeValidator-->>Startup: proceed
  end
  Startup->>Refresh: refresh provider placeholders
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4718: Modifies agents/hermes/config/hermes-config.ts, related to config construction in the same area.
  • NVIDIA/NemoClaw#4703: Related startup-sequence changes in agents/hermes/start.sh that touch validations executed before refresh_hermes_provider_placeholders().

Suggested labels

E2E, area: onboarding, area: integrations

Suggested reviewers

  • cv
  • laitingsheng

Poem

🐰 I hop the sandbox, sniff and bound,
I nudge the secrets safe and sound,
Placeholders stand where raw tokens lay,
Tests watch the gate both night and day,
CI cheers — no secrets run away!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(hermes): keep remote secrets out of sandbox surfaces' directly addresses the main change: preventing environment-sourced secrets from being exposed to remote surfaces.
Linked Issues check ✅ Passed The PR comprehensively implements all coding requirements from #4770: secret-boundary validation in .env and process environment, secret-shaped value rejection, allowlisting of non-secrets and resolver placeholders, and managed-tool gateway credential migration.
Out of Scope Changes check ✅ Passed All changes are directly related to issue #4770's objectives: secret-boundary enforcement, credential migration, test coverage, and workflow integration for secret-boundary validation testing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/hermes-remote-secret-boundary

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

E2E Advisor Recommendation

Required E2E: hermes-secret-boundary-e2e, hermes-e2e, hermes-onboard-security-posture-e2e, hermes-slack-e2e, messaging-providers-e2e
Optional E2E: hermes-root-entrypoint-smoke-e2e, hermes-discord-e2e, common-egress-agent-e2e

Dispatch hint: hermes-secret-boundary-e2e,hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e

Auto-dispatched E2E: hermes-e2e, hermes-onboard-security-posture-e2e, hermes-slack-e2e, messaging-providers-e2e via nightly-e2e.yaml at 5800d3783caaaed481f5660d1e061a63ba987202nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • hermes-secret-boundary-e2e (medium): Directly validates the new Hermes sandbox secret-boundary behavior: built image inspection, managed-tool image credential surfaces, remote toolsets, and startup rejection of raw secret-shaped .env/runtime env values.
  • hermes-e2e (high): Exercises real install → onboard --agent hermes → health probe → live inference, covering the changed Hermes config generation, onboarding create path, and startup secret-boundary validation in an actual sandbox.
  • hermes-onboard-security-posture-e2e (high): Required because the PR changes Hermes runtime security posture and credential handling during startup/onboard. This validates a full Hermes onboard with non-root host-user and runtime guard assertions.
  • hermes-slack-e2e (high): Required for the changed Hermes messaging platform_toolsets and Slack credential placeholder/provider path. The PR also touches the Hermes Slack E2E script and CI secret-passing behavior for messaging providers.
  • messaging-providers-e2e (high): Validates the reusable workflow's new messaging_live_secrets gating plus the provider/placeholder/L7-proxy chain for Telegram, Discord, and Slack credentials. This is important because the PR changes whether live messaging secrets are passed to E2E scripts.

Optional E2E

  • hermes-root-entrypoint-smoke-e2e (medium): Useful adjacent coverage for the modified Hermes start.sh root/non-root entrypoint paths, layout repair, gateway-user execution, and PID migration. The required Hermes E2Es cover real startup, but this gives faster image-entrypoint confidence.
  • hermes-discord-e2e (high): Optional confidence for the same messaging platform_toolsets change on another Hermes messaging channel besides Slack.
  • common-egress-agent-e2e (very high): Optional expensive end-to-end agent-flow coverage for Hermes common-egress policy/tool behavior, including Hermes Nous policy routes. Helpful because managed-tool gateway configuration changed, but not strictly merge-blocking unless managed-tool runtime behavior is the PR's primary risk.

New E2E recommendations

  • Hermes managed-tool gateway broker runtime (high): Existing coverage now inspects managed-tool image/config and unit-tests the broker, but there is no full E2E that onboards Hermes with managed-tool gateways and verifies a sandbox tool request reaches a hermetic host broker/upstream while raw refresh tokens remain host-only.
    • Suggested test: Add a hermetic Hermes managed-tool gateway broker E2E using a fake Nous portal/upstream to validate OpenShell resolver placeholder presentation, host-side refresh, token rotation, and no raw OAuth credential in sandbox env/config/logs.
  • Reusable E2E workflow secret gating (medium): The reusable e2e-script workflow now conditionally passes live messaging secrets. Unit/contract tests help, but an end-to-end workflow-level negative check would catch accidental secret exposure in future workflow edits.
    • Suggested test: Add a lightweight workflow-contract E2E or CI smoke that dispatches a benign script twice, with messaging_live_secrets false/true, and asserts messaging secret env vars are absent unless explicitly enabled.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: hermes-secret-boundary-e2e,hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-hermes, ubuntu-repo-cloud-hermes-slack, ubuntu-repo-cloud-hermes-discord
Optional scenario E2E: ubuntu-repo-cloud-openclaw, wsl-repo-cloud-openclaw, macos-repo-cloud-openclaw

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required scenario E2E

  • ubuntu-repo-cloud-hermes: Core Hermes configuration, startup, plugin, broker, and onboard code changed. This scenario exercises repo-checkout Hermes onboarding, sandbox startup, gateway health, inference, and Hermes-specific health/history checks on the primary Ubuntu runner.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
  • ubuntu-repo-cloud-hermes-slack: Hermes messaging platform toolset configuration and sandbox secret-boundary behavior changed, with Slack-specific legacy tests also changed. This scenario exercises Hermes Slack onboarding plus messaging placeholder/no-secret-leak checks.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack
  • ubuntu-repo-cloud-hermes-discord: Hermes messaging platform mapping and top-level Discord config generation changed. This scenario exercises Hermes Discord onboarding and the messaging provider/placeholder/no-secret-leak path on the primary Ubuntu runner.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord

Optional scenario E2E

  • ubuntu-repo-cloud-openclaw: src/lib/onboard.ts is shared onboarding code. Although the visible diff is primarily Hermes-related, this adjacent OpenClaw baseline can catch unintended regressions in generic sandbox create/onboard behavior.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • wsl-repo-cloud-openclaw: Optional platform-adjacent coverage for shared onboarding changes on WSL. Special-runner scenario, so not required unless maintainers want cross-platform confidence.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=wsl-repo-cloud-openclaw
  • macos-repo-cloud-openclaw: Optional platform-adjacent coverage for shared onboarding/install behavior on macOS. Special-runner scenario and Docker-dependent suites are skipped, so keep optional.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=macos-repo-cloud-openclaw

Relevant changed files

  • agents/hermes/config/hermes-config.ts
  • agents/hermes/host/tool-gateway-broker.ts
  • agents/hermes/plugin/__init__.py
  • agents/hermes/start.sh
  • src/lib/onboard.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

PR Review Advisor

Findings: 1 needs attention, 4 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 2 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • Gate live messaging secrets for target-ref dispatches (.github/workflows/nightly-e2e.yaml:361): The `messaging-providers-e2e` job still opts into live Telegram, Discord, and Slack secrets while running `test/e2e/test-messaging-providers.sh` from `${{ inputs.target_ref || github.ref }}`. On a `workflow_dispatch` run with a non-empty `target_ref`, the reusable runner checks out that target ref and executes code from it with the live messaging secrets in the environment, so target-ref-controlled E2E code can read or exfiltrate those secrets.
    • Recommendation: Only provide live messaging secrets when the tested script is from a trusted ref. Use a guard equivalent to the Docker Hub credential guard, for example `github.event_name != 'workflow_dispatch' || inputs.target_ref == ''`, or split live-secret validation into a trusted-ref-only job and keep fake-token coverage for target refs. Add a workflow contract test that models `workflow_dispatch` with non-empty `inputs.target_ref` and asserts the live messaging secrets are blank or not requested.
    • Evidence: `messaging-providers-e2e` sets `ref: ${{ inputs.target_ref || github.ref }}` and `messaging_live_secrets: true`; `.github/workflows/e2e-script.yaml` checks out `inputs.ref` into `repo` and sets `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, and `SLACK_APP_TOKEN_REAL` whenever `inputs.messaging_live_secrets` is true. The added test asserts explicit opt-in/default false, but does not assert target-ref withholding.

🔎 Worth checking

  • Source-of-truth review needed: agents/hermes/plugin/__init__.py managed-tool gateway monkeypatches: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `_broker_user_token()` returns `openshell:resolve:env:NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN` and `_install_nous_tool_broker_patch()` monkeypatches multiple Hermes modules.
  • Source-of-truth review needed: agents/hermes/host/tool-gateway-broker.ts credential compatibility: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `findCredentialState()` checks broker-token state first and refresh-token state second; `handleProxy()` uses the presented token directly for refresh-token matches and resolves host runtime refresh token for broker-token matches.
  • Hermes chat-output and memory redaction clauses remain out of scope (agents/hermes/start.sh:948): The PR prevents NemoClaw-managed raw secret-shaped values from entering Hermes `.env` or startup process env, which addresses the managed `.env`/startup-env path. It does not implement the linked issue's provenance-based chat redaction or memory persistence behavior, so raw secrets introduced after startup through another writable file/env path could still be echoed or persisted by Hermes.
  • Document removal conditions for Hermes managed-tool compatibility shims (agents/hermes/plugin/__init__.py:221): The PR changes several localized compatibility shims around Hermes managed-tool gateway auth and legacy broker-token handling. The comments explain the invalid upstream state and include regression tests, but the removal condition is still only a broad 'long term' note and the legacy broker-token compatibility path lacks a concrete sunset/version condition.
    • Recommendation: Document the specific upstream Hermes setting/version or NemoClaw compatibility milestone that will allow removing the monkeypatches and legacy `TOOL_GATEWAY_USER_TOKEN`/broker-token path. If legacy support must remain indefinitely, state that explicitly and keep the regression tests tied to that contract.
    • Evidence: `_broker_user_token()` now returns an OpenShell resolver placeholder by default and `_install_nous_tool_broker_patch()` monkeypatches Hermes managed-tool modules. `tool-gateway-broker.ts` accepts both broker-token and raw refresh-token credential states. Tests cover these paths, but comments do not define a concrete removal trigger for the workaround or legacy compatibility behavior.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Workflow target-ref live messaging secret withholding: model `workflow_dispatch` with non-empty `inputs.target_ref` and assert `messaging-providers-e2e` does not expose `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, or `SLACK_APP_TOKEN_REAL`.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
  • **Runtime validation** — Reusable E2E runner trusted-ref live secret guard: assert `.github/workflows/e2e-script.yaml` or all callers require both explicit live-secret opt-in and a trusted-ref predicate equivalent to the Docker Hub guard.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
  • **Runtime validation** — Hermes chat redaction provenance path: simulate an env-derived value reaching terminal/tool output and assert the Slack/chat response contains `<redacted: ENV_VAR_NAME>` rather than the raw value.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
  • **Runtime validation** — Hermes memory persistence redaction: simulate terminal/tool output containing an env-derived secret and assert persisted memory/transcript stores only the redacted placeholder.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
  • **Runtime validation** — Post-startup raw secret boundary scope: either test and document that arbitrary raw secrets written after startup remain out of scope, or add a runtime guard where Hermes loads or reads env values.. Unit and smoke coverage is strong for generated Hermes config, startup rejection, plugin patches, and broker behavior, but the changed workflow trusted-code boundary and the issue-level Slack/chat/memory behavior still need targeted behavioral validation.
  • **Acceptance clause:** When a Slack user prompts the bot to `print` or `check access` of an env-var-named secret (e.g. `DEVTEST_API_TOKEN`, `API_TOKEN`, `CQA_TOKEN`), the agent runs `echo "${X_TOKEN}"` via the terminal tool and posts the **complete plaintext token value** to the Slack channel. — add test evidence or identify existing coverage. The PR keeps terminal enabled and prevents NemoClaw-managed raw secret-shaped values from being present in `/sandbox/.hermes/.env` or startup process env. This prevents the reported managed startup/config path, but does not add a terminal/chat-layer refusal for arbitrary raw secrets introduced after startup.
  • **Acceptance clause:** The bot boot log declares `Secret redaction: ENABLED (tool output, logs, and chat responses are scrubbed before delivery)` but the redaction layer does not catch generic UUID/GUID-format token values — it apparently only matches known prefixes like `xoxb-` / `sk-`. — add test evidence or identify existing coverage. The new startup guard rejects UUID-like raw values in secret-shaped env names and tests verify the raw value is not printed in startup errors. The Hermes redaction layer itself is not changed.
  • **Acceptance clause:** Two independent prompts from a single user produced the leak in under a minute. — add test evidence or identify existing coverage. No PR diff evidence exercises the Slack prompt path end-to-end; the new tests target generated config, startup env validation, plugin/broker behavior, and image scans.
Since last review details

Current findings:

  • Source-of-truth review needed: agents/hermes/plugin/__init__.py managed-tool gateway monkeypatches: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `_broker_user_token()` returns `openshell:resolve:env:NEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN` and `_install_nous_tool_broker_patch()` monkeypatches multiple Hermes modules.
  • Source-of-truth review needed: agents/hermes/host/tool-gateway-broker.ts credential compatibility: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `findCredentialState()` checks broker-token state first and refresh-token state second; `handleProxy()` uses the presented token directly for refresh-token matches and resolves host runtime refresh token for broker-token matches.
  • Gate live messaging secrets for target-ref dispatches (.github/workflows/nightly-e2e.yaml:361): The `messaging-providers-e2e` job still opts into live Telegram, Discord, and Slack secrets while running `test/e2e/test-messaging-providers.sh` from `${{ inputs.target_ref || github.ref }}`. On a `workflow_dispatch` run with a non-empty `target_ref`, the reusable runner checks out that target ref and executes code from it with the live messaging secrets in the environment, so target-ref-controlled E2E code can read or exfiltrate those secrets.
    • Recommendation: Only provide live messaging secrets when the tested script is from a trusted ref. Use a guard equivalent to the Docker Hub credential guard, for example `github.event_name != 'workflow_dispatch' || inputs.target_ref == ''`, or split live-secret validation into a trusted-ref-only job and keep fake-token coverage for target refs. Add a workflow contract test that models `workflow_dispatch` with non-empty `inputs.target_ref` and asserts the live messaging secrets are blank or not requested.
    • Evidence: `messaging-providers-e2e` sets `ref: ${{ inputs.target_ref || github.ref }}` and `messaging_live_secrets: true`; `.github/workflows/e2e-script.yaml` checks out `inputs.ref` into `repo` and sets `TELEGRAM_BOT_TOKEN_REAL`, `DISCORD_BOT_TOKEN_REAL`, `SLACK_BOT_TOKEN_REAL`, and `SLACK_APP_TOKEN_REAL` whenever `inputs.messaging_live_secrets` is true. The added test asserts explicit opt-in/default false, but does not assert target-ref withholding.
  • Hermes chat-output and memory redaction clauses remain out of scope (agents/hermes/start.sh:948): The PR prevents NemoClaw-managed raw secret-shaped values from entering Hermes `.env` or startup process env, which addresses the managed `.env`/startup-env path. It does not implement the linked issue's provenance-based chat redaction or memory persistence behavior, so raw secrets introduced after startup through another writable file/env path could still be echoed or persisted by Hermes.
  • Document removal conditions for Hermes managed-tool compatibility shims (agents/hermes/plugin/__init__.py:221): The PR changes several localized compatibility shims around Hermes managed-tool gateway auth and legacy broker-token handling. The comments explain the invalid upstream state and include regression tests, but the removal condition is still only a broad 'long term' note and the legacy broker-token compatibility path lacks a concrete sunset/version condition.
    • Recommendation: Document the specific upstream Hermes setting/version or NemoClaw compatibility milestone that will allow removing the monkeypatches and legacy `TOOL_GATEWAY_USER_TOKEN`/broker-token path. If legacy support must remain indefinitely, state that explicitly and keep the regression tests tied to that contract.
    • Evidence: `_broker_user_token()` now returns an OpenShell resolver placeholder by default and `_install_nous_tool_broker_patch()` monkeypatches Hermes managed-tool modules. `tool-gateway-broker.ts` accepts both broker-token and raw refresh-token credential states. Tests cover these paths, but comments do not define a concrete removal trigger for the workaround or legacy compatibility behavior.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

cv and others added 3 commits June 4, 2026 12:26
@ericksoa ericksoa added v0.0.61 Release target bug-fix PR fixes a bug or regression security Potential vulnerability, unsafe behavior, or access risk area: security Security controls, permissions, secrets, or hardening integration: hermes Hermes integration behavior integration: slack Slack integration or channel behavior platform: ubuntu Affects Ubuntu Linux environments UAT Issues flagged for User Acceptance Testing. NV QA Bugs found by the NVIDIA QA Team labels Jun 5, 2026
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa marked this pull request as ready for review June 5, 2026 09:40
@ericksoa ericksoa requested a review from cv June 5, 2026 09:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-hermes-sandbox-secret-boundary.sh`:
- Around line 221-224: The test uses GUID-like literal values that trigger
secret scanners; update the E2E payloads used with
assert_startup_rejects_env_entry to use a benign non-secret sentinel (e.g.,
"SENTINEL_VALUE" or a dynamically composed string) instead of the GUID-like
literals so the startup-rejection behavior is still validated without leaking
realistic-looking secrets; make the same replacement for the other occurrence
referenced in the test (the second assert_startup_rejects_env_entry call).

In `@test/generate-hermes-config.test.ts`:
- Around line 183-197: The test "flags bare API-named .env secrets while
allowing API server config" uses a GUID-like literal in rawSecret which triggers
secret scanners; change the fixture to a non-sensitive sentinel (e.g.,
"raw-secret-value") or build the value from harmless fragments so
findRawSecretEnvEntries still sees a non-placeholder raw string; update the
constant referenced as rawSecret in this test so the assertion for
findRawSecretEnvEntries([...]) remains unchanged.

In `@test/hermes-start.test.ts`:
- Around line 596-605: The test uses a hard-coded GUID-like secret in the
"rejects bare API-named raw values without printing the value" spec—replace that
literal with a benign sentinel or construct it at runtime to avoid committing
scanner bait; update the value passed to runHermesEnvSecretBoundary (the envFile
string used in this test and the similar one at the other location around the
645-658 block) to use a non-sensitive token name (e.g., "SENTINEL_TOKEN" or a
runtime concatenation like "token-" + "123") so the test behavior remains the
same but no real-looking GUID is stored in the repo.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 32075eba-4954-4863-815a-545c52db6645

📥 Commits

Reviewing files that changed from the base of the PR and between c281aec and 69e3001.

📒 Files selected for processing (12)
  • .github/workflows/sandbox-images-and-e2e.yaml
  • agents/hermes/config/hermes-config.ts
  • agents/hermes/host/tool-gateway-broker.ts
  • agents/hermes/plugin/__init__.py
  • agents/hermes/start.sh
  • src/lib/onboard.ts
  • test/e2e/test-hermes-sandbox-secret-boundary.sh
  • test/generate-hermes-config.test.ts
  • test/hermes-plugin-handlers.test.ts
  • test/hermes-start.test.ts
  • test/hermes-tool-gateway-broker.test.ts
  • test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • agents/hermes/start.sh
  • agents/hermes/config/hermes-config.ts

Comment on lines +221 to +224
assert_startup_rejects_env_entry \
"INTERNAL_API=01234567-89ab-cdef-0123-456789abcdef" \
"INTERNAL_API" \
"01234567-89ab-cdef-0123-456789abcdef"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use non-secret sentinels in the E2E payloads.

These newly added GUID-like literals are already being reported by Betterleaks. This smoke test only needs a raw non-placeholder value to prove startup rejection, so swapping in a benign sentinel string or composing the value dynamically avoids secret-scanner noise without weakening the check.

Also applies to: 229-232

🧰 Tools
🪛 Betterleaks (1.3.1)

[high] 222-222: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-hermes-sandbox-secret-boundary.sh` around lines 221 - 224, The
test uses GUID-like literal values that trigger secret scanners; update the E2E
payloads used with assert_startup_rejects_env_entry to use a benign non-secret
sentinel (e.g., "SENTINEL_VALUE" or a dynamically composed string) instead of
the GUID-like literals so the startup-rejection behavior is still validated
without leaking realistic-looking secrets; make the same replacement for the
other occurrence referenced in the test (the second
assert_startup_rejects_env_entry call).

Comment on lines +183 to +197
it("flags bare API-named .env secrets while allowing API server config", () => {
const rawSecret = "01234567-89ab-cdef-0123-456789abcdef";

expect(
findRawSecretEnvEntries(
[
"API_SERVER_PORT=18642",
"API_SERVER_HOST=127.0.0.1",
`INTERNAL_API=${rawSecret}`,
"SERVICE_API=openshell:resolve:env:SERVICE_API",
"",
].join("\n"),
),
).toEqual(["INTERNAL_API line 3"]);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid committing secret-scanner-shaped fixture values.

This GUID-like literal is already being flagged by Betterleaks in this PR. findRawSecretEnvEntries() only cares that the value is a non-placeholder raw string, so a benign sentinel like raw-secret-value or a value assembled from fragments will preserve the coverage without adding scanner noise.

🧰 Tools
🪛 Betterleaks (1.3.1)

[high] 184-184: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/generate-hermes-config.test.ts` around lines 183 - 197, The test "flags
bare API-named .env secrets while allowing API server config" uses a GUID-like
literal in rawSecret which triggers secret scanners; change the fixture to a
non-sensitive sentinel (e.g., "raw-secret-value") or build the value from
harmless fragments so findRawSecretEnvEntries still sees a non-placeholder raw
string; update the constant referenced as rawSecret in this test so the
assertion for findRawSecretEnvEntries([...]) remains unchanged.

Comment thread test/hermes-start.test.ts
Comment on lines +596 to +605
it("rejects bare API-named raw values without printing the value", () => {
const rawToken = "01234567-89ab-cdef-0123-456789abcdef";
const result = runHermesEnvSecretBoundary({
envFile: `INTERNAL_API=${rawToken}\n`,
});

expect(result.status).toBe(1);
expect(result.stderr).toContain("INTERNAL_API (line 1)");
expect(result.stderr).not.toContain(rawToken);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace the hard-coded GUID-like test secrets.

These new fixture values are already being flagged by Betterleaks. The boundary checks here reject any raw non-placeholder value for those keys, so switching to a benign sentinel string—or composing the value at runtime—keeps the test intent without committing scanner bait.

Also applies to: 645-658

🧰 Tools
🪛 Betterleaks (1.3.1)

[high] 597-597: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/hermes-start.test.ts` around lines 596 - 605, The test uses a hard-coded
GUID-like secret in the "rejects bare API-named raw values without printing the
value" spec—replace that literal with a benign sentinel or construct it at
runtime to avoid committing scanner bait; update the value passed to
runHermesEnvSecretBoundary (the envFile string used in this test and the similar
one at the other location around the 645-658 block) to use a non-sensitive token
name (e.g., "SENTINEL_TOKEN" or a runtime concatenation like "token-" + "123")
so the test behavior remains the same but no real-looking GUID is stored in the
repo.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 27007564674
Target ref: 69e300199bff3cf14a0a3a7d366187d38b829b25
Workflow ref: main
Requested jobs: all (no filter)
Summary: 55 passed, 2 failed, 2 skipped

Job Result
agent-turn-latency-e2e ✅ success
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ⚠️ cancelled
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-dashboard-e2e ✅ success
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
issue-3600-gpu-proof-optional-e2e ✅ success
issue-4434-tui-unreachable-inference-e2e ❌ failure
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ✅ success
kimi-inference-compat-e2e ✅ success
launchable-smoke-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
onboard-negative-paths-e2e ❌ failure
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-discord-pairing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success
openclaw-skill-cli-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
sessions-agents-cli-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ⚠️ cancelled
tunnel-lifecycle-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success
vm-driver-privileged-exec-routing-e2e ✅ success

Failed jobs: issue-4434-tui-unreachable-inference-e2e, onboard-negative-paths-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27008360665
Target ref: bfbe6062db63e93a7195b61985b39f5a5e724703
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27008436572
Target ref: bfbe6062db63e93a7195b61985b39f5a5e724703
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,hermes-onboard-security-posture-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-script-workflow.test.ts`:
- Around line 65-69: The code incorrectly references a non-existent property
runnerWorkflow.true; replace the fallback expression with a safe
optional-chaining access on runnerWorkflow itself (e.g. compute callInputs from
runnerWorkflow?.on?.workflow_call?.inputs) so the value becomes: const
callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {}; update the
occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow and
callInputs are defined.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ac05cf9e-d111-4e28-a064-5953a9ee3bb0

📥 Commits

Reviewing files that changed from the base of the PR and between bfbe606 and da27088.

📒 Files selected for processing (4)
  • .github/workflows/e2e-script.yaml
  • .github/workflows/nightly-e2e.yaml
  • test/e2e-script-workflow.test.ts
  • test/validate-e2e-coverage.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/nightly-e2e.yaml

Comment on lines +65 to +69
it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {
const callInputs =
runnerWorkflow.on?.workflow_call?.inputs ??
runnerWorkflow.true?.workflow_call?.inputs ??
{};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix the type error: runnerWorkflow.true is not a valid property.

Line 68 attempts to access runnerWorkflow.true?.workflow_call?.inputs, but true is not a valid property name. This appears to be a typo.

🐛 Proposed fix
-    const callInputs =
-      runnerWorkflow.on?.workflow_call?.inputs ??
-      runnerWorkflow.true?.workflow_call?.inputs ??
-      {};
+    const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {
const callInputs =
runnerWorkflow.on?.workflow_call?.inputs ??
runnerWorkflow.true?.workflow_call?.inputs ??
{};
it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => {
const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {};
🧰 Tools
🪛 GitHub Check: checks

[failure] 68-68:
Property 'true' does not exist on type 'RunnerWorkflow'.


[failure] 67-67:
Property 'on' does not exist on type 'RunnerWorkflow'.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-script-workflow.test.ts` around lines 65 - 69, The code incorrectly
references a non-existent property runnerWorkflow.true; replace the fallback
expression with a safe optional-chaining access on runnerWorkflow itself (e.g.
compute callInputs from runnerWorkflow?.on?.workflow_call?.inputs) so the value
becomes: const callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {};
update the occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow
and callInputs are defined.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27010209868
Target ref: da2708813bf3c8ac84aeb06894de0f6bc65f92c4
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27010271846
Target ref: da2708813bf3c8ac84aeb06894de0f6bc65f92c4
Workflow ref: main
Requested jobs: hermes-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
hermes-e2e ✅ success
hermes-slack-e2e ✅ success
messaging-providers-e2e ⚠️ cancelled

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27010496448
Target ref: 2c05e1a3d67a863b7b0496c25a0d185785770efa
Workflow ref: fix/hermes-remote-secret-boundary
Requested jobs: hermes-secret-boundary-e2e,hermes-slack-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27010609871
Target ref: 2c05e1a3d67a863b7b0496c25a0d185785770efa
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success
messaging-providers-e2e ✅ success

@cv
Copy link
Copy Markdown
Collaborator

cv commented Jun 5, 2026

@ericksoa why is this tagged v0.0.61?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27027032111
Target ref: 7e5757e5f93507987fe01dcdd5c5dbd9f67a80b1
Workflow ref: main
Requested jobs: hermes-e2e,hermes-slack-e2e,hermes-discord-e2e,hermes-root-entrypoint-smoke-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success

@cv cv added v0.0.60 Release target and removed v0.0.61 Release target labels Jun 5, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)
  • .github/workflows/nightly-e2e.yaml
  • src/lib/onboard.ts
  • test/helpers/e2e-workflow-contract.ts
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/helpers/e2e-workflow-contract.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/onboard.test.ts
  • .github/workflows/nightly-e2e.yaml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf

📥 Commits

Reviewing files that changed from the base of the PR and between da27088 and 0bdfa21.

📒 Files selected for processing (4)
  • .github/workflows/nightly-e2e.yaml
  • src/lib/onboard.ts
  • test/helpers/e2e-workflow-contract.ts
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/helpers/e2e-workflow-contract.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/onboard.test.ts
  • .github/workflows/nightly-e2e.yaml
🛑 Comments failed to post (1)
src/lib/onboard.ts (1)

5518-5521: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Scope policy-tier env validation to non-interactive onboarding.

selectPolicyTier() only consumes NEMOCLAW_POLICY_TIER in non-interactive mode, but validatePolicyTierEnvEarly() now runs for every onboard path. A stale invalid export will now abort an interactive onboarding run before the user ever reaches the tier prompt, even though the interactive path ignores that env var.

Suggested fix
-  policyTierEnv.validatePolicyTierEnvEarly();
+  if (isNonInteractive()) {
+    policyTierEnv.validatePolicyTierEnvEarly();
+  }

Also applies to: 6137-6139

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 5518 - 5521, validatePolicyTierEnvEarly() is
being run for all onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER
env to abort interactive runs even though selectPolicyTier() only uses that env
in non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27031834188
Target ref: 0bdfa21eb62a7a4c3dc667c5434ed09f0f06525e
Workflow ref: main
Requested jobs: hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-slack-e2e,hermes-discord-e2e,messaging-providers-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job Result
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success
messaging-providers-e2e ✅ success

@cv cv enabled auto-merge (squash) June 5, 2026 19:29
@cv cv merged commit e0aa9e3 into main Jun 5, 2026
29 checks passed
@cv cv deleted the fix/hermes-remote-secret-boundary branch June 5, 2026 19:36
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 27035850282
Target ref: 5800d3783caaaed481f5660d1e061a63ba987202
Workflow ref: main
Requested jobs: hermes-e2e,hermes-onboard-security-posture-e2e,hermes-slack-e2e,messaging-providers-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-slack-e2e ✅ success
messaging-providers-e2e ✅ success

miyoungc added a commit that referenced this pull request Jun 6, 2026
## Summary
- Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the
dev announcement from discussion #4877.
- Fills the source-doc gaps found during release-prep review across
inference, policy tiers, command behavior, security boundaries, Hermes
dashboard/tooling, runtime context, and troubleshooting.
- Refreshes generated agent skills under `.agents/skills/` from the
current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`.

## Source summary
- #4037 -> `docs/reference/architecture.mdx`,
`docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents
system-only runtime context that stays out of visible chat.
- #4875 -> `docs/reference/architecture.mdx`,
`docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents
try-first sandbox network/filesystem guidance and clearer failure
classification.
- #4788 -> `docs/security/best-practices.mdx`,
`docs/about/release-notes.mdx`: Documents shared OpenClaw
device-approval policy for startup and connect.
- #4768 -> `docs/reference/network-policies.mdx`,
`docs/network-policy/integration-policy-examples.mdx`,
`docs/get-started/quickstart.mdx`,
`docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`:
Documents `weather`, `public-reference`, and Hermes managed-tool gateway
preset behavior.
- #3788 and #4864 -> `docs/reference/network-policies.mdx`,
`docs/reference/commands.mdx`: Documents non-interactive policy-tier
fail-fast behavior and interactive prompt fallback.
- #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware
default sandbox resolution for `list`, `status`, and `tunnel` commands.
- #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel
status` behavior.
- #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy
preset descriptions in `policy-list`.
- #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents
package-managed OpenShell gateway service and Docker-driver
gateway-marker behavior.
- #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent
gateway/dashboard cleanup isolation by sandbox name and port.
- #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU
patch rollback behavior.
- #4610 -> `docs/reference/troubleshooting.mdx`,
`docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission
guidance aligned and removes skipped experimental wording.
- #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling
for custom `onboard --from <Dockerfile>` contexts in generated skills.
- #4870 -> `docs/reference/commands.mdx`,
`docs/manage-sandboxes/runtime-controls.mdx`: Documents
`NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage.
- #4641 -> `docs/inference/inference-options.mdx`,
`docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM
platform-digest pulls and served-model id adoption.
- #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents
stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash
coverage.
- #4852 -> `docs/inference/use-local-inference.mdx`,
`docs/reference/troubleshooting.mdx`: Documents Ollama model fit
filtering, 16K context floor, cold-load retry, and failed-model
exclusion.
- #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents
API-family sync, Hermes `api_mode`, and Bedrock Runtime exception.
- #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents
Nemotron managed-inference native tool-search fallback.
- #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents
interactive multimodal input prompting.
- #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass
normalization in generated troubleshooting coverage.
- #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents
prebuilt Hermes dashboard assets and TUI recovery without runtime
rebuilds.
- #4854 -> `docs/inference/switch-inference-providers.mdx`,
`docs/reference/commands.mdx`: Documents Hermes proxy API-key
placeholder preservation during inference switches.
- #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`.agents/skills/`: Keeps messaging enrollment behavior aligned with
manifest-hook implementation.
- #4771 -> `docs/security/best-practices.mdx`,
`docs/security/credential-storage.mdx`: Documents Hermes
placeholder-only secret boundary for sandbox-visible runtime files.
- #4787 -> `docs/security/best-practices.mdx`,
`docs/about/release-notes.mdx`: Documents expanded memory scanner
examples for OpenAI project keys and Slack app-level tokens.
- #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill
install mirroring into the agent home directory.
- #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep
structure and generated `.agents/skills/` refresh as the template for
this release.

## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/
--prefix nemoclaw-user --doc-platform fern-mdx --dry-run`
- `npm run docs`
- `git diff --check`
- skip-term scan across `docs/`, `.agents/skills/`, and `skills/`
- `npm run build:cli`
- `npm run typecheck:cli`
- Commit and pre-push hook suites, including markdownlint, gitleaks,
env-var docs gate, docs-to-skills verification, and skills YAML tests

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* DeepSeek-V4-Flash now available as default inference model for DGX
Station.
* Hermes dashboard improved with dedicated port and OAuth-authenticated
tool gateway selection.
* Added weather and public-reference policy presets for expanded agent
capabilities.
* Enhanced Ollama model selection with GPU memory filtering and
automatic retry for timeouts.

* **Bug Fixes**
  * Improved policy tier validation to prevent invalid configurations.
* Better sandbox cleanup scoping by port to prevent conflicts across
deployments.
  * Added GPU patch failure recovery with automatic rollback.

* **Documentation**
* Expanded troubleshooting guides for inference, security, and sandbox
lifecycle.
  * Added .dockerignore best practices for custom deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: security Security controls, permissions, secrets, or hardening bug-fix PR fixes a bug or regression integration: hermes Hermes integration behavior integration: slack Slack integration or channel behavior NV QA Bugs found by the NVIDIA QA Team platform: ubuntu Affects Ubuntu Linux environments security Potential vulnerability, unsafe behavior, or access risk UAT Issues flagged for User Acceptance Testing. v0.0.60 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 22.04][Security] Hermes agent echoes env-var token verbatim to Slack chat on "print $X_TOKEN" prompt

3 participants