fix(hermes): keep remote secrets out of sandbox surfaces#4771
Conversation
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds Hermes startup secret-boundary validators, refactors tool-gateway credential flow to use a refresh-token env var, updates remote platform toolset assignment, and adds unit and E2E tests plus CI jobs that verify no raw secret-shaped values appear in sandbox .env or process environment. ChangesHermes secret boundary enforcement and credential refactoring
Sequence Diagram(s)sequenceDiagram
participant Startup as Hermes startup
participant EnvValidator as validate_hermes_env_secret_boundary
participant RuntimeValidator as validate_hermes_runtime_env_secret_boundary
participant PythonScanner as Python secret scanner
participant Refresh as refresh_hermes_provider_placeholders
Startup->>EnvValidator: validate /sandbox/.hermes/.env (symlink, secret-shaped entries)
EnvValidator->>PythonScanner: scan file for credential-like keys/values
alt violations found
PythonScanner-->>EnvValidator: report offending keys/lines
EnvValidator-->>Startup: exit non-zero
else
EnvValidator-->>Startup: proceed
end
Startup->>RuntimeValidator: validate process environment for secret-shaped keys
RuntimeValidator->>PythonScanner: scan process env
alt violations found
PythonScanner-->>RuntimeValidator: report offending keys
RuntimeValidator-->>Startup: exit non-zero
else
RuntimeValidator-->>Startup: proceed
end
Startup->>Refresh: refresh provider placeholders
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: Dispatch required scenario E2E:
Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 1 needs attention, 4 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
…ret-boundary # Conflicts: # agents/hermes/config/hermes-config.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/test-hermes-sandbox-secret-boundary.sh`:
- Around line 221-224: The test uses GUID-like literal values that trigger
secret scanners; update the E2E payloads used with
assert_startup_rejects_env_entry to use a benign non-secret sentinel (e.g.,
"SENTINEL_VALUE" or a dynamically composed string) instead of the GUID-like
literals so the startup-rejection behavior is still validated without leaking
realistic-looking secrets; make the same replacement for the other occurrence
referenced in the test (the second assert_startup_rejects_env_entry call).
In `@test/generate-hermes-config.test.ts`:
- Around line 183-197: The test "flags bare API-named .env secrets while
allowing API server config" uses a GUID-like literal in rawSecret which triggers
secret scanners; change the fixture to a non-sensitive sentinel (e.g.,
"raw-secret-value") or build the value from harmless fragments so
findRawSecretEnvEntries still sees a non-placeholder raw string; update the
constant referenced as rawSecret in this test so the assertion for
findRawSecretEnvEntries([...]) remains unchanged.
In `@test/hermes-start.test.ts`:
- Around line 596-605: The test uses a hard-coded GUID-like secret in the
"rejects bare API-named raw values without printing the value" spec—replace that
literal with a benign sentinel or construct it at runtime to avoid committing
scanner bait; update the value passed to runHermesEnvSecretBoundary (the envFile
string used in this test and the similar one at the other location around the
645-658 block) to use a non-sensitive token name (e.g., "SENTINEL_TOKEN" or a
runtime concatenation like "token-" + "123") so the test behavior remains the
same but no real-looking GUID is stored in the repo.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 32075eba-4954-4863-815a-545c52db6645
📒 Files selected for processing (12)
.github/workflows/sandbox-images-and-e2e.yamlagents/hermes/config/hermes-config.tsagents/hermes/host/tool-gateway-broker.tsagents/hermes/plugin/__init__.pyagents/hermes/start.shsrc/lib/onboard.tstest/e2e/test-hermes-sandbox-secret-boundary.shtest/generate-hermes-config.test.tstest/hermes-plugin-handlers.test.tstest/hermes-start.test.tstest/hermes-tool-gateway-broker.test.tstest/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- agents/hermes/start.sh
- agents/hermes/config/hermes-config.ts
| assert_startup_rejects_env_entry \ | ||
| "INTERNAL_API=01234567-89ab-cdef-0123-456789abcdef" \ | ||
| "INTERNAL_API" \ | ||
| "01234567-89ab-cdef-0123-456789abcdef" |
There was a problem hiding this comment.
Use non-secret sentinels in the E2E payloads.
These newly added GUID-like literals are already being reported by Betterleaks. This smoke test only needs a raw non-placeholder value to prove startup rejection, so swapping in a benign sentinel string or composing the value dynamically avoids secret-scanner noise without weakening the check.
Also applies to: 229-232
🧰 Tools
🪛 Betterleaks (1.3.1)
[high] 222-222: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/test-hermes-sandbox-secret-boundary.sh` around lines 221 - 224, The
test uses GUID-like literal values that trigger secret scanners; update the E2E
payloads used with assert_startup_rejects_env_entry to use a benign non-secret
sentinel (e.g., "SENTINEL_VALUE" or a dynamically composed string) instead of
the GUID-like literals so the startup-rejection behavior is still validated
without leaking realistic-looking secrets; make the same replacement for the
other occurrence referenced in the test (the second
assert_startup_rejects_env_entry call).
| it("flags bare API-named .env secrets while allowing API server config", () => { | ||
| const rawSecret = "01234567-89ab-cdef-0123-456789abcdef"; | ||
|
|
||
| expect( | ||
| findRawSecretEnvEntries( | ||
| [ | ||
| "API_SERVER_PORT=18642", | ||
| "API_SERVER_HOST=127.0.0.1", | ||
| `INTERNAL_API=${rawSecret}`, | ||
| "SERVICE_API=openshell:resolve:env:SERVICE_API", | ||
| "", | ||
| ].join("\n"), | ||
| ), | ||
| ).toEqual(["INTERNAL_API line 3"]); | ||
| }); |
There was a problem hiding this comment.
Avoid committing secret-scanner-shaped fixture values.
This GUID-like literal is already being flagged by Betterleaks in this PR. findRawSecretEnvEntries() only cares that the value is a non-placeholder raw string, so a benign sentinel like raw-secret-value or a value assembled from fragments will preserve the coverage without adding scanner noise.
🧰 Tools
🪛 Betterleaks (1.3.1)
[high] 184-184: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/generate-hermes-config.test.ts` around lines 183 - 197, The test "flags
bare API-named .env secrets while allowing API server config" uses a GUID-like
literal in rawSecret which triggers secret scanners; change the fixture to a
non-sensitive sentinel (e.g., "raw-secret-value") or build the value from
harmless fragments so findRawSecretEnvEntries still sees a non-placeholder raw
string; update the constant referenced as rawSecret in this test so the
assertion for findRawSecretEnvEntries([...]) remains unchanged.
| it("rejects bare API-named raw values without printing the value", () => { | ||
| const rawToken = "01234567-89ab-cdef-0123-456789abcdef"; | ||
| const result = runHermesEnvSecretBoundary({ | ||
| envFile: `INTERNAL_API=${rawToken}\n`, | ||
| }); | ||
|
|
||
| expect(result.status).toBe(1); | ||
| expect(result.stderr).toContain("INTERNAL_API (line 1)"); | ||
| expect(result.stderr).not.toContain(rawToken); | ||
| }); |
There was a problem hiding this comment.
Replace the hard-coded GUID-like test secrets.
These new fixture values are already being flagged by Betterleaks. The boundary checks here reject any raw non-placeholder value for those keys, so switching to a benign sentinel string—or composing the value at runtime—keeps the test intent without committing scanner bait.
Also applies to: 645-658
🧰 Tools
🪛 Betterleaks (1.3.1)
[high] 597-597: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/hermes-start.test.ts` around lines 596 - 605, The test uses a hard-coded
GUID-like secret in the "rejects bare API-named raw values without printing the
value" spec—replace that literal with a benign sentinel or construct it at
runtime to avoid committing scanner bait; update the value passed to
runHermesEnvSecretBoundary (the envFile string used in this test and the similar
one at the other location around the 645-658 block) to use a non-sensitive token
name (e.g., "SENTINEL_TOKEN" or a runtime concatenation like "token-" + "123")
so the test behavior remains the same but no real-looking GUID is stored in the
repo.
Selective E2E Results — ❌ Some jobs failedRun: 27007564674
|
Selective E2E Results — ✅ All requested jobs passedRun: 27008360665
|
Selective E2E Results — ✅ All requested jobs passedRun: 27008436572
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e-script-workflow.test.ts`:
- Around line 65-69: The code incorrectly references a non-existent property
runnerWorkflow.true; replace the fallback expression with a safe
optional-chaining access on runnerWorkflow itself (e.g. compute callInputs from
runnerWorkflow?.on?.workflow_call?.inputs) so the value becomes: const
callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {}; update the
occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow and
callInputs are defined.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ac05cf9e-d111-4e28-a064-5953a9ee3bb0
📒 Files selected for processing (4)
.github/workflows/e2e-script.yaml.github/workflows/nightly-e2e.yamltest/e2e-script-workflow.test.tstest/validate-e2e-coverage.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
| it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => { | ||
| const callInputs = | ||
| runnerWorkflow.on?.workflow_call?.inputs ?? | ||
| runnerWorkflow.true?.workflow_call?.inputs ?? | ||
| {}; |
There was a problem hiding this comment.
Fix the type error: runnerWorkflow.true is not a valid property.
Line 68 attempts to access runnerWorkflow.true?.workflow_call?.inputs, but true is not a valid property name. This appears to be a typo.
🐛 Proposed fix
- const callInputs =
- runnerWorkflow.on?.workflow_call?.inputs ??
- runnerWorkflow.true?.workflow_call?.inputs ??
- {};
+ const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => { | |
| const callInputs = | |
| runnerWorkflow.on?.workflow_call?.inputs ?? | |
| runnerWorkflow.true?.workflow_call?.inputs ?? | |
| {}; | |
| it("requires an explicit opt-in before exposing live messaging secrets to scripts", () => { | |
| const callInputs = runnerWorkflow.on?.workflow_call?.inputs ?? {}; |
🧰 Tools
🪛 GitHub Check: checks
[failure] 68-68:
Property 'true' does not exist on type 'RunnerWorkflow'.
[failure] 67-67:
Property 'on' does not exist on type 'RunnerWorkflow'.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e-script-workflow.test.ts` around lines 65 - 69, The code incorrectly
references a non-existent property runnerWorkflow.true; replace the fallback
expression with a safe optional-chaining access on runnerWorkflow itself (e.g.
compute callInputs from runnerWorkflow?.on?.workflow_call?.inputs) so the value
becomes: const callInputs = runnerWorkflow?.on?.workflow_call?.inputs ?? {};
update the occurrence in test/e2e-script-workflow.test.ts where runnerWorkflow
and callInputs are defined.
Selective E2E Results — ✅ All requested jobs passedRun: 27010209868
|
Selective E2E Results — ✅ All requested jobs passedRun: 27010271846
|
Selective E2E Results — ✅ All requested jobs passedRun: 27010496448
|
Selective E2E Results — ✅ All requested jobs passedRun: 27010609871
|
|
@ericksoa why is this tagged v0.0.61? |
Selective E2E Results — ✅ All requested jobs passedRun: 27027032111
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf
📒 Files selected for processing (4)
.github/workflows/nightly-e2e.yamlsrc/lib/onboard.tstest/helpers/e2e-workflow-contract.tstest/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
- test/helpers/e2e-workflow-contract.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- test/onboard.test.ts
- .github/workflows/nightly-e2e.yaml
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5518-5521: validatePolicyTierEnvEarly() is being run for all
onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort
interactive runs even though selectPolicyTier() only uses that env in
non-interactive mode; restrict the early validation so it only runs when
isNonInteractive() is true. Concretely, modify the onboarding flow to call
policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or
perform the validation) inside the same non-interactive branch that contains
selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation
with an isNonInteractive() guard), referencing selectPolicyTier(),
validatePolicyTierEnvEarly(), isNonInteractive(), and
policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4b006206-c80c-47e4-b7b3-f3f08a6d64bf
📒 Files selected for processing (4)
.github/workflows/nightly-e2e.yamlsrc/lib/onboard.tstest/helpers/e2e-workflow-contract.tstest/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
- test/helpers/e2e-workflow-contract.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- test/onboard.test.ts
- .github/workflows/nightly-e2e.yaml
🛑 Comments failed to post (1)
src/lib/onboard.ts (1)
5518-5521:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winScope policy-tier env validation to non-interactive onboarding.
selectPolicyTier()only consumesNEMOCLAW_POLICY_TIERin non-interactive mode, butvalidatePolicyTierEnvEarly()now runs for every onboard path. A stale invalid export will now abort an interactive onboarding run before the user ever reaches the tier prompt, even though the interactive path ignores that env var.Suggested fix
- policyTierEnv.validatePolicyTierEnvEarly(); + if (isNonInteractive()) { + policyTierEnv.validatePolicyTierEnvEarly(); + }Also applies to: 6137-6139
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 5518 - 5521, validatePolicyTierEnvEarly() is being run for all onboarding paths which causes an invalid NEMOCLAW_POLICY_TIER env to abort interactive runs even though selectPolicyTier() only uses that env in non-interactive mode; restrict the early validation so it only runs when isNonInteractive() is true. Concretely, modify the onboarding flow to call policyTierEnv.resolvePolicyTierFromEnv() and validatePolicyTierEnvEarly() (or perform the validation) inside the same non-interactive branch that contains selectPolicyTier() (or wrap the existing validatePolicyTierEnvEarly() invocation with an isNonInteractive() guard), referencing selectPolicyTier(), validatePolicyTierEnvEarly(), isNonInteractive(), and policyTierEnv.resolvePolicyTierFromEnv() to locate and change the logic.
Selective E2E Results — ✅ All requested jobs passedRun: 27031834188
|
Selective E2E Results — ✅ All requested jobs passedRun: 27035850282
|
## Summary - Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the dev announcement from discussion #4877. - Fills the source-doc gaps found during release-prep review across inference, policy tiers, command behavior, security boundaries, Hermes dashboard/tooling, runtime context, and troubleshooting. - Refreshes generated agent skills under `.agents/skills/` from the current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`. ## Source summary - #4037 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents system-only runtime context that stays out of visible chat. - #4875 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents try-first sandbox network/filesystem guidance and clearer failure classification. - #4788 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents shared OpenClaw device-approval policy for startup and connect. - #4768 -> `docs/reference/network-policies.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`: Documents `weather`, `public-reference`, and Hermes managed-tool gateway preset behavior. - #3788 and #4864 -> `docs/reference/network-policies.mdx`, `docs/reference/commands.mdx`: Documents non-interactive policy-tier fail-fast behavior and interactive prompt fallback. - #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware default sandbox resolution for `list`, `status`, and `tunnel` commands. - #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel status` behavior. - #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy preset descriptions in `policy-list`. - #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents package-managed OpenShell gateway service and Docker-driver gateway-marker behavior. - #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent gateway/dashboard cleanup isolation by sandbox name and port. - #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU patch rollback behavior. - #4610 -> `docs/reference/troubleshooting.mdx`, `docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission guidance aligned and removes skipped experimental wording. - #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling for custom `onboard --from <Dockerfile>` contexts in generated skills. - #4870 -> `docs/reference/commands.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`: Documents `NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage. - #4641 -> `docs/inference/inference-options.mdx`, `docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM platform-digest pulls and served-model id adoption. - #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash coverage. - #4852 -> `docs/inference/use-local-inference.mdx`, `docs/reference/troubleshooting.mdx`: Documents Ollama model fit filtering, 16K context floor, cold-load retry, and failed-model exclusion. - #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents API-family sync, Hermes `api_mode`, and Bedrock Runtime exception. - #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents Nemotron managed-inference native tool-search fallback. - #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents interactive multimodal input prompting. - #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass normalization in generated troubleshooting coverage. - #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents prebuilt Hermes dashboard assets and TUI recovery without runtime rebuilds. - #4854 -> `docs/inference/switch-inference-providers.mdx`, `docs/reference/commands.mdx`: Documents Hermes proxy API-key placeholder preservation during inference switches. - #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`, `.agents/skills/`: Keeps messaging enrollment behavior aligned with manifest-hook implementation. - #4771 -> `docs/security/best-practices.mdx`, `docs/security/credential-storage.mdx`: Documents Hermes placeholder-only secret boundary for sandbox-visible runtime files. - #4787 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents expanded memory scanner examples for OpenAI project keys and Slack app-level tokens. - #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill install mirroring into the agent home directory. - #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep structure and generated `.agents/skills/` refresh as the template for this release. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run` - `npm run docs` - `git diff --check` - skip-term scan across `docs/`, `.agents/skills/`, and `skills/` - `npm run build:cli` - `npm run typecheck:cli` - Commit and pre-push hook suites, including markdownlint, gitleaks, env-var docs gate, docs-to-skills verification, and skills YAML tests <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * DeepSeek-V4-Flash now available as default inference model for DGX Station. * Hermes dashboard improved with dedicated port and OAuth-authenticated tool gateway selection. * Added weather and public-reference policy presets for expanded agent capabilities. * Enhanced Ollama model selection with GPU memory filtering and automatic retry for timeouts. * **Bug Fixes** * Improved policy tier validation to prevent invalid configurations. * Better sandbox cleanup scoping by port to prevent conflicts across deployments. * Added GPU patch failure recovery with automatic rollback. * **Documentation** * Expanded troubleshooting guides for inference, security, and sandbox lifecycle. * Added .dockerignore best practices for custom deployments. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Summary
terminal,file,code_execution,memory,session_search,delegation, andcronjobplatform_toolsets.cliunpinned and avoidno_mcp, so the fix does not disable default Hermes/MCP capability as the security control/sandbox/.hermes/.envor the startup process environment contains raw secret-shaped values (*_TOKEN,*_KEY,*_SECRET,*_API,*_PASSWORD,*_CREDENTIAL), while allowing OpenShell resolver placeholders and Slack SDK placeholder aliasesOPENCLAW_GATEWAY_TOKEN; known non-secret config names such asAPI_SERVER_PORT,API_SERVER_HOST,NEMOCLAW_INFERENCE_API, andNEMOCLAW_PROVIDER_KEYare explicitly allowlistedTOOL_GATEWAY_USER_TOKENsandbox env; Hermes now sends the attached OpenShell provider placeholder forNEMOCLAW_HERMES_TOOL_GATEWAY_REFRESH_TOKEN, and the host broker accepts the OpenShell-rewritten refresh credential while keeping raw OAuth state host-sidemainand preserve the new Hermesapi_moderouting added thereRegression story
No intentional Hermes tool regression remains in this PR. Slack, Discord, Telegram, WeChat/Weixin, WhatsApp, and the OpenAI-compatible API server keep the remote toolsets they previously had, including terminal, file, code execution, memory, session search, delegation, cron, and default MCP exposure. Managed tool presets still configure their backends; for example
nous-codekeepsterminal.backend=modal, andnous-audiostill addstts.The security boundary is enforced by keeping NemoClaw-managed secrets out of sandbox-visible files and startup env. A terminal prompt can still print resolver placeholders such as
openshell:resolve:env:*or Slack placeholder aliases, but it should not be able to print NemoClaw/OpenShell-managed raw credential values because those values are not placed in/sandbox/.hermes/.envor passed asTOOL_GATEWAY_USER_TOKENanymore. Startup also refuses raw secret-shaped process env values, closing the other obviousenv/printenvpath for terminal-enabled Hermes sandboxes.Startup rejects generic UUID-shaped values like the reported
DEVTEST_API_TOKENleak and also rejects bare*_APInames such asINTERNAL_API. The CD smoke asserts the built Hermes image preserves the expected remote toolsets, has no raw secret-shaped.envvalues, and refuses injected raw.envand startup-env secrets. Existing sandboxes still need rebuild/recreate to pick up the new generated config/startup behavior; this PR does not try to do emergency credential replacement.Boundary note: if code outside NemoClaw/OpenShell deliberately writes an arbitrary raw secret into a writable sandbox file after startup, Hermes can still echo that file. This PR closes the NemoClaw-managed paths implicated by #4770 and adds tests to prevent reintroducing that class through generated Hermes config, startup validation, managed-tool gateway auth, or the sandbox image workflow.
Tests
bash -n agents/hermes/start.sh test/e2e/test-hermes-sandbox-secret-boundary.shnpx vitest run test/hermes-start.test.ts test/generate-hermes-config.test.ts --testTimeout 60000npx vitest run test/hermes-plugin-handlers.test.ts test/hermes-tool-gateway-broker.test.ts test/onboard.test.ts --testTimeout 60000npm run checksnpm run lint(passes; reports one unrelated existing warning insrc/lib/onboard/child-exit-tracker.test.ts)npm run build:cliterminal,file,code_execution, memory/session/delegation/cron,tts, andterminal.backend=modal, while.envhas noTOOL_GATEWAY_USER_TOKENand no raw secret-shaped valuesFixes #4770.
Signed-off-by: Aaron Erickson aerickson@nvidia.com
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores