fix(debug): validate sandbox name and clean partial tarballs#4506
Conversation
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
📝 WalkthroughWalkthroughAdds pre-collection sandbox-name resolution/validation (flag → NEMOCLAW_SANDBOX_NAME → NEMOCLAW_SANDBOX → SANDBOX_NAME → default), exits with error on unregistered sandboxes, uses atomic partial tarball writes with cleanup on failure, and extends unit, integration, and e2e tests. ChangesSandbox validation and error handling for debug command
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 2 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
Selective E2E Results — ✅ All requested jobs passedRun: 26624434917
|
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/lib/diagnostics/debug-command.test.ts (1)
72-88: ⚡ Quick winAssert the exit status in the env-sourced failure test.
This case verifies throw/message, but not that the exit code is
1, which is part of the new behavior contract.Patch suggestion
).toThrow("exit"); + expect(exit).toHaveBeenCalledWith(1); expect(runDebug).not.toHaveBeenCalled();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/diagnostics/debug-command.test.ts` around lines 72 - 88, The test for env-sourced failure should also assert the numeric exit status: after calling runDebugCommandWithOptions (which currently expects to throw "exit"), add an assertion that the mocked exit function was called with 1 (e.g., expect(exit).toHaveBeenCalledWith(1)) to verify the new behavior contract; locate the assertions around runDebugCommandWithOptions, runDebug, errorLines and the exit mock in debug-command.test.ts and add the exit call assertion immediately after the existing toThrow and runDebug checks.test/e2e/test-diagnostics.sh (1)
317-320: ⚡ Quick winUse a per-run unknown sandbox name to reduce e2e flake risk.
A hard-coded “unknown” name can collide in persistent/shared environments. A unique name per run keeps this failure-path assertion deterministic.
Patch suggestion
- local bad_name="nemoclaw-e2e-does-not-exist" + local bad_name="nemoclaw-e2e-unknown-$(date +%s)-$$"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-diagnostics.sh` around lines 317 - 320, Replace the hard-coded sandbox name in the failing test so it is unique per run: instead of setting bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier (e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and variables bad_output, bad_rc, bad_log should be unchanged); this prevents collisions in shared/persistent test environments while keeping the failure-path assertion deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/diagnostics/debug-command.test.ts`:
- Around line 72-88: The test for env-sourced failure should also assert the
numeric exit status: after calling runDebugCommandWithOptions (which currently
expects to throw "exit"), add an assertion that the mocked exit function was
called with 1 (e.g., expect(exit).toHaveBeenCalledWith(1)) to verify the new
behavior contract; locate the assertions around runDebugCommandWithOptions,
runDebug, errorLines and the exit mock in debug-command.test.ts and add the exit
call assertion immediately after the existing toThrow and runDebug checks.
In `@test/e2e/test-diagnostics.sh`:
- Around line 317-320: Replace the hard-coded sandbox name in the failing test
so it is unique per run: instead of setting
bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier
(e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name
before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and
variables bad_output, bad_rc, bad_log should be unchanged); this prevents
collisions in shared/persistent test environments while keeping the failure-path
assertion deterministic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3ec46952-41f8-4bbc-a7b8-03829f3ba686
📒 Files selected for processing (6)
src/lib/diagnostics/debug-command.test.tssrc/lib/diagnostics/debug-command.tssrc/lib/diagnostics/debug.test.tssrc/lib/diagnostics/debug.tstest/cli.test.tstest/e2e/test-diagnostics.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- src/lib/diagnostics/debug.test.ts
- test/cli.test.ts
- src/lib/diagnostics/debug-command.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26625352549
|
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
…ract Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
|
🌿 Preview your docs: https://nvidia-preview-pr-4506.docs.buildwithfern.com/nemoclaw |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
docs/reference/commands.mdx (2)
1149-1149: ⚡ Quick winBreak into separate lines and use active voice.
This paragraph has three sentences on one line, which makes diffs harder to review. It also uses passive constructions ("is supplied", "is written").
Suggested revision (one sentence per line, active voice):
-When `--sandbox` is supplied explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway. An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and no tarball is written. Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox (and warns if that default is stale). +When you supply `--sandbox` explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway. +An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and NemoClaw writes no tarball. +Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox and warns if that default is stale.As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line." and "Active voice required. Flag passive constructions."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/commands.mdx` at line 1149, Split the paragraph into three separate lines (one sentence per line) and rewrite in active voice: first line state that when the --sandbox flag or one of the env vars (NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME) is provided the CLI requires the name to match a registered sandbox (note that the flag takes precedence then env vars in that order); second line state that an unknown or stale sandbox name causes the command to exit non-zero, log an actionable error naming the sandbox and the source env var when applicable, and prevent writing any tarball (use active verbs like "exit", "log", "prevent"); third line state that when no explicit name is provided, nemoclaw debug falls back to the registry's default sandbox and warns if that default is stale (use "falls back" and "warns" as active verbs) and ensure each sentence is on its own source line to satisfy the one-sentence-per-line guideline.
1147-1147: ⚡ Quick winUse active voice.
The sentence uses passive constructions ("is written", "is preserved").
Suggested revision:
-The tarball is written to a temporary sibling and renamed on success, so a pre-existing file at `--output` is preserved when `tar` fails. +NemoClaw writes the tarball to a temporary sibling and renames it on success, so `tar` failures preserve any pre-existing file at `--output`.As per coding guidelines: Active voice required. Flag passive constructions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/commands.mdx` at line 1147, Rewrite the passive sentence to active voice by naming the actor and using active verbs: change "The tarball is written to a temporary sibling and renamed on success, so a pre-existing file at `--output` is preserved when `tar` fails." to something like "We (or the command) write the tarball to a temporary sibling and rename it on success, so a pre-existing file at `--output` remains untouched if `tar` fails," ensuring you use `tar` and `--output` exactly as shown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/reference/commands.mdx`:
- Line 1149: Split the paragraph into three separate lines (one sentence per
line) and rewrite in active voice: first line state that when the --sandbox flag
or one of the env vars (NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME)
is provided the CLI requires the name to match a registered sandbox (note that
the flag takes precedence then env vars in that order); second line state that
an unknown or stale sandbox name causes the command to exit non-zero, log an
actionable error naming the sandbox and the source env var when applicable, and
prevent writing any tarball (use active verbs like "exit", "log", "prevent");
third line state that when no explicit name is provided, nemoclaw debug falls
back to the registry's default sandbox and warns if that default is stale (use
"falls back" and "warns" as active verbs) and ensure each sentence is on its own
source line to satisfy the one-sentence-per-line guideline.
- Line 1147: Rewrite the passive sentence to active voice by naming the actor
and using active verbs: change "The tarball is written to a temporary sibling
and renamed on success, so a pre-existing file at `--output` is preserved when
`tar` fails." to something like "We (or the command) write the tarball to a
temporary sibling and rename it on success, so a pre-existing file at `--output`
remains untouched if `tar` fails," ensuring you use `tar` and `--output` exactly
as shown.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2fa46822-2c0a-4937-996e-5cc59128b0fe
📒 Files selected for processing (4)
docs/reference/commands.mdxsrc/lib/diagnostics/debug.tssrc/lib/diagnostics/tarball.tstest/cli.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/cli.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26627617919
|
jyaunches
left a comment
There was a problem hiding this comment.
Approving — regression risk is minimal here.
Why low regression risk:
- Blast radius is confined to
nemoclaw debugandsrc/lib/diagnostics/. No runtime, sandbox lifecycle, gateway, or install paths are touched. - The new live-gateway check degrades safely: if
openshell sandbox listfails, validation falls back to registry-only, so a gateway outage won't breakdebugfor already-registered sandboxes. - Atomic tarball uses same-directory partial +
renameSync, so no cross-filesystemEXDEVsurprise. - The behavior change (unknown
--sandboxnow exits 1) is the intended fix for T6029556; the old silent-success path was already broken. diagnostics-e2eis green on the latest commit, including TC-DIAG-06 covering both registered-success and unknown-name-rejection. Unit tests cover env precedence, flag-over-env, default fallback, exit code, and tarball cleanup.
Follow-up (non-blocking): the PR Review Advisor's note about the predictable ${output}.partial.${process.pid} path is a hardening idea, not a regression risk — worth a follow-up issue to switch to an mkdtemp-style exclusive path and add a regression test for pre-existing-collision and rename-failure cleanup, but it shouldn't gate this fix.
## Summary - Adds the v0.0.56 release notes section with links to the deeper docs pages for installer, status, inference, messaging, policy, and lifecycle changes. - Updates source docs for the remaining release-prep gaps around `uv` in the PyPI preset, compact WhatsApp pairing guidance, and `nemoclaw inference set` command boundaries. - Refreshes generated `nemoclaw-user-*` skills and removes skipped experimental command terms from generated skill surfaces. ## Source summary - #4613 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents that public installs and `nemoclaw update` follow the maintained `lkg` tag by default. - #4419 -> `docs/about/release-notes.mdx`: Notes that non-interactive Linux installs can reactivate Docker group membership and continue in one installer run when `sg docker` is available. - #4550 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures live sandbox agent-version probing for status, connect, and upgrade checks. - #4609 -> `docs/inference/use-local-inference.mdx`, `docs/about/release-notes.mdx`: Captures the GPU Docker-driver host-network local-inference reachability gate. - #4607 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents compact WhatsApp QR pairing guidance and gateway/session diagnostics. - #4582 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Reflects Slack credential validation before enabling the channel. - #4554 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/troubleshooting.mdx`, `docs/about/release-notes.mdx`: Keeps Telegram allowlist alias guidance in the generated user skills and release notes. - #4563 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Includes the new `nemoclaw <name> skill remove <skill>` command in command docs and release notes. - #4566 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents the `nemoclaw inference set` redirect boundary when `--provider` or `--model` is missing. - #4323 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures per-sandbox status JSON support. - #4506 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures debug command sandbox-name validation and safer tarball writing. - #4569 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Documents that the `pypi` preset allows `/usr/local/bin/uv`. - #4579 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Captures observable Jira preset validation guidance. - #4229 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents user-data preservation defaults for uninstall. - #4399 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures CPU-only sandbox intent preservation across rebuilds. - #4058 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures safer snapshot restore behavior around existing destinations. - #4155 and #4460 -> skipped by `docs/.docs-skip`: Removed skipped experimental command terms from source docs and generated skill evals instead of documenting those features. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `npm run docs` (passes; Fern reports the pre-existing light-mode accent contrast warning) - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills` (no matches) - `npm run build:cli` (run to refresh local CLI artifacts for the pre-push TypeScript hook) - Commit hooks passed, including `NEMOCLAW_* env-var documentation gate`, `Verify docs-to-skills output`, `markdownlint-cli2`, `gitleaks`, and `Test (skills YAML)`. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Expanded Model Router setup with YAML examples, flow diagrams, and credential handling; strengthened agent-config immutability and integrity guidance; messaging channels updated (Telegram aliases, WhatsApp pairing/diagnostics); CLI docs revised (GPU detection, inference set behavior, uninstall/rebuild preservation); overview rebranded to NemoClaw and added v0.0.56 release notes. * **New Features** * Added `nemoclaw <name> channels status` (messaging diagnostics, JSON); added `nemoclaw <name> skill remove`; Hermes no longer marked experimental; DGX Spark quickstart sandbox-name note. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
nemoclaw debug --sandbox <unknown> --output FILEsilently collected host diagnostics for a non-existent sandbox and wrote a non-empty (~8 KB) tarball with exit 0. Validation inrunDebugCommandWithOptionsonly ran when--sandboxwas omitted, so explicit names from--sandbox,NEMOCLAW_SANDBOX_NAME,NEMOCLAW_SANDBOX, andSANDBOX_NAMEbypassed both registry and live-gateway checks. Per T6029556 the command must exit non-zero, name the unknown sandbox, and leave no partial tarball.Related Issue
Fixes #4494.
Changes
src/commands/debug.ts: newisSandboxKnown(name)mirroringgetDefaultSandbox— requires the name to live in the local registry, and whenopenshell sandbox listsucceeds it must also appear in the live gateway (rejects stale registry entries)src/lib/diagnostics/debug-command.ts: validate any explicit name (flag or env var) before collection; documented precedence is--sandbox>NEMOCLAW_SANDBOX_NAME>NEMOCLAW_SANDBOX>SANDBOX_NAME; on miss, print an actionable error that names the sandbox and the source env var, then exit 1src/lib/diagnostics/debug.ts: drop the redundant env-var re-read inrunDebugso whitespace-only env values cannot bypass the wrapper's trim/validate, and trim the option namesrc/lib/diagnostics/tarball.ts(new): atomic tarball — write tooutput.partial.<pid>, rename on success, andrmSyncthe partial (not the user's--output) ontarfailure so a pre-existing file is preserveddocs/reference/commands.mdx: document the explicit-name precedence, the registry + live-gateway validation contract, the unknown/stale exit-non-zero behaviour, and the atomic tarball semanticsNEMOCLAW_SANDBOX_NAMEprecedence, flag-over-env, default fallback, exit(1) assertion on env failure, atomic tarball preservation of pre-existing files, partial cleanup on failuretest/cli.test.ts:createDebugCommandTestEnvregisters its env-sourced sandbox plus anyextraSandboxNames, and the fakeopenshell sandbox listnow emits those names so the live check passes; new CLI cases cover unknown explicit name (exits non-zero, no tarball) and stale registry entry (registry-known but missing from the live list → rejected)test/e2e/test-diagnostics.sh:TC-DIAG-06confirms a registered--sandboxsucceeds and an unknown name (per-run unique to avoid shared-env collisions) exits non-zero, names the sandbox, says "not registered", and leaves no archiveType of Change
Verification
npx prek run --all-filespassesnpm testpassesnpm run docsbuilds without warnings (doc changes only)Signed-off-by: Tinson Lai tinsonl@nvidia.com
Summary by CodeRabbit
New Features
Bug Fixes
Tests