fix(debug): validate sandbox name and clean partial tarballs by laitingsheng · Pull Request #4506 · NVIDIA/NemoClaw

laitingsheng · 2026-05-29T07:29:55Z

Summary

nemoclaw debug --sandbox <unknown> --output FILE silently collected host diagnostics for a non-existent sandbox and wrote a non-empty (~8 KB) tarball with exit 0. Validation in runDebugCommandWithOptions only ran when --sandbox was omitted, so explicit names from --sandbox, NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, and SANDBOX_NAME bypassed both registry and live-gateway checks. Per T6029556 the command must exit non-zero, name the unknown sandbox, and leave no partial tarball.

Related Issue

Fixes #4494.

Changes

src/commands/debug.ts: new isSandboxKnown(name) mirroring getDefaultSandbox — requires the name to live in the local registry, and when openshell sandbox list succeeds it must also appear in the live gateway (rejects stale registry entries)
src/lib/diagnostics/debug-command.ts: validate any explicit name (flag or env var) before collection; documented precedence is --sandbox > NEMOCLAW_SANDBOX_NAME > NEMOCLAW_SANDBOX > SANDBOX_NAME; on miss, print an actionable error that names the sandbox and the source env var, then exit 1
src/lib/diagnostics/debug.ts: drop the redundant env-var re-read in runDebug so whitespace-only env values cannot bypass the wrapper's trim/validate, and trim the option name
src/lib/diagnostics/tarball.ts (new): atomic tarball — write to output.partial.<pid>, rename on success, and rmSync the partial (not the user's --output) on tar failure so a pre-existing file is preserved
docs/reference/commands.mdx: document the explicit-name precedence, the registry + live-gateway validation contract, the unknown/stale exit-non-zero behaviour, and the atomic tarball semantics
Unit tests: explicit-name validation (flag + env), source labelling, NEMOCLAW_SANDBOX_NAME precedence, flag-over-env, default fallback, exit(1) assertion on env failure, atomic tarball preservation of pre-existing files, partial cleanup on failure
test/cli.test.ts: createDebugCommandTestEnv registers its env-sourced sandbox plus any extraSandboxNames, and the fake openshell sandbox list now emits those names so the live check passes; new CLI cases cover unknown explicit name (exits non-zero, no tarball) and stale registry entry (registry-known but missing from the live list → rejected)
test/e2e/test-diagnostics.sh: TC-DIAG-06 confirms a registered --sandbox succeeds and an unknown name (per-run unique to avoid shared-env collisions) exits non-zero, names the sandbox, says "not registered", and leaves no archive

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

New Features
- Debug command: explicit sandbox can come from flag or env (clear precedence); names are validated against both the local registry and live gateway, errors cite the env key or name and exit non-zero.
Bug Fixes
- Tarball creation is atomic (PID-suffixed partial then rename), preserves pre-existing files, cleans partials on failure, sets non-zero exit state, and logs warnings (auto-redaction, attach-to-issue guidance).
Tests
- Expanded unit and E2E coverage for sandbox resolution/validation, error messaging, and tarball cleanup.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai · 2026-05-29T07:30:08Z

📝 Walkthrough

Walkthrough

Adds pre-collection sandbox-name resolution/validation (flag → NEMOCLAW_SANDBOX_NAME → NEMOCLAW_SANDBOX → SANDBOX_NAME → default), exits with error on unregistered sandboxes, uses atomic partial tarball writes with cleanup on failure, and extends unit, integration, and e2e tests.

Changes

Sandbox validation and error handling for debug command

Layer / File(s)	Summary
Sandbox validation infra & runtime changes `src/commands/debug.ts`, `src/lib/diagnostics/debug-command.ts`, `src/lib/diagnostics/debug.ts`	Adds `isSandboxKnown` to command deps, extends `RunDebugCommandDeps` with `env`, `errorLine`, and `exit` injection, adds `resolveExplicitName` (flag vs env precedence), validates explicit names with `isSandboxKnown`, and updates `runDebug` option precedence.
Unit tests: sandbox selection & error paths `src/lib/diagnostics/debug-command.test.ts`	Adds tests for: accepting registered explicit sandbox, rejecting unregistered explicit sandbox (non-zero exit + error text mentioning name and `nemoclaw list`), validating env-sourced names and reporting the env key, env precedence (NEMOCLAW_SANDBOX_NAME over others), flag overriding env, and fallback to `getDefaultSandbox`.
Tarball creation, atomic write and cleanup `src/lib/diagnostics/tarball.ts`, `src/lib/diagnostics/debug.ts`, `src/lib/diagnostics/debug.test.ts`	Introduces `createTarball` that writes to a PID-suffixed partial path, runs `tar`, renames atomically on success; on tar or rename failure it logs, attempts best-effort removal of partial files, sets `process.exitCode = 1`, and returns `false`. Tests verify cleanup and preservation of pre-existing output.
Integration and E2E test updates `test/cli.test.ts`, `test/e2e/test-diagnostics.sh`	Test helper can pre-register sandbox names; CLI tests added/updated to assert `debug --sandbox` accepts registered names, rejects unknown/stale names without producing partial tarballs, and suppresses stale-default warnings when appropriate; E2E adds `TC-DIAG-06` for these cases.
Docs: debug command behavior `docs/reference/commands.mdx`	Documents atomic tarball write strategy and explicit sandbox validation semantics (flag/env precedence, live-gateway check, non-zero exit on invalid/stale names).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

NemoClaw CLI, Sandbox

Suggested reviewers

cv
ericksoa

Poem

🐰 I peek at env and flag before I start the run,
If a sandbox's unknown, I stop — no tar is spun,
I write my tar to partial, then rename it right,
If things go sideways, I clean up in the night,
Tests hop in line — diagnostics tidy, done! 🎩

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the primary fixes: sandbox name validation and partial tarball cleanup.
Linked Issues check	✅ Passed	All coding requirements from issue `#4494` are met: sandbox validation exits non-zero, actionable error names the unknown sandbox, and partial tarballs are cleaned up.
Out of Scope Changes check	✅ Passed	All changes directly support sandbox validation and tarball cleanup requirements. Documentation and test harness updates are necessary supporting changes.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/4494-debug-validate-sandbox

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-29T07:32:00Z

E2E Advisor Recommendation

Required E2E: diagnostics-e2e
Optional E2E: docs-validation-e2e

Dispatch hint: diagnostics-e2e

Auto-dispatched E2E: diagnostics-e2e via nightly-e2e.yaml at 362c353edbd2eed75c356176134adb3a6b9a0782 — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

diagnostics-e2e (medium-high; onboards a sandbox and requires Docker plus NVIDIA_API_KEY): Directly covers the changed nemoclaw debug runtime flow, including quick/full diagnostics, tarball creation and sanitization, registered sandbox selection, unknown sandbox rejection, and credentials list/reset after onboarding a real sandbox.

Optional E2E

docs-validation-e2e (medium; installs NemoClaw and runs docs validation without remote link checks): Useful to verify the updated command reference remains in sync with CLI/help documentation and docs validation expectations, but the runtime risk is already covered by diagnostics-e2e.

New E2E recommendations

diagnostics-debug-env-sandbox-selection (medium): Unit tests cover NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, and SANDBOX_NAME precedence and rejection, but the diagnostics E2E only exercises the --sandbox flag path. A future E2E could validate one env-sourced success and one env-sourced unknown-name failure against a real onboarded sandbox.
- Suggested test: Add a TC-DIAG case to test/e2e/test-diagnostics.sh for env-sourced sandbox name precedence and rejection before tarball creation.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: diagnostics-e2e

github-actions · 2026-05-29T07:32:01Z

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

None.

Relevant changed files

None.

github-actions · 2026-05-29T07:33:54Z

PR Review Advisor

Findings: 0 needs attention, 2 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 0 still apply, 1 new item found

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

Source-of-truth review needed: src/lib/diagnostics/tarball.ts partial tarball cleanup: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `tarball.ts` computes `const partial = `${output}.partial.${process.pid}``, catches tar/rename failures, and best-effort removes the partial path.
Use an exclusive, unpredictable partial tarball path (src/lib/diagnostics/tarball.ts:26): The tarball helper writes to a predictable sibling path, `${output}.partial.${process.pid}`, before renaming it into place. In a shared writable destination directory, another local process could pre-create that path, including as a symlink on permissive filesystems, before `tar czf` opens it. That leaves a local TOCTOU/file-clobber risk and also leaves the rename-failure cleanup path only indirectly covered.
- Recommendation: Create the temporary tarball with an exclusive, unpredictable name in the destination directory, such as a `mkdtemp`/`mkstemp`-style path, or otherwise ensure `O_CREAT|O_EXCL|O_NOFOLLOW` semantics before passing the path to tar. Add regression coverage for pre-existing partial-path collision or symlink-like collision behavior and for cleanup after `renameSync` failure.
- Evidence: `src/lib/diagnostics/tarball.ts:26` derives `partial` directly from user-supplied `output` and `process.pid`; tests cover tar failure cleanup but not pre-existing partial collisions or rename failure cleanup.

🌱 Nice ideas

None.

Since last review details

Current findings:

Source-of-truth review needed: src/lib/diagnostics/tarball.ts partial tarball cleanup: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `tarball.ts` computes `const partial = `${output}.partial.${process.pid}``, catches tar/rename failures, and best-effort removes the partial path.
Use an exclusive, unpredictable partial tarball path (src/lib/diagnostics/tarball.ts:26): The tarball helper writes to a predictable sibling path, `${output}.partial.${process.pid}`, before renaming it into place. In a shared writable destination directory, another local process could pre-create that path, including as a symlink on permissive filesystems, before `tar czf` opens it. That leaves a local TOCTOU/file-clobber risk and also leaves the rename-failure cleanup path only indirectly covered.
- Recommendation: Create the temporary tarball with an exclusive, unpredictable name in the destination directory, such as a `mkdtemp`/`mkstemp`-style path, or otherwise ensure `O_CREAT|O_EXCL|O_NOFOLLOW` semantics before passing the path to tar. Add regression coverage for pre-existing partial-path collision or symlink-like collision behavior and for cleanup after `renameSync` failure.
- Evidence: `src/lib/diagnostics/tarball.ts:26` derives `partial` directly from user-supplied `output` and `process.pid`; tests cover tar failure cleanup but not pre-existing partial collisions or rename failure cleanup.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

github-actions · 2026-05-29T07:42:17Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26624434917
Target ref: 16d5c158d91b4862981171b47d4bb4f2e128f61c
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
diagnostics-e2e	✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai

🧹 Nitpick comments (2)

src/lib/diagnostics/debug-command.test.ts (1)

72-88: ⚡ Quick win

Assert the exit status in the env-sourced failure test.

This case verifies throw/message, but not that the exit code is 1, which is part of the new behavior contract.

Patch suggestion

     ).toThrow("exit");
+    expect(exit).toHaveBeenCalledWith(1);
     expect(runDebug).not.toHaveBeenCalled();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/diagnostics/debug-command.test.ts` around lines 72 - 88, The test for
env-sourced failure should also assert the numeric exit status: after calling
runDebugCommandWithOptions (which currently expects to throw "exit"), add an
assertion that the mocked exit function was called with 1 (e.g.,
expect(exit).toHaveBeenCalledWith(1)) to verify the new behavior contract;
locate the assertions around runDebugCommandWithOptions, runDebug, errorLines
and the exit mock in debug-command.test.ts and add the exit call assertion
immediately after the existing toThrow and runDebug checks.

test/e2e/test-diagnostics.sh (1)

317-320: ⚡ Quick win

Use a per-run unknown sandbox name to reduce e2e flake risk.

A hard-coded “unknown” name can collide in persistent/shared environments. A unique name per run keeps this failure-path assertion deterministic.

Patch suggestion

-  local bad_name="nemoclaw-e2e-does-not-exist"
+  local bad_name="nemoclaw-e2e-unknown-$(date +%s)-$$"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-diagnostics.sh` around lines 317 - 320, Replace the hard-coded
sandbox name in the failing test so it is unique per run: instead of setting
bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier
(e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name
before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and
variables bad_output, bad_rc, bad_log should be unchanged); this prevents
collisions in shared/persistent test environments while keeping the failure-path
assertion deterministic.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/diagnostics/debug-command.test.ts`:
- Around line 72-88: The test for env-sourced failure should also assert the
numeric exit status: after calling runDebugCommandWithOptions (which currently
expects to throw "exit"), add an assertion that the mocked exit function was
called with 1 (e.g., expect(exit).toHaveBeenCalledWith(1)) to verify the new
behavior contract; locate the assertions around runDebugCommandWithOptions,
runDebug, errorLines and the exit mock in debug-command.test.ts and add the exit
call assertion immediately after the existing toThrow and runDebug checks.

In `@test/e2e/test-diagnostics.sh`:
- Around line 317-320: Replace the hard-coded sandbox name in the failing test
so it is unique per run: instead of setting
bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier
(e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name
before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and
variables bad_output, bad_rc, bad_log should be unchanged); this prevents
collisions in shared/persistent test environments while keeping the failure-path
assertion deterministic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3ec46952-41f8-4bbc-a7b8-03829f3ba686

📥 Commits

Reviewing files that changed from the base of the PR and between 16d5c15 and 5c545c8.

📒 Files selected for processing (6)

src/lib/diagnostics/debug-command.test.ts
src/lib/diagnostics/debug-command.ts
src/lib/diagnostics/debug.test.ts
src/lib/diagnostics/debug.ts
test/cli.test.ts
test/e2e/test-diagnostics.sh

🚧 Files skipped from review as they are similar to previous changes (3)

src/lib/diagnostics/debug.test.ts
test/cli.test.ts
src/lib/diagnostics/debug-command.ts

github-actions · 2026-05-29T08:03:53Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26625352549
Target ref: 5c545c82dbd20969134a0ed5b536d445ed0673e8
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
diagnostics-e2e	✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

…ract Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-29T08:45:08Z

🌿 Preview your docs: https://nvidia-preview-pr-4506.docs.buildwithfern.com/nemoclaw

coderabbitai

🧹 Nitpick comments (2)

docs/reference/commands.mdx (2)

1149-1149: ⚡ Quick win

Break into separate lines and use active voice.

This paragraph has three sentences on one line, which makes diffs harder to review. It also uses passive constructions ("is supplied", "is written").

Suggested revision (one sentence per line, active voice):

-When `--sandbox` is supplied explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway. An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and no tarball is written. Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox (and warns if that default is stale).
+When you supply `--sandbox` explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway.
+An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and NemoClaw writes no tarball.
+Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox and warns if that default is stale.

As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line." and "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.mdx` at line 1149, Split the paragraph into three
separate lines (one sentence per line) and rewrite in active voice: first line
state that when the --sandbox flag or one of the env vars
(NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME) is provided the CLI
requires the name to match a registered sandbox (note that the flag takes
precedence then env vars in that order); second line state that an unknown or
stale sandbox name causes the command to exit non-zero, log an actionable error
naming the sandbox and the source env var when applicable, and prevent writing
any tarball (use active verbs like "exit", "log", "prevent"); third line state
that when no explicit name is provided, nemoclaw debug falls back to the
registry's default sandbox and warns if that default is stale (use "falls back"
and "warns" as active verbs) and ensure each sentence is on its own source line
to satisfy the one-sentence-per-line guideline.

1147-1147: ⚡ Quick win

Use active voice.

The sentence uses passive constructions ("is written", "is preserved").

Suggested revision:

-The tarball is written to a temporary sibling and renamed on success, so a pre-existing file at `--output` is preserved when `tar` fails.
+NemoClaw writes the tarball to a temporary sibling and renames it on success, so `tar` failures preserve any pre-existing file at `--output`.

As per coding guidelines: Active voice required. Flag passive constructions.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.mdx` at line 1147, Rewrite the passive sentence to
active voice by naming the actor and using active verbs: change "The tarball is
written to a temporary sibling and renamed on success, so a pre-existing file at
`--output` is preserved when `tar` fails." to something like "We (or the
command) write the tarball to a temporary sibling and rename it on success, so a
pre-existing file at `--output` remains untouched if `tar` fails," ensuring you
use `tar` and `--output` exactly as shown.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/reference/commands.mdx`:
- Line 1149: Split the paragraph into three separate lines (one sentence per
line) and rewrite in active voice: first line state that when the --sandbox flag
or one of the env vars (NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME)
is provided the CLI requires the name to match a registered sandbox (note that
the flag takes precedence then env vars in that order); second line state that
an unknown or stale sandbox name causes the command to exit non-zero, log an
actionable error naming the sandbox and the source env var when applicable, and
prevent writing any tarball (use active verbs like "exit", "log", "prevent");
third line state that when no explicit name is provided, nemoclaw debug falls
back to the registry's default sandbox and warns if that default is stale (use
"falls back" and "warns" as active verbs) and ensure each sentence is on its own
source line to satisfy the one-sentence-per-line guideline.
- Line 1147: Rewrite the passive sentence to active voice by naming the actor
and using active verbs: change "The tarball is written to a temporary sibling
and renamed on success, so a pre-existing file at `--output` is preserved when
`tar` fails." to something like "We (or the command) write the tarball to a
temporary sibling and rename it on success, so a pre-existing file at `--output`
remains untouched if `tar` fails," ensuring you use `tar` and `--output` exactly
as shown.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2fa46822-2c0a-4937-996e-5cc59128b0fe

📥 Commits

Reviewing files that changed from the base of the PR and between 65ffe14 and 362c353.

📒 Files selected for processing (4)

docs/reference/commands.mdx
src/lib/diagnostics/debug.ts
src/lib/diagnostics/tarball.ts
test/cli.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

test/cli.test.ts

github-actions · 2026-05-29T08:56:21Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26627617919
Target ref: 362c353edbd2eed75c356176134adb3a6b9a0782
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
diagnostics-e2e	✅ success

jyaunches

Approving — regression risk is minimal here.

Why low regression risk:

Blast radius is confined to nemoclaw debug and src/lib/diagnostics/. No runtime, sandbox lifecycle, gateway, or install paths are touched.
The new live-gateway check degrades safely: if openshell sandbox list fails, validation falls back to registry-only, so a gateway outage won't break debug for already-registered sandboxes.
Atomic tarball uses same-directory partial + renameSync, so no cross-filesystem EXDEV surprise.
The behavior change (unknown --sandbox now exits 1) is the intended fix for T6029556; the old silent-success path was already broken.
diagnostics-e2e is green on the latest commit, including TC-DIAG-06 covering both registered-success and unknown-name-rejection. Unit tests cover env precedence, flag-over-env, default fallback, exit code, and tarball cleanup.

Follow-up (non-blocking): the PR Review Advisor's note about the predictable ${output}.partial.${process.pid} path is a hardening idea, not a regression risk — worth a follow-up issue to switch to an mkdtemp-style exclusive path and add a regression test for pre-existing-collision and rename-failure cleanup, but it shouldn't gate this fix.

## Summary - Adds the v0.0.56 release notes section with links to the deeper docs pages for installer, status, inference, messaging, policy, and lifecycle changes. - Updates source docs for the remaining release-prep gaps around `uv` in the PyPI preset, compact WhatsApp pairing guidance, and `nemoclaw inference set` command boundaries. - Refreshes generated `nemoclaw-user-*` skills and removes skipped experimental command terms from generated skill surfaces. ## Source summary - #4613 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents that public installs and `nemoclaw update` follow the maintained `lkg` tag by default. - #4419 -> `docs/about/release-notes.mdx`: Notes that non-interactive Linux installs can reactivate Docker group membership and continue in one installer run when `sg docker` is available. - #4550 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures live sandbox agent-version probing for status, connect, and upgrade checks. - #4609 -> `docs/inference/use-local-inference.mdx`, `docs/about/release-notes.mdx`: Captures the GPU Docker-driver host-network local-inference reachability gate. - #4607 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents compact WhatsApp QR pairing guidance and gateway/session diagnostics. - #4582 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Reflects Slack credential validation before enabling the channel. - #4554 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/troubleshooting.mdx`, `docs/about/release-notes.mdx`: Keeps Telegram allowlist alias guidance in the generated user skills and release notes. - #4563 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Includes the new `nemoclaw <name> skill remove <skill>` command in command docs and release notes. - #4566 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents the `nemoclaw inference set` redirect boundary when `--provider` or `--model` is missing. - #4323 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures per-sandbox status JSON support. - #4506 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures debug command sandbox-name validation and safer tarball writing. - #4569 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Documents that the `pypi` preset allows `/usr/local/bin/uv`. - #4579 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Captures observable Jira preset validation guidance. - #4229 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents user-data preservation defaults for uninstall. - #4399 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures CPU-only sandbox intent preservation across rebuilds. - #4058 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures safer snapshot restore behavior around existing destinations. - #4155 and #4460 -> skipped by `docs/.docs-skip`: Removed skipped experimental command terms from source docs and generated skill evals instead of documenting those features. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `npm run docs` (passes; Fern reports the pre-existing light-mode accent contrast warning) - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills` (no matches) - `npm run build:cli` (run to refresh local CLI artifacts for the pre-push TypeScript hook) - Commit hooks passed, including `NEMOCLAW_* env-var documentation gate`, `Verify docs-to-skills output`, `markdownlint-cli2`, `gitleaks`, and `Test (skills YAML)`.  ## Summary by CodeRabbit * **Documentation** * Expanded Model Router setup with YAML examples, flow diagrams, and credential handling; strengthened agent-config immutability and integrity guidance; messaging channels updated (Telegram aliases, WhatsApp pairing/diagnostics); CLI docs revised (GPU detection, inference set behavior, uninstall/rebuild preservation); overview rebranded to NemoClaw and added v0.0.56 release notes. * **New Features** * Added `nemoclaw <name> channels status` (messaging diagnostics, JSON); added `nemoclaw <name> skill remove`; Hermes no longer marked experimental; DGX Spark quickstart sandbox-name note.

fix(debug): validate sandbox name and clean partial tarballs

16d5c15

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

laitingsheng added the fix label May 29, 2026

fix(debug): honour NEMOCLAW_SANDBOX_NAME and add diagnostics e2e

5c545c8

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

laitingsheng added 2 commits May 29, 2026 08:19

fix(debug): live-list check, atomic tarball, e2e per-run name

65ffe14

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

fix(debug): extract tarball helper, drop env fallback drift, doc cont…

362c353

…ract Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

laitingsheng added v0.0.55 Release target Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. labels May 29, 2026

jyaunches self-requested a review May 29, 2026 17:30

jyaunches approved these changes May 29, 2026

View reviewed changes

jyaunches added R2 v0.0.56 Release target and removed v0.0.55 Release target R2 labels May 29, 2026

cv merged commit f8c85f3 into main May 30, 2026
36 checks passed

cv deleted the fix/4494-debug-validate-sandbox branch May 30, 2026 18:18

miyoungc mentioned this pull request Jun 1, 2026

docs: refresh 0.0.56 release documentation #4618

Merged

Conversation

laitingsheng commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-actions Bot commented May 29, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 29, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 29, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

jyaunches left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

laitingsheng commented May 29, 2026 •

edited

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading