Skip to content

fix(debug): validate sandbox name and clean partial tarballs#4506

Merged
cv merged 4 commits into
mainfrom
fix/4494-debug-validate-sandbox
May 30, 2026
Merged

fix(debug): validate sandbox name and clean partial tarballs#4506
cv merged 4 commits into
mainfrom
fix/4494-debug-validate-sandbox

Conversation

@laitingsheng
Copy link
Copy Markdown
Contributor

@laitingsheng laitingsheng commented May 29, 2026

Summary

nemoclaw debug --sandbox <unknown> --output FILE silently collected host diagnostics for a non-existent sandbox and wrote a non-empty (~8 KB) tarball with exit 0. Validation in runDebugCommandWithOptions only ran when --sandbox was omitted, so explicit names from --sandbox, NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, and SANDBOX_NAME bypassed both registry and live-gateway checks. Per T6029556 the command must exit non-zero, name the unknown sandbox, and leave no partial tarball.

Related Issue

Fixes #4494.

Changes

  • src/commands/debug.ts: new isSandboxKnown(name) mirroring getDefaultSandbox — requires the name to live in the local registry, and when openshell sandbox list succeeds it must also appear in the live gateway (rejects stale registry entries)
  • src/lib/diagnostics/debug-command.ts: validate any explicit name (flag or env var) before collection; documented precedence is --sandbox > NEMOCLAW_SANDBOX_NAME > NEMOCLAW_SANDBOX > SANDBOX_NAME; on miss, print an actionable error that names the sandbox and the source env var, then exit 1
  • src/lib/diagnostics/debug.ts: drop the redundant env-var re-read in runDebug so whitespace-only env values cannot bypass the wrapper's trim/validate, and trim the option name
  • src/lib/diagnostics/tarball.ts (new): atomic tarball — write to output.partial.<pid>, rename on success, and rmSync the partial (not the user's --output) on tar failure so a pre-existing file is preserved
  • docs/reference/commands.mdx: document the explicit-name precedence, the registry + live-gateway validation contract, the unknown/stale exit-non-zero behaviour, and the atomic tarball semantics
  • Unit tests: explicit-name validation (flag + env), source labelling, NEMOCLAW_SANDBOX_NAME precedence, flag-over-env, default fallback, exit(1) assertion on env failure, atomic tarball preservation of pre-existing files, partial cleanup on failure
  • test/cli.test.ts: createDebugCommandTestEnv registers its env-sourced sandbox plus any extraSandboxNames, and the fake openshell sandbox list now emits those names so the live check passes; new CLI cases cover unknown explicit name (exits non-zero, no tarball) and stale registry entry (registry-known but missing from the live list → rejected)
  • test/e2e/test-diagnostics.sh: TC-DIAG-06 confirms a registered --sandbox succeeds and an unknown name (per-run unique to avoid shared-env collisions) exits non-zero, names the sandbox, says "not registered", and leaves no archive

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • New Features

    • Debug command: explicit sandbox can come from flag or env (clear precedence); names are validated against both the local registry and live gateway, errors cite the env key or name and exit non-zero.
  • Bug Fixes

    • Tarball creation is atomic (PID-suffixed partial then rename), preserves pre-existing files, cleans partials on failure, sets non-zero exit state, and logs warnings (auto-redaction, attach-to-issue guidance).
  • Tests

    • Expanded unit and E2E coverage for sandbox resolution/validation, error messaging, and tarball cleanup.

Review Change Stack

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

Adds pre-collection sandbox-name resolution/validation (flag → NEMOCLAW_SANDBOX_NAME → NEMOCLAW_SANDBOX → SANDBOX_NAME → default), exits with error on unregistered sandboxes, uses atomic partial tarball writes with cleanup on failure, and extends unit, integration, and e2e tests.

Changes

Sandbox validation and error handling for debug command

Layer / File(s) Summary
Sandbox validation infra & runtime changes
src/commands/debug.ts, src/lib/diagnostics/debug-command.ts, src/lib/diagnostics/debug.ts
Adds isSandboxKnown to command deps, extends RunDebugCommandDeps with env, errorLine, and exit injection, adds resolveExplicitName (flag vs env precedence), validates explicit names with isSandboxKnown, and updates runDebug option precedence.
Unit tests: sandbox selection & error paths
src/lib/diagnostics/debug-command.test.ts
Adds tests for: accepting registered explicit sandbox, rejecting unregistered explicit sandbox (non-zero exit + error text mentioning name and nemoclaw list), validating env-sourced names and reporting the env key, env precedence (NEMOCLAW_SANDBOX_NAME over others), flag overriding env, and fallback to getDefaultSandbox.
Tarball creation, atomic write and cleanup
src/lib/diagnostics/tarball.ts, src/lib/diagnostics/debug.ts, src/lib/diagnostics/debug.test.ts
Introduces createTarball that writes to a PID-suffixed partial path, runs tar, renames atomically on success; on tar or rename failure it logs, attempts best-effort removal of partial files, sets process.exitCode = 1, and returns false. Tests verify cleanup and preservation of pre-existing output.
Integration and E2E test updates
test/cli.test.ts, test/e2e/test-diagnostics.sh
Test helper can pre-register sandbox names; CLI tests added/updated to assert debug --sandbox accepts registered names, rejects unknown/stale names without producing partial tarballs, and suppresses stale-default warnings when appropriate; E2E adds TC-DIAG-06 for these cases.
Docs: debug command behavior
docs/reference/commands.mdx
Documents atomic tarball write strategy and explicit sandbox validation semantics (flag/env precedence, live-gateway check, non-zero exit on invalid/stale names).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

NemoClaw CLI, Sandbox

Suggested reviewers

  • cv
  • ericksoa

Poem

🐰 I peek at env and flag before I start the run,
If a sandbox's unknown, I stop — no tar is spun,
I write my tar to partial, then rename it right,
If things go sideways, I clean up in the night,
Tests hop in line — diagnostics tidy, done! 🎩

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary fixes: sandbox name validation and partial tarball cleanup.
Linked Issues check ✅ Passed All coding requirements from issue #4494 are met: sandbox validation exits non-zero, actionable error names the unknown sandbox, and partial tarballs are cleaned up.
Out of Scope Changes check ✅ Passed All changes directly support sandbox validation and tarball cleanup requirements. Documentation and test harness updates are necessary supporting changes.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4494-debug-validate-sandbox

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Advisor Recommendation

Required E2E: diagnostics-e2e
Optional E2E: docs-validation-e2e

Dispatch hint: diagnostics-e2e

Auto-dispatched E2E: diagnostics-e2e via nightly-e2e.yaml at 362c353edbd2eed75c356176134adb3a6b9a0782nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • diagnostics-e2e (medium-high; onboards a sandbox and requires Docker plus NVIDIA_API_KEY): Directly covers the changed nemoclaw debug runtime flow, including quick/full diagnostics, tarball creation and sanitization, registered sandbox selection, unknown sandbox rejection, and credentials list/reset after onboarding a real sandbox.

Optional E2E

  • docs-validation-e2e (medium; installs NemoClaw and runs docs validation without remote link checks): Useful to verify the updated command reference remains in sync with CLI/help documentation and docs validation expectations, but the runtime risk is already covered by diagnostics-e2e.

New E2E recommendations

  • diagnostics-debug-env-sandbox-selection (medium): Unit tests cover NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, and SANDBOX_NAME precedence and rejection, but the diagnostics E2E only exercises the --sandbox flag path. A future E2E could validate one env-sourced success and one env-sourced unknown-name failure against a real onboarded sandbox.
    • Suggested test: Add a TC-DIAG case to test/e2e/test-diagnostics.sh for env-sourced sandbox name precedence and rejection before tarball creation.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: diagnostics-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

PR Review Advisor

Findings: 0 needs attention, 2 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 0 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: src/lib/diagnostics/tarball.ts partial tarball cleanup: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `tarball.ts` computes `const partial = `${output}.partial.${process.pid}``, catches tar/rename failures, and best-effort removes the partial path.
  • Use an exclusive, unpredictable partial tarball path (src/lib/diagnostics/tarball.ts:26): The tarball helper writes to a predictable sibling path, `${output}.partial.${process.pid}`, before renaming it into place. In a shared writable destination directory, another local process could pre-create that path, including as a symlink on permissive filesystems, before `tar czf` opens it. That leaves a local TOCTOU/file-clobber risk and also leaves the rename-failure cleanup path only indirectly covered.
    • Recommendation: Create the temporary tarball with an exclusive, unpredictable name in the destination directory, such as a `mkdtemp`/`mkstemp`-style path, or otherwise ensure `O_CREAT|O_EXCL|O_NOFOLLOW` semantics before passing the path to tar. Add regression coverage for pre-existing partial-path collision or symlink-like collision behavior and for cleanup after `renameSync` failure.
    • Evidence: `src/lib/diagnostics/tarball.ts:26` derives `partial` directly from user-supplied `output` and `process.pid`; tests cover tar failure cleanup but not pre-existing partial collisions or rename failure cleanup.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: src/lib/diagnostics/tarball.ts partial tarball cleanup: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `tarball.ts` computes `const partial = `${output}.partial.${process.pid}``, catches tar/rename failures, and best-effort removes the partial path.
  • Use an exclusive, unpredictable partial tarball path (src/lib/diagnostics/tarball.ts:26): The tarball helper writes to a predictable sibling path, `${output}.partial.${process.pid}`, before renaming it into place. In a shared writable destination directory, another local process could pre-create that path, including as a symlink on permissive filesystems, before `tar czf` opens it. That leaves a local TOCTOU/file-clobber risk and also leaves the rename-failure cleanup path only indirectly covered.
    • Recommendation: Create the temporary tarball with an exclusive, unpredictable name in the destination directory, such as a `mkdtemp`/`mkstemp`-style path, or otherwise ensure `O_CREAT|O_EXCL|O_NOFOLLOW` semantics before passing the path to tar. Add regression coverage for pre-existing partial-path collision or symlink-like collision behavior and for cleanup after `renameSync` failure.
    • Evidence: `src/lib/diagnostics/tarball.ts:26` derives `partial` directly from user-supplied `output` and `process.pid`; tests cover tar failure cleanup but not pre-existing partial collisions or rename failure cleanup.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26624434917
Target ref: 16d5c158d91b4862981171b47d4bb4f2e128f61c
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
diagnostics-e2e ✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/lib/diagnostics/debug-command.test.ts (1)

72-88: ⚡ Quick win

Assert the exit status in the env-sourced failure test.

This case verifies throw/message, but not that the exit code is 1, which is part of the new behavior contract.

Patch suggestion
     ).toThrow("exit");
+    expect(exit).toHaveBeenCalledWith(1);
     expect(runDebug).not.toHaveBeenCalled();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/diagnostics/debug-command.test.ts` around lines 72 - 88, The test for
env-sourced failure should also assert the numeric exit status: after calling
runDebugCommandWithOptions (which currently expects to throw "exit"), add an
assertion that the mocked exit function was called with 1 (e.g.,
expect(exit).toHaveBeenCalledWith(1)) to verify the new behavior contract;
locate the assertions around runDebugCommandWithOptions, runDebug, errorLines
and the exit mock in debug-command.test.ts and add the exit call assertion
immediately after the existing toThrow and runDebug checks.
test/e2e/test-diagnostics.sh (1)

317-320: ⚡ Quick win

Use a per-run unknown sandbox name to reduce e2e flake risk.

A hard-coded “unknown” name can collide in persistent/shared environments. A unique name per run keeps this failure-path assertion deterministic.

Patch suggestion
-  local bad_name="nemoclaw-e2e-does-not-exist"
+  local bad_name="nemoclaw-e2e-unknown-$(date +%s)-$$"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-diagnostics.sh` around lines 317 - 320, Replace the hard-coded
sandbox name in the failing test so it is unique per run: instead of setting
bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier
(e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name
before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and
variables bad_output, bad_rc, bad_log should be unchanged); this prevents
collisions in shared/persistent test environments while keeping the failure-path
assertion deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/diagnostics/debug-command.test.ts`:
- Around line 72-88: The test for env-sourced failure should also assert the
numeric exit status: after calling runDebugCommandWithOptions (which currently
expects to throw "exit"), add an assertion that the mocked exit function was
called with 1 (e.g., expect(exit).toHaveBeenCalledWith(1)) to verify the new
behavior contract; locate the assertions around runDebugCommandWithOptions,
runDebug, errorLines and the exit mock in debug-command.test.ts and add the exit
call assertion immediately after the existing toThrow and runDebug checks.

In `@test/e2e/test-diagnostics.sh`:
- Around line 317-320: Replace the hard-coded sandbox name in the failing test
so it is unique per run: instead of setting
bad_name="nemoclaw-e2e-does-not-exist" generate a per-run unique identifier
(e.g., incorporate PID, timestamp, or mktemp/uuid) and assign it to bad_name
before invoking the nemoclaw debug command (the invocation using TIMEOUT_CMD and
variables bad_output, bad_rc, bad_log should be unchanged); this prevents
collisions in shared/persistent test environments while keeping the failure-path
assertion deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3ec46952-41f8-4bbc-a7b8-03829f3ba686

📥 Commits

Reviewing files that changed from the base of the PR and between 16d5c15 and 5c545c8.

📒 Files selected for processing (6)
  • src/lib/diagnostics/debug-command.test.ts
  • src/lib/diagnostics/debug-command.ts
  • src/lib/diagnostics/debug.test.ts
  • src/lib/diagnostics/debug.ts
  • test/cli.test.ts
  • test/e2e/test-diagnostics.sh
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/lib/diagnostics/debug.test.ts
  • test/cli.test.ts
  • src/lib/diagnostics/debug-command.ts

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26625352549
Target ref: 5c545c82dbd20969134a0ed5b536d445ed0673e8
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
diagnostics-e2e ✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
…ract

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
docs/reference/commands.mdx (2)

1149-1149: ⚡ Quick win

Break into separate lines and use active voice.

This paragraph has three sentences on one line, which makes diffs harder to review. It also uses passive constructions ("is supplied", "is written").

Suggested revision (one sentence per line, active voice):

-When `--sandbox` is supplied explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway. An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and no tarball is written. Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox (and warns if that default is stale).
+When you supply `--sandbox` explicitly (via flag or one of `NEMOCLAW_SANDBOX_NAME`, `NEMOCLAW_SANDBOX`, `SANDBOX_NAME` — flag wins, then the env vars in that order), the name must match a registered sandbox; if `openshell sandbox list` succeeds it must also appear in the live gateway.
+An unknown or stale name exits non-zero with an actionable error that names the sandbox and reports the source env var when applicable, and NemoClaw writes no tarball.
+Without an explicit name, `nemoclaw debug` falls back to the registry's default sandbox and warns if that default is stale.

As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line." and "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.mdx` at line 1149, Split the paragraph into three
separate lines (one sentence per line) and rewrite in active voice: first line
state that when the --sandbox flag or one of the env vars
(NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME) is provided the CLI
requires the name to match a registered sandbox (note that the flag takes
precedence then env vars in that order); second line state that an unknown or
stale sandbox name causes the command to exit non-zero, log an actionable error
naming the sandbox and the source env var when applicable, and prevent writing
any tarball (use active verbs like "exit", "log", "prevent"); third line state
that when no explicit name is provided, nemoclaw debug falls back to the
registry's default sandbox and warns if that default is stale (use "falls back"
and "warns" as active verbs) and ensure each sentence is on its own source line
to satisfy the one-sentence-per-line guideline.

1147-1147: ⚡ Quick win

Use active voice.

The sentence uses passive constructions ("is written", "is preserved").

Suggested revision:

-The tarball is written to a temporary sibling and renamed on success, so a pre-existing file at `--output` is preserved when `tar` fails.
+NemoClaw writes the tarball to a temporary sibling and renames it on success, so `tar` failures preserve any pre-existing file at `--output`.

As per coding guidelines: Active voice required. Flag passive constructions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.mdx` at line 1147, Rewrite the passive sentence to
active voice by naming the actor and using active verbs: change "The tarball is
written to a temporary sibling and renamed on success, so a pre-existing file at
`--output` is preserved when `tar` fails." to something like "We (or the
command) write the tarball to a temporary sibling and rename it on success, so a
pre-existing file at `--output` remains untouched if `tar` fails," ensuring you
use `tar` and `--output` exactly as shown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/reference/commands.mdx`:
- Line 1149: Split the paragraph into three separate lines (one sentence per
line) and rewrite in active voice: first line state that when the --sandbox flag
or one of the env vars (NEMOCLAW_SANDBOX_NAME, NEMOCLAW_SANDBOX, SANDBOX_NAME)
is provided the CLI requires the name to match a registered sandbox (note that
the flag takes precedence then env vars in that order); second line state that
an unknown or stale sandbox name causes the command to exit non-zero, log an
actionable error naming the sandbox and the source env var when applicable, and
prevent writing any tarball (use active verbs like "exit", "log", "prevent");
third line state that when no explicit name is provided, nemoclaw debug falls
back to the registry's default sandbox and warns if that default is stale (use
"falls back" and "warns" as active verbs) and ensure each sentence is on its own
source line to satisfy the one-sentence-per-line guideline.
- Line 1147: Rewrite the passive sentence to active voice by naming the actor
and using active verbs: change "The tarball is written to a temporary sibling
and renamed on success, so a pre-existing file at `--output` is preserved when
`tar` fails." to something like "We (or the command) write the tarball to a
temporary sibling and rename it on success, so a pre-existing file at `--output`
remains untouched if `tar` fails," ensuring you use `tar` and `--output` exactly
as shown.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2fa46822-2c0a-4937-996e-5cc59128b0fe

📥 Commits

Reviewing files that changed from the base of the PR and between 65ffe14 and 362c353.

📒 Files selected for processing (4)
  • docs/reference/commands.mdx
  • src/lib/diagnostics/debug.ts
  • src/lib/diagnostics/tarball.ts
  • test/cli.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/cli.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26627617919
Target ref: 362c353edbd2eed75c356176134adb3a6b9a0782
Workflow ref: main
Requested jobs: diagnostics-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
diagnostics-e2e ✅ success

@laitingsheng laitingsheng added v0.0.55 Release target Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. labels May 29, 2026
@jyaunches jyaunches self-requested a review May 29, 2026 17:30
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — regression risk is minimal here.

Why low regression risk:

  • Blast radius is confined to nemoclaw debug and src/lib/diagnostics/. No runtime, sandbox lifecycle, gateway, or install paths are touched.
  • The new live-gateway check degrades safely: if openshell sandbox list fails, validation falls back to registry-only, so a gateway outage won't break debug for already-registered sandboxes.
  • Atomic tarball uses same-directory partial + renameSync, so no cross-filesystem EXDEV surprise.
  • The behavior change (unknown --sandbox now exits 1) is the intended fix for T6029556; the old silent-success path was already broken.
  • diagnostics-e2e is green on the latest commit, including TC-DIAG-06 covering both registered-success and unknown-name-rejection. Unit tests cover env precedence, flag-over-env, default fallback, exit code, and tarball cleanup.

Follow-up (non-blocking): the PR Review Advisor's note about the predictable ${output}.partial.${process.pid} path is a hardening idea, not a regression risk — worth a follow-up issue to switch to an mkdtemp-style exclusive path and add a regression test for pre-existing-collision and rename-failure cleanup, but it shouldn't gate this fix.

@jyaunches jyaunches added R2 v0.0.56 Release target and removed v0.0.55 Release target R2 labels May 29, 2026
@cv cv merged commit f8c85f3 into main May 30, 2026
36 checks passed
@cv cv deleted the fix/4494-debug-validate-sandbox branch May 30, 2026 18:18
miyoungc added a commit that referenced this pull request Jun 1, 2026
## Summary

- Adds the v0.0.56 release notes section with links to the deeper docs
pages for installer, status, inference, messaging, policy, and lifecycle
changes.
- Updates source docs for the remaining release-prep gaps around `uv` in
the PyPI preset, compact WhatsApp pairing guidance, and `nemoclaw
inference set` command boundaries.
- Refreshes generated `nemoclaw-user-*` skills and removes skipped
experimental command terms from generated skill surfaces.

## Source summary

- #4613 -> `docs/manage-sandboxes/lifecycle.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
that public installs and `nemoclaw update` follow the maintained `lkg`
tag by default.
- #4419 -> `docs/about/release-notes.mdx`: Notes that non-interactive
Linux installs can reactivate Docker group membership and continue in
one installer run when `sg docker` is available.
- #4550 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures live sandbox agent-version
probing for status, connect, and upgrade checks.
- #4609 -> `docs/inference/use-local-inference.mdx`,
`docs/about/release-notes.mdx`: Captures the GPU Docker-driver
host-network local-inference reachability gate.
- #4607 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
compact WhatsApp QR pairing guidance and gateway/session diagnostics.
- #4582 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Reflects
Slack credential validation before enabling the channel.
- #4554 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/troubleshooting.mdx`, `docs/about/release-notes.mdx`:
Keeps Telegram allowlist alias guidance in the generated user skills and
release notes.
- #4563 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Includes the new `nemoclaw <name> skill
remove <skill>` command in command docs and release notes.
- #4566 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Documents the `nemoclaw inference set`
redirect boundary when `--provider` or `--model` is missing.
- #4323 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures per-sandbox status JSON
support.
- #4506 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures debug command sandbox-name
validation and safer tarball writing.
- #4569 -> `docs/network-policy/integration-policy-examples.mdx`,
`docs/about/release-notes.mdx`: Documents that the `pypi` preset allows
`/usr/local/bin/uv`.
- #4579 -> `docs/network-policy/integration-policy-examples.mdx`,
`docs/about/release-notes.mdx`: Captures observable Jira preset
validation guidance.
- #4229 -> `docs/manage-sandboxes/lifecycle.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
user-data preservation defaults for uninstall.
- #4399 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures CPU-only sandbox intent
preservation across rebuilds.
- #4058 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures safer snapshot restore behavior
around existing destinations.
- #4155 and #4460 -> skipped by `docs/.docs-skip`: Removed skipped
experimental command terms from source docs and generated skill evals
instead of documenting those features.

## Verification

- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `npm run docs` (passes; Fern reports the pre-existing light-mode
accent contrast warning)
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" .agents/skills` (no matches)
- `npm run build:cli` (run to refresh local CLI artifacts for the
pre-push TypeScript hook)
- Commit hooks passed, including `NEMOCLAW_* env-var documentation
gate`, `Verify docs-to-skills output`, `markdownlint-cli2`, `gitleaks`,
and `Test (skills YAML)`.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Expanded Model Router setup with YAML examples, flow diagrams, and
credential handling; strengthened agent-config immutability and
integrity guidance; messaging channels updated (Telegram aliases,
WhatsApp pairing/diagnostics); CLI docs revised (GPU detection,
inference set behavior, uninstall/rebuild preservation); overview
rebranded to NemoClaw and added v0.0.56 release notes.

* **New Features**
* Added `nemoclaw <name> channels status` (messaging diagnostics, JSON);
added `nemoclaw <name> skill remove`; Hermes no longer marked
experimental; DGX Spark quickstart sandbox-name note.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. v0.0.56 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.04 + Ubuntu 22.04][CLI&UX] nemoclaw debug --sandbox <unknown> --output exits 0 and writes a non-empty tarball instead of erroring

4 participants