Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDetects and reuses a healthy shared NemoClaw OpenShell gateway, adds live sandbox reconciliation for connect/status flows, introduces recovery-aware gateway startup modes, runs staged build cleanup for blueprint artifacts, and expands unit/CLI/e2e tests covering gateway lifecycle and registry reconciliation. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as CLI (bin/nemoclaw.js)
participant Onboard as Onboard (bin/lib/onboard.js)
participant OpenShell as OpenShell
participant Registry as Local Registry
rect rgba(200,150,100,0.5)
Note over User,Registry: Initial onboarding may recreate gateway
User->>CLI: onboard
CLI->>Onboard: startGateway()
Onboard->>OpenShell: openshell status / gateway info
Onboard->>OpenShell: destroy/start gateway (if stale)
OpenShell-->>Onboard: gateway ready
Onboard-->>CLI: success (new gateway)
end
rect rgba(100,150,200,0.5)
Note over User,Registry: Reuse healthy shared gateway
User->>CLI: onboard
CLI->>Onboard: startGateway()
Onboard->>OpenShell: openshell status / gateway info
OpenShell-->>Onboard: Connected & named
Onboard->>OpenShell: select existing gateway
Onboard-->>CLI: success (reused gateway)
end
rect rgba(150,200,100,0.5)
Note over User,Registry: Sandbox connect/status reconciliation
User->>CLI: sandbox connect / status
CLI->>Onboard: ensureLiveSandboxOrExit
Onboard->>OpenShell: openshell sandbox get / logs
OpenShell-->>Onboard: live sandbox state
Onboard->>Registry: compare with local registry
alt sandbox live
Onboard->>OpenShell: forward/connect
else registry stale
Onboard->>Registry: remove stale entry
Onboard-->>User: guidance / recovery messaging
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
🚀 Docs preview ready! |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
1487-1488: Centralize this staged-tree scrubber.The same
.venv/.pytest_cache/__pycache__cleanup now lives here and inscripts/setup.sh, so the next artifact tweak has to land in two places. A shared helper/script would keep the legacysetuppath aligned withonboard.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 1487 - 1488, Duplicate cleanup commands for removing .venv, .pytest_cache and __pycache__ are present in onboard.js (run(...) calls) and scripts/setup.sh; extract them into a single centralized "staged-tree scrubber" (either a small shell script like scripts/clean-staged-tree.sh or a JS helper exported from bin/lib/cleanStagedTree.js) and replace the inline run(...) invocations in onboard.js and the matching snippet in scripts/setup.sh to call that central helper. Ensure the central helper performs the same operations (rm -rf for .venv and .pytest_cache and find ... -name __pycache__ -prune -exec rm -rf {} +) and that onboard.js's call uses the existing run(...) wrapper semantics (preserve { ignoreError: true } behavior) or that scripts/setup.sh invokes the script with identical error-tolerant behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/nemoclaw.js`:
- Around line 100-120: The connected check in getNamedGatewayLifecycleState uses
/Connected/i which also matches "Disconnected"; update the connected detection
to match the exact status token/line in cleanStatus (e.g. replace the
/Connected/i.test(cleanStatus) usage with a more specific regex such as one that
anchors to the status line like /^Status:\s*Connected\b/im.test(cleanStatus) or
a whole-word match for "Connected" so "Disconnected" does not match), leaving
the rest of the function logic unchanged and still using the cleanStatus
variable.
In `@test/e2e/test-double-onboard.sh`:
- Around line 411-418: The current acceptance regex for gateway_status_output
incorrectly treats "Removed stale local registry entry" as a valid gateway-stop
outcome; modify the grep pattern in the gateway status check to remove that
phrase so only lifecycle/recovery messages ("Recovered NemoClaw gateway
runtime", "gateway is no longer configured after restart/rebuild", "gateway is
still refusing connections after restart", "gateway trust material rotated after
restart") count as passes, and keep the existing pass/fail behavior
(pass(...)/fail(...)) around gateway_status_output. After this check, add an
explicit assertion that the registry still contains SANDBOX_B by calling the
registry query used elsewhere in the test (the same registry listing helper that
references SANDBOX_B) and fail the test if SANDBOX_B is missing; reference
gateway_status_output, SANDBOX_B, and the pass/fail helpers when implementing
these changes.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 1487-1488: Duplicate cleanup commands for removing .venv,
.pytest_cache and __pycache__ are present in onboard.js (run(...) calls) and
scripts/setup.sh; extract them into a single centralized "staged-tree scrubber"
(either a small shell script like scripts/clean-staged-tree.sh or a JS helper
exported from bin/lib/cleanStagedTree.js) and replace the inline run(...)
invocations in onboard.js and the matching snippet in scripts/setup.sh to call
that central helper. Ensure the central helper performs the same operations (rm
-rf for .venv and .pytest_cache and find ... -name __pycache__ -prune -exec rm
-rf {} +) and that onboard.js's call uses the existing run(...) wrapper
semantics (preserve { ignoreError: true } behavior) or that scripts/setup.sh
invokes the script with identical error-tolerant behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 17d70415-acbc-418a-b174-fa1a4f01e638
📒 Files selected for processing (7)
bin/lib/onboard.jsbin/nemoclaw.jsscripts/setup.shtest/cli.test.jstest/e2e/test-double-onboard.shtest/gateway-cleanup.test.jstest/onboard.test.js
There was a problem hiding this comment.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
188-203: Consider using regex for ANSI stripping.The manual character-by-character approach works but is harder to maintain and has O(n²) complexity due to string concatenation. The existing
isSandboxReadyfunction at line 171 already uses a regex for ANSI stripping.♻️ Suggested simplification using regex
function stripAnsi(value = "") { - let cleaned = ""; - for (let i = 0; i < value.length; i += 1) { - if (value.charCodeAt(i) === 27 && value[i + 1] === "[") { - i += 2; - while (i < value.length && /[0-9;]/.test(value[i])) { - i += 1; - } - if (value[i] === "m") { - continue; - } - } - cleaned += value[i] || ""; - } - return cleaned; + // eslint-disable-next-line no-control-regex + return value.replace(/\x1b\[[0-9;]*m/g, ""); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 188 - 203, The stripAnsi function currently iterates character-by-character and builds a string which is harder to maintain and can be O(n²); replace its implementation with a single regex-based removal (same approach used in isSandboxReady) to strip ANSI escape sequences and return value.replace(ANSI_REGEX, '') where ANSI_REGEX matches CSI/escape sequences; update stripAnsi to use that regex and ensure it defaults value to an empty string as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 188-203: The stripAnsi function currently iterates
character-by-character and builds a string which is harder to maintain and can
be O(n²); replace its implementation with a single regex-based removal (same
approach used in isSandboxReady) to strip ANSI escape sequences and return
value.replace(ANSI_REGEX, '') where ANSI_REGEX matches CSI/escape sequences;
update stripAnsi to use that regex and ensure it defaults value to an empty
string as before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0b3e9fc7-003d-423c-b21f-81102d9fc253
📒 Files selected for processing (1)
bin/lib/onboard.js
* fix: improve gateway lifecycle recovery (NVIDIA#953) * fix: improve gateway lifecycle recovery * docs: fix readme markdown list spacing * fix: tighten gateway lifecycle review follow-ups * fix: simplify tokenized control ui output * fix: restore chat route in control ui urls * refactor: simplify ansi stripping in onboard * fix: shorten control ui url output * fix: move control ui below cli next steps * fix: swap hard/soft ulimit settings in start script (NVIDIA#951) Fixes NVIDIA#949 Co-authored-by: KJ <kejones@nvidia.com> --------- Co-authored-by: KJ <kejones@nvidia.com> Co-authored-by: Emily Wilkins <80470879+epwilkins@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Supersedes #952 and the closed #908. This branch rebuilds the same work as a single verified signed commit so it can satisfy the branch signature requirements.
Summary
nemoclawgateway across repeat onboardingconnectandstatusinstead of trusting stale local registry entriesIssues
Security
openshellwrappersValidation
Brev CPU Validation
Environment:
brev-cpukj-nemoclaw-cpu-20260325-15544743cf8ebValidated on a real disposable Linux host:
openshell gateway stop+openshell gateway start --name nemoclaw, NemoClaw now surfaces a precise post-restart classification instead of a generic transport failureReadyResidual
Connection refusedstate. That remains a gateway/runtime limitation, not something this PR tries to bypass.Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores
Documentation