docs(troubleshooting): add reconnect steps after host reboot#911
docs(troubleshooting): add reconnect steps after host reboot#911cjagwani merged 1 commit intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdded a new troubleshooting section documenting the reconnection workflow after host reboot. The section provides step-by-step commands to restart container runtime, verify sandbox state, conditionally restart the OpenShell gateway, and reconnect the client. Includes a warning about using Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
docs/reference/troubleshooting.md (5)
166-166: Passive voice detected."the OpenShell gateway may need to be restarted" is passive. Use active voice: "restart the OpenShell gateway" or "you may need to restart the OpenShell gateway."
As per coding guidelines: Active voice required. Flag passive constructions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` at line 166, The sentence uses passive voice ("the OpenShell gateway may need to be restarted"); change it to active voice by replacing it with an active construction such as "Restart the OpenShell gateway." or "You may need to restart the OpenShell gateway." Update the sentence in troubleshooting.md where the phrase appears so the guidance reads actively.
153-154: Consider removing bold from platform labels."Linux:" and "macOS:" use bold on routine category labels. The coding guidelines reserve bold for UI labels, parameter names, and genuine warnings. Plain text or using a different formatting approach (e.g., subheadings) would be more appropriate.
As per coding guidelines: "Unnecessary bold on routine instructions ... Bold is reserved for UI labels, parameter names, and genuine warnings." (LLM pattern detected)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 153 - 154, Remove the bold formatting from the routine platform labels "Linux:" and "macOS:" in the troubleshooting note so they are plain text (or convert them to subheadings if preferred); update the lines that currently read "**Linux:** start Docker..." and "**macOS:** open Docker Desktop..." to use regular text or appropriate heading syntax to comply with the guideline that bold is reserved for UI labels, parameter names, and warnings.
148-148: Passive voice detected."may not be running" is passive. Rewrite in active voice, for example: "the container runtime, OpenShell gateway, and sandbox may have stopped" or "these components do not run automatically after reboot."
As per coding guidelines: Active voice required. Flag passive constructions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` at line 148, The sentence "After a host reboot, the container runtime, OpenShell gateway, and sandbox may not be running." uses passive voice; rewrite it in active voice by replacing it with a direct statement such as "After a host reboot, the container runtime, OpenShell gateway, and sandbox may have stopped" or "These components do not run automatically after reboot," ensuring the new wording is active and clearly indicates the action/state for the listed components.
151-180: Remove numbering and bold formatting from step headers.The coding guidelines prohibit numbering steps in this format ("Step 1:", "Step 2:", etc.) and reserve bold for UI labels, parameter names, and genuine warnings—not routine procedural headings.
Replace the numbered bold headers with plain unnumbered subheadings or remove the bold formatting.
As per coding guidelines: "Do not number section titles. Flag 'Section 1: ...', 'Step 3: ...', etc." and "Unnecessary bold on routine instructions ... Bold is reserved for UI labels, parameter names, and genuine warnings." (LLM pattern detected)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 151 - 180, Remove numeric prefixes and bold formatting from the step headers in the troubleshooting section: change "**1. Start the container runtime**", "**2. Check sandbox state**", "**3. Restart the gateway (if needed)**", "**4. Reconnect**", and "**5. Restart auxiliary services (if needed)**" to plain unnumbered subheadings or simple sentences (e.g., "Start the container runtime", "Check sandbox state", "Restart the gateway (if needed)", "Reconnect", "Restart auxiliary services (if needed)") so they follow the guideline that section titles must not be numbered and routine instructions should not be bolded; keep existing code blocks and commands intact (e.g., the `openshell sandbox list` and `openshell gateway start --name nemoclaw` examples).
192-192: Use present tense and remove unnecessary bold.Two issues:
- "will be lost" uses future tense. The guidelines require present tense for clarity and directness. Rewrite as "workspace files are lost" or "this destroys workspace files."
- "destroys and recreates" uses unnecessary bold. The text already appears in a warning admonition, so additional bold is redundant and follows an LLM-generated pattern.
As per coding guidelines: "Present tense. Flag future tense ('will') in descriptions of current behavior." and "Unnecessary bold on routine instructions ... Bold is reserved for UI labels, parameter names, and genuine warnings." (LLM pattern detected)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` at line 192, Edit the sentence containing "destroys and recreates" and "will be lost": remove the bolding around "destroys and recreates" and change the future tense "will be lost" to present tense (e.g., "workspace files are lost" or "this destroys workspace files"), so the admonition reads without extra emphasis and uses present-tense description.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/completion.js`:
- Around line 6-10: Add the missing "setup" entry to the GLOBAL_COMMANDS array
so tab completion suggests the deprecated setup command (update the array in
completion.js to include "setup"); then update the zsh and fish completion
scripts to include the setup option with a short description like "setup:Legacy
setup (deprecated)" so the completion helpers present the command and
description consistently with the existing "setup" declaration in nemoclaw.js;
ensure all three places (GLOBAL_COMMANDS, zsh completion, fish completion) use
the exact token "setup" and the same description text.
- Line 209: The fish completion's hardcoded variable global_cmds is missing the
"setup" subcommand; update the list assigned to global_cmds in
bin/lib/completion.js to include "setup" (so the array/string containing onboard
list deploy setup-spark start stop status debug uninstall help completion also
contains setup) to keep it in sync with the other completion fixes.
- Around line 109-125: The zsh completion's global_cmds array in
bin/lib/completion.js is missing the deprecated "setup" command; update the
global_cmds=( ... ) array (the global_cmds symbol) to include an entry for
'setup:Deprecated setup' (or similar deprecation text) alongside the other
entries so zsh completion suggests the deprecated setup command.
---
Nitpick comments:
In `@docs/reference/troubleshooting.md`:
- Line 166: The sentence uses passive voice ("the OpenShell gateway may need to
be restarted"); change it to active voice by replacing it with an active
construction such as "Restart the OpenShell gateway." or "You may need to
restart the OpenShell gateway." Update the sentence in troubleshooting.md where
the phrase appears so the guidance reads actively.
- Around line 153-154: Remove the bold formatting from the routine platform
labels "Linux:" and "macOS:" in the troubleshooting note so they are plain text
(or convert them to subheadings if preferred); update the lines that currently
read "**Linux:** start Docker..." and "**macOS:** open Docker Desktop..." to use
regular text or appropriate heading syntax to comply with the guideline that
bold is reserved for UI labels, parameter names, and warnings.
- Line 148: The sentence "After a host reboot, the container runtime, OpenShell
gateway, and sandbox may not be running." uses passive voice; rewrite it in
active voice by replacing it with a direct statement such as "After a host
reboot, the container runtime, OpenShell gateway, and sandbox may have stopped"
or "These components do not run automatically after reboot," ensuring the new
wording is active and clearly indicates the action/state for the listed
components.
- Around line 151-180: Remove numeric prefixes and bold formatting from the step
headers in the troubleshooting section: change "**1. Start the container
runtime**", "**2. Check sandbox state**", "**3. Restart the gateway (if
needed)**", "**4. Reconnect**", and "**5. Restart auxiliary services (if
needed)**" to plain unnumbered subheadings or simple sentences (e.g., "Start the
container runtime", "Check sandbox state", "Restart the gateway (if needed)",
"Reconnect", "Restart auxiliary services (if needed)") so they follow the
guideline that section titles must not be numbered and routine instructions
should not be bolded; keep existing code blocks and commands intact (e.g., the
`openshell sandbox list` and `openshell gateway start --name nemoclaw`
examples).
- Line 192: Edit the sentence containing "destroys and recreates" and "will be
lost": remove the bolding around "destroys and recreates" and change the future
tense "will be lost" to present tense (e.g., "workspace files are lost" or "this
destroys workspace files"), so the admonition reads without extra emphasis and
uses present-tense description.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8f211fd1-ffd5-465c-a707-bb27c0f689e2
📒 Files selected for processing (4)
bin/lib/completion.jsbin/nemoclaw.jsdocs/reference/troubleshooting.mdtest/completion.test.js
|
Addressed CodeRabbit feedback:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
docs/reference/troubleshooting.md (1)
151-180: Replace bold step labels with proper subheadings or plain text labels.Using bold for routine step headers (
**1. ...**) matches an LLM-style pattern and reduces consistency with the docs style.As per coding guidelines: “Unnecessary bold on routine instructions … should be flagged” and “LLM-generated patterns … flag every occurrence” (LLM pattern detected).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 151 - 180, Replace the bolded step labels (e.g., "**1. Start the container runtime**", "**2. Check sandbox state**", "**3. Restart the gateway (if needed)**", "**4. Reconnect**", "**5. Restart auxiliary services (if needed)**") with proper markdown subheadings or plain text labels consistent with docs style (e.g., "1. Start the container runtime" as a level-3 heading or an unbolded numbered line); keep the command examples like "openshell sandbox list", "openshell gateway start --name nemoclaw", and "nemoclaw <name> connect" unchanged, ensuring headings match surrounding section formatting and remove the unnecessary bold formatting that triggered the LLM-style pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/reference/troubleshooting.md`:
- Around line 180-186: Update the heading text to accurately reflect the command
behavior by changing the section title from "Restart auxiliary services (if
needed)" to "Start auxiliary services (if needed)" and keep the body text "start
them again" unchanged; alternatively, if restart semantics are required, replace
the single `nemoclaw start` instruction with the explicit restart sequence
`nemoclaw stop` followed by `nemoclaw start` so the heading and command
(`nemoclaw start`) are consistent.
- Around line 191-193: Update the troubleshooting text to clarify that "nemoclaw
onboard" only destroys and recreates an existing sandbox if the user explicitly
confirms the prompt or sets NEMOCLAW_RECREATE_SANDBOX=1 in non-interactive mode;
reference the onboard flow implemented in bin/lib/onboard.js (the confirmation
prompt for recreation) and reword the sentence to indicate that without explicit
confirmation the existing sandbox is preserved and data loss will not occur.
---
Nitpick comments:
In `@docs/reference/troubleshooting.md`:
- Around line 151-180: Replace the bolded step labels (e.g., "**1. Start the
container runtime**", "**2. Check sandbox state**", "**3. Restart the gateway
(if needed)**", "**4. Reconnect**", "**5. Restart auxiliary services (if
needed)**") with proper markdown subheadings or plain text labels consistent
with docs style (e.g., "1. Start the container runtime" as a level-3 heading or
an unbolded numbered line); keep the command examples like "openshell sandbox
list", "openshell gateway start --name nemoclaw", and "nemoclaw <name> connect"
unchanged, ensuring headings match surrounding section formatting and remove the
unnecessary bold formatting that triggered the LLM-style pattern.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a178a69b-ee22-46df-8625-5c3d197b073c
📒 Files selected for processing (1)
docs/reference/troubleshooting.md
There was a problem hiding this comment.
♻️ Duplicate comments (2)
docs/reference/troubleshooting.md (2)
180-186:⚠️ Potential issue | 🟡 MinorHeading says “Restart” but the command only starts services.
This is the same heading/behavior mismatch already raised earlier; either change the heading to “Start auxiliary services (if needed)” or document an explicit restart sequence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 180 - 186, The heading "Restart auxiliary services (if needed)" is misleading because the snippet only shows the start command; update the docs so heading and commands match: either change the heading to "Start auxiliary services (if needed)" or add an explicit restart sequence (e.g., show stop then start or a restart command) under the same heading; specifically edit the section containing the heading "Restart auxiliary services (if needed)" and the command example using "nemoclaw start" to ensure they describe the same action.
191-193:⚠️ Potential issue | 🟠 MajorDestructive behavior wording for
nemoclaw onboardis still too absolute.This repeats the previously raised issue: recreation/destruction is conditional (confirmation or explicit non-interactive recreate), so the warning should reflect that condition to avoid overstating guaranteed data loss.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 191 - 193, The current sentence asserts that `nemoclaw onboard` always "destroys and recreates" the sandbox; update the warning to state this is conditional by noting that `nemoclaw onboard` will destroy and recreate workspace files only if you confirm the prompt or explicitly invoke a non-interactive recreate option (e.g., a --recreate/--force style flag), so users know data loss occurs only when they consent or run with that explicit flag; keep the same list of affected files (SOUL.md, USER.md, IDENTITY.md, AGENTS.md, MEMORY.md) and retain the backup pointer to Back Up and Restore.
🧹 Nitpick comments (1)
docs/reference/troubleshooting.md (1)
153-154: Remove bold styling from platform labels in routine bullets.
**Linux:**and**macOS:**are routine instruction labels; plain text is preferable here.
LLM pattern detected.As per coding guidelines, “Unnecessary bold on routine instructions … Bold is reserved for UI labels, parameter names, and genuine warnings.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 153 - 154, Remove the unnecessary bold formatting from the routine platform labels by changing the "**Linux:**" and "**macOS:**" bullet prefixes to plain text "Linux:" and "macOS:" in the troubleshooting document; update the two lines that currently read "**Linux:** start Docker if it is not already running (`sudo systemctl start docker`)" and "**macOS:** open Docker Desktop or start Colima (`colima start`)" to use non-bold labels "Linux:" and "macOS:" so the labels follow the guideline reserving bold for UI labels, parameter names, and warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/reference/troubleshooting.md`:
- Around line 180-186: The heading "Restart auxiliary services (if needed)" is
misleading because the snippet only shows the start command; update the docs so
heading and commands match: either change the heading to "Start auxiliary
services (if needed)" or add an explicit restart sequence (e.g., show stop then
start or a restart command) under the same heading; specifically edit the
section containing the heading "Restart auxiliary services (if needed)" and the
command example using "nemoclaw start" to ensure they describe the same action.
- Around line 191-193: The current sentence asserts that `nemoclaw onboard`
always "destroys and recreates" the sandbox; update the warning to state this is
conditional by noting that `nemoclaw onboard` will destroy and recreate
workspace files only if you confirm the prompt or explicitly invoke a
non-interactive recreate option (e.g., a --recreate/--force style flag), so
users know data loss occurs only when they consent or run with that explicit
flag; keep the same list of affected files (SOUL.md, USER.md, IDENTITY.md,
AGENTS.md, MEMORY.md) and retain the backup pointer to Back Up and Restore.
---
Nitpick comments:
In `@docs/reference/troubleshooting.md`:
- Around line 153-154: Remove the unnecessary bold formatting from the routine
platform labels by changing the "**Linux:**" and "**macOS:**" bullet prefixes to
plain text "Linux:" and "macOS:" in the troubleshooting document; update the two
lines that currently read "**Linux:** start Docker if it is not already running
(`sudo systemctl start docker`)" and "**macOS:** open Docker Desktop or start
Colima (`colima start`)" to use non-bold labels "Linux:" and "macOS:" so the
labels follow the guideline reserving bold for UI labels, parameter names, and
warnings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f89fba9b-f6a1-4857-9009-66065a25acf3
📒 Files selected for processing (1)
docs/reference/troubleshooting.md
|
Re: review 4009592389 — both duplicate issues were already fixed in b66846d (pushed before this review ran):
Nitpick (bold on Linux:/macOS: labels): keeping as-is — bold aids scannability in a bullet list with inline code. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/reference/troubleshooting.md (1)
166-166: Avoid colon punctuation here unless it introduces a list.This sentence uses a colon before a command block; per style, use a period instead.
As per coding guidelines: "Colons should only introduce a list. Flag colons used as general punctuation between clauses."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` at line 166, The sentence "If the sandbox is not listed or the command fails, restart the OpenShell gateway:" should end with a period instead of a colon; update that line to "If the sandbox is not listed or the command fails, restart the OpenShell gateway." so the punctuation follows the guideline that colons only introduce lists.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/reference/troubleshooting.md`:
- Line 166: The sentence "If the sandbox is not listed or the command fails,
restart the OpenShell gateway:" should end with a period instead of a colon;
update that line to "If the sandbox is not listed or the command fails, restart
the OpenShell gateway." so the punctuation follows the guideline that colons
only introduce lists.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d631f554-d34b-4c0e-b5da-da82f8f10c21
📒 Files selected for processing (1)
docs/reference/troubleshooting.md
There was a problem hiding this comment.
LGTM. All CI green. Docs are clear with platform-specific steps and a proper destructive-action warning. Shell completion is a solid addition good coverage across bash/zsh/fish with tests to keep debug flags in sync. Closes #469 cleanly.
66ebc06 to
f761dd6
Compare
Add a "Reconnect after a host reboot" section with a 5-step recovery sequence: start the container runtime, check sandbox state, restart the gateway if needed, reconnect, and start auxiliary services. Include a warning that nemoclaw onboard prompts before destroying an existing sandbox and that workspace files are lost on recreation. Fixes NVIDIA#469 Co-Authored-By: Ritik Kadyan <RitikKadyan@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8076acb to
3fd5d2b
Compare
Replace external docs.nvidia.com/nemoclaw links with relative paths to the in-repo /docs/ directory. The external links were valid but outdated since PRs NVIDIA#911 and NVIDIA#603 added the corresponding pages to the repository. Closes NVIDIA#968 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Anh Quang Nguyen <29374105+aprprprr@users.noreply.github.com>
Summary
nemoclaw onboardis destructive (destroys workspace files) and should only be used as a last resortCloses #469
Related
nemoclaw reconnectcommand to automate this sequenceAcknowledgment
Credit to @RitikKadyan whose PR #474 first tackled this issue. This PR builds on that work with additional steps (gateway restart) and a data loss warning. Ritik is included as a co-author on the commit.
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit