fix(onboard): probe-and-pick WSL2 host IP candidates for local inference by gburachas · Pull Request #1864 · NVIDIA/NemoClaw

gburachas · 2026-04-14T06:01:39Z

Summary

On WSL2 with Docker Desktop, host.openshell.internal (via Docker's host-gateway) resolves to an unreachable IPv6 ULA or un-routable gateway IP. That breaks:

The onboard-time container reachability probe.
Runtime inference routing from inside the sandbox to a host-side Ollama or vLLM.

This PR resolves a reachable IPv4 on WSL2 + Docker Desktop and uses it for both the OPENAI_BASE_URL written into the gateway and the container reachability probe. It now handles both Ollama/vLLM placements correctly:

Server inside WSL → reached via the distro's eth0 IPv4 (ip -4 -o route get 1.1.1.1 src).
Server on the Windows host (NAT mode) → reached via the WSL2 default gateway (ip -4 -o route show default).

detectWsl2HostIpCandidates() returns an ordered, filtered list (drops loopback, link-local, and common Docker/k8s bridge ranges). validateLocalProvider() probes each candidate with the container reachability check and returns the first that answers as resolvedHostIp. onboard.js uses that winner, prints it, and tries to persist it to the sandbox registry. Non-WSL2 platforms are unchanged — host.openshell.internal remains the default.

Host-side prerequisites (for Windows-hosted Ollama)

When Ollama runs on the Windows host (not inside WSL), the fix only helps once the host itself is reachable from WSL. Run these before onboarding (full step-by-step with shell labels lives in docs/reference/troubleshooting.md):

Bind Ollama to all interfaces — PowerShell (Administrator):
```
[System.Environment]::SetEnvironmentVariable('OLLAMA_HOST','0.0.0.0:11434','Machine')
```
Then restart Ollama from a fresh PowerShell.

Allow inbound TCP 11434 in Windows Defender Firewall — PowerShell (Administrator):

New-NetFirewallRule -DisplayName "Ollama 11434 (WSL)" -Direction Inbound -Protocol TCP -LocalPort 11434 -Action Allow -Profile Any

Switch WSL2 to mirrored networking mode. NAT mode on recent Windows 11 routes WSL traffic through a separate Hyper-V firewall layer that ignores standard inbound rules (NATInboundRuleNotApplicable). Add to %USERPROFILE%\.wslconfig:
```
[wsl2]
networkingMode=mirrored
```
Then wsl --shutdown from PowerShell and reopen WSL.

This PR itself does not change any of the above — it handles the NemoClaw/Docker side once the host is reachable.

Linked issues

Addressed in this PR

Closes [WSL2] Local Ollama inference timeout — inference.local TLS handshake fails, host.openshell.internal:11434 unreachable from openshell-0 pod #1472 — host.openshell.internal unreachable on WSL2 + Docker Desktop; local Ollama/vLLM inference routing failed.
Addresses [WSL2] NemoClaw sandbox cannot reach Windows-hosted Ollama — local inference path blocked (relates to #305, #315, #246) #336 — sandbox cannot reach Windows-hosted Ollama on WSL2. Covers points 1 and 3 of the reporter's reproduction (address resolution + direct reachability).

Related but not fixed here

WSL2 Support Tracking — Known Gaps & Workarounds #305 — WSL2 support tracking. This PR helps the inference path only; the gateway bootstrap / image-pull / TLS issues listed there are unchanged.
Local vLLM inference from sandbox on WSL2 + RTX 5090 — workaround documented #315 — vLLM on WSL2 walkthrough. Sandbox-egress iptables/veth workarounds described there are orthogonal to the address-resolution fix.
[Bug] Ollama reasoning models return empty content — agent responses are blank #246 — Ollama reasoning-model blank-content bug. Separate root cause (response payload parsing). Workaround: pick a non-reasoning model.

Verification

End-to-end on WSL2 (mirrored networking) + Docker Desktop + Windows-hosted Ollama, with qwen2.5-coder:3b:

Onboard reaches step 8/8, logs Resolved WSL2 host IP for container access: 10.125.212.156.
openshell inference get shows provider=ollama-local, model=qwen2.5-coder:3b, timeout=180s.
Sandbox pod → http://host.openshell.internal:11434/api/tags returns the model list ([WSL2] NemoClaw sandbox cannot reach Windows-hosted Ollama — local inference path blocked (relates to #305, #315, #246) #336 point 1 confirmed).
5-turn chat completion from inside the sandbox returns in ~3s, no timeout, no proxy-403.

Unit tests: 43/43 pass in src/lib/local-inference.test.ts, covering candidate ordering, filtering (docker/loopback/link-local/IPv6), probe-and-pick fallback to the default gateway, resolvedHostIp return value, and non-WSL2 regression path.

Known follow-ups (tracked separately)

registry.updateSandbox(sandboxName, {resolvedHostIp, model, provider}) is a no-op at step 4/8 because the sandbox entry is created at step 6/8 (same pre-existing bug affects model and provider).
Onboard does not always return cleanly to the shell after step 8/8.

Type of Change

Code change for a bug fix.

Testing

Unit tests added for new helpers and branches (detectWsl2HostIpCandidates, probe-and-pick fallback, candidate filtering).
npm run typecheck:cli passes.
Targeted vitest --project cli src/lib/local-inference.test.ts — 43/43 pass.
Manual E2E on WSL2 + Docker Desktop + Windows Ollama: onboard completes, sandbox → Ollama chat roundtrip ~3s.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Resolved local inference connectivity on WSL2 + Docker Desktop by probing multiple host IP candidates and routing via the first reachable address.
- Onboarding validation now prompts on warnings instead of exiting immediately.
New Features
- Automatically persists the detected host IP for subsequent local-inference sessions.
Documentation
- Added detailed troubleshooting for WSL2 + Docker Desktop networking and manual workarounds.
Tests
- Expanded tests covering host-IP detection and validation flows.

#1472) On WSL2 with Docker Desktop, `host.openshell.internal` resolves to an unreachable IPv6 ULA or gateway IP. This breaks both the onboard container reachability check (step 4/8 hard-exits) and runtime inference routing (proxy cannot reach upstream Ollama). Changes: - detect WSL2 + Docker Desktop at onboard time and resolve the distro's eth0 IPv4 via `hostname -I` - pass the reachable IP to `OPENAI_BASE_URL` and the container reachability probe instead of `host.openshell.internal` - add `-4` flag to curl in the reachability check to force IPv4 - replace hard `process.exit(1)` with a "Continue anyway?" prompt when the container reachability check fails - improve the error message with actionable diagnostic causes Non-WSL2 platforms (macOS, native Linux, Docker Engine) are unaffected; `host.openshell.internal` remains the default when no override is needed. Signed-off-by: Giedrius Burachas <gburachas@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extends the earlier host.openshell.internal override to cover both Ollama/vLLM server placements on WSL2 + Docker Desktop: A. Server inside WSL — reached via the distro's eth0 IPv4 (`ip -4 -o route get 1.1.1.1` src). B. Server on Windows host (NAT mode) — reached only via the WSL2 default gateway (`ip -4 -o route show default`). `detectWsl2HostIpCandidates()` now returns an ordered, filtered list (drops loopback, link-local, and common Docker/k8s bridge ranges). `validateLocalProvider()` probes each candidate with the container reachability check and returns the first that answers as `resolvedHostIp`. `onboard.js` uses that winner for OPENAI_BASE_URL and the in-container probe, prints the resolved IP, and tries to persist it to the sandbox registry. Non-WSL2 platforms are unchanged. Verified end-to-end: onboard with ollama-local + Windows-hosted Ollama over WSL2 mirrored networking reaches step 8/8, prints a resolved host IP, and a 5-turn chat completion from inside the sandbox returns in ~3s with no timeout. Adds a note to docs/CONTRIBUTING.md telling AI contributors not to stage `.agents/skills/nemoclaw-user-*/` — those files are regenerated from `docs/` by the pre-commit hook. Addresses: #1472 (host.openshell.internal unreachable on WSL2 + Docker Desktop, runtime inference routing), #336 (sandbox cannot reach Windows-hosted Ollama; covers points 1 and 3 of the reporter's reproduction). Related (not fixed here): - #305 — WSL2 tracking issue; this PR helps the inference path only, not the gateway bootstrap / image-pull / TLS issues listed there. - #315 — vLLM-on-WSL2 walkthrough; sandbox egress iptables/veth workarounds documented there are orthogonal. - #246 — Ollama reasoning-model blank-content bug; separate issue, pick a non-reasoning model until it's fixed. Known follow-ups, not in this PR: - `registry.updateSandbox(sandboxName, {resolvedHostIp, ...})` is a no-op at step 4/8 because the sandbox entry is created at step 6/8; persistence order needs fixing (same pre-existing bug affects `model` and `provider` fields). - Onboard does not always return cleanly to the shell after step 8/8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-14T06:01:54Z

📝 Walkthrough

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: probing and selecting WSL2 host IP candidates for local inference restoration.
Linked Issues check	✅ Passed	The PR implements all coding requirements from issue `#1472`: detects reachable host IPs via probing on WSL2, restores container reachability to local Ollama/vLLM endpoints, and persists the resolved IP for consistent routing.
Out of Scope Changes check	✅ Passed	All changes are directly related to resolving the WSL2 host IP reachability issue: local inference validation logic, onboarding flow, documentation, tests, registry persistence, and contributor guidance.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ollama-inference-timeout-1472

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

When Ollama runs on the Windows host (not inside WSL), NemoClaw's auto-detection only helps once the host itself is actually reachable from WSL. Document the three prerequisite steps, each labeled with the shell to run it in (PowerShell vs WSL): 1. Bind Ollama to 0.0.0.0 via OLLAMA_HOST at Machine scope. 2. Allow inbound TCP 11434 in Windows Defender Firewall. 3. Switch WSL2 to mirrored networking mode (NAT mode hits NATInboundRuleNotApplicable on the Hyper-V firewall layer). Also improves the fallback troubleshooting list with explicit shell labels and the gateway-IP manual override for users who cannot switch to mirrored mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

docs/CONTRIBUTING.md (1)
36-36: Please align this note with docs style rules (bold/colon/line-splitting).

This line has three style issues: unnecessary bold on routine instruction (LLM pattern detected), a colon not introducing a list, and multiple sentences on one source line.
Suggested edit
-**For AI coding assistants:** Do not `git add` any file under `.agents/skills/nemoclaw-user-*/` — not even when `git status` shows it as modified. The pre-commit hook regenerates and stages those files automatically from `docs/`. Staging them manually makes the commit diff harder to review and can mask out-of-date hand edits. If you changed user-facing behavior, update the matching page under `docs/` and stage only `docs/**/*.md`; the hook does the rest.
+For AI coding assistants, do not `git add` any file under `.agents/skills/nemoclaw-user-*/`, even when `git status` shows it as modified.
+The pre-commit hook regenerates and stages those files automatically from `docs/`.
+Staging them manually makes the commit diff harder to review and can mask out-of-date hand edits.
+If you changed user-facing behavior, update the matching page under `docs/` and stage only `docs/**/*.md`.
+The hook does the rest.
As per coding guidelines: “One sentence per line in source”, “Colons should only introduce a list”, and “Unnecessary bold on routine instructions … flag as suggestions with the note ‘LLM pattern detected.’”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/CONTRIBUTING.md` at line 36, Update the sentence about AI coding
assistants to follow docs style: remove the bold formatting around the routine
instruction, change the colon so it either introduces a list or is replaced with
a period, and split the content so there is one sentence per source line;
reference the exact text mentioning `.agents/skills/nemoclaw-user-*/` and
`docs/` when making the change and add a short suggestion note "LLM pattern
detected" (not bold) after the instruction to indicate it’s a stylistic
suggestion rather than a rule.
docs/reference/troubleshooting.md (1)
158-200: Reflow this section to one sentence per line and remove routine bold emphasis.

The new prose is wrapped mid-sentence across multiple source lines, and bolding phrases like “inside WSL”, “Windows host”, and “mirrored” reads like routine emphasis rather than a warning. LLM pattern detected.

As per coding guidelines, "One sentence per line in source (makes diffs readable)" and "Bold is reserved for UI labels, parameter names, and genuine warnings."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/troubleshooting.md` around lines 158 - 200, The "Local
inference on WSL2 + Docker Desktop" section has wrapped sentences across lines
and uses routine bolding (e.g., "**inside WSL**", "**Windows host**",
"**mirrored**"); reflow every sentence so each ends on its own source line and
remove bold from routine emphasis (leave bold only for UI labels/params/warnings
like OPENAI_BASE_URL, host.openshell.internal, OLLAMA_HOST), preserving content
and examples (ip commands, env vars, resolvedHostIp, NO_PROXY) and ensuring
lists and bullet points remain one sentence per line for clearer diffs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 3405-3409: The call to registry.updateSandbox(...) is writing
using the temporary GATEWAY_NAME before the real sandbox exists (setupInference
runs before createSandbox), so its return is false and the data is lost; fix by
deferring the persistence until after the sandbox is registered (i.e., after
createSandbox completes inside onboard) or by caching the {model, provider,
resolvedHostIp} in session state and applying them when registry.createSandbox /
registry.updateSandbox is called for the real sandbox name; specifically modify
the flow around setupInference(), createSandbox(), and the
registry.updateSandbox call so you either move the registry.updateSandbox
invocation to post-createSandbox or add a session/cache write-read that
registry.updateSandbox consumes once the real sandbox entry exists.
- Around line 3324-3331: On validation failure in the local-provider probe
branches (the blocks that check validation.ok, call prompt(), and call
process.exit(1)), respect the global non-interactive flag instead of always
invoking prompt(): if the non-interactive mode flag (the same boolean used
elsewhere in the wizard, e.g., nonInteractive or flags.nonInteractive) is set
then log the validation.message and immediately call process.exit(1) (no
prompt), otherwise keep the existing interactive prompt flow; apply this same
guard to both places that call prompt() (the shown block and the other branch
around lines 3362-3369).

In `@src/lib/local-inference.ts`:
- Around line 85-96: The curl-based container reachability commands returned by
getLocalProviderContainerReachabilityCheck (cases "vllm-local" and
"ollama-local") lack timeout flags and can block on blackholed IPs; modify
getLocalProviderContainerReachabilityCheck to add the same timeout options used
by getOllamaProbeCommand (e.g., --max-time and optionally --connect-timeout) to
the returned command strings so each probe times out quickly and the caller loop
(the WSL2 IP candidate loop) can proceed promptly.

---

Nitpick comments:
In `@docs/CONTRIBUTING.md`:
- Line 36: Update the sentence about AI coding assistants to follow docs style:
remove the bold formatting around the routine instruction, change the colon so
it either introduces a list or is replaced with a period, and split the content
so there is one sentence per source line; reference the exact text mentioning
`.agents/skills/nemoclaw-user-*/` and `docs/` when making the change and add a
short suggestion note "LLM pattern detected" (not bold) after the instruction to
indicate it’s a stylistic suggestion rather than a rule.

In `@docs/reference/troubleshooting.md`:
- Around line 158-200: The "Local inference on WSL2 + Docker Desktop" section
has wrapped sentences across lines and uses routine bolding (e.g., "**inside
WSL**", "**Windows host**", "**mirrored**"); reflow every sentence so each ends
on its own source line and remove bold from routine emphasis (leave bold only
for UI labels/params/warnings like OPENAI_BASE_URL, host.openshell.internal,
OLLAMA_HOST), preserving content and examples (ip commands, env vars,
resolvedHostIp, NO_PROXY) and ensuring lists and bullet points remain one
sentence per line for clearer diffs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a21e145a-cafe-493c-b807-56b5bd3afcb8

📥 Commits

Reviewing files that changed from the base of the PR and between cd36c58 and 5800da8.

📒 Files selected for processing (6)

bin/lib/onboard.js
docs/CONTRIBUTING.md
docs/reference/troubleshooting.md
src/lib/local-inference.test.ts
src/lib/local-inference.ts
src/lib/registry.ts

coderabbitai · 2026-04-14T06:14:02Z

    if (!validation.ok) {
      console.error(`  ${validation.message}`);
-      process.exit(1);
+      const answer = (await prompt("  Continue anyway? Inference may fail at runtime. [y/N]: "))
+        .trim()
+        .toLowerCase();
+      if (answer !== "y") {
+        process.exit(1);
+      }


⚠️ Potential issue | 🟠 Major

Keep local-provider probe failures non-interactive in --non-interactive mode.

These branches now call prompt() unconditionally on validation failure. If nemoclaw onboard --non-interactive selects ollama or vllm, a bad probe turns into a hang instead of the hard failure the rest of the wizard uses.

Suggested fix

if (!validation.ok) { console.error(` ${validation.message}`); + if (isNonInteractive()) { + process.exit(1); + } const answer = (await prompt(" Continue anyway? Inference may fail at runtime. [y/N]: ")) .trim() .toLowerCase(); if (answer !== "y") { process.exit(1); } }

Apply the same guard in both local-provider branches.

Also applies to: 3362-3369

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 3324 - 3331, On validation failure in the local-provider probe branches (the blocks that check validation.ok, call prompt(), and call process.exit(1)), respect the global non-interactive flag instead of always invoking prompt(): if the non-interactive mode flag (the same boolean used elsewhere in the wizard, e.g., nonInteractive or flags.nonInteractive) is set then log the validation.message and immediately call process.exit(1) (no prompt), otherwise keep the existing interactive prompt flow; apply this same guard to both places that call prompt() (the shown block and the other branch around lines 3362-3369).

coderabbitai · 2026-04-14T06:14:02Z

+  registry.updateSandbox(sandboxName, {
+    model,
+    provider,
+    resolvedHostIp: resolvedHostIp || null,
+  });


⚠️ Potential issue | 🟠 Major

This registry write never reaches the real sandbox entry.

setupInference() still runs before createSandbox(), and onboard() passes GATEWAY_NAME here (Line 4415), not the eventual sandbox name. That makes registry.updateSandbox(...) return false and silently drop model, provider, and resolvedHostIp for normal onboard runs.

Please move this persistence until after the sandbox has been registered, or store it in session state and apply it once the real sandbox entry exists.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 3405 - 3409, The call to registry.updateSandbox(...) is writing using the temporary GATEWAY_NAME before the real sandbox exists (setupInference runs before createSandbox), so its return is false and the data is lost; fix by deferring the persistence until after the sandbox is registered (i.e., after createSandbox completes inside onboard) or by caching the {model, provider, resolvedHostIp} in session state and applying them when registry.createSandbox / registry.updateSandbox is called for the real sandbox name; specifically modify the flow around setupInference(), createSandbox(), and the registry.updateSandbox call so you either move the registry.updateSandbox invocation to post-createSandbox or add a session/cache write-read that registry.updateSandbox consumes once the real sandbox entry exists.

coderabbitai · 2026-04-14T06:14:02Z

+export function getLocalProviderContainerReachabilityCheck(
+  provider: string,
+  hostIp?: string,
+): string | null {
+  const addHost = hostIp
+    ? `--add-host host.openshell.internal:${hostIp}`
+    : "--add-host host.openshell.internal:host-gateway";
  switch (provider) {
    case "vllm-local":
-      return `docker run --rm --add-host host.openshell.internal:host-gateway ${CONTAINER_REACHABILITY_IMAGE} -sf http://host.openshell.internal:8000/v1/models 2>/dev/null`;
+      return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 -sf http://host.openshell.internal:8000/v1/models 2>/dev/null`;
    case "ollama-local":
-      return `docker run --rm --add-host host.openshell.internal:host-gateway ${CONTAINER_REACHABILITY_IMAGE} -sf http://host.openshell.internal:11434/api/tags 2>/dev/null`;
+      return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 -sf http://host.openshell.internal:11434/api/tags 2>/dev/null`;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and get its total line count if [ -f "src/lib/local-inference.ts" ]; then echo "File exists. Line count:" wc -l src/lib/local-inference.ts else echo "File not found" fi

Repository: NVIDIA/NemoClaw

Length of output: 1870

🏁 Script executed:

# Read the specific lines mentioned in the review (85-96 and 201-210) echo "=== Lines 85-96 ===" sed -n '85,96p' src/lib/local-inference.ts echo "" echo "=== Lines 201-210 ===" sed -n '201,210p' src/lib/local-inference.ts

Repository: NVIDIA/NemoClaw

Length of output: 1185

🏁 Script executed:

# Check for SPDX license header at the top of the file echo "=== First 10 lines (checking for SPDX header) ===" head -10 src/lib/local-inference.ts

Repository: NVIDIA/NemoClaw

Length of output: 508

🏁 Script executed:

# Search for all curl commands in the file to understand the context echo "=== All curl/container reachability commands in the file ===" rg "curl|getLocalProviderContainerReachabilityCheck" src/lib/local-inference.ts -A 3

Repository: NVIDIA/NemoClaw

Length of output: 1505

Add timeout flags to container reachability probe commands.

The getLocalProviderContainerReachabilityCheck() function returns curl commands without timeout flags. When this function is called in a loop for multiple WSL2 host IP candidates (lines 201-210), a blackholed IP can block progress for minutes before attempting the next candidate. This is inconsistent with getOllamaProbeCommand() in the same file, which already includes --max-time.

Suggested fix

export function getLocalProviderContainerReachabilityCheck( provider: string, hostIp?: string, ): string | null { const addHost = hostIp ? `--add-host host.openshell.internal:${hostIp}` : "--add-host host.openshell.internal:host-gateway"; switch (provider) { case "vllm-local": - return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 -sf http://host.openshell.internal:8000/v1/models 2>/dev/null`; + return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 --connect-timeout 2 --max-time 5 -sf http://host.openshell.internal:8000/v1/models 2>/dev/null`; case "ollama-local": - return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 -sf http://host.openshell.internal:11434/api/tags 2>/dev/null`; + return `docker run --rm ${addHost} ${CONTAINER_REACHABILITY_IMAGE} -4 --connect-timeout 2 --max-time 5 -sf http://host.openshell.internal:11434/api/tags 2>/dev/null`; default: return null; } }

Also applies to: 201-210

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/lib/local-inference.ts` around lines 85 - 96, The curl-based container reachability commands returned by getLocalProviderContainerReachabilityCheck (cases "vllm-local" and "ollama-local") lack timeout flags and can block on blackholed IPs; modify getLocalProviderContainerReachabilityCheck to add the same timeout options used by getOllamaProbeCommand (e.g., --max-time and optionally --connect-timeout) to the returned command strings so each probe times out quickly and the caller loop (the WSL2 IP candidate loop) can proceed promptly.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/reference/troubleshooting.md`:
- Around line 255-257: The fenced code block containing "```bash" and "ip route
show default | awk '/default/ {print $3}'" needs a blank line immediately before
the opening ``` and a blank line immediately after the closing ``` to satisfy
MD031 (blanks-around-fences); update the nested fenced block in the markdown
list so there is an empty line above and below the triple-backtick fence.
- Around line 176-179: The documentation currently states that the first
successful container-side probe is definitively "injected into both
`OPENAI_BASE_URL` and the reachability check, and persisted to the sandbox
registry entry as `resolvedHostIp`"; change this absolute language to reflect
best-effort behavior by replacing claims of guaranteed persistence with phrases
like "attempts to persist" or "is attempted to be persisted" for
`resolvedHostIp`, and clarify that `OPENAI_BASE_URL` and the reachability check
receive the candidate when the probe succeeds, noting this occurs during the
onboarding flow and that persistence ordering/guarantees are not strict.
- Around line 192-200: The CLI examples currently use `powershell`/`bash` fenced
blocks without the required `$` prompt; update the fenced code blocks to use the
`console` language tag and prefix each command line with a `$` prompt (e.g., for
the PowerShell snippet replace the ```powershell block containing
[System.Environment]::SetEnvironmentVariable('OLLAMA_HOST'...) and the
Get-Process | Where-Object ... | Stop-Process -Force and ollama serve lines with
a ```console block and add `$ ` before each command), and apply the same
transformation to the other referenced blocks (lines 204-207, 211-215, 232-234,
238-241, 255-257) so all CLI examples follow the docs guideline.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 59fd9dd1-922e-41f4-974d-b4c09699f1ac

📥 Commits

Reviewing files that changed from the base of the PR and between 5800da8 and 768b4f4.

📒 Files selected for processing (1)

docs/reference/troubleshooting.md

coderabbitai · 2026-04-14T06:17:17Z

+The first candidate whose container-side probe succeeds is injected
+into both `OPENAI_BASE_URL` and the reachability check, and persisted
+to the sandbox registry entry as `resolvedHostIp`. No manual override
+is needed for either Ollama placement.


⚠️ Potential issue | 🟠 Major

Avoid absolute wording about resolvedHostIp persistence.

This states persistence as guaranteed, but current behavior is best-effort in onboarding flow and has a known persistence-ordering follow-up. Please soften this to "attempts to persist" to avoid misleading troubleshooting expectations.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 176 - 179, The documentation currently states that the first successful container-side probe is definitively "injected into both `OPENAI_BASE_URL` and the reachability check, and persisted to the sandbox registry entry as `resolvedHostIp`"; change this absolute language to reflect best-effort behavior by replacing claims of guaranteed persistence with phrases like "attempts to persist" or "is attempted to be persisted" for `resolvedHostIp`, and clarify that `OPENAI_BASE_URL` and the reachability check receive the candidate when the probe succeeds, noting this occurs during the onboarding flow and that persistence ordering/guarantees are not strict.

coderabbitai · 2026-04-14T06:17:17Z

+```powershell
+# Persist across reboots; Machine scope so services also inherit it.
+[System.Environment]::SetEnvironmentVariable('OLLAMA_HOST','0.0.0.0:11434','Machine')
+
+# Stop Ollama (tray + server) and start it in a new shell so it picks
+# up the new env var. Open a NEW PowerShell window first, then:
+Get-Process | Where-Object { $_.ProcessName -like 'ollama*' } | Stop-Process -Force
+ollama serve
+```


⚠️ Potential issue | 🟡 Minor

Use console fenced blocks with $ prompts for CLI commands.

These CLI examples are tagged as powershell/bash, but the docs rule requires console blocks with $ prompt prefixes for command examples.

Suggested formatting adjustment

-```powershell +```console +$ [System.Environment]::SetEnvironmentVariable('OLLAMA_HOST','0.0.0.0:11434','Machine') ... -Get-Process | Where-Object { $_.ProcessName -like 'ollama*' } | Stop-Process -Force -ollama serve +$ Get-Process | Where-Object { $_.ProcessName -like 'ollama*' } | Stop-Process -Force +$ ollama serve

</details> As per coding guidelines, "CLI code blocks must use the `console` language tag with `$` prompt prefix. Flag ```bash or ```shell for CLI examples." Also applies to: 204-207, 211-215, 232-234, 238-241, 255-257 <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/reference/troubleshooting.md around lines 192 - 200, The CLI examples
currently use powershell/bash fenced blocks without the required $ prompt;
update the fenced code blocks to use the console language tag and prefix each
command line with a $ prompt (e.g., for the PowerShell snippet replace the

[System.Environment]::SetEnvironmentVariable('OLLAMA_HOST'...) and the Get-Process | Where-Object ... | Stop-Process -Force and ollama serve lines with a ```console block and add `$ ` before each command), and apply the same transformation to the other referenced blocks (lines 204-207, 211-215, 232-234, 238-241, 255-257) so all CLI examples follow the docs guideline.

coderabbitai · 2026-04-14T06:17:17Z

+  ```bash
+  ip route show default | awk '/default/ {print $3}'
+  ```


⚠️ Potential issue | 🟡 Minor

Add blank lines around the nested fenced code block.

This block trips MD031 (blanks-around-fences) in the list item. Insert blank lines before and after the fence to satisfy markdownlint and keep rendering stable.

As per coding guidelines, "Follow style guide in docs/CONTRIBUTING.md for documentation."

🧰 Tools

🪛 markdownlint-cli2 (0.22.0)

[warning] 255-255: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

[warning] 257-257: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 255 - 257, The fenced code block containing "```bash" and "ip route show default | awk '/default/ {print $3}'" needs a blank line immediately before the opening ``` and a blank line immediately after the closing ``` to satisfy MD031 (blanks-around-fences); update the nested fenced block in the markdown list so there is an empty line above and below the triple-backtick fence.

jieunl24 · 2026-04-14T18:10:07Z

Does this work for both WSL hosted Ollama and Windows hosted Ollama?
Windows hosted Ollama is not a supported path as NemoClaw only shows Ollama as an inference option if and only if Ollama is installed on WSL and user selected model gets loaded on Ollama on WSL.
That being said, Windows hosted Ollama is still reachable from the sandbox already, it'll just return 404 model not found when the user selected model was not already loaded by the user on Windows hosted Ollama side.

I wonder if the reporter was running Ollama on Windows, and if not - the expected binding for Ollama on WSL is 127.0.0.1 (the default, no need for binding specification), not 0.0.0.0.

jieunl24 · 2026-04-14T18:12:50Z

+          "Local Ollama is responding on localhost, but the container reachability check failed for http://host.openshell.internal:11434.\n" +
+          "  Common causes:\n" +
+          "  • Ollama is bound to 127.0.0.1 — set OLLAMA_HOST=0.0.0.0:11434\n" +
+          "  • Docker Desktop on WSL2 resolves host-gateway to IPv6 — try installing Docker Engine natively in WSL2\n" +


Prerequisite of running NemoClaw on Windows is Docker Desktop.

jieunl24 · 2026-04-15T09:15:43Z

The premise about host.openshell.internal resolving to IPv6 / un-routable IPs seems incorrect / outdated. It's set as a hostAlias and always resolves to the Docker host-gateway IPv4 address (e.g. 172.29.0.254):

/ # kubectl get po my-assistant -o yaml -n openshell | grep hostAlias -A 6
  hostAliases:
  - hostnames:
    - host.docker.internal
    - host.openshell.internal
    ip: 172.29.0.254
  initContainers:
  - command:

("my-assistant" is my sandbox's name) and then verified the full chain: gateway container → host-gateway IP → Windows host → wslrelay → Ollama - works end-to-end. The IP is routable and Ollama responds.

/ # kubectl exec -it openshell-0 -n openshell -- sh
$ bash -c 'echo -e "GET /api/tags HTTP/1.0\r\n\r\n" > /dev/tcp/172.29.0.254/11434 && echo "REACHABLE" || echo "UNREACHABLE"'
REACHABLE
$

The actual root cause is conditional, it only manifests when Ollama binds to a dual-stack socket. WSL2's wslrelay.exe doesn't forward dual-stack (AF_INET6) sockets (microsoft/WSL#4851). Go's net.Listen("tcp", "0.0.0.0") creates a dual-stack socket, so Ollama with OLLAMA_HOST=0.0.0.0 binds to *:11434 (AF_INET6) rather than .0.0.0:11434 (AF_INET4). The relay ignores it, and the connection blackholes.
The fix is simpler: on WSL2, don't set OLLAMA_HOST=0.0.0.0 - let Ollama bind to its default 127.0.0.1, which the relay does forward. This fix has been merged around 2 weeks ago (#1104).

Re: the reporter (#1472) - they're on OpenShell 0.0.14, I've verified on openshell 0.0.26. Also as mentioned on my previous comment, reporter seems to be binding Ollama host to 0.0.0.0, which causes the dual-stack socket issue above.

There's a windows setup documentation that was added recently https://github.com/NVIDIA/NemoClaw/blob/main/docs/get-started/windows-setup.md which also covers how to set up local inference with Ollama.

Kindly verify whether issue is reproducible with the latest nemoclaw and openshell versions, following the installation guide.

cv · 2026-04-16T08:03:45Z

Friendly AI-generated maintainer note:

Thanks for the WSL2 follow-up. I took a pass through the current branch and I need these blockers addressed before I can move it forward:

Rebase onto current main and port the onboarding changes to the active TypeScript path (src/lib/onboard.ts / current CLI flow), not the older bin/lib/onboard.js path by itself.
Keep --non-interactive failures non-interactive — local-provider validation failures should not prompt.
Move the resolvedHostIp registry write so it targets the real sandbox entry after sandbox creation, not the pre-create gateway placeholder.
Resolve the remaining major CodeRabbit threads, including the docs wording that currently overstates the persistence guarantee.

After that update lands, I can re-run the gate check.

gburachas and others added 2 commits April 10, 2026 17:46

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

wscurran added bug Something isn't working Platform: Windows/WSL Support for Windows Subsystem for Linux Local Models Running NemoClaw with local models fix labels Apr 14, 2026

jieunl24 reviewed Apr 14, 2026

View reviewed changes

cv added the v0.0.18 Release target label Apr 16, 2026

Conversation

gburachas commented Apr 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Host-side prerequisites (for Windows-hosted Ollama)

Linked issues

Addressed in this PR

Related but not fixed here

Verification

Known follow-ups (tracked separately)

Type of Change

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

jieunl24 commented Apr 14, 2026

Uh oh!

jieunl24 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

jieunl24 commented Apr 15, 2026

Uh oh!

cv commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gburachas commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading