Enable managed local Ollama inference and add cloud/Ollama model selection#295
Enable managed local Ollama inference and add cloud/Ollama model selection#295
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds local inference support (Ollama, vLLM), centralizes inference/provider config, introduces JS and shell runtime & Docker-socket detection, makes install/setup runtime-aware (macOS validations), adds sandbox config sync and CLI shim handling, switches many commands to interactive runner, and adds extensive tests and shell helpers. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Installer as install.sh
participant Runtime as runtime helpers
participant Docker as ContainerRuntime
participant CLI as NemoClaw CLI
User->>Installer: run install.sh
Installer->>Runtime: detect_docker_host()
Runtime->>Docker: probe sockets / env
alt DOCKER_HOST present
Docker-->>Runtime: return env host
else Colima socket found
Docker-->>Runtime: return colima unix socket
else Docker Desktop found
Docker-->>Runtime: return docker-desktop socket
end
Runtime-->>Installer: runtime & host
Installer->>Runtime: is_unsupported_macos_runtime?
alt unsupported (macOS + podman)
Runtime-->>Installer: fail early
else supported
Installer->>CLI: install from GitHub, create shim
CLI-->>Installer: shim created
end
Installer->>User: installation complete
sequenceDiagram
participant User
participant Onboard as setupNim
participant LocalCheck as validateLocalProvider
participant Models as Ollama/vLLM discovery
participant ProviderCfg as inference-config
participant Sandbox as Sandbox Sync Script
User->>Onboard: start onboarding
Onboard->>Onboard: present providers (cloud, ollama, vllm, nim)
alt user selects local (ollama/vllm)
Onboard->>LocalCheck: validateLocalProvider(provider)
LocalCheck-->>Onboard: health, baseUrl or error
Onboard->>Models: getOllamaModelOptions()
Models-->>Onboard: model list
else user selects cloud/nim
Onboard->>ProviderCfg: getProviderSelectionConfig(provider, model)
ProviderCfg-->>Onboard: endpoint/profile/credentialEnv
end
Onboard->>User: prompt model
User-->>Onboard: select model
Onboard->>Sandbox: buildSandboxConfigSyncScript(selectionConfig)
Sandbox-->>Onboard: generated script applied to sandbox
Onboard->>User: onboarding complete
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
|
Depends on #286 |
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (9)
scripts/lib/runtime.sh (2)
215-229: Add timeout to curl health checks to prevent hanging.Without a timeout,
curlmay hang indefinitely if the service is unresponsive or a firewall drops packets.♻️ Add timeout to curl commands
check_local_provider_health() { local provider="${1:-}" case "$provider" in vllm-local) - curl -sf http://localhost:8000/v1/models > /dev/null 2>&1 + curl -sf --connect-timeout 2 --max-time 5 http://localhost:8000/v1/models > /dev/null 2>&1 ;; ollama-local) - curl -sf http://localhost:11434/api/tags > /dev/null 2>&1 + curl -sf --connect-timeout 2 --max-time 5 http://localhost:11434/api/tags > /dev/null 2>&1 ;; *) return 1 ;; esac }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/lib/runtime.sh` around lines 215 - 229, The curl health checks in function check_local_provider_health (cases vllm-local and ollama-local) lack timeouts and can hang; update those curl calls to include a connection and overall timeout (e.g., --connect-timeout and --max-time or -m) so the health check fails quickly on unresponsive services, and keep the existing -s -f output flags intact.
114-117: Consider handling IPv6 loopback (::1) if dual-stack environments are expected.
is_loopback_iponly checks for IPv4127.*addresses. In dual-stack environments,::1could appear in resolv.conf.♻️ Optional improvement for IPv6 support
is_loopback_ip() { local ip="${1:-}" - [[ "$ip" == 127.* ]] + [[ "$ip" == 127.* || "$ip" == "::1" ]] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/lib/runtime.sh` around lines 114 - 117, The is_loopback_ip function only matches IPv4 127.*; update it to also recognize IPv6 loopback (::1) by expanding the conditional in is_loopback_ip to check for the literal "::1" (and optionally normalized IPv6 forms) in addition to the existing 127.* pattern so dual‑stack environments are handled correctly.scripts/install.sh (2)
21-97: Intentional code duplication for standalone installer support.The inline
define_runtime_helpersduplicates functions fromruntime.shto supportcurl | bashinstallations where the library file isn't available. This is a reasonable trade-off.Consider adding a comment noting this intentional duplication to help future maintainers keep both in sync.
📝 Add sync reminder comment
define_runtime_helpers() { + # NOTE: These functions are intentionally duplicated from scripts/lib/runtime.sh + # to support standalone curl-pipe-bash installation. Keep them in sync. socket_exists() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/install.sh` around lines 21 - 97, Add a short comment above the define_runtime_helpers block stating that the functions (socket_exists, find_colima_docker_socket, find_docker_desktop_socket, detect_docker_host) are intentionally duplicated from runtime.sh to support standalone "curl | bash" installs and include a reminder to keep this copy in sync with runtime.sh when updating those helpers; reference define_runtime_helpers and runtime.sh so future maintainers know where the canonical implementation lives.
365-369: Consider pinning to a release tag if a stable version is available.Installing from
git+https://github.com/NVIDIA/NemoClaw.gitalways fetches the latestmainbranch, which could include breaking changes. While this script is designed for early-stage onboarding and main branch installs are acceptable for alpha development, consider pinning to a release tag once stable releases are published (e.g.,git+https://github.com/NVIDIA/NemoClaw.git#v1.0.0or a commit SHA for reproducibility).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/install.sh` around lines 365 - 369, The install lines currently fetch NemoClaw from the repo root which always pulls the latest main; update the git URL used in both branches of the NODE_MGR conditional (the two npm install -g git+https://github.com/NVIDIA/NemoClaw.git invocations) to pin to a stable release or commit (e.g. append `#vX.Y.Z` or a commit SHA) or introduce a variable (e.g. NEMOCLAW_REF) and use git+https://github.com/NVIDIA/NemoClaw.git#${NEMOCLAW_REF} so both the sudo and non-sudo install paths install the pinned release for reproducible installs.scripts/setup.sh (1)
144-155: Consider extending Ollama startup wait or adding a retry loop.The 2-second
sleepafter starting Ollama (line 147) may not be sufficient for cold starts, especially on slower systems or when models need loading.♻️ Add startup verification loop
if ! check_local_provider_health "ollama-local"; then info "Starting Ollama service..." OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 & - sleep 2 + for i in 1 2 3 4 5; do + sleep 2 + check_local_provider_health "ollama-local" && break + [ "$i" -eq 5 ] && warn "Ollama may not be fully ready" + done fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/setup.sh` around lines 144 - 155, Replace the fixed 2-second sleep after launching Ollama with a startup verification loop that repeatedly calls check_local_provider_health("ollama-local") (or polls get_local_provider_base_url("ollama-local")/health endpoint) with short sleeps and a total timeout/backoff; if health check succeeds proceed to call upsert_provider, otherwise kill the background process and surface an error. Ensure OLLAMA_HOST=0.0.0.0:11434 ollama serve is still started in background, implement a configurable max wait and per-iteration delay, and handle failure path (stop service and return non-zero) so upsert_provider is only called once the local provider is healthy.bin/lib/inference-config.js (1)
32-42: Consider defining aDEFAULT_VLLM_MODELconstant for consistency.The
vllm-localcase uses the string literal"vllm-local"as the default model, whilenvidia-nimandollama-localuse proper model constants. This inconsistency could cause confusion since"vllm-local"isn't a valid model identifier.♻️ Suggested improvement
+const DEFAULT_VLLM_MODEL = "default"; // or appropriate vLLM model identifier + function getProviderSelectionConfig(provider, model) { switch (provider) { // ... case "vllm-local": return { endpointType: "custom", endpointUrl: INFERENCE_ROUTE_URL, ncpPartner: null, - model: model || "vllm-local", + model: model || DEFAULT_VLLM_MODEL, profile: DEFAULT_ROUTE_PROFILE,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/inference-config.js` around lines 32 - 42, Add a DEFAULT_VLLM_MODEL constant and use it instead of the string literal in the "vllm-local" case: create a named constant (e.g., DEFAULT_VLLM_MODEL) near the other model constants in this file and replace the hard-coded "vllm-local" in the returned object (model: model || "vllm-local") with model: model || DEFAULT_VLLM_MODEL so it matches the pattern used for nvidia-nim and ollama-local and avoids an invalid literal.test/inference-config.test.js (1)
18-64: Good test coverage for core functionality; consider adding edge case tests.The tests validate the main paths well. For completeness, consider adding tests for:
vllm-localprovider configuration- Unknown provider returning
nullgetOpenClawPrimaryModelwithnvidia-nimprovider🧪 Additional test cases to consider
it("maps vllm-local to the sandbox inference route", () => { assert.deepEqual(getProviderSelectionConfig("vllm-local"), { endpointType: "custom", endpointUrl: INFERENCE_ROUTE_URL, ncpPartner: null, model: "vllm-local", profile: DEFAULT_ROUTE_PROFILE, credentialEnv: DEFAULT_ROUTE_CREDENTIAL_ENV, provider: "vllm-local", providerLabel: "Local vLLM", }); }); it("returns null for unknown providers", () => { assert.equal(getProviderSelectionConfig("unknown-provider"), null); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/inference-config.test.js` around lines 18 - 64, Add edge-case tests: add a test asserting getProviderSelectionConfig("vllm-local") returns the sandbox config (endpointType "custom", endpointUrl INFERENCE_ROUTE_URL, ncpPartner null, model "vllm-local", profile DEFAULT_ROUTE_PROFILE, credentialEnv DEFAULT_ROUTE_CREDENTIAL_ENV, provider "vllm-local", providerLabel "Local vLLM"), add a test asserting getProviderSelectionConfig("unknown-provider") returns null, and add a test calling getOpenClawPrimaryModel("nvidia-nim", "<model>") returns `${MANAGED_PROVIDER_ID}/<model>`; use the existing test file patterns and reference getProviderSelectionConfig, getOpenClawPrimaryModel, INFERENCE_ROUTE_URL, DEFAULT_ROUTE_PROFILE, DEFAULT_ROUTE_CREDENTIAL_ENV, and MANAGED_PROVIDER_ID when asserting expected values.test/uninstall.test.js (1)
29-46: Clean up temporary directories after the shim-removal test.The test leaves
mkdtempSyncoutput behind. Add cleanup to avoid temp-dir accumulation.♻️ Minimal cleanup pattern
it("removes the user-local nemoclaw shim", () => { const tmp = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-uninstall-shim-")); - const shimDir = path.join(tmp, ".local", "bin"); - const shimPath = path.join(shimDir, "nemoclaw"); - fs.mkdirSync(shimDir, { recursive: true }); - fs.writeFileSync(shimPath, "#!/usr/bin/env bash\n", { mode: 0o755 }); - - const result = spawnSync( - "bash", - ["-lc", `HOME="${tmp}" source "${UNINSTALL_SCRIPT}"; remove_nemoclaw_cli`], - { - cwd: path.join(__dirname, ".."), - encoding: "utf-8", - }, - ); - - assert.equal(result.status, 0); - assert.equal(fs.existsSync(shimPath), false); + try { + const shimDir = path.join(tmp, ".local", "bin"); + const shimPath = path.join(shimDir, "nemoclaw"); + fs.mkdirSync(shimDir, { recursive: true }); + fs.writeFileSync(shimPath, "#!/usr/bin/env bash\n", { mode: 0o755 }); + + const result = spawnSync( + "bash", + ["-lc", `HOME="${tmp}" source "${UNINSTALL_SCRIPT}"; remove_nemoclaw_cli`], + { cwd: path.join(__dirname, ".."), encoding: "utf-8" }, + ); + + assert.equal(result.status, 0); + assert.equal(fs.existsSync(shimPath), false); + } finally { + fs.rmSync(tmp, { recursive: true, force: true }); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/uninstall.test.js` around lines 29 - 46, The test creates a temporary directory via fs.mkdtempSync assigned to tmp and never removes it; after calling remove_nemoclaw_cli and the assertions on result.status and fs.existsSync(shimPath), delete the tmp directory (use fs.rmSync or fs.rmdirSync with recursive:true and force:true) to clean up the mkdtempSync output so the test doesn't leave behind temp dirs.test/onboard.test.js (1)
27-27: Avoid order-sensitive JSON assertion in script matching.This regex couples the test to JSON key ordering/serialization details. Prefer asserting required fields independently to reduce brittle failures.
🔧 More resilient assertions
- assert.match(script, /json\.loads\("\{\\\"baseUrl\\\":\\\"https:\/\/inference\.local\/v1\\\",\\\"apiKey\\\":\\\"unused\\\"/); + assert.match(script, /json\.loads\("/); + assert.match(script, /\\"baseUrl\\":\\"https:\/\/inference\.local\/v1\\"/); + assert.match(script, /\\"apiKey\\":\\"unused\\"/);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.js` at line 27, The current assertion in onboard.test.js uses an order-sensitive regex on the entire JSON string (assert.match(script, /json\.loads\(".../)), which is brittle; update the test to assert required fields independently instead — for example, locate the JSON payload in script (the same variable `script`) then either parse the JSON and assert its properties (e.g., .baseUrl and .apiKey) or use separate regex/assertions that check for `"baseUrl":"https://inference.local/v1"` and `"apiKey":"unused"` independently rather than one combined ordered pattern; modify the assertion(s) around `assert.match(script, ...)` to implement this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/local-inference.js`:
- Around line 42-45: validateLocalProvider currently treats a missing
health-check command from getLocalProviderHealthCheck(provider) as success;
change it to fail fast for unknown local provider IDs by returning an error
result (e.g., { ok: false, error: 'Unknown local provider id: <provider>' })
when getLocalProviderHealthCheck(provider) returns falsy. Update the logic in
validateLocalProvider (and any callers expecting the previous shape) so typos
like "ollama" vs "ollama-local" are rejected immediately instead of later when
config/base URL lookups occur.
In `@bin/lib/onboard.js`:
- Around line 49-58: The providerType calculation in
buildSandboxConfigSyncScript incorrectly infers Ollama only when model ===
DEFAULT_OLLAMA_MODEL; instead derive providerType from
selectionConfig.endpointType when selectionConfig.profile === "inference-local".
Update the providerType logic in buildSandboxConfigSyncScript to switch on
selectionConfig.endpointType (e.g., map "ollama" -> "ollama-local", "vllm" ->
"vllm-local", fallback -> "nvidia-nim") rather than inspecting
selectionConfig.model/DEFAULT_OLLAMA_MODEL, and keep the existing
endpointType-based mapping for non-"inference-local" profiles; ensure
getOpenClawPrimaryModel is still called with the resolved providerType and
selectionConfig.model.
In `@bin/lib/platform.js`:
- Around line 59-63: The code currently lists Colima socket candidates before
the Docker Desktop socket, causing detectDockerHost() to pick Colima when both
exist; update the candidate arrays in the macOS branches to place the Docker
Desktop socket (path.join(home, ".docker/run/docker.sock")) before the entries
returned by getColimaDockerSocketCandidates({ home }) so Docker Desktop is not
silently deprioritized, and make the same reordering in the other macOS
candidate block (lines with the second darwin branch).
In `@bin/nemoclaw.js`:
- Around line 122-123: The command injection risk comes from interpolating
untrusted variables (name and sandboxName) directly into shell strings passed to
runInteractive; update the callers that build those commands (the runInteractive
invocations around lines using variables name and sandboxName) to either (a)
validate/whitelist the values of name/sandboxName against an allowed pattern
before use, or (b) avoid shell interpolation entirely by switching to a safer
execution API that accepts an argv array (e.g., use child_process.spawn/execFile
or modify runInteractive to accept command + args) so SSH target and remote
arguments are passed as separate parameters and not concatenated into a single
shell string. Ensure every occurrence (the lines referenced and similar calls)
is fixed to use validation or the argv approach.
In `@install.sh`:
- Around line 73-80: The current ensure_nemoclaw_shim() returns early when
ORIGINAL_PATH already contains NEMOCLAW_SHIM_DIR, but it doesn't verify the
actual shim exists or is executable, causing verify_nemoclaw() to incorrectly
succeed; update ensure_nemoclaw_shim() to, when ORIGINAL_PATH contains
NEMOCLAW_SHIM_DIR, check the shim at "$shim_path" (and/or command -v nemoclaw)
and only return 0 if the shim is present and executable (-x); if missing or not
executable, create the directory, symlink "$npm_bin/nemoclaw" to "$shim_path"
and refresh_path as before; also update verify_nemoclaw() to explicitly check
for an executable nemoclaw on PATH (command -v or test -x "$shim_path") before
returning success, and apply the same existence check to the duplicate logic
around lines 251-265.
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 370-395: The current fallback to DEFAULT_MODELS can assign
cloud-only model IDs to local endpoints; update the model selection logic so
DEFAULT_MODELS is only used for cloud endpoint types (e.g., "build", "ncp") and
not for local providers ("vllm", "nim-local", "ollama"). Specifically, change
how modelOptions is computed (the modelOptions, curatedCloudOptions,
discoveredModelOptions and DEFAULT_MODELS expressions) so that for local
endpointType values you do NOT fall back to DEFAULT_MODELS and instead pass the
discoveredModelOptions (possibly empty) to promptSelect (adjust defaultIndex as
needed); leave the DEFAULT_MODELS fallback only for cloud types and ensure
promptSelect receives only valid options for the chosen endpoint type.
In `@nemoclaw/src/index.ts`:
- Around line 150-177: activeModelEntries currently returns a hardcoded
four-model Nemotron-only array when loadOnboardConfig yields no model, causing
mismatch with the six curated cloud models exposed in commands/onboard.ts;
change activeModelEntries to import and return the shared curated catalog
exported from commands/onboard.ts (e.g., curatedCloudModels or the exported
symbol that lists the six curated models) so both onboarding and the fallback
path use the exact same model list, removing the hardcoded array and referencing
that exported catalog instead.
In `@README.md`:
- Around line 144-145: The README sentence that groups "Ollama and vLLM" as both
experimental is inaccurate; update the phrasing in the Local inference section
so it explicitly states Ollama is no longer experimental while vLLM remains
experimental and note the macOS requirement for OpenShell only applies when
host-routing is needed; edit the sentence mentioning "Local inference options
such as Ollama and vLLM are still experimental" to separately call out "Ollama
(stable/non-experimental) and vLLM (experimental)" and keep the OpenShell
host-routing note as-is.
In `@scripts/nemoclaw-start.sh`:
- Around line 33-35: The code only sets agents.defaults.model.primary when
default_model (from os.environ.get('NEMOCLAW_MODEL')) is present, leaving a
stale primary in cfg when the env var is not provided; modify the logic around
default_model so that when default_model is falsy you explicitly remove/clear
the 'primary' entry under cfg.setdefault('agents', {}).setdefault('defaults',
{}).setdefault('model', {}) (or delete the 'primary' key if present) to ensure
previous runs don’t persist an unintended provider/model.
In `@scripts/smoke-macos-install.sh`:
- Around line 161-177: The feed_install_answers function contains an infinite
while : loop waiting for a log string and needs a timeout to avoid hanging;
modify feed_install_answers to record a start time (e.g., using date +%s), check
elapsed time inside the loop and break with a failure path when elapsed exceeds
a configurable timeout (use a variable like INSTALL_TIMEOUT or default value),
ensure the function still writes the expected 'n\n' to answers_pipe before
exiting and optionally append a diagnostic message to install_log or stderr so
callers can detect the timeout; reference the function name
feed_install_answers, the loop that checks "$install_log" and the
"$answers_pipe" and "$install_log" variables when applying the change.
In `@test/smoke-macos-install.test.js`:
- Around line 91-95: The spawnSync call that runs the interactive smoke script
(the call assigned to result using spawnSync) can hang; add a timeout option
(e.g., timeout: 60000) to the spawnSync options object and fail the test with a
clear message if the child times out (check result.error && result.error.code
=== 'ETIMEDOUT' or result.signal === 'SIGTERM' / result.status === null). Update
the options passed to spawnSync (the object currently containing cwd, encoding,
env) to include the timeout and add a small conditional after the call to
assert/throw if a timeout occurred so CI jobs don't hang when the shell flow
regresses.
- Around line 55-64: The test uses a hard-coded HOME
("/tmp/nemoclaw-smoke-no-runtime") which can leak state between runs; update the
test around the spawnSync call (the spec using SMOKE_SCRIPT and spawnSync) to
create an isolated temporary HOME using os.tmpdir() + fs.mkdtempSync (or
fs.promises.mkdtemp) and set env.HOME to that temp directory for the spawnSync
invocation, and ensure the temp directory is removed/cleaned up after the test
completes.
In `@uninstall.sh`:
- Around line 203-205: The uninstall currently deletes any file at
${NEMOCLAW_SHIM_DIR}/nemoclaw; change it to only remove files the installer
created by (1) if it's a symlink, verify readlink target matches the expected
installer shim target before calling remove_path, and (2) if it's a regular
file, open and check for an installer marker string (e.g. a unique comment or
header the installer adds) and only call remove_path when that marker is
present; reference NEMOCLAW_SHIM_DIR and the remove_path invocation to locate
and update the removal logic.
---
Nitpick comments:
In `@bin/lib/inference-config.js`:
- Around line 32-42: Add a DEFAULT_VLLM_MODEL constant and use it instead of the
string literal in the "vllm-local" case: create a named constant (e.g.,
DEFAULT_VLLM_MODEL) near the other model constants in this file and replace the
hard-coded "vllm-local" in the returned object (model: model || "vllm-local")
with model: model || DEFAULT_VLLM_MODEL so it matches the pattern used for
nvidia-nim and ollama-local and avoids an invalid literal.
In `@scripts/install.sh`:
- Around line 21-97: Add a short comment above the define_runtime_helpers block
stating that the functions (socket_exists, find_colima_docker_socket,
find_docker_desktop_socket, detect_docker_host) are intentionally duplicated
from runtime.sh to support standalone "curl | bash" installs and include a
reminder to keep this copy in sync with runtime.sh when updating those helpers;
reference define_runtime_helpers and runtime.sh so future maintainers know where
the canonical implementation lives.
- Around line 365-369: The install lines currently fetch NemoClaw from the repo
root which always pulls the latest main; update the git URL used in both
branches of the NODE_MGR conditional (the two npm install -g
git+https://github.com/NVIDIA/NemoClaw.git invocations) to pin to a stable
release or commit (e.g. append `#vX.Y.Z` or a commit SHA) or introduce a variable
(e.g. NEMOCLAW_REF) and use
git+https://github.com/NVIDIA/NemoClaw.git#${NEMOCLAW_REF} so both the sudo and
non-sudo install paths install the pinned release for reproducible installs.
In `@scripts/lib/runtime.sh`:
- Around line 215-229: The curl health checks in function
check_local_provider_health (cases vllm-local and ollama-local) lack timeouts
and can hang; update those curl calls to include a connection and overall
timeout (e.g., --connect-timeout and --max-time or -m) so the health check fails
quickly on unresponsive services, and keep the existing -s -f output flags
intact.
- Around line 114-117: The is_loopback_ip function only matches IPv4 127.*;
update it to also recognize IPv6 loopback (::1) by expanding the conditional in
is_loopback_ip to check for the literal "::1" (and optionally normalized IPv6
forms) in addition to the existing 127.* pattern so dual‑stack environments are
handled correctly.
In `@scripts/setup.sh`:
- Around line 144-155: Replace the fixed 2-second sleep after launching Ollama
with a startup verification loop that repeatedly calls
check_local_provider_health("ollama-local") (or polls
get_local_provider_base_url("ollama-local")/health endpoint) with short sleeps
and a total timeout/backoff; if health check succeeds proceed to call
upsert_provider, otherwise kill the background process and surface an error.
Ensure OLLAMA_HOST=0.0.0.0:11434 ollama serve is still started in background,
implement a configurable max wait and per-iteration delay, and handle failure
path (stop service and return non-zero) so upsert_provider is only called once
the local provider is healthy.
In `@test/inference-config.test.js`:
- Around line 18-64: Add edge-case tests: add a test asserting
getProviderSelectionConfig("vllm-local") returns the sandbox config
(endpointType "custom", endpointUrl INFERENCE_ROUTE_URL, ncpPartner null, model
"vllm-local", profile DEFAULT_ROUTE_PROFILE, credentialEnv
DEFAULT_ROUTE_CREDENTIAL_ENV, provider "vllm-local", providerLabel "Local
vLLM"), add a test asserting getProviderSelectionConfig("unknown-provider")
returns null, and add a test calling getOpenClawPrimaryModel("nvidia-nim",
"<model>") returns `${MANAGED_PROVIDER_ID}/<model>`; use the existing test file
patterns and reference getProviderSelectionConfig, getOpenClawPrimaryModel,
INFERENCE_ROUTE_URL, DEFAULT_ROUTE_PROFILE, DEFAULT_ROUTE_CREDENTIAL_ENV, and
MANAGED_PROVIDER_ID when asserting expected values.
In `@test/onboard.test.js`:
- Line 27: The current assertion in onboard.test.js uses an order-sensitive
regex on the entire JSON string (assert.match(script, /json\.loads\(".../)),
which is brittle; update the test to assert required fields independently
instead — for example, locate the JSON payload in script (the same variable
`script`) then either parse the JSON and assert its properties (e.g., .baseUrl
and .apiKey) or use separate regex/assertions that check for
`"baseUrl":"https://inference.local/v1"` and `"apiKey":"unused"` independently
rather than one combined ordered pattern; modify the assertion(s) around
`assert.match(script, ...)` to implement this change.
In `@test/uninstall.test.js`:
- Around line 29-46: The test creates a temporary directory via fs.mkdtempSync
assigned to tmp and never removes it; after calling remove_nemoclaw_cli and the
assertions on result.status and fs.existsSync(shimPath), delete the tmp
directory (use fs.rmSync or fs.rmdirSync with recursive:true and force:true) to
clean up the mkdtempSync output so the test doesn't leave behind temp dirs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 10ea04ca-dffa-4d29-8c31-c58f30a50909
⛔ Files ignored due to path filters (15)
nemoclaw/dist/cli.jsis excluded by!**/dist/**nemoclaw/dist/cli.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/onboard.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/onboard.jsis excluded by!**/dist/**nemoclaw/dist/commands/onboard.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/slash.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/slash.jsis excluded by!**/dist/**nemoclaw/dist/commands/slash.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/index.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/index.jsis excluded by!**/dist/**nemoclaw/dist/index.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/onboard/config.d.tsis excluded by!**/dist/**nemoclaw/dist/onboard/config.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/onboard/config.jsis excluded by!**/dist/**nemoclaw/dist/onboard/config.js.mapis excluded by!**/dist/**,!**/*.map
📒 Files selected for processing (32)
README.mdbin/lib/credentials.jsbin/lib/inference-config.jsbin/lib/local-inference.jsbin/lib/onboard.jsbin/lib/platform.jsbin/lib/runner.jsbin/nemoclaw.jsinstall.shnemoclaw/src/cli.tsnemoclaw/src/commands/onboard.tsnemoclaw/src/commands/slash.tsnemoclaw/src/index.tsnemoclaw/src/onboard/config.tsscripts/fix-coredns.shscripts/install.shscripts/lib/runtime.shscripts/nemoclaw-start.shscripts/setup.shscripts/smoke-macos-install.shtest/credentials.test.jstest/inference-config.test.jstest/install-preflight.test.jstest/local-inference.test.jstest/onboard-selection.test.jstest/onboard.test.jstest/platform.test.jstest/runner.test.jstest/runtime-shell.test.jstest/smoke-macos-install.test.jstest/uninstall.test.jsuninstall.sh
| const command = getLocalProviderHealthCheck(provider); | ||
| if (!command) { | ||
| return { ok: true }; | ||
| } |
There was a problem hiding this comment.
Fail unknown local provider IDs instead of accepting them.
If getLocalProviderHealthCheck() has no match, validateLocalProvider() currently returns { ok: true }. A typo such as ollama vs ollama-local will pass validation here and only fail later when the base URL/config lookup returns null.
💡 Proposed fix
function validateLocalProvider(provider, runCapture) {
const command = getLocalProviderHealthCheck(provider);
if (!command) {
- return { ok: true };
+ return {
+ ok: false,
+ message: `Unsupported local inference provider: ${provider}.`,
+ };
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const command = getLocalProviderHealthCheck(provider); | |
| if (!command) { | |
| return { ok: true }; | |
| } | |
| const command = getLocalProviderHealthCheck(provider); | |
| if (!command) { | |
| return { | |
| ok: false, | |
| message: `Unsupported local inference provider: ${provider}.`, | |
| }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/local-inference.js` around lines 42 - 45, validateLocalProvider
currently treats a missing health-check command from
getLocalProviderHealthCheck(provider) as success; change it to fail fast for
unknown local provider IDs by returning an error result (e.g., { ok: false,
error: 'Unknown local provider id: <provider>' }) when
getLocalProviderHealthCheck(provider) returns falsy. Update the logic in
validateLocalProvider (and any callers expecting the previous shape) so typos
like "ollama" vs "ollama-local" are rejected immediately instead of later when
config/base URL lookups occur.
| function buildSandboxConfigSyncScript(selectionConfig) { | ||
| const providerType = | ||
| selectionConfig.profile === "inference-local" | ||
| ? selectionConfig.model === DEFAULT_OLLAMA_MODEL | ||
| ? "ollama-local" | ||
| : "nvidia-nim" | ||
| : selectionConfig.endpointType === "vllm" | ||
| ? "vllm-local" | ||
| : "nvidia-nim"; | ||
| const primaryModel = getOpenClawPrimaryModel(providerType, selectionConfig.model); |
There was a problem hiding this comment.
Derive the sandbox provider from the selected endpoint, not the model name.
Line 52 only recognizes DEFAULT_OLLAMA_MODEL as Ollama. Any other inference-local selection then collapses to nvidia-nim, so the generated OpenClaw config is wrong for alternate Ollama models and potentially other local-provider selections as well.
Suggested fix
function buildSandboxConfigSyncScript(selectionConfig) {
const providerType =
- selectionConfig.profile === "inference-local"
- ? selectionConfig.model === DEFAULT_OLLAMA_MODEL
- ? "ollama-local"
- : "nvidia-nim"
- : selectionConfig.endpointType === "vllm"
+ selectionConfig.endpointType === "ollama"
+ ? "ollama-local"
+ : selectionConfig.endpointType === "vllm"
? "vllm-local"
: "nvidia-nim";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/onboard.js` around lines 49 - 58, The providerType calculation in
buildSandboxConfigSyncScript incorrectly infers Ollama only when model ===
DEFAULT_OLLAMA_MODEL; instead derive providerType from
selectionConfig.endpointType when selectionConfig.profile === "inference-local".
Update the providerType logic in buildSandboxConfigSyncScript to switch on
selectionConfig.endpointType (e.g., map "ollama" -> "ollama-local", "vllm" ->
"vllm-local", fallback -> "nvidia-nim") rather than inspecting
selectionConfig.model/DEFAULT_OLLAMA_MODEL, and keep the existing
endpointType-based mapping for non-"inference-local" profiles; ensure
getOpenClawPrimaryModel is still called with the resolved providerType and
selectionConfig.model.
| if (platform === "darwin") { | ||
| return [ | ||
| ...getColimaDockerSocketCandidates({ home }), | ||
| path.join(home, ".docker/run/docker.sock"), | ||
| ]; |
There was a problem hiding this comment.
Don't silently prefer Colima when multiple macOS sockets exist.
This candidate order puts Colima ahead of Docker Desktop, and detectDockerHost() returns the first path that merely exists. Since bin/lib/runner.js exports that into DOCKER_HOST immediately, hosts with both runtimes installed get auto-pinned to Colima again instead of surfacing the ambiguity.
💡 Proposed fix
function detectDockerHost(opts = {}) {
const env = opts.env ?? process.env;
if (env.DOCKER_HOST) {
return {
dockerHost: env.DOCKER_HOST,
source: "env",
socketPath: null,
};
}
const existsSync = opts.existsSync ?? require("fs").existsSync;
- for (const socketPath of getDockerSocketCandidates(opts)) {
- if (existsSync(socketPath)) {
- return {
- dockerHost: `unix://${socketPath}`,
- source: "socket",
- socketPath,
- };
- }
- }
+ const matches = getDockerSocketCandidates(opts).filter((socketPath) => existsSync(socketPath));
+ if (matches.length !== 1) {
+ return null;
+ }
+
+ const [socketPath] = matches;
+ return {
+ dockerHost: `unix://${socketPath}`,
+ source: "socket",
+ socketPath,
+ };
return null;
}Also applies to: 79-88
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/platform.js` around lines 59 - 63, The code currently lists Colima
socket candidates before the Docker Desktop socket, causing detectDockerHost()
to pick Colima when both exist; update the candidate arrays in the macOS
branches to place the Docker Desktop socket (path.join(home,
".docker/run/docker.sock")) before the entries returned by
getColimaDockerSocketCandidates({ home }) so Docker Desktop is not silently
deprioritized, and make the same reordering in the other macOS candidate block
(lines with the second darwin branch).
| default_model = os.environ.get('NEMOCLAW_MODEL') | ||
| if default_model: | ||
| cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {})['primary'] = default_model |
There was a problem hiding this comment.
Clear stale primary model state when no override is provided.
This only updates agents.defaults.model.primary when NEMOCLAW_MODEL is set, so an old value from a previous run survives in ~/.openclaw/openclaw.json and can keep forcing the wrong provider/model instead of returning to the managed default path.
💡 Proposed fix
-default_model = os.environ.get('NEMOCLAW_MODEL')
-if default_model:
- cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {})['primary'] = default_model
+model_cfg = cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {})
+default_model = os.environ.get('NEMOCLAW_MODEL')
+if default_model:
+ model_cfg['primary'] = default_model
+else:
+ model_cfg.pop('primary', None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/nemoclaw-start.sh` around lines 33 - 35, The code only sets
agents.defaults.model.primary when default_model (from
os.environ.get('NEMOCLAW_MODEL')) is present, leaving a stale primary in cfg
when the env var is not provided; modify the logic around default_model so that
when default_model is falsy you explicitly remove/clear the 'primary' entry
under cfg.setdefault('agents', {}).setdefault('defaults',
{}).setdefault('model', {}) (or delete the 'primary' key if present) to ensure
previous runs don’t persist an unintended provider/model.
| feed_install_answers() { | ||
| local answers_pipe="$1" | ||
| local install_log="$2" | ||
|
|
||
| ( | ||
| printf '%s\n' "$SANDBOX_NAME" | ||
|
|
||
| while :; do | ||
| if [ -f "$install_log" ] && grep -q "OpenClaw gateway launched inside sandbox" "$install_log"; then | ||
| break | ||
| fi | ||
| sleep 1 | ||
| done | ||
|
|
||
| printf 'n\n' | ||
| ) > "$answers_pipe" | ||
| } |
There was a problem hiding this comment.
Add a timeout to prevent infinite loop in feed_install_answers.
The while : loop waits indefinitely for a log message. If install.sh fails before writing the expected message, this function will loop forever, consuming CPU.
🛡️ Proposed fix to add timeout
feed_install_answers() {
local answers_pipe="$1"
local install_log="$2"
+ local timeout=300 # 5 minutes max wait
+ local elapsed=0
(
printf '%s\n' "$SANDBOX_NAME"
while :; do
if [ -f "$install_log" ] && grep -q "OpenClaw gateway launched inside sandbox" "$install_log"; then
break
fi
+ if [ "$elapsed" -ge "$timeout" ]; then
+ echo "[smoke] Timeout waiting for gateway launch message" >&2
+ break
+ fi
sleep 1
+ elapsed=$((elapsed + 1))
done
printf 'n\n'
) > "$answers_pipe"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| feed_install_answers() { | |
| local answers_pipe="$1" | |
| local install_log="$2" | |
| ( | |
| printf '%s\n' "$SANDBOX_NAME" | |
| while :; do | |
| if [ -f "$install_log" ] && grep -q "OpenClaw gateway launched inside sandbox" "$install_log"; then | |
| break | |
| fi | |
| sleep 1 | |
| done | |
| printf 'n\n' | |
| ) > "$answers_pipe" | |
| } | |
| feed_install_answers() { | |
| local answers_pipe="$1" | |
| local install_log="$2" | |
| local timeout=300 # 5 minutes max wait | |
| local elapsed=0 | |
| ( | |
| printf '%s\n' "$SANDBOX_NAME" | |
| while :; do | |
| if [ -f "$install_log" ] && grep -q "OpenClaw gateway launched inside sandbox" "$install_log"; then | |
| break | |
| fi | |
| if [ "$elapsed" -ge "$timeout" ]; then | |
| echo "[smoke] Timeout waiting for gateway launch message" >&2 | |
| break | |
| fi | |
| sleep 1 | |
| elapsed=$((elapsed + 1)) | |
| done | |
| printf 'n\n' | |
| ) > "$answers_pipe" | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/smoke-macos-install.sh` around lines 161 - 177, The
feed_install_answers function contains an infinite while : loop waiting for a
log string and needs a timeout to avoid hanging; modify feed_install_answers to
record a start time (e.g., using date +%s), check elapsed time inside the loop
and break with a failure path when elapsed exceeds a configurable timeout (use a
variable like INSTALL_TIMEOUT or default value), ensure the function still
writes the expected 'n\n' to answers_pipe before exiting and optionally append a
diagnostic message to install_log or stderr so callers can detect the timeout;
reference the function name feed_install_answers, the loop that checks
"$install_log" and the "$answers_pipe" and "$install_log" variables when
applying the change.
| it("fails when a requested runtime socket is unavailable", () => { | ||
| const result = spawnSync("bash", [SMOKE_SCRIPT, "--runtime", "docker-desktop"], { | ||
| cwd: path.join(__dirname, ".."), | ||
| encoding: "utf-8", | ||
| env: { | ||
| ...process.env, | ||
| NVIDIA_API_KEY: "nvapi-test", | ||
| HOME: "/tmp/nemoclaw-smoke-no-runtime", | ||
| }, | ||
| }); |
There was a problem hiding this comment.
Use an isolated temp HOME to avoid cross-run flakiness.
Using a hard-coded HOME path can leak state between runs and make this test nondeterministic.
✅ Safer HOME setup
+const fs = require("node:fs");
+const os = require("node:os");
@@
it("fails when a requested runtime socket is unavailable", () => {
+ const home = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-smoke-no-runtime-"));
const result = spawnSync("bash", [SMOKE_SCRIPT, "--runtime", "docker-desktop"], {
cwd: path.join(__dirname, ".."),
encoding: "utf-8",
env: {
...process.env,
NVIDIA_API_KEY: "nvapi-test",
- HOME: "/tmp/nemoclaw-smoke-no-runtime",
+ HOME: home,
},
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("fails when a requested runtime socket is unavailable", () => { | |
| const result = spawnSync("bash", [SMOKE_SCRIPT, "--runtime", "docker-desktop"], { | |
| cwd: path.join(__dirname, ".."), | |
| encoding: "utf-8", | |
| env: { | |
| ...process.env, | |
| NVIDIA_API_KEY: "nvapi-test", | |
| HOME: "/tmp/nemoclaw-smoke-no-runtime", | |
| }, | |
| }); | |
| it("fails when a requested runtime socket is unavailable", () => { | |
| const home = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-smoke-no-runtime-")); | |
| const result = spawnSync("bash", [SMOKE_SCRIPT, "--runtime", "docker-desktop"], { | |
| cwd: path.join(__dirname, ".."), | |
| encoding: "utf-8", | |
| env: { | |
| ...process.env, | |
| NVIDIA_API_KEY: "nvapi-test", | |
| HOME: home, | |
| }, | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/smoke-macos-install.test.js` around lines 55 - 64, The test uses a
hard-coded HOME ("/tmp/nemoclaw-smoke-no-runtime") which can leak state between
runs; update the test around the spawnSync call (the spec using SMOKE_SCRIPT and
spawnSync) to create an isolated temporary HOME using os.tmpdir() +
fs.mkdtempSync (or fs.promises.mkdtemp) and set env.HOME to that temp directory
for the spawnSync invocation, and ensure the temp directory is removed/cleaned
up after the test completes.
| const result = spawnSync("bash", ["-lc", script], { | ||
| cwd: path.join(__dirname, ".."), | ||
| encoding: "utf-8", | ||
| env: { ...process.env, NVIDIA_API_KEY: "nvapi-test" }, | ||
| }); |
There was a problem hiding this comment.
Add a timeout to the interactive smoke invocation to prevent hung test jobs.
This spawnSync can block indefinitely if the shell flow regresses. A bounded timeout keeps CI reliable.
⏱️ Suggested guardrail
const result = spawnSync("bash", ["-lc", script], {
cwd: path.join(__dirname, ".."),
encoding: "utf-8",
env: { ...process.env, NVIDIA_API_KEY: "nvapi-test" },
+ timeout: 30_000,
+ killSignal: "SIGKILL",
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const result = spawnSync("bash", ["-lc", script], { | |
| cwd: path.join(__dirname, ".."), | |
| encoding: "utf-8", | |
| env: { ...process.env, NVIDIA_API_KEY: "nvapi-test" }, | |
| }); | |
| const result = spawnSync("bash", ["-lc", script], { | |
| cwd: path.join(__dirname, ".."), | |
| encoding: "utf-8", | |
| env: { ...process.env, NVIDIA_API_KEY: "nvapi-test" }, | |
| timeout: 30_000, | |
| killSignal: "SIGKILL", | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/smoke-macos-install.test.js` around lines 91 - 95, The spawnSync call
that runs the interactive smoke script (the call assigned to result using
spawnSync) can hang; add a timeout option (e.g., timeout: 60000) to the
spawnSync options object and fail the test with a clear message if the child
times out (check result.error && result.error.code === 'ETIMEDOUT' or
result.signal === 'SIGTERM' / result.status === null). Update the options passed
to spawnSync (the object currently containing cwd, encoding, env) to include the
timeout and add a small conditional after the call to assert/throw if a timeout
occurred so CI jobs don't hang when the shell flow regresses.
| if [ -L "${NEMOCLAW_SHIM_DIR}/nemoclaw" ] || [ -f "${NEMOCLAW_SHIM_DIR}/nemoclaw" ]; then | ||
| remove_path "${NEMOCLAW_SHIM_DIR}/nemoclaw" | ||
| fi |
There was a problem hiding this comment.
Only remove a shim that this installer owns.
Line 203 now removes any regular file at ~/.local/bin/nemoclaw, not just the shim created by this installer. If a user already has their own wrapper or binary there, uninstall will silently delete it. Please verify the symlink target or an installer marker before deleting regular files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@uninstall.sh` around lines 203 - 205, The uninstall currently deletes any
file at ${NEMOCLAW_SHIM_DIR}/nemoclaw; change it to only remove files the
installer created by (1) if it's a symlink, verify readlink target matches the
expected installer shim target before calling remove_path, and (2) if it's a
regular file, open and check for an installer marker string (e.g. a unique
comment or header the installer adds) and only call remove_path when that marker
is present; reference NEMOCLAW_SHIM_DIR and the remove_path invocation to locate
and update the removal logic.
9669b09 to
20a0e52
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
bin/lib/onboard.js (1)
586-594:⚠️ Potential issue | 🟠 MajorHonor non-interactive Ollama model selection here.
requestedModelis computed earlier, but this branch ignores it and always callspromptOllamaModel().NEMOCLAW_PROVIDER=ollamawill still block on stdin in CI instead of honoringNEMOCLAW_MODELor the detected default.🤖 Suggested fix
} else if (selected.key === "ollama") { if (!ollamaRunning) { console.log(" Starting Ollama..."); run("OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 &", { ignoreError: true }); sleep(2); } console.log(" ✓ Using Ollama on localhost:11434"); provider = "ollama-local"; - model = await promptOllamaModel(); + if (isNonInteractive()) { + const availableModels = getOllamaModelOptions(runCapture); + model = requestedModel || getDefaultOllamaModel(runCapture); + if (requestedModel && availableModels.length > 0 && !availableModels.includes(requestedModel)) { + console.error(` Unsupported NEMOCLAW_MODEL for Ollama: ${requestedModel}`); + process.exit(1); + } + console.log(` [non-interactive] Ollama model: ${model}`); + } else { + model = await promptOllamaModel(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 586 - 594, The Ollama branch always calls promptOllamaModel() and blocks interactive CI; change it to honor the precomputed requestedModel and avoid prompting when non-interactive. In the selected.key === "ollama" block use the existing requestedModel (or process.env.NEMOCLAW_MODEL) if present: set model = requestedModel (or env fallback) else, only if process.stdin.isTTY (interactive), call model = await promptOllamaModel(); keep existing Ollama start logic (ollamaRunning, run(), sleep()), and then set provider = "ollama-local" as before.nemoclaw/src/commands/onboard.ts (1)
417-428:⚠️ Potential issue | 🟡 MinorDuplicate "Provider" label in confirmation summary.
The confirmation summary displays "Provider" twice - once with the human-readable label (Line 418) and once with the technical provider name (Line 428). Consider using distinct labels for clarity.
♻️ Suggested fix
logger.info(` Endpoint: ${describeOnboardEndpoint(summaryConfig)}`); logger.info(` Provider: ${summaryConfig.providerLabel}`); if (ncpPartner) { logger.info(` NCP Partner: ${ncpPartner}`); } logger.info(` Model: ${model}`); logger.info( ` API Key: ${requiresApiKey ? maskApiKey(apiKey) : "not required (local provider)"}`, ); logger.info(` Credential: $${credentialEnv}`); logger.info(` Profile: ${profile}`); - logger.info(` Provider: ${providerName}`); + logger.info(` Provider ID: ${providerName}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 417 - 428, The confirmation summary currently logs the provider twice: a human-friendly label using summaryConfig.providerLabel (logged in the logger.info call that interpolates describeOnboardEndpoint) and a technical name using providerName (the final logger.info). Change the final logger.info to use a distinct label such as "Provider Name:" or "Provider (internal):" instead of "Provider:" so it no longer duplicates summaryConfig.providerLabel; update the logger.info that references providerName accordingly (the logger.info that currently logs `Provider: ${providerName}`).
♻️ Duplicate comments (9)
scripts/nemoclaw-start.sh (1)
33-35:⚠️ Potential issue | 🟠 MajorClear stale
primarymodel state when no override is provided.When
NEMOCLAW_MODELis absent on a later launch, the previousagents.defaults.model.primarysurvives in~/.openclaw/openclaw.json. That keeps forcing the old provider/model instead of returning to the managed default path.Suggested fix
-default_model = os.environ.get('NEMOCLAW_MODEL') -if default_model: - cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {})['primary'] = default_model +model_cfg = cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {}) +default_model = os.environ.get('NEMOCLAW_MODEL') +if default_model: + model_cfg['primary'] = default_model +else: + model_cfg.pop('primary', None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/nemoclaw-start.sh` around lines 33 - 35, If NEMOCLAW_MODEL is not set, remove any existing agents.defaults.model.primary entry so stale model selection doesn't persist; update the logic around default_model (the block that currently uses cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model', {})['primary']) to delete or pop the 'primary' key from cfg['agents']['defaults']['model'] when default_model is falsy, ensuring the config falls back to the managed default path.install.sh (1)
73-75:⚠️ Potential issue | 🟠 MajorDon't report success when
nemoclawstill isn't runnable.If
~/.local/binis already onORIGINAL_PATHbut the shim is missing,ensure_nemoclaw_shim()returns success without creating anything.verify_nemoclaw()then falls through the warning path and still returns0, sorun_onboard()can fail later withnemoclaw: command not found.Suggested fix
- if [[ ":$ORIGINAL_PATH:" == *":$npm_bin:"* ]] || [[ ":$ORIGINAL_PATH:" == *":$NEMOCLAW_SHIM_DIR:"* ]]; then + if [[ ":$ORIGINAL_PATH:" == *":$npm_bin:"* ]]; then return 0 fi + + if [[ -L "$shim_path" || -x "$shim_path" ]]; then + refresh_path + return 0 + fi @@ - warn "Continuing — nemoclaw is installed but requires a PATH update." - return 0 + error "Installed nemoclaw at $npm_bin/nemoclaw but it is still not invokable in this session."Also applies to: 251-265
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install.sh` around lines 73 - 75, The current early-return in ensure_nemoclaw_shim() uses the presence of npm_bin or NEMOCLAW_SHIM_DIR in ORIGINAL_PATH to return success even when the actual shim file is missing, causing verify_nemoclaw() to incorrectly return 0; change the logic so that ensure_nemoclaw_shim() verifies the shim file exists (check for the executable at $NEMOCLAW_SHIM_DIR/nemoclaw or the expected shim path) before returning success, and only return 0 if the shim is present, otherwise create the shim or return non-zero; update the same pattern in the other occurrence (lines ~251-265) so both places inspect the actual shim file rather than just PATH membership, and ensure verify_nemoclaw() and run_onboard() receive a failure code when the shim is absent.bin/lib/platform.js (1)
59-63:⚠️ Potential issue | 🟠 MajorDon't auto-select Colima when Docker Desktop is also installed.
On macOS, the candidate list puts Colima first and
detectDockerHost()returns the first existing socket. Becausebin/lib/runner.jsexports that intoDOCKER_HOSTat module load, dual-runtime hosts get pinned to Colima again without any ambiguity surfacing.Suggested fix
if (platform === "darwin") { return [ - ...getColimaDockerSocketCandidates({ home }), path.join(home, ".docker/run/docker.sock"), + ...getColimaDockerSocketCandidates({ home }), ]; }Also applies to: 80-87
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/platform.js` around lines 59 - 63, The macOS socket candidate ordering currently prefers Colima (via getColimaDockerSocketCandidates) causing detectDockerHost to pick Colima even when Docker Desktop is installed; change the candidate ordering so Docker Desktop socket(s) are checked before Colima (or update detectDockerHost to prefer Docker Desktop sockets over Colima). Locate the platform === "darwin" branch where getColimaDockerSocketCandidates({ home }) is used and reorder or prepend the Docker Desktop socket candidate(s) (the docker.sock path(s) used by Docker Desktop) so detectDockerHost and the module-export in runner.js will select Docker Desktop first.test/smoke-macos-install.test.js (2)
91-95:⚠️ Potential issue | 🟠 MajorBound the subprocess running the sourced smoke flow.
This child process can hang indefinitely if the answer-writer logic regresses, which will stall the whole test job. Add
timeout/killSignalhere and fail explicitly onETIMEDOUT.⏱️ Suggested guardrail
const result = spawnSync("bash", ["-lc", script], { cwd: path.join(__dirname, ".."), encoding: "utf-8", env: { ...process.env, NVIDIA_API_KEY: "nvapi-test" }, + timeout: 30_000, + killSignal: "SIGKILL", }); + assert.notEqual(result.error?.code, "ETIMEDOUT", "sourced smoke helper timed out"); assert.equal(result.status, 0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/smoke-macos-install.test.js` around lines 91 - 95, The spawnSync invocation running the sourced smoke flow (variable script) should be bounded to avoid hangs: add a timeout (e.g., timeout: <milliseconds>) and a killSignal option to the options object passed to spawnSync, and after the call check result.error or result.signal to detect an ETIMEDOUT/timeout and explicitly fail the test with a clear message; update the code around spawnSync (the result assignment and subsequent assertions) to treat a timeout (result.error.code === 'ETIMEDOUT' or result.signal indicating the killSignal) as a test failure so the job doesn't hang indefinitely.
55-64:⚠️ Potential issue | 🟡 MinorUse an isolated HOME for this runtime-socket check.
A fixed
/tmp/nemoclaw-smoke-no-runtimecan pick up leftover state from prior runs and make this guardrail nondeterministic. Create a per-test temp HOME and clean it up afterward.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/smoke-macos-install.test.js` around lines 55 - 64, The test "fails when a requested runtime socket is unavailable" currently uses a fixed HOME (/tmp/nemoclaw-smoke-no-runtime) which can leak state between runs; change the test to create a unique temporary directory for HOME (e.g., via fs.mkdtempSync or os.tmpdir + mkdtemp) before calling spawnSync with SMOKE_SCRIPT and set env.HOME to that temp dir, then remove the temp directory in a finally/after hook to ensure isolation and cleanup; update the test surrounding symbols it("fails when a requested runtime socket is unavailable") and the spawnSync call that uses SMOKE_SCRIPT to use this temp HOME.nemoclaw/src/index.ts (1)
150-177:⚠️ Potential issue | 🟠 MajorShare the curated cloud catalog with this fallback path.
When
loadOnboardConfig()is empty, this still exposes the old four-model Nemotron list. A fresh plugin registration won't surface Kimi/GLM/MiniMax/Qwen/GPT-OSS until after onboarding, so the plugin UI drifts immediately from the curated selector. Please source this array from the same shared catalog the onboarding model selector uses instead of hardcoding it again here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/index.ts` around lines 150 - 177, The fallback in activeModelEntries currently returns a hardcoded four-model list when loadOnboardConfig() is empty; replace that hardcoded array with the shared curated catalog used by the onboarding selector so both UIs stay in sync. Locate the activeModelEntries function and instead of returning the inline ModelProviderEntry[] literal, import (or call) the existing shared catalog provider used by the onboarding selector (the same source that produces the Kimi/GLM/MiniMax/Qwen/GPT-OSS entries) and return its entries; ensure types match ModelProviderEntry and preserve contextWindow/maxOutput fields from the catalog.scripts/smoke-macos-install.sh (1)
161-176:⚠️ Potential issue | 🟠 MajorAdd a timeout around the staged-answer wait.
If
install.shreaches the follow-up prompt without ever writing this log line, the writer never sends the finalnand the whole smoke run blocks forever. Bound the wait and emit a diagnostic before falling through.⏱️ Suggested guardrail
feed_install_answers() { local answers_pipe="$1" local install_log="$2" + local timeout="${INSTALL_TIMEOUT:-300}" + local start_ts + start_ts="$(date +%s)" ( printf '%s\n' "$SANDBOX_NAME" while :; do if [ -f "$install_log" ] && grep -q "OpenClaw gateway launched inside sandbox" "$install_log"; then break fi + if [ $(( $(date +%s) - start_ts )) -ge "$timeout" ]; then + echo "[smoke] Timed out waiting for gateway launch message" >&2 + break + fi sleep 1 done printf 'n\n' ) > "$answers_pipe" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/smoke-macos-install.sh` around lines 161 - 176, The loop in feed_install_answers (which writes to answers_pipe and reads install_log) can block forever waiting for the "OpenClaw gateway launched inside sandbox" string; add a timeout guard (e.g., max wait seconds) and a timestamp/check to break after the timeout, write a diagnostic message to install_log or stdout/stderr before falling through, and ensure the final printf 'n\n' is still sent to answers_pipe on timeout. Update feed_install_answers to accept or define a MAX_WAIT variable, use SANDBOX_NAME and install_log in the diagnostic, and make the loop break on either success or timeout so callers won’t hang.README.md (1)
153-153:⚠️ Potential issue | 🟡 MinorSplit Ollama and vLLM status in this note.
The PR makes Ollama a supported local path, but this sentence still says both Ollama and vLLM are experimental.
📝 Suggested wording
-Local inference options such as Ollama and vLLM are still experimental. On macOS, they also depend on OpenShell host-routing support in addition to the local service itself being reachable on the host. +Ollama local inference is supported when host-routing and reachability requirements are met. vLLM local inference remains experimental. On macOS, local inference also depends on OpenShell host-routing support in addition to the local service itself being reachable on the host.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 153, Update the single README sentence so it reflects that Ollama is now a supported local inference option while vLLM remains experimental: replace the combined statement with two clauses (or two sentences) naming Ollama as a supported/local path and vLLM as still experimental; keep the macOS/OpenShell host-routing note attached to the option(s) that actually require it (mention OpenShell host-routing in the vLLM macOS note if applicable).bin/lib/onboard.js (1)
72-80:⚠️ Potential issue | 🔴 CriticalResolve the local provider from
endpointType.This still collapses every
inference-localselection whose model is notDEFAULT_OLLAMA_MODELtonvidia-nim, so alternate Ollama models and vLLM both generate the wrong OpenClaw config. Switch onselectionConfig.endpointTypehere instead of the model name.🔧 Suggested fix
const providerType = - selectionConfig.profile === "inference-local" - ? selectionConfig.model === DEFAULT_OLLAMA_MODEL - ? "ollama-local" - : "nvidia-nim" - : selectionConfig.endpointType === "vllm" + selectionConfig.endpointType === "ollama" + ? "ollama-local" + : selectionConfig.endpointType === "vllm" ? "vllm-local" : "nvidia-nim";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 72 - 80, The conditional in buildSandboxConfigSyncScript wrongly uses selectionConfig.model to choose the local provider for profile === "inference-local", collapsing non-default Ollama models into "nvidia-nim"; change the branch to switch on selectionConfig.endpointType (e.g., "ollama", "vllm", etc.) when profile is "inference-local" so that endpointType determines "ollama-local", "vllm-local" or "nvidia-nim" as appropriate; update any related comparisons to use selectionConfig.endpointType instead of selectionConfig.model while preserving the DEFAULT_OLLAMA_MODEL constant use elsewhere.
🧹 Nitpick comments (5)
test/runtime-shell.test.js (2)
13-13: Empty afterEach callback is unnecessary.This can be removed since it doesn't perform any cleanup.
♻️ Suggested fix
-afterEach(() => {});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/runtime-shell.test.js` at line 13, The empty afterEach callback in the test file is redundant; remove the standalone afterEach(() => {}); invocation (referencing the afterEach symbol in test/runtime-shell.test.js) to clean up the test suite and leave no-op hooks out of the codebase.
34-47: Consider moving temp directory cleanup to a finally block or afterEach.If an assertion fails,
fs.rmSyncwon't be called, leaving temp directories behind. For test hygiene, wrap cleanup in atry/finallyor track created directories for cleanup inafterEach.♻️ Example using try/finally
it("prefers Colima over Docker Desktop", () => { const home = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-runtime-shell-")); - const colimaSocket = path.join(home, ".colima/default/docker.sock"); - const dockerDesktopSocket = path.join(home, ".docker/run/docker.sock"); - - const result = runShell(`source "${RUNTIME_SH}"; detect_docker_host`, { - HOME: home, - NEMOCLAW_TEST_SOCKET_PATHS: `${colimaSocket}:${dockerDesktopSocket}`, - }); - - assert.equal(result.status, 0); - assert.equal(result.stdout.trim(), `unix://${colimaSocket}`); - fs.rmSync(home, { recursive: true, force: true }); + try { + const colimaSocket = path.join(home, ".colima/default/docker.sock"); + const dockerDesktopSocket = path.join(home, ".docker/run/docker.sock"); + + const result = runShell(`source "${RUNTIME_SH}"; detect_docker_host`, { + HOME: home, + NEMOCLAW_TEST_SOCKET_PATHS: `${colimaSocket}:${dockerDesktopSocket}`, + }); + + assert.equal(result.status, 0); + assert.equal(result.stdout.trim(), `unix://${colimaSocket}`); + } finally { + fs.rmSync(home, { recursive: true, force: true }); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/runtime-shell.test.js` around lines 34 - 47, The test "prefers Colima over Docker Desktop" currently calls fs.rmSync at the end so an assertion failure will leave temp files; modify the test to guarantee cleanup by moving fs.rmSync into a finally block around the assertions (or register the created temp dir for removal in an afterEach), i.e., wrap the runShell/assert block in try { ... } finally { fs.rmSync(home, { recursive: true, force: true }) } or push the path to a list cleaned by an afterEach, referencing the test name and the runShell and fs.rmSync calls to locate the code to change.scripts/setup.sh (1)
144-148: Background Ollama startup may silently fail.If
ollama servefails to start (e.g., port already in use by another process, permissions issue), the error is swallowed by> /dev/null 2>&1 &. Consider adding a post-startup health check.♻️ Suggested improvement
if ! check_local_provider_health "ollama-local"; then info "Starting Ollama service..." OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 & sleep 2 + if ! check_local_provider_health "ollama-local"; then + warn "Ollama service did not start successfully. You may need to start it manually." + fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/setup.sh` around lines 144 - 148, The startup block for Ollama swallows errors and may hide failures; after launching OLLAMA_HOST=0.0.0.0:11434 ollama serve in the background (the section using OLLAMA_HOST and the call to `ollama serve` guarded by `check_local_provider_health "ollama-local"`), add a post-startup health probe that retries the existing check_local_provider_health function for a short loop (e.g., retry every second for 10s) and, on failure, capture and print the Ollama logs (avoid redirecting to /dev/null) and exit with non-zero status so the caller knows startup failed. Ensure the probe uses the same health-check function (check_local_provider_health) and that background process output is redirected to a temporary log file which is printed on error.scripts/lib/runtime.sh (2)
183-195: Substring matching may cause unexpected ambiguity.The
grep -Fwithgateway_nameperforms substring matching, so a gateway name like "neo" would match both "openshell-cluster-neo" and "openshell-cluster-neotest", causing the function to fail with ambiguity. This is likely acceptable since gateway names should be unique, but exact-match semantics might be safer.♻️ Optional: Use word-boundary matching
If exact suffix matching is desired:
- matches="$(printf '%s\n' "$containers" | grep -F -- "$gateway_name" || true)" + matches="$(printf '%s\n' "$containers" | grep -E -- "-${gateway_name}\$" || true)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/lib/runtime.sh` around lines 183 - 195, The current substring matching when building matches (using gateway_name, matches, match_count, containers) can return ambiguous results; change the filter to perform an exact-line match instead of grep -F substring matching (e.g. use grep -xF -- "$gateway_name" or equivalent awk/grep that compares the whole line) so only container names exactly equal to gateway_name are counted; update the grep invocation that produces matches and leave the subsequent match_count and branching logic intact.
114-128: Loopback detection is IPv4-only.
is_loopback_ipand the awk pattern infirst_non_loopback_nameserveronly match IPv4 loopback addresses (127.*). IPv6 loopback (::1) would not be filtered. This is likely fine for the Docker/k3s context where127.0.0.11is typical, but worth noting if IPv6 becomes relevant.♻️ Optional: Add IPv6 loopback handling
is_loopback_ip() { local ip="${1:-}" - [[ "$ip" == 127.* ]] + [[ "$ip" == 127.* || "$ip" == "::1" ]] }And in
first_non_loopback_nameserver:printf '%s\n' "$resolv_conf" \ - | awk '$1 == "nameserver" && $2 !~ /^127\./ { print $2; exit }' + | awk '$1 == "nameserver" && $2 !~ /^127\./ && $2 != "::1" { print $2; exit }'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/lib/runtime.sh` around lines 114 - 128, The loopback detection currently only handles IPv4 (127.*); update is_loopback_ip and the nameserver filter in first_non_loopback_nameserver to also treat IPv6 loopback addresses as loopback (e.g. "::1" and IPv4-mapped loopback forms like "::ffff:127.*") so they are excluded; specifically, change is_loopback_ip to return true for addresses matching 127.* or exactly "::1" or starting with "::ffff:127." and update the awk expression in first_non_loopback_nameserver (the nameserver filter) to skip entries where $2 matches /^127\./ or /^::1$/ or /^::ffff:127\./ so IPv6 loopback nameservers are ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/inference-config.js`:
- Around line 32-38: The helper getOpenClawPrimaryModel currently falls back to
the cloud model when model is undefined for the "vllm-local" provider, causing a
local selection to be treated as cloud; update the logic in the vllm-local case
in getOpenClawPrimaryModel so that when model is falsy it returns "vllm-local"
(matching getProviderSelectionConfig) instead of the cloud default, and apply
the same change to the other similar case block (lines corresponding to the
second vllm-local branch) so both branches use "vllm-local" as the default model
when model is omitted.
In `@scripts/install.sh`:
- Around line 365-369: Replace the git-based global install calls that use the
NemoClaw repo with a pinned npm package install: update the branches that check
NODE_MGR and the npm install commands (the lines that call "npm install -g
git+https://github.com/NVIDIA/NemoClaw.git") to instead run "npm install -g
nemoclaw@0.1.0" so installs are version-pinned and do not require git; keep the
existing sudo conditional around the install when NODE_MGR="nodesource".
---
Outside diff comments:
In `@bin/lib/onboard.js`:
- Around line 586-594: The Ollama branch always calls promptOllamaModel() and
blocks interactive CI; change it to honor the precomputed requestedModel and
avoid prompting when non-interactive. In the selected.key === "ollama" block use
the existing requestedModel (or process.env.NEMOCLAW_MODEL) if present: set
model = requestedModel (or env fallback) else, only if process.stdin.isTTY
(interactive), call model = await promptOllamaModel(); keep existing Ollama
start logic (ollamaRunning, run(), sleep()), and then set provider =
"ollama-local" as before.
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 417-428: The confirmation summary currently logs the provider
twice: a human-friendly label using summaryConfig.providerLabel (logged in the
logger.info call that interpolates describeOnboardEndpoint) and a technical name
using providerName (the final logger.info). Change the final logger.info to use
a distinct label such as "Provider Name:" or "Provider (internal):" instead of
"Provider:" so it no longer duplicates summaryConfig.providerLabel; update the
logger.info that references providerName accordingly (the logger.info that
currently logs `Provider: ${providerName}`).
---
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 72-80: The conditional in buildSandboxConfigSyncScript wrongly
uses selectionConfig.model to choose the local provider for profile ===
"inference-local", collapsing non-default Ollama models into "nvidia-nim";
change the branch to switch on selectionConfig.endpointType (e.g., "ollama",
"vllm", etc.) when profile is "inference-local" so that endpointType determines
"ollama-local", "vllm-local" or "nvidia-nim" as appropriate; update any related
comparisons to use selectionConfig.endpointType instead of selectionConfig.model
while preserving the DEFAULT_OLLAMA_MODEL constant use elsewhere.
In `@bin/lib/platform.js`:
- Around line 59-63: The macOS socket candidate ordering currently prefers
Colima (via getColimaDockerSocketCandidates) causing detectDockerHost to pick
Colima even when Docker Desktop is installed; change the candidate ordering so
Docker Desktop socket(s) are checked before Colima (or update detectDockerHost
to prefer Docker Desktop sockets over Colima). Locate the platform === "darwin"
branch where getColimaDockerSocketCandidates({ home }) is used and reorder or
prepend the Docker Desktop socket candidate(s) (the docker.sock path(s) used by
Docker Desktop) so detectDockerHost and the module-export in runner.js will
select Docker Desktop first.
In `@install.sh`:
- Around line 73-75: The current early-return in ensure_nemoclaw_shim() uses the
presence of npm_bin or NEMOCLAW_SHIM_DIR in ORIGINAL_PATH to return success even
when the actual shim file is missing, causing verify_nemoclaw() to incorrectly
return 0; change the logic so that ensure_nemoclaw_shim() verifies the shim file
exists (check for the executable at $NEMOCLAW_SHIM_DIR/nemoclaw or the expected
shim path) before returning success, and only return 0 if the shim is present,
otherwise create the shim or return non-zero; update the same pattern in the
other occurrence (lines ~251-265) so both places inspect the actual shim file
rather than just PATH membership, and ensure verify_nemoclaw() and run_onboard()
receive a failure code when the shim is absent.
In `@nemoclaw/src/index.ts`:
- Around line 150-177: The fallback in activeModelEntries currently returns a
hardcoded four-model list when loadOnboardConfig() is empty; replace that
hardcoded array with the shared curated catalog used by the onboarding selector
so both UIs stay in sync. Locate the activeModelEntries function and instead of
returning the inline ModelProviderEntry[] literal, import (or call) the existing
shared catalog provider used by the onboarding selector (the same source that
produces the Kimi/GLM/MiniMax/Qwen/GPT-OSS entries) and return its entries;
ensure types match ModelProviderEntry and preserve contextWindow/maxOutput
fields from the catalog.
In `@README.md`:
- Line 153: Update the single README sentence so it reflects that Ollama is now
a supported local inference option while vLLM remains experimental: replace the
combined statement with two clauses (or two sentences) naming Ollama as a
supported/local path and vLLM as still experimental; keep the macOS/OpenShell
host-routing note attached to the option(s) that actually require it (mention
OpenShell host-routing in the vLLM macOS note if applicable).
In `@scripts/nemoclaw-start.sh`:
- Around line 33-35: If NEMOCLAW_MODEL is not set, remove any existing
agents.defaults.model.primary entry so stale model selection doesn't persist;
update the logic around default_model (the block that currently uses
cfg.setdefault('agents', {}).setdefault('defaults', {}).setdefault('model',
{})['primary']) to delete or pop the 'primary' key from
cfg['agents']['defaults']['model'] when default_model is falsy, ensuring the
config falls back to the managed default path.
In `@scripts/smoke-macos-install.sh`:
- Around line 161-176: The loop in feed_install_answers (which writes to
answers_pipe and reads install_log) can block forever waiting for the "OpenClaw
gateway launched inside sandbox" string; add a timeout guard (e.g., max wait
seconds) and a timestamp/check to break after the timeout, write a diagnostic
message to install_log or stdout/stderr before falling through, and ensure the
final printf 'n\n' is still sent to answers_pipe on timeout. Update
feed_install_answers to accept or define a MAX_WAIT variable, use SANDBOX_NAME
and install_log in the diagnostic, and make the loop break on either success or
timeout so callers won’t hang.
In `@test/smoke-macos-install.test.js`:
- Around line 91-95: The spawnSync invocation running the sourced smoke flow
(variable script) should be bounded to avoid hangs: add a timeout (e.g.,
timeout: <milliseconds>) and a killSignal option to the options object passed to
spawnSync, and after the call check result.error or result.signal to detect an
ETIMEDOUT/timeout and explicitly fail the test with a clear message; update the
code around spawnSync (the result assignment and subsequent assertions) to treat
a timeout (result.error.code === 'ETIMEDOUT' or result.signal indicating the
killSignal) as a test failure so the job doesn't hang indefinitely.
- Around line 55-64: The test "fails when a requested runtime socket is
unavailable" currently uses a fixed HOME (/tmp/nemoclaw-smoke-no-runtime) which
can leak state between runs; change the test to create a unique temporary
directory for HOME (e.g., via fs.mkdtempSync or os.tmpdir + mkdtemp) before
calling spawnSync with SMOKE_SCRIPT and set env.HOME to that temp dir, then
remove the temp directory in a finally/after hook to ensure isolation and
cleanup; update the test surrounding symbols it("fails when a requested runtime
socket is unavailable") and the spawnSync call that uses SMOKE_SCRIPT to use
this temp HOME.
---
Nitpick comments:
In `@scripts/lib/runtime.sh`:
- Around line 183-195: The current substring matching when building matches
(using gateway_name, matches, match_count, containers) can return ambiguous
results; change the filter to perform an exact-line match instead of grep -F
substring matching (e.g. use grep -xF -- "$gateway_name" or equivalent awk/grep
that compares the whole line) so only container names exactly equal to
gateway_name are counted; update the grep invocation that produces matches and
leave the subsequent match_count and branching logic intact.
- Around line 114-128: The loopback detection currently only handles IPv4
(127.*); update is_loopback_ip and the nameserver filter in
first_non_loopback_nameserver to also treat IPv6 loopback addresses as loopback
(e.g. "::1" and IPv4-mapped loopback forms like "::ffff:127.*") so they are
excluded; specifically, change is_loopback_ip to return true for addresses
matching 127.* or exactly "::1" or starting with "::ffff:127." and update the
awk expression in first_non_loopback_nameserver (the nameserver filter) to skip
entries where $2 matches /^127\./ or /^::1$/ or /^::ffff:127\./ so IPv6 loopback
nameservers are ignored.
In `@scripts/setup.sh`:
- Around line 144-148: The startup block for Ollama swallows errors and may hide
failures; after launching OLLAMA_HOST=0.0.0.0:11434 ollama serve in the
background (the section using OLLAMA_HOST and the call to `ollama serve` guarded
by `check_local_provider_health "ollama-local"`), add a post-startup health
probe that retries the existing check_local_provider_health function for a short
loop (e.g., retry every second for 10s) and, on failure, capture and print the
Ollama logs (avoid redirecting to /dev/null) and exit with non-zero status so
the caller knows startup failed. Ensure the probe uses the same health-check
function (check_local_provider_health) and that background process output is
redirected to a temporary log file which is printed on error.
In `@test/runtime-shell.test.js`:
- Line 13: The empty afterEach callback in the test file is redundant; remove
the standalone afterEach(() => {}); invocation (referencing the afterEach symbol
in test/runtime-shell.test.js) to clean up the test suite and leave no-op hooks
out of the codebase.
- Around line 34-47: The test "prefers Colima over Docker Desktop" currently
calls fs.rmSync at the end so an assertion failure will leave temp files; modify
the test to guarantee cleanup by moving fs.rmSync into a finally block around
the assertions (or register the created temp dir for removal in an afterEach),
i.e., wrap the runShell/assert block in try { ... } finally { fs.rmSync(home, {
recursive: true, force: true }) } or push the path to a list cleaned by an
afterEach, referencing the test name and the runShell and fs.rmSync calls to
locate the code to change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b99d4f89-1f94-403d-87fe-7bb6ed813421
⛔ Files ignored due to path filters (15)
nemoclaw/dist/cli.jsis excluded by!**/dist/**nemoclaw/dist/cli.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/onboard.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/onboard.jsis excluded by!**/dist/**nemoclaw/dist/commands/onboard.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/slash.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/commands/slash.jsis excluded by!**/dist/**nemoclaw/dist/commands/slash.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/index.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/index.jsis excluded by!**/dist/**nemoclaw/dist/index.js.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/onboard/config.d.tsis excluded by!**/dist/**nemoclaw/dist/onboard/config.d.ts.mapis excluded by!**/dist/**,!**/*.mapnemoclaw/dist/onboard/config.jsis excluded by!**/dist/**nemoclaw/dist/onboard/config.js.mapis excluded by!**/dist/**,!**/*.map
📒 Files selected for processing (32)
README.mdbin/lib/credentials.jsbin/lib/inference-config.jsbin/lib/local-inference.jsbin/lib/onboard.jsbin/lib/platform.jsbin/lib/runner.jsbin/nemoclaw.jsinstall.shnemoclaw/src/cli.tsnemoclaw/src/commands/onboard.tsnemoclaw/src/commands/slash.tsnemoclaw/src/index.tsnemoclaw/src/onboard/config.tsscripts/fix-coredns.shscripts/install.shscripts/lib/runtime.shscripts/nemoclaw-start.shscripts/setup.shscripts/smoke-macos-install.shtest/credentials.test.jstest/inference-config.test.jstest/install-preflight.test.jstest/local-inference.test.jstest/onboard-selection.test.jstest/onboard.test.jstest/platform.test.jstest/runner.test.jstest/runtime-shell.test.jstest/smoke-macos-install.test.jstest/uninstall.test.jsuninstall.sh
✅ Files skipped from review due to trivial changes (1)
- test/platform.test.js
🚧 Files skipped from review as they are similar to previous changes (9)
- test/local-inference.test.js
- bin/lib/local-inference.js
- bin/lib/credentials.js
- nemoclaw/src/commands/slash.ts
- test/credentials.test.js
- test/uninstall.test.js
- test/onboard.test.js
- uninstall.sh
- bin/nemoclaw.js
| case "vllm-local": | ||
| return { | ||
| endpointType: "custom", | ||
| endpointUrl: INFERENCE_ROUTE_URL, | ||
| ncpPartner: null, | ||
| model: model || "vllm-local", | ||
| profile: DEFAULT_ROUTE_PROFILE, |
There was a problem hiding this comment.
Keep the primary-model helper aligned with the provider defaults.
getProviderSelectionConfig("vllm-local") defaults to "vllm-local", but getOpenClawPrimaryModel("vllm-local") falls back to the cloud model when model is omitted. That silently misidentifies a local vLLM selection as cloud and reintroduces the implicit fallback this module is supposed to remove.
Suggested fix
function getOpenClawPrimaryModel(provider, model) {
- const resolvedModel =
- model || (provider === "ollama-local" ? DEFAULT_OLLAMA_MODEL : DEFAULT_CLOUD_MODEL);
- return resolvedModel ? `${MANAGED_PROVIDER_ID}/${resolvedModel}` : null;
+ let resolvedModel = model;
+ if (!resolvedModel) {
+ switch (provider) {
+ case "nvidia-nim":
+ resolvedModel = DEFAULT_CLOUD_MODEL;
+ break;
+ case "vllm-local":
+ resolvedModel = "vllm-local";
+ break;
+ case "ollama-local":
+ resolvedModel = DEFAULT_OLLAMA_MODEL;
+ break;
+ default:
+ return null;
+ }
+ }
+
+ return `${MANAGED_PROVIDER_ID}/${resolvedModel}`;
}Also applies to: 59-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/inference-config.js` around lines 32 - 38, The helper
getOpenClawPrimaryModel currently falls back to the cloud model when model is
undefined for the "vllm-local" provider, causing a local selection to be treated
as cloud; update the logic in the vllm-local case in getOpenClawPrimaryModel so
that when model is falsy it returns "vllm-local" (matching
getProviderSelectionConfig) instead of the cloud default, and apply the same
change to the other similar case block (lines corresponding to the second
vllm-local branch) so both branches use "vllm-local" as the default model when
model is omitted.
| if [ "$NODE_MGR" = "nodesource" ]; then | ||
| sudo npm install -g nemoclaw | ||
| sudo npm install -g git+https://github.com/NVIDIA/NemoClaw.git | ||
| else | ||
| npm install -g nemoclaw | ||
| npm install -g git+https://github.com/NVIDIA/NemoClaw.git | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if NemoClaw has releases/tags that could be used for version pinning
gh release list --repo NVIDIA/NemoClaw --limit 5 2>/dev/null || echo "No releases found or repo not accessible"
git ls-remote --tags https://github.com/NVIDIA/NemoClaw.git 2>/dev/null | head -5 || echo "Could not fetch tags"Repository: NVIDIA/NemoClaw
Length of output: 99
🏁 Script executed:
#!/bin/bash
# Verify NemoClaw repository state with better error handling
echo "=== Attempting git ls-remote directly ==="
git ls-remote --tags https://github.com/NVIDIA/NemoClaw.git 2>&1
echo ""
echo "=== Checking repository existence ==="
curl -s -I https://api.github.com/repos/NVIDIA/NemoClaw | head -5
echo ""
echo "=== Attempting to fetch specific branch info ==="
curl -s https://api.github.com/repos/NVIDIA/NemoClaw | jq -r '.pushed_at, .default_branch' 2>/dev/null || echo "Could not fetch repository metadata"Repository: NVIDIA/NemoClaw
Length of output: 416
🏁 Script executed:
#!/bin/bash
# Check if we can at least reference a commit hash as alternative
echo "=== Getting recent commits ==="
git ls-remote --heads https://github.com/NVIDIA/NemoClaw.git main 2>&1 | head -1
echo ""
echo "=== Checking if this is a valid npm package name on npm registry ==="
npm view NVIDIA/NemoClaw 2>&1 | head -10 || echo "Not found in npm registry"Repository: NVIDIA/NemoClaw
Length of output: 474
Use the published npm package version instead of installing from git.
The repository has no git tags for version pinning. However, nemoclaw@0.1.0 is published on npm and can be installed directly:
if [ "$NODE_MGR" = "nodesource" ]; then
sudo npm install -g nemoclaw@0.1.0
else
npm install -g nemoclaw@0.1.0
fiThis provides reproducible, version-pinned installs without requiring git during installation.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [ "$NODE_MGR" = "nodesource" ]; then | |
| sudo npm install -g nemoclaw | |
| sudo npm install -g git+https://github.com/NVIDIA/NemoClaw.git | |
| else | |
| npm install -g nemoclaw | |
| npm install -g git+https://github.com/NVIDIA/NemoClaw.git | |
| fi | |
| if [ "$NODE_MGR" = "nodesource" ]; then | |
| sudo npm install -g nemoclaw@0.1.0 | |
| else | |
| npm install -g nemoclaw@0.1.0 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/install.sh` around lines 365 - 369, Replace the git-based global
install calls that use the NemoClaw repo with a pinned npm package install:
update the branches that check NODE_MGR and the npm install commands (the lines
that call "npm install -g git+https://github.com/NVIDIA/NemoClaw.git") to
instead run "npm install -g nemoclaw@0.1.0" so installs are version-pinned and
do not require git; keep the existing sudo conditional around the install when
NODE_MGR="nodesource".
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bin/lib/onboard.js (1)
587-596:⚠️ Potential issue | 🟠 MajorAvoid interactive prompts for Ollama model selection in non-interactive mode.
Line 595 and Line 604 always call
promptOllamaModel(). In--non-interactivemode this can block automation even thoughrequestedModelis already available.Suggested fix
} else if (selected.key === "ollama") { if (!ollamaRunning) { console.log(" Starting Ollama..."); run("OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 &", { ignoreError: true }); sleep(2); } console.log(" ✓ Using Ollama on localhost:11434"); provider = "ollama-local"; - model = await promptOllamaModel(); + if (isNonInteractive()) { + const availableModels = getOllamaModelOptions(runCapture); + model = requestedModel || getDefaultOllamaModel(runCapture); + if (!availableModels.includes(model)) { + console.error(` Unsupported NEMOCLAW_MODEL for Ollama: ${model}`); + process.exit(1); + } + console.log(` [non-interactive] Ollama model: ${model}`); + } else { + model = await promptOllamaModel(); + } } else if (selected.key === "install-ollama") { console.log(" Installing Ollama via Homebrew..."); run("brew install ollama", { ignoreError: true }); console.log(" Starting Ollama..."); run("OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 &", { ignoreError: true }); sleep(2); console.log(" ✓ Using Ollama on localhost:11434"); provider = "ollama-local"; - model = await promptOllamaModel(); + if (isNonInteractive()) { + const availableModels = getOllamaModelOptions(runCapture); + model = requestedModel || getDefaultOllamaModel(runCapture); + if (!availableModels.includes(model)) { + console.error(` Unsupported NEMOCLAW_MODEL for Ollama: ${model}`); + process.exit(1); + } + console.log(` [non-interactive] Ollama model: ${model}`); + } else { + model = await promptOllamaModel(); + }Also applies to: 596-605
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 587 - 596, The code always calls promptOllamaModel() for the "ollama" (and "install-ollama") branches which blocks in non-interactive runs; change the logic in those branches to use the provided requestedModel (or other non-interactive flag) first and only call promptOllamaModel() when requestedModel is falsy and interactive mode is enabled — i.e., set model = requestedModel || await promptOllamaModel() (or skip prompt when nonInteractive is true), updating both places where promptOllamaModel() is invoked so automation won't hang.
♻️ Duplicate comments (1)
bin/lib/onboard.js (1)
73-82:⚠️ Potential issue | 🔴 CriticalDerive sandbox provider from
endpointType, notmodeldefault match.Line 75-Line 81 still misclassifies local selections when the Ollama model is not
DEFAULT_OLLAMA_MODEL(and can also mis-map vLLM local). This produces incorrect OpenClaw provider/model config.Suggested fix
function buildSandboxConfigSyncScript(selectionConfig) { - const providerType = - selectionConfig.profile === "inference-local" - ? selectionConfig.model === DEFAULT_OLLAMA_MODEL - ? "ollama-local" - : "nvidia-nim" - : selectionConfig.endpointType === "vllm" - ? "vllm-local" - : "nvidia-nim"; + const providerType = + selectionConfig.endpointType === "ollama" + ? "ollama-local" + : selectionConfig.endpointType === "vllm" + ? "vllm-local" + : "nvidia-nim";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 73 - 82, The providerType derivation in buildSandboxConfigSyncScript is using selectionConfig.model to decide between "ollama-local" and "nvidia-nim", which misclassifies local endpoints; change the logic to derive providerType from selectionConfig.endpointType (and only fall back to model checks for non-local profiles if needed). Update the providerType assignment so it checks selectionConfig.profile === "inference-local" then uses selectionConfig.endpointType === "ollama" (or the appropriate endpoint string) to pick "ollama-local", selectionConfig.endpointType === "vllm" to pick "vllm-local", otherwise default to "nvidia-nim"; keep the call to getOpenClawPrimaryModel(providerType, selectionConfig.model) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@bin/lib/onboard.js`:
- Around line 587-596: The code always calls promptOllamaModel() for the
"ollama" (and "install-ollama") branches which blocks in non-interactive runs;
change the logic in those branches to use the provided requestedModel (or other
non-interactive flag) first and only call promptOllamaModel() when
requestedModel is falsy and interactive mode is enabled — i.e., set model =
requestedModel || await promptOllamaModel() (or skip prompt when nonInteractive
is true), updating both places where promptOllamaModel() is invoked so
automation won't hang.
---
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 73-82: The providerType derivation in buildSandboxConfigSyncScript
is using selectionConfig.model to decide between "ollama-local" and
"nvidia-nim", which misclassifies local endpoints; change the logic to derive
providerType from selectionConfig.endpointType (and only fall back to model
checks for non-local profiles if needed). Update the providerType assignment so
it checks selectionConfig.profile === "inference-local" then uses
selectionConfig.endpointType === "ollama" (or the appropriate endpoint string)
to pick "ollama-local", selectionConfig.endpointType === "vllm" to pick
"vllm-local", otherwise default to "nvidia-nim"; keep the call to
getOpenClawPrimaryModel(providerType, selectionConfig.model) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1ec3e656-a1df-45f6-9c47-9ae21f2c46d5
📒 Files selected for processing (3)
bin/lib/local-inference.jsbin/lib/onboard.jstest/local-inference.test.js
🚧 Files skipped from review as they are similar to previous changes (1)
- bin/lib/local-inference.js
There was a problem hiding this comment.
♻️ Duplicate comments (2)
bin/lib/onboard.js (1)
74-83:⚠️ Potential issue | 🟠 MajorDerive the sandbox provider from
endpointType, not the model name.The current logic only recognizes
DEFAULT_OLLAMA_MODELas Ollama. Any other Ollama model (e.g.,"qwen3:32b") would incorrectly collapse to"nvidia-nim", causing the generated OpenClaw config to use wrong provider settings.,
💡 Proposed fix — derive from endpointType consistently
function buildSandboxConfigSyncScript(selectionConfig) { const providerType = - selectionConfig.profile === "inference-local" - ? selectionConfig.model === DEFAULT_OLLAMA_MODEL - ? "ollama-local" - : "nvidia-nim" - : selectionConfig.endpointType === "vllm" + selectionConfig.endpointType === "ollama" + ? "ollama-local" + : selectionConfig.endpointType === "vllm" ? "vllm-local" : "nvidia-nim";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 74 - 83, In buildSandboxConfigSyncScript, providerType is derived from the model name (checking DEFAULT_OLLAMA_MODEL) which misidentifies other Ollama models; change the logic to derive providerType from selectionConfig.endpointType instead: when selectionConfig.profile === "inference-local" map selectionConfig.endpointType (e.g., "ollama" -> "ollama-local", "vllm" -> "vllm-local", default -> "nvidia-nim"), then pass that providerType into getOpenClawPrimaryModel as before so OpenClaw config uses the correct provider.bin/lib/local-inference.js (1)
41-45:⚠️ Potential issue | 🟡 MinorFail unknown local provider IDs instead of silently accepting them.
When
getLocalProviderHealthCheck()returnsnullfor an unrecognized provider, the function currently returns{ ok: true }. A typo such as"ollama"vs"ollama-local"will pass validation here and only fail later when the base URL lookup returnsnull.,
💡 Proposed fix
function validateLocalProvider(provider, runCapture) { const command = getLocalProviderHealthCheck(provider); if (!command) { - return { ok: true }; + return { + ok: false, + message: `Unsupported local inference provider: ${provider}. Valid options: ollama-local, vllm-local.`, + }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/local-inference.js` around lines 41 - 45, The function validateLocalProvider currently treats a null result from getLocalProviderHealthCheck(provider) as success; change it to fail-fast for unknown local provider IDs by returning a failure result (e.g., { ok: false, error: `Unknown local provider: ${provider.id}` } or similar) when getLocalProviderHealthCheck returns null, so validateLocalProvider reports unrecognized providers instead of silently accepting them; update any calling code that expects the previous { ok: true } behavior if necessary.
🧹 Nitpick comments (4)
bin/lib/onboard.js (2)
66-72:shellQuoteis duplicated fromlocal-inference.js.Consider importing
shellQuotefromlocal-inference.jsinstead of redefining it here to maintain a single source of truth.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 66 - 72, The function shellQuote is duplicated — remove the local definition of shellQuote in onboard.js and import the single implementation from local-inference.js instead; update references in onboard.js to use the imported shellQuote (leave pythonLiteralJson as-is), ensuring the import statement names the same exported symbol from local-inference.js and that any module/export style (CommonJS require or ES module import) matches the project’s existing pattern.
614-628: Potential double assignment ofmodelin cloud provider path.Line 624 assigns
modelfrom the prompt (when interactive), but line 626 unconditionally reassigns it tomodel || requestedModel || DEFAULT_CLOUD_MODEL. In interactive mode, if the user selected a model at line 624,modelwould already be set and line 626 would be a no-op. However, the intent is unclear—consider consolidating the logic.💡 Suggested clarification
if (provider === "nvidia-nim") { if (isNonInteractive()) { // In non-interactive mode, NVIDIA_API_KEY must be set via env var if (!process.env.NVIDIA_API_KEY) { console.error(" NVIDIA_API_KEY is required for cloud provider in non-interactive mode."); console.error(" Set it via: NVIDIA_API_KEY=nvapi-... nemoclaw onboard --non-interactive"); process.exit(1); } + model = requestedModel || DEFAULT_CLOUD_MODEL; } else { await ensureApiKey(); - model = model || (await promptCloudModel()) || DEFAULT_CLOUD_MODEL; + model = await promptCloudModel(); } - model = model || requestedModel || DEFAULT_CLOUD_MODEL; console.log(` Using NVIDIA Cloud API with model: ${model}`); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 614 - 628, The code double-assigns model in the nvidia-nim branch; consolidate the logic so model is set exactly once after handling interactive vs non-interactive flows. Specifically, in the interactive branch where ensureApiKey() and promptCloudModel() are used (functions ensureApiKey() and promptCloudModel()), set model = model || (await promptCloudModel()) || requestedModel || DEFAULT_CLOUD_MODEL and remove the later unconditional assignment; in the non-interactive branch ensure NVIDIA_API_KEY is present and then set model = model || requestedModel || DEFAULT_CLOUD_MODEL before logging.test/local-inference.test.js (1)
51-82: Consider adding a test for unknown provider IDs.The tests thoroughly cover known providers (ollama-local, vllm-local), but there's no test verifying behavior when an unknown provider (e.g.,
"typo-local") is passed. This would help document and catch regressions in the current fail-open behavior (or validate a fix).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/local-inference.test.js` around lines 51 - 82, Add a new test case in test/local-inference.test.js that calls validateLocalProvider with an unknown provider id (e.g., "typo-local") and a minimal mock responder; assert that the returned object has ok === false and that result.message mentions the unknown provider id ("typo-local") so the behavior for unknown providers is explicitly tested. Use the same test pattern as the existing cases (it(...), calling validateLocalProvider("typo-local", () => "") and asserting result.ok is false and result.message includes "typo-local").bin/lib/local-inference.js (1)
117-119: DuplicateshellQuoteimplementation exists inonboard.js.This function is identical to the one defined in
bin/lib/onboard.js(lines 66-68). Consider exporting it from one location to avoid duplication.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/local-inference.js` around lines 117 - 119, The function shellQuote is duplicated (defined in shellQuote in local-inference.js and the same implementation in onboard.js); extract this utility into a single shared module (e.g., create a new exported helper, e.g., export function shellQuote(...) from a shared utils file), update local-inference.js and onboard.js to import and use that exported shellQuote, and remove the duplicate definitions so only the shared exported shellQuote remains referenced by both modules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@bin/lib/local-inference.js`:
- Around line 41-45: The function validateLocalProvider currently treats a null
result from getLocalProviderHealthCheck(provider) as success; change it to
fail-fast for unknown local provider IDs by returning a failure result (e.g., {
ok: false, error: `Unknown local provider: ${provider.id}` } or similar) when
getLocalProviderHealthCheck returns null, so validateLocalProvider reports
unrecognized providers instead of silently accepting them; update any calling
code that expects the previous { ok: true } behavior if necessary.
In `@bin/lib/onboard.js`:
- Around line 74-83: In buildSandboxConfigSyncScript, providerType is derived
from the model name (checking DEFAULT_OLLAMA_MODEL) which misidentifies other
Ollama models; change the logic to derive providerType from
selectionConfig.endpointType instead: when selectionConfig.profile ===
"inference-local" map selectionConfig.endpointType (e.g., "ollama" ->
"ollama-local", "vllm" -> "vllm-local", default -> "nvidia-nim"), then pass that
providerType into getOpenClawPrimaryModel as before so OpenClaw config uses the
correct provider.
---
Nitpick comments:
In `@bin/lib/local-inference.js`:
- Around line 117-119: The function shellQuote is duplicated (defined in
shellQuote in local-inference.js and the same implementation in onboard.js);
extract this utility into a single shared module (e.g., create a new exported
helper, e.g., export function shellQuote(...) from a shared utils file), update
local-inference.js and onboard.js to import and use that exported shellQuote,
and remove the duplicate definitions so only the shared exported shellQuote
remains referenced by both modules.
In `@bin/lib/onboard.js`:
- Around line 66-72: The function shellQuote is duplicated — remove the local
definition of shellQuote in onboard.js and import the single implementation from
local-inference.js instead; update references in onboard.js to use the imported
shellQuote (leave pythonLiteralJson as-is), ensuring the import statement names
the same exported symbol from local-inference.js and that any module/export
style (CommonJS require or ES module import) matches the project’s existing
pattern.
- Around line 614-628: The code double-assigns model in the nvidia-nim branch;
consolidate the logic so model is set exactly once after handling interactive vs
non-interactive flows. Specifically, in the interactive branch where
ensureApiKey() and promptCloudModel() are used (functions ensureApiKey() and
promptCloudModel()), set model = model || (await promptCloudModel()) ||
requestedModel || DEFAULT_CLOUD_MODEL and remove the later unconditional
assignment; in the non-interactive branch ensure NVIDIA_API_KEY is present and
then set model = model || requestedModel || DEFAULT_CLOUD_MODEL before logging.
In `@test/local-inference.test.js`:
- Around line 51-82: Add a new test case in test/local-inference.test.js that
calls validateLocalProvider with an unknown provider id (e.g., "typo-local") and
a minimal mock responder; assert that the returned object has ok === false and
that result.message mentions the unknown provider id ("typo-local") so the
behavior for unknown providers is explicitly tested. Use the same test pattern
as the existing cases (it(...), calling validateLocalProvider("typo-local", ()
=> "") and asserting result.ok is false and result.message includes
"typo-local").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d537b395-dc32-401d-b59d-02a758c3d397
📒 Files selected for processing (3)
bin/lib/local-inference.jsbin/lib/onboard.jstest/local-inference.test.js
The Ollama branch in setupNim() called promptOllamaModel() directly, ignoring non-interactive mode. This caused onboard --non-interactive with NEMOCLAW_PROVIDER=ollama to hang waiting for input. Now checks isNonInteractive() and uses NEMOCLAW_MODEL env var or getDefaultOllamaModel() fallback instead of prompting.
ericksoa
left a comment
There was a problem hiding this comment.
LGTM — thorough work making Ollama first-class. Tested on macOS: 137 unit tests + 22 vitest pass, cloud inference works, non-interactive mode works for both cloud and Ollama. Code review clean — no injection risks, proper quoting, new modules are well-structured. Added a commit (efe9678) for non-interactive Ollama model selection. Nice job @kjw3.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
bin/lib/onboard.js (1)
75-82:⚠️ Potential issue | 🔴 CriticalDerive
providerTypefromselectionConfig.endpointType, notselectionConfig.model.Line 77 causes non-default Ollama models (and local vLLM under
inference-local) to fall back tonvidia-nim, which misconfigures the synced OpenClaw model/provider mapping.Proposed fix
function buildSandboxConfigSyncScript(selectionConfig) { - const providerType = - selectionConfig.profile === "inference-local" - ? selectionConfig.model === DEFAULT_OLLAMA_MODEL - ? "ollama-local" - : "nvidia-nim" - : selectionConfig.endpointType === "vllm" - ? "vllm-local" - : "nvidia-nim"; + const providerType = + selectionConfig.endpointType === "ollama" + ? "ollama-local" + : selectionConfig.endpointType === "vllm" + ? "vllm-local" + : "nvidia-nim";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 75 - 82, The providerType ternary currently checks selectionConfig.model (and DEFAULT_OLLAMA_MODEL) which causes non-default Ollama models and local vLLM to map incorrectly; change the logic in the providerType computation to branch on selectionConfig.endpointType instead: when selectionConfig.profile === "inference-local" check selectionConfig.endpointType === "ollama" => "ollama-local", else selectionConfig.endpointType === "vllm" => "vllm-local", otherwise fall back to "nvidia-nim"; update the providerType expression that references selectionConfig and DEFAULT_OLLAMA_MODEL to use selectionConfig.endpointType for correct provider mapping.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
359-364: Update the CoreDNS comment to match current behavior.Line 359 says “always run”, but Lines 361-364 execute conditionally. Please align the comment with
shouldPatchCoredns(runtime)to avoid drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 359 - 364, The comment above the CoreDNS patch is inaccurate; update it to reflect the conditional execution. Change the comment to state that CoreDNS is patched only when shouldPatchCoredns(runtime) returns true (after calling getContainerRuntime()), so it no longer claims “always run”; keep references to getContainerRuntime(), shouldPatchCoredns(runtime), and the run(...) invocation running fix-coredns.sh to make clear the behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 75-82: The providerType ternary currently checks
selectionConfig.model (and DEFAULT_OLLAMA_MODEL) which causes non-default Ollama
models and local vLLM to map incorrectly; change the logic in the providerType
computation to branch on selectionConfig.endpointType instead: when
selectionConfig.profile === "inference-local" check selectionConfig.endpointType
=== "ollama" => "ollama-local", else selectionConfig.endpointType === "vllm" =>
"vllm-local", otherwise fall back to "nvidia-nim"; update the providerType
expression that references selectionConfig and DEFAULT_OLLAMA_MODEL to use
selectionConfig.endpointType for correct provider mapping.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 359-364: The comment above the CoreDNS patch is inaccurate; update
it to reflect the conditional execution. Change the comment to state that
CoreDNS is patched only when shouldPatchCoredns(runtime) returns true (after
calling getContainerRuntime()), so it no longer claims “always run”; keep
references to getContainerRuntime(), shouldPatchCoredns(runtime), and the
run(...) invocation running fix-coredns.sh to make clear the behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0429ed71-ced9-473d-93ea-0f2d84bdf915
📒 Files selected for processing (1)
bin/lib/onboard.js
Keep both the container runtime check (from feat/ollama-local-inference) and the sandbox name parameterization (from main). SCRIPT_DIR/REPO_DIR already defined earlier in the file via runtime.sh sourcing.
…ction (NVIDIA#295) * centralize runtime detection and installer fallbacks * fail fast on podman for macos * tighten local inference setup paths * add macos install smoke test * fix smoke macos install cleanup trap * add runtime selection to macos smoke test * improve colima coredns upstream detection * avoid consuming stdin in colima dns probe * align smoke answers with preset prompt * close stdin for onboarding child commands * stage smoke answers through a pipe * avoid smoke log hangs from inherited stdout * allow prompt-driven installs to exit under piped stdin * silence smoke answer feeder shutdown * tighten macos runtime selection and dns patching * clarify apple silicon macos support * make curl installer self-contained * fix macos installer preflight test on linux ci * clarify runtime prerequisite wording * preserve interactive prompts after first answer * remove openshell gateway volume on uninstall * restore tty for sandbox connect * sync sandbox inference config from onboard selection * sync sandbox config via connect stdin * register ollama models with explicit provider ids * add stable nemoclaw shim for nvm installs * register ollama provider config in sandbox * route sandbox inference through managed provider * fix sandbox provider config serialization * probe container reachability for local inference * require explicit local provider selection * align provider identity across managed inference UI * make validated ollama path first-class * add curated cloud model selector * add ollama model selector * warm selected ollama model and quiet sandbox sync output * fail onboarding when selected ollama model is unhealthy * fix: respect non-interactive mode in Ollama model selection The Ollama branch in setupNim() called promptOllamaModel() directly, ignoring non-interactive mode. This caused onboard --non-interactive with NEMOCLAW_PROVIDER=ollama to hang waiting for input. Now checks isNonInteractive() and uses NEMOCLAW_MODEL env var or getDefaultOllamaModel() fallback instead of prompting. --------- Aaron Erickson <aerickson@nvidia.com>
…ction (NVIDIA#295) * centralize runtime detection and installer fallbacks * fail fast on podman for macos * tighten local inference setup paths * add macos install smoke test * fix smoke macos install cleanup trap * add runtime selection to macos smoke test * improve colima coredns upstream detection * avoid consuming stdin in colima dns probe * align smoke answers with preset prompt * close stdin for onboarding child commands * stage smoke answers through a pipe * avoid smoke log hangs from inherited stdout * allow prompt-driven installs to exit under piped stdin * silence smoke answer feeder shutdown * tighten macos runtime selection and dns patching * clarify apple silicon macos support * make curl installer self-contained * fix macos installer preflight test on linux ci * clarify runtime prerequisite wording * preserve interactive prompts after first answer * remove openshell gateway volume on uninstall * restore tty for sandbox connect * sync sandbox inference config from onboard selection * sync sandbox config via connect stdin * register ollama models with explicit provider ids * add stable nemoclaw shim for nvm installs * register ollama provider config in sandbox * route sandbox inference through managed provider * fix sandbox provider config serialization * probe container reachability for local inference * require explicit local provider selection * align provider identity across managed inference UI * make validated ollama path first-class * add curated cloud model selector * add ollama model selector * warm selected ollama model and quiet sandbox sync output * fail onboarding when selected ollama model is unhealthy * fix: respect non-interactive mode in Ollama model selection The Ollama branch in setupNim() called promptOllamaModel() directly, ignoring non-interactive mode. This caused onboard --non-interactive with NEMOCLAW_PROVIDER=ollama to hang waiting for input. Now checks isNonInteractive() and uses NEMOCLAW_MODEL env var or getDefaultOllamaModel() fallback instead of prompting. --------- Aaron Erickson <aerickson@nvidia.com>
Summary
This PR moves the local inference issue set forward by making the Ollama path usable, explicit, and validated across
Apple Silicon macOS and Linux.
It directly advances:
#12by requiring explicit provider selection instead of silently auto-selecting Ollama#24by aligning provider/model identity across onboarding, sandbox config, and plugin UI#260by validating the managed local Ollama path on Apple Silicon macOS#246and#247by getting the managed Ollama route working end-to-end with real replies instead of blank/brokenlocal sessions
What changed
NEMOCLAW_EXPERIMENTAL=1https://inference.local/v1route so OpenShell remains in thesecurity path
nvidia/nemotron-3-super-120b-a12bmoonshotai/kimi-k2.5z-ai/glm5minimaxai/minimax-m2.5qwen/qwen3.5-397b-a17bopenai/gpt-oss-120bollama listValidation
Automated:
npm testnpm --prefix nemoclaw run buildManual:
inference.localRemaining follow-up
#157remains separate from this PR; this work focuses on managed Ollama and cloud routing rather than Docker ModelRunner
path itself is validated
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests