Conversation
…NEMOCLAW_PREFERRED_API override Backends like SGLang expose /v1/responses and pass the existing non-streaming validation probe, but their streaming mode only emits lifecycle events (created/in_progress/completed) without the granular content deltas OpenClaw requires (output_text.delta, etc.). This causes runtime failures after onboarding succeeds. Changes: - Add runStreamingEventProbe() in http-probe.ts that sends a stream:true request and verifies the SSE event stream includes response.output_text.delta - Integrate the streaming probe into probeOpenAiLikeEndpoint for custom endpoints (probeStreaming: true) — falls back to /chat/completions when streaming events are incomplete - Add shouldForceCompletionsApi() in validation.ts checking NEMOCLAW_PREFERRED_API env var so users can bypass /responses entirely - Wire both into validateCustomOpenAiLikeSelection - Add unit tests for the new functions (11 new test cases) - Document NEMOCLAW_PREFERRED_API, the NEMOCLAW_INFERENCE_API_OVERRIDE workaround, and a troubleshooting entry for the runtime failure scenario Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a streaming-event probe for OpenAI-compatible Changes
Sequence DiagramsequenceDiagram
participant User
participant Onboard as Onboarding
participant Validation as ValidationLogic
participant Probe as HTTPProbe
participant Server as OpenAI-Compatible Server
User->>Onboard: run "nemoclaw onboard"
Onboard->>Validation: read NEMOCLAW_PREFERRED_API
alt preference forces completions
Validation-->>Onboard: skip /v1/responses
Onboard->>Probe: probe /v1/chat/completions
Probe->>Server: probe /v1/chat/completions
Server-->>Probe: OK
else probe responses first
Onboard->>Probe: probe /v1/responses (non-stream)
Probe->>Server: probe /v1/responses
Server-->>Probe: OK
Onboard->>Probe: runStreamingEventProbe(/v1/responses)
Probe->>Server: curl -N /v1/responses (stream)
Server-->>Probe: SSE events
Probe->>Probe: parse events, check response.output_text.delta
alt required event present
Probe-->>Onboard: {ok: true}
else missing events
Probe-->>Onboard: {ok: false, missingEvents: [...]}
Onboard->>Probe: probe /v1/chat/completions (fallback)
Probe->>Server: probe /v1/chat/completions
Server-->>Probe: OK
end
end
Probe-->>Onboard: final probe result
Onboard-->>User: onboarding result (and image baked config)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/lib/validation.ts (1)
132-140: Pass the preferred API into this helper instead of readingprocess.envhere.
validation.tsis documented as a pure, input-driven module, but this addition now depends on process state. Moving the env lookup tosrc/lib/onboard.tskeeps this layer deterministic and easier to reuse/test.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/validation.ts` around lines 132 - 140, The helper shouldForceCompletionsApi now reads process.env directly which breaks the pure, input-driven design of validation.ts; change its signature to accept the preferred API string (e.g., preferredApi: string | undefined) and remove any env access from inside shouldForceCompletionsApi, then perform the trim()/toLowerCase() & comparison there; update callers (notably in src/lib/onboard.ts) to read process.env.NEMOCLAW_PREFERRED_API, pass that value into shouldForceCompletionsApi, and adjust tests accordingly so validation.ts remains deterministic and testable.docs/inference/switch-inference-providers.md (1)
87-89: Rewrite the passive sentence in active voice.
No image rebuild is needed.reads passively. Say what NemoClaw does or what the reader does instead. As per coding guidelines, "Active voice required. Flag passive constructions."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/inference/switch-inference-providers.md` around lines 87 - 89, Rewrite the passive sentence "No image rebuild is needed." in active voice—replace it with a clear actor-based line such as "You do not need to rebuild the image." or "NemoClaw does not require rebuilding the image." Update the sentence near the existing text that mentions patching `openclaw.json` (the line that currently reads "No image rebuild is needed.") so the statement explicitly names the actor.docs/inference/use-local-inference.md (1)
147-149: Address the reader directly in this paragraph.
This variable tells the wizard.../It works...is feature-centric wording. The docs style guide asks for second person when you describe what the reader should do or what happens when they set a value. As per coding guidelines, "Second person ('you') when addressing the reader."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/inference/use-local-inference.md` around lines 147 - 149, Update the two sentences that currently start "This variable tells the wizard..." and "It works in both..." to use second-person wording addressing the reader; for example, replace with something like "Set this variable to make the wizard skip the /v1/responses probe and use /v1/chat/completions directly." and "This works in both interactive and non-interactive modes." Ensure the edited sentences mention the endpoints (/v1/responses and /v1/chat/completions) and keep the meaning unchanged while using "you"/direct instruction tone.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/skills/nemoclaw-user-configure-inference/SKILL.md:
- Around line 131-145: The SKILL.md changes were made directly but this artifact
must be regenerated from the canonical docs source; revert the manual edits in
SKILL.md (the nemoclaw-user-configure-inference SKILL.md) and run the project's
skill-generation pipeline/tool that produces .agents/skills/*/SKILL.md from
docs/ (regenerate the skill from docs to reapply the intended content), then
commit the regenerated SKILL.md so the file is produced consistently from docs
rather than edited in place.
In `@src/lib/onboard.ts`:
- Around line 1204-1229: The code treats any runStreamingEventProbe failure
(streamResult.ok === false) as a streaming-incompatibility and falls back
silently; instead, inspect runStreamingEventProbe's failure details (e.g.,
streamResult.reason, streamResult.errorCode, or the text in
streamResult.message) and only perform the streaming-to-chat fallback when the
failure explicitly indicates missing/unsupported SSE events (e.g., reason ===
"missing-events" or message contains "missing events"); for all other non-ok
results from runStreamingEventProbe, surface a validation error (push
failure/log it and abort/return the probe) rather than switching APIs. Use the
runStreamingEventProbe, streamResult.ok, and
streamResult.message/streamResult.reason identifiers to locate and implement
this conditional.
---
Nitpick comments:
In `@docs/inference/switch-inference-providers.md`:
- Around line 87-89: Rewrite the passive sentence "No image rebuild is needed."
in active voice—replace it with a clear actor-based line such as "You do not
need to rebuild the image." or "NemoClaw does not require rebuilding the image."
Update the sentence near the existing text that mentions patching
`openclaw.json` (the line that currently reads "No image rebuild is needed.") so
the statement explicitly names the actor.
In `@docs/inference/use-local-inference.md`:
- Around line 147-149: Update the two sentences that currently start "This
variable tells the wizard..." and "It works in both..." to use second-person
wording addressing the reader; for example, replace with something like "Set
this variable to make the wizard skip the /v1/responses probe and use
/v1/chat/completions directly." and "This works in both interactive and
non-interactive modes." Ensure the edited sentences mention the endpoints
(/v1/responses and /v1/chat/completions) and keep the meaning unchanged while
using "you"/direct instruction tone.
In `@src/lib/validation.ts`:
- Around line 132-140: The helper shouldForceCompletionsApi now reads
process.env directly which breaks the pure, input-driven design of
validation.ts; change its signature to accept the preferred API string (e.g.,
preferredApi: string | undefined) and remove any env access from inside
shouldForceCompletionsApi, then perform the trim()/toLowerCase() & comparison
there; update callers (notably in src/lib/onboard.ts) to read
process.env.NEMOCLAW_PREFERRED_API, pass that value into
shouldForceCompletionsApi, and adjust tests accordingly so validation.ts remains
deterministic and testable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: edd33113-b66d-4071-aa12-59635240f544
📒 Files selected for processing (10)
.agents/skills/nemoclaw-user-configure-inference/SKILL.md.agents/skills/nemoclaw-user-reference/references/troubleshooting.mddocs/inference/switch-inference-providers.mddocs/inference/use-local-inference.mddocs/reference/troubleshooting.mdsrc/lib/http-probe.test.tssrc/lib/http-probe.tssrc/lib/onboard.tssrc/lib/validation.test.tssrc/lib/validation.ts
…file ARG precedence NEMOCLAW_INFERENCE_API_OVERRIDE only patches openclaw.json at container startup — it does not update the Dockerfile ARG baked into the image. On recreate-sandbox the baked value wins. The reliable fix is a fresh nemoclaw onboard which re-probes and rebakes the image. Updated all three doc pages to recommend nemoclaw onboard instead of the override env var, and added a note explaining the limitation. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
- Distinguish transport failures from missing-events in streaming probe fallback: only fall back to /chat/completions when missingEvents is non-empty; surface transport errors as hard validation failures - Make shouldForceCompletionsApi() pure by accepting the preferred API value as a parameter instead of reading process.env directly, keeping validation.ts free of I/O per its module contract - Fix passive voice and second-person wording in docs Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
🦞 NemoClaw Functional Review — PR #1833Verdict: APPROVE Summaryfeat(inference): validate streaming events for /v1/responses and add NEMOCLAW_PREFERRED_API override Changed files: 10 files Blocking Issues
Functional Testing
Adversarial Testing (15 tests — 15 pass, 0 concern)Ran 15 adversarial tests against the live sandbox targeting SSE parsing, event name matching, env var validation, error handling, temp file cleanup, and DoS resilience. Adversarial Test Report — PR #1833Tested: 2026-04-13T13:55 UTC AnalysisWhat the PR does: Adds streaming event validation for Attack surfaces identified:
Adversarial TestsTest 1: SGLang-like incomplete streaming detected
Test 2: Full valid streaming response passes
Test 3: Empty response body
Test 4: Malformed SSE (no
|
| Tests | Pass | Concern | Fail |
|---|---|---|---|
| 15 | 15 | 0 | 0 |
Verdict impact: Clean pass across all tests. The streaming event detection, env var validation, error handling, and temp file cleanup are all solid. No concerns.
Security Scan
| Check | Result | Evidence |
|---|---|---|
| Dangerous patterns (eval/exec/proto) | Scanned pr.diff with grep -n |
RegExp.exec() — SSE parsing, not child_process.exec (false positive) |
| Hardcoded credentials | Scanned pr.diff with grep -niE |
getCredential() function call, not hardcoded (false positive) |
| Dependency changes | Checked package.json diff |
(no dependency changes) |
| Permission changes (chmod/chown) | Scanned pr.diff |
(none found) |
Notes
This is an automated functional review. Every ✅ above was verified by running the stated command/test.
Manual review is still recommended for business logic, API contracts, and performance.
🦞 Auto-reviewed by Nemo.
…NEMOCLAW_PREFERRED_API override (NVIDIA#1833) ## Summary - Adds streaming SSE event validation to the `/v1/responses` probe for custom OpenAI-compatible endpoints, catching backends like SGLang that return valid non-streaming responses but emit incomplete streaming events - Adds `NEMOCLAW_PREFERRED_API=openai-completions` env var to bypass `/v1/responses` probe entirely during onboarding - Documents both the env var override and the existing `NEMOCLAW_INFERENCE_API_OVERRIDE` workaround for already-onboarded sandboxes ## Context Community user reported SGLang passes onboarding validation for `/v1/responses` but fails at runtime because its streaming mode only emits 3 lifecycle events (`response.created`, `response.in_progress`, `response.completed`) — missing the granular content deltas OpenClaw requires (`response.output_text.delta`, etc.). ## Test plan - [ ] Unit tests for `shouldForceCompletionsApi()` (6 cases) and `runStreamingEventProbe()` (5 cases) pass - [ ] `NEMOCLAW_PREFERRED_API=openai-completions` skips `/v1/responses` probe during custom endpoint onboarding - [ ] Streaming probe detects SGLang-like incomplete SSE events and falls back to `/chat/completions` - [ ] Full test suite green <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added NEMOCLAW_PREFERRED_API to force Chat Completions (works interactive/non‑interactive) and optionally skip the /v1/responses probe * Onboarding now validates streaming events and will automatically fall back to Chat Completions if required events are missing; transport/probe failures produce a hard failure * **Documentation** * New troubleshooting and recovery steps (rerun `nemoclaw onboard` to re‑probe and bake the correct API) * Clarified that NEMOCLAW_INFERENCE_API_OVERRIDE only patches startup config and does not update baked image ARGs * Minor wording tweak about image rebuilds * **Tests** * Added tests covering streaming probes, cleanup, error cases, and the preference logic <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…NEMOCLAW_PREFERRED_API override (NVIDIA#1833) ## Summary - Adds streaming SSE event validation to the `/v1/responses` probe for custom OpenAI-compatible endpoints, catching backends like SGLang that return valid non-streaming responses but emit incomplete streaming events - Adds `NEMOCLAW_PREFERRED_API=openai-completions` env var to bypass `/v1/responses` probe entirely during onboarding - Documents both the env var override and the existing `NEMOCLAW_INFERENCE_API_OVERRIDE` workaround for already-onboarded sandboxes ## Context Community user reported SGLang passes onboarding validation for `/v1/responses` but fails at runtime because its streaming mode only emits 3 lifecycle events (`response.created`, `response.in_progress`, `response.completed`) — missing the granular content deltas OpenClaw requires (`response.output_text.delta`, etc.). ## Test plan - [ ] Unit tests for `shouldForceCompletionsApi()` (6 cases) and `runStreamingEventProbe()` (5 cases) pass - [ ] `NEMOCLAW_PREFERRED_API=openai-completions` skips `/v1/responses` probe during custom endpoint onboarding - [ ] Streaming probe detects SGLang-like incomplete SSE events and falls back to `/chat/completions` - [ ] Full test suite green <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added NEMOCLAW_PREFERRED_API to force Chat Completions (works interactive/non‑interactive) and optionally skip the /v1/responses probe * Onboarding now validates streaming events and will automatically fall back to Chat Completions if required events are missing; transport/probe failures produce a hard failure * **Documentation** * New troubleshooting and recovery steps (rerun `nemoclaw onboard` to re‑probe and bake the correct API) * Clarified that NEMOCLAW_INFERENCE_API_OVERRIDE only patches startup config and does not update baked image ARGs * Minor wording tweak about image rebuilds * **Tests** * Added tests covering streaming probes, cleanup, error cases, and the preference logic <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Aaron Erickson <aerickson@nvidia.com> Signed-off-by: ColinM-sys <cmcdonough@50words.com>
Summary
/v1/responsesprobe for custom OpenAI-compatible endpoints, catching backends like SGLang that return valid non-streaming responses but emit incomplete streaming eventsNEMOCLAW_PREFERRED_API=openai-completionsenv var to bypass/v1/responsesprobe entirely during onboardingNEMOCLAW_INFERENCE_API_OVERRIDEworkaround for already-onboarded sandboxesContext
Community user reported SGLang passes onboarding validation for
/v1/responsesbut fails at runtime because its streaming mode only emits 3 lifecycle events (response.created,response.in_progress,response.completed) — missing the granular content deltas OpenClaw requires (response.output_text.delta, etc.).Test plan
shouldForceCompletionsApi()(6 cases) andrunStreamingEventProbe()(5 cases) passNEMOCLAW_PREFERRED_API=openai-completionsskips/v1/responsesprobe during custom endpoint onboarding/chat/completionsSummary by CodeRabbit
New Features
Documentation
nemoclaw onboardto re‑probe and bake the correct API)Tests