fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490
fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490
Conversation
…w 4.9 OpenClaw 2026.4.9 added a global SSRF guard that blocks any hostname ending in .local/.localhost/.internal on every model provider HTTP request. NemoClaw routes inference at https://inference.local/v1, so every LLM call inside the sandbox now fails closed. Set the typed ConfiguredModelProviderRequest.allowPrivateNetwork field on the inference provider in the baked openclaw.json. Scoped to that one provider — the rest of openclaw's SSRF policy stays intact. Safe inside the sandbox because OpenShell already enforces egress via its network policy, making openclaw's hostname guard duplicative for this hop. Adds a static regression guard in test/sandbox-provisioning.test.ts so a future Dockerfile refactor cannot silently drop the opt-in. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
📝 WalkthroughWalkthroughPrevented verification tokens from being embedded in agent prompts, reframed several sandbox assertions to routing-layer checks, added JSON-based agent response validation and error-pattern scanning, and implemented header/state sanitization when rewriting forward-proxy HTTPS requests, plus comprehensive unit and e2e tests for the proxy rewrite. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (proxy request)
participant Proxy as Local Proxy Wrapper\n(nemoclaw http.request)
participant Upstream as Target HTTPS Upstream
rect rgba(230,240,255,0.5)
Client->>Proxy: HTTP request with proxy-style URL (http://proxy/...https://target/...)
end
rect rgba(240,255,230,0.5)
Proxy->>Proxy: Detect FORWARD-mode path -> parse https://target URL\nSanitize headers (remove Host, Proxy-*, Connection tokens, hop-by-hop)\nStrip caller TLS/agent fields (agent, auth, servername, etc.)
end
rect rgba(255,240,230,0.5)
Proxy->>Upstream: Issue rewritten https.request with target host, port, path, sanitized headers
Upstream-->>Proxy: HTTPS response (200 / body)
Proxy-->>Client: Forward response back to client
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/sandbox-provisioning.test.ts (1)
78-85: Tighten the invariant to reduce false positives.The current
toContaincheck can still pass if the same literal is added in an unrelated block. Consider asserting ordering/context near the provider config too.Suggested test hardening
it("openclaw.json provider config opts into allowPrivateNetwork so inference.local resolves", () => { @@ - expect(src).toContain("'request': {'allowPrivateNetwork': True}"); + const baseUrlIdx = src.indexOf("'baseUrl': inference_base_url"); + const requestIdx = src.indexOf("'request': {'allowPrivateNetwork': True}"); + expect(baseUrlIdx).toBeGreaterThanOrEqual(0); + expect(requestIdx).toBeGreaterThan(baseUrlIdx); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sandbox-provisioning.test.ts` around lines 78 - 85, The test currently uses a loose expect(src).toContain("'request': {'allowPrivateNetwork': True}") which can match unrelated code; tighten it by asserting the allowPrivateNetwork literal appears in the openclaw provider config context — for example, locate the provider identifier string used in the test (e.g., "openclaw" or "openclaw.json") and assert that the substring "'request': {'allowPrivateNetwork': True}" appears after it (or use a single regex that matches the provider block and the request key together, e.g., /openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/) so the check targets the provider config rather than any unrelated occurrence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/sandbox-provisioning.test.ts`:
- Around line 78-85: The test currently uses a loose
expect(src).toContain("'request': {'allowPrivateNetwork': True}") which can
match unrelated code; tighten it by asserting the allowPrivateNetwork literal
appears in the openclaw provider config context — for example, locate the
provider identifier string used in the test (e.g., "openclaw" or
"openclaw.json") and assert that the substring "'request':
{'allowPrivateNetwork': True}" appears after it (or use a single regex that
matches the provider block and the request key together, e.g.,
/openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/)
so the check targets the provider config rather than any unrelated occurrence.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0f3f1a6c-f95f-4ace-b5cb-322fd7c02fa2
📒 Files selected for processing (2)
Dockerfiletest/sandbox-provisioning.test.ts
Address CodeRabbit nitpick on PR #2490: replace the bare `toContain` with an ordered-index check that pins the literal between `'baseUrl': inference_base_url` and `'models': [{**({'compat'` — the python that builds the providers dict. A bare match could pass on any unrelated occurrence elsewhere in the Dockerfile; the ordered indices prove the opt-in lives inside the provider config block. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…tokens The "live inference" assertions in cloud-e2e (test-full-e2e.sh) and the Hermes e2e were curl-from-sandbox checks. They prove OpenShell's DNS forwarder + proxy can route inference.local; they never invoke openclaw's HTTP client and never reach openclaw's SSRF guard. That is why every openclaw 4.9 nightly-e2e run on PR #2464 reported [LIVE] Sandbox inference: PASS while real users were getting SsrFBlockedError on the same release. Changes: * test-full-e2e.sh: relabel Phase 4b from [LIVE] to [ROUTING] with a comment pointing at #2490; add Phase 4c, an actual openclaw-mediated turn that runs `openclaw agent --json` over SSH, parses result.payloads[].text, and asserts the model produced "42" for "What is 6 multiplied by 7?". The expected token is not a substring of the prompt, --json routes logs to stderr, stderr is dropped — so prompt-echo on an error path cannot satisfy the grep. * test-hermes-e2e.sh: same relabel for the equivalent curl assertion. * test-sandbox-operations.sh TC-SBX-02: replace `Say exactly: HELLO_E2E` prompt + grep on merged stdout/stderr with the same arithmetic-via-JSON pattern. The previous assertion would match the prompt itself in any error path that quoted it back, including the openclaw 4.9 SSRF rejection — false positive that hid the regression for the entire 4.2 → 4.7 → 4.8 → 4.9 bump series. * verify-sandbox-skill-via-agent.sh: stop embedding ${VERIFY_TOKEN} in the prompt (the agent must read it from SKILL.md — that is the test). Add a guard that refuses SKILL_VERIFY_PROMPT overrides which smuggle the token back in, and a negative assertion on SsrFBlockedError, transport errors, and gateway-unavailable markers before the positive grep. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/e2e/test-full-e2e.sh (1)
397-415: Consider extracting the JSON payload parser to a shared function.The Python snippet for parsing
result.payloads[].textfrom the openclaw agent JSON envelope is duplicated intest-sandbox-operations.sh(lines 297-309). If more tests adopt this pattern, consider extracting it totest/e2e/lib/as a shared helper.Not blocking since there are only two occurrences currently.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-full-e2e.sh` around lines 397 - 415, Extract the duplicated Python JSON parsing snippet that builds agent_reply from agent_response into a shared helper (e.g., create a script or shell function named parse_agent_payloads or parse_agent_reply in test/e2e/lib/), replace the inline Python block in both test-full-e2e.sh and test-sandbox-operations.sh with a call to that helper, and ensure the helper accepts agent_response (or reads stdin) and returns the concatenated payload texts so the existing grep/if logic around agent_reply remains unchanged.test/e2e/test-sandbox-operations.sh (1)
297-315: Consider logging when JSON parsing fails.The Python parser silently exits with code 0 when JSON parsing fails (line 302), which causes
$replyto be empty. While the test will still fail (empty string won't match "42"), the failure message won't distinguish between "no JSON output" and "JSON parsed but no matching payloads."This is acceptable for now since the raw stdout is included in the failure message, but a debug log could help triage.
🔧 Optional: Add parse failure indicator
reply=$(echo "$raw" | python3 -c " import json, sys try: doc = json.load(sys.stdin) except Exception: + print('PARSE_FAILED', file=sys.stderr) sys.exit(0) result = doc.get('result') or {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-sandbox-operations.sh` around lines 297 - 315, The test's Python JSON extractor (the python3 one-liner that populates the reply variable from raw) currently swallows JSON parse errors and exits silently; change that python block (the invocation that sets reply) to catch the JSON parsing exception and emit a clear parse-failure marker (for example print a sentinel like "JSON_PARSE_ERROR" to stdout or stderr) so the shell can distinguish parse failures from empty payloads, and update the surrounding shell check (the if that inspects $reply and grep for 42) to detect this marker and produce a more specific fail message referencing reply/raw when parsing failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e/test-full-e2e.sh`:
- Around line 397-415: Extract the duplicated Python JSON parsing snippet that
builds agent_reply from agent_response into a shared helper (e.g., create a
script or shell function named parse_agent_payloads or parse_agent_reply in
test/e2e/lib/), replace the inline Python block in both test-full-e2e.sh and
test-sandbox-operations.sh with a call to that helper, and ensure the helper
accepts agent_response (or reads stdin) and returns the concatenated payload
texts so the existing grep/if logic around agent_reply remains unchanged.
In `@test/e2e/test-sandbox-operations.sh`:
- Around line 297-315: The test's Python JSON extractor (the python3 one-liner
that populates the reply variable from raw) currently swallows JSON parse errors
and exits silently; change that python block (the invocation that sets reply) to
catch the JSON parsing exception and emit a clear parse-failure marker (for
example print a sentinel like "JSON_PARSE_ERROR" to stdout or stderr) so the
shell can distinguish parse failures from empty payloads, and update the
surrounding shell check (the if that inspects $reply and grep for 42) to detect
this marker and produce a more specific fail message referencing reply/raw when
parsing failed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 25682ebd-5989-4b93-8cd5-91f44bd1a632
📒 Files selected for processing (4)
test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.shtest/e2e/test-full-e2e.shtest/e2e/test-hermes-e2e.shtest/e2e/test-sandbox-operations.sh
Adversarial self-reviewI ran a deliberate red-team pass on this change before requesting review. Posting the findings here so reviewers don't have to redo the work and so the open follow-ups are tracked. Cross-referenced openclaw HEAD at BlockersNone. Schema match confirmed at openclaw Concerns (worth tracking, not blocking)1. Cross-origin redirect blast radius. 2. Upstream regression magnet. 3. Test brittleness/laxity. The static check in 4. Provider-key fan-out fragility. Notes
Follow-ups to file after merge
|
sandbox_exec_for() merges stderr into stdout (2>&1 at line 95) so it can log non-zero exits — but that pollutes the stdout `python3 json.load()` needs to parse, and node-side warnings (UNDICI-EHPA, etc.) from openclaw agent always land on stderr. Result on the first nightly-e2e dispatch: TC-SBX-02 captured `(node:1044) [UNDICI-EHPA] Warning: ...` ahead of any JSON, json.load raised, the except branch produced reply='', and the test failed with `expected '42' in agent reply, got: ''`. cloud-e2e Phase 4c didn't have this problem because it uses a direct ssh invocation with `2>/dev/null` at the source. Mirror that here: open the ssh-config in the helper, call ssh directly, drop stderr at the wire. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Follow-up to #2344 (NemoClaw 0.0.24). The http.request → https.request rewrite that resolves the NODE_USE_ENV_PROXY=1 / library-side proxy double-processing was shallow-copying the caller's options into the rewritten https.request. That dragged three classes of forward-proxy- hop fields onto a request that now goes direct to the upstream: * options.agent — a forward-proxy http.Agent. http.Agent cannot speak TLS, so https.request either ignores it and falls back to a default agent or fails the TLS handshake outright depending on caller. This is the most likely root cause of the deepinfra-style report on Discord ("LLM request failed: network connection error" against a custom OpenAI-compatible upstream while NVIDIA-routed providers keep working). NVIDIA Endpoints' DNS-rewritten path through OpenShell doesn't end up in this branch, so the bug stayed invisible. * options.auth — basic-auth meant for the proxy hop. Leaving it on https.request would Basic-auth the *target* server with proxy credentials. * Host / Proxy-Authorization / Proxy-Connection / Proxy-Authenticate headers — Host points at the proxy host and the Proxy-* headers are hop-by-hop. Stripping them lets Node regenerate Host from the target's hostname/port and prevents proxy creds from leaking upstream. Adds test/http-proxy-fix-rewrite.test.ts with seven cases pinning each strip and confirming signal/timeout/TLS fields and target headers (Authorization, Content-Type, …) survive. The existing http-proxy-fix-sync.test.ts byte-for-byte enforcer still passes — the canonical wrapper at nemoclaw-blueprint/scripts/http-proxy-fix.js and the heredoc in scripts/nemoclaw-start.sh were updated together. Reverts the openclaw `request.allowPrivateNetwork` Dockerfile change that this PR previously carried — empirical investigation showed openclaw 2026.4.9 rejects that key in strict zod and `openclaw doctor --fix` (which the Dockerfile runs at build time) silently strips it back to `"request": {}` before the image ships. The upstream plumbing that reads `request.allowPrivateNetwork` only landed in 2026.4.10 (commit 0808dd111c, openclaw PR #63671). The change here was a no-op, diagnosing the wrong layer; this commit replaces it with the actual fix at the wrapper layer. The label-rename + arithmetic-via-`--json` test improvements added earlier on this branch (Phase 4c, TC-SBX-02 rewrite, skill-verify hardening, sandbox_exec stderr-merge fix) are kept — they catch regressions independent of the proxy bug. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Spins up a local HTTPS mock on 127.0.0.1 with a self-signed cert generated
via openssl, simulates the FORWARD-mode http.request shape that axios +
HTTPS_PROXY produces (forward-proxy http.Agent attached, proxy basic-auth,
Host pointing at the proxy, Proxy-* headers), and asserts an
OpenAI-compatible chat-completions round trip actually completes.
Bisect proof:
Wrapper without the strip (parent commit reverted):
× test fails with `TypeError: Protocol "https:" not supported.
Expected "http:"` — Node detects the forward-proxy http.Agent
riding into https.request. Same root cause as the deepinfra
"LLM request failed: network connection error" symptom.
Wrapper with the strip (this PR):
✓ test passes; mock receives POST /v1/openai/chat/completions with
Authorization: Bearer real-deepinfra-token preserved, Host:
localhost:<port> regenerated by Node, Proxy-* headers absent,
and the request body intact. Caller gets the mock's PONG reply.
Skipped if openssl is not on PATH; CI runners (ubuntu-latest, macOS) have
it. Cert is generated fresh per test run with 1-day expiry so there is no
long-lived crypto material in the repo.
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…sport hints
Adversarial review caught three blockers and seven concerns on the prior
commit. This addresses all of them:
Wrapper (nemoclaw-blueprint/scripts/http-proxy-fix.js + heredoc in
scripts/nemoclaw-start.sh, kept byte-for-byte synced):
* sanitizeHeaders strips the full RFC 7230 §6.1 hop-by-hop set
(Connection, Keep-Alive, Proxy-Authorization, TE, Trailer,
Transfer-Encoding, Upgrade) plus Host, Proxy-Connection, and
Proxy-Authenticate. Also walks the Connection header and strips
every token named there (transitive hop-by-hop). The prior commit
only covered Host + Proxy-*, leaving Connection: close from the
proxy hop riding into a direct-to-target request and defeating
keep-alive on the rewrite. (B1)
* Proxy-Authenticate is response-only per RFC 7235 §4.3 — comment
block updated to call this out so the next reviewer doesn't repeat
the question. Kept in the strip set as belt-and-suspenders for
clients that echo response headers into retry-request options. (B2)
* Strip rewritten.servername and rewritten.checkServerIdentity — TLS
SNI and identity-check pre-computed for the proxy hop must not
survive into the direct-to-target handshake. (C2)
* Strip rewritten.socketPath — Unix-socket proxies (cntlm-style)
exist; routing TLS bytes into the proxy socket would defeat the
rewrite. (C2)
* Strip rewritten.localAddress, lookup, family, hints —
source-binding and DNS hints picked for reachability to the proxy
don't apply to a different upstream. (C3)
Tests:
* test/http-proxy-fix-rewrite.test.ts: new cases pin every new strip
(RFC 7230 §6.1 hop-by-hop including transitive Connection-named
tokens; servername/checkServerIdentity; socketPath/localAddress/
lookup/family/hints). Adds a separate "control" describe block that
constructs the broken pre-fix rewrite shape inline and asserts
https.request synchronously rejects it on Node 22. The bisect now
lives in the repo, not in PR description prose. (B3)
* test/http-proxy-fix-e2e.test.ts: switches from raw process.env
mutation to vi.stubEnv (matches the rewrite suite, prevents env
leaks across test files in the same vitest worker). Saves and
restores the original http.request in beforeEach/afterEach so a
failed test does not leave the wrapped function chained into the
next test file. (C4, C5)
* Cert expiry: 1-day → 7-day so a long-running `vitest --watch`
session does not outrun it. (C6)
* When CI=true and openssl is missing, throws at module load instead
of silent-skipping. CI runners ship openssl; a missing-binary skip
would be a false green. (C7)
Run-time totals: 28 cases across rewrite (12) + sync (6) + e2e (1) +
sandbox-provisioning (10) — all pass; 0 regressions.
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/http-proxy-fix-rewrite.test.ts (1)
249-260: Correct assertion for explicit port, but type inconsistency worth noting.The test correctly expects
"8443"as a string becauseURL.portreturns a string for explicit ports. The earlier test at line 97 expects443as a number because the production code usestarget.port || 443where empty string falsy-falls back to the numeric default.This is correct behavior but the mixed string/number type could confuse future maintainers. Consider adding a brief comment.
📝 Optional: Add clarifying comment
it("uses the explicit target port when one is present in the URL", () => { http.request({ hostname: PROXY_HOST, port: 3128, path: "https://internal.example.com:8443/v1/x", headers: {}, }); expect(captured).not.toBeNull(); + // URL.port returns a string for explicit ports; the production code's + // `target.port || 443` preserves the string type here (vs. numeric 443 fallback). expect(captured?.port).toBe("8443"); expect(captured?.hostname).toBe("internal.example.com"); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/http-proxy-fix-rewrite.test.ts` around lines 249 - 260, The test "uses the explicit target port when one is present in the URL" asserts captured?.port === "8443" which is correct because URL.port returns a string while other tests expect numeric 443 due to production code using target.port || 443; add a one-line comment above this test (or above the assertion) explaining that explicit URL ports are string-typed via URL.port and that the code falls back to numeric 443 when URL.port is empty to avoid future confusion referencing the captured variable and the test name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/http-proxy-fix-rewrite.test.ts`:
- Around line 249-260: The test "uses the explicit target port when one is
present in the URL" asserts captured?.port === "8443" which is correct because
URL.port returns a string while other tests expect numeric 443 due to production
code using target.port || 443; add a one-line comment above this test (or above
the assertion) explaining that explicit URL ports are string-typed via URL.port
and that the code falls back to numeric 443 when URL.port is empty to avoid
future confusion referencing the captured variable and the test name.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e060c472-d919-4006-9f36-5474c932e073
📒 Files selected for processing (4)
nemoclaw-blueprint/scripts/http-proxy-fix.jsscripts/nemoclaw-start.shtest/http-proxy-fix-e2e.test.tstest/http-proxy-fix-rewrite.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- nemoclaw-blueprint/scripts/http-proxy-fix.js
- scripts/nemoclaw-start.sh
## Summary Refreshes user-facing docs for the last 24 hours of merged NemoClaw history and bumps the docs metadata to 0.0.29, the next version after v0.0.28. The updates are limited to behavior supported by merged PR descriptions and diffs. ## Changes - `docs/reference/commands.md`: documented `nemoclaw <name> policy-add --from-file` and `--from-dir`, including custom preset review guidance, from #2077 / commit `7720b175`. - `docs/deployment/deploy-to-remote-gpu.md`: clarified that non-loopback `CHAT_UI_URL` disables OpenClaw device pairing for remote browser-only deployments, from #2449 / commit `f5ee8a4d`. - `docs/inference/inference-options.md`: documented provider-aware credential retry validation and the NVIDIA-only `nvapi-` prefix check, from #2389 / commit `6f7f0c6d`. - `docs/inference/switch-inference-providers.md`: documented `NEMOCLAW_INFERENCE_INPUTS` for text/image-capable model metadata baked into `openclaw.json`, from #2441 / commit `f4391892`. - `docs/reference/troubleshooting.md`: added the Git certificate verification entry for proxy CA propagation through `GIT_SSL_CAINFO`, `GIT_SSL_CAPATH`, `CURL_CA_BUNDLE`, and `REQUESTS_CA_BUNDLE`, from #2345 / commit `fa0dc1ab`. - `docs/versions1.json` and `docs/project.json`: promoted docs version `0.0.29`; `docs/versions1.json` omits unpublished `0.0.26`, `0.0.27`, and `0.0.28` entries. - `.agents/skills/nemoclaw-user-*`: regenerated derived user skill references from the updated docs. - Reviewed with no extra doc changes: #2575 / `d392ec07`, #2565 / `a3231049`, #1965 / `db1ef3ca`, #1990 / `db665834`, #2495 / `7da86fa3`, #2496 / `3192f4f4`, #2490 / `8c209058`, #2487 / `1f615e2f`, #2483 / `5653d33a`, #2482 / `31c782c0`, #2464 / `23bb5703`, #2472 / `a54f9a34`, and #2437 / `6bc860d7`. - Skipped per docs policy: #2420 / `7b76df6b` touched the experimental sandbox config path listed in `docs/.docs-skip`; #2466 / `cc15689c` touched a skipped term and CI-only sandbox image files. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification <!-- Check each item you ran and confirmed. Leave unchecked items you skipped. --> - [x] `npx prek run --all-files` passes - [ ] `npm test` passes — failed locally in installer-integration tests and one onboard helper timeout; the doc-scoped hook test projects passed under `prek`. - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) — build succeeded, but local Sphinx emitted the existing version-switcher file read message. - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) ## AI Disclosure <!-- If an AI agent authored or co-authored this PR, check the box and name the tool. Remove this section for fully human-authored PRs. --> - [x] AI-assisted — tool: Codex --- <!-- DCO sign-off required by CI. Run: git config user.name && git config user.email --> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Support for custom YAML presets in policy configuration via --from-file and --from-dir. * New build-time inference input option to declare accepted modalities (text or text,image). * **Improvements** * Credential validation now offers interactive recovery: re-enter key, retry, choose another provider, or exit. * Clarified provider-specific API key prefix handling (nvapi- only applies to NVIDIA keys). * **Documentation** * TLS certificate troubleshooting for inspected networks. * Clarified remote dashboard security/device-pairing behavior; command docs updated; docs version bumped. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Summary
Follow-up to #2344 (NemoClaw 0.0.24). The
http.request→https.requestrewrite innemoclaw-blueprint/scripts/http-proxy-fix.jswas shallow-copying the caller's options into the rewrittenhttps.requestcall. That dragged forward-proxy-hop fields onto a request that now goes direct to the upstream. This is consistent with the Discord report frommflovaafter the 0.0.23 → 0.0.24 bump:mflova was on a custom OpenAI-compatible endpoint (deepinfra). NVIDIA Endpoints' DNS-rewritten path through OpenShell doesn't end up in the wrapper's FORWARD-mode branch, which is why upstream regressions there stayed invisible to nightly-e2e. mflova's reported error is the openclaw client's wrapped surface; on Node 22 the underlying mechanism is a synchronous
TypeError: Protocol "https:" not supported. Expected "http:"thrown byhttps.requestwhen the forward-proxyhttp.Agentreaches it. On Node 18/20 the same root cause manifests as a TLS handshake failure rather than a synchronous throw — same bug class, different surface.Bisect proof (committed in this PR)
test/http-proxy-fix-rewrite.test.tscarries both halves of the proof in-repo:describeblock reproduces the broken pre-fix rewrite shape inline (Object.assignwith no field strips, hop-by-hop headers preserved) and assertshttps.requestrejects it withTypeError: Protocol "https:" not supported. That test passes regardless of the wrapper, locking in the bug class.describeblock asserts the wrapper produces options that don't match that shape —agent,auth,Host,Proxy-*, the rest of RFC 7230 §6.1 hop-by-hop,servername,checkServerIdentity,socketPath,localAddress,lookup,family,hintsare all stripped on rewrite.Combined: bug class is real and detectable; wrapper's job is to never produce that shape. If a future maintainer reverts a strip, the rewrite test breaks; if Node changes the failure surface, the control test breaks. Two-sided coverage without storing a copy of the broken wrapper.
What was wrong
The rewrite was:
Three classes of forward-proxy-hop fields rode along:
options.agent— a forward-proxyhttp.Agent.http.Agent.protocol === 'http:', so on Node 22https.requestthrowsTypeError: Protocol "https:" not supported. Expected "http:"synchronously; on Node 18/20 it falls through and the TLS handshake fails. Manifests upstream as "Connection error".options.auth— basic-auth meant for the proxy hop. Leaving it onhttps.requestBasic-auths the target server with proxy credentials.Connection,Keep-Alive,Proxy-Authorization,TE,Trailer,Transfer-Encoding,Upgrade, plus tokens named inConnection(transitively hop-by-hop), plusHost(proxy-pointing) andProxy-Connection(de facto deprecated).Connection: closefrom the proxy hop defeats keep-alive on the rewrite;Upgradeto a non-WS target produces 400/426;Proxy-*leak proxy credentials upstream.Plus: TLS identity (
servername,checkServerIdentity), socket-path proxy hint (socketPath), and source-binding / DNS hints (localAddress,lookup,family,hints) that all describe the connection to the proxy and don't apply to a different upstream.The fix
nemoclaw-blueprint/scripts/http-proxy-fix.js(canonical) and the heredoc inscripts/nemoclaw-start.sh(byte-for-byte synced via the existinghttp-proxy-fix-sync.test.ts):sanitizeHeaders()strips the full RFC 7230 §6.1 hop-by-hop set plusHost,Proxy-Connection,Proxy-Authenticate, and any token named in theConnectionheader (transitive per the same RFC). Case-insensitive.delete rewritten.agent,delete rewritten.auth,delete rewritten.servername,delete rewritten.checkServerIdentity,delete rewritten.socketPath,delete rewritten.localAddress,delete rewritten.lookup,delete rewritten.family,delete rewritten.hintsafter theObject.assign.ca/cert/key/rejectUnauthorized),timeout, body, and target-intent headers (Authorization,Content-Type, …) all survive.Tests
test/http-proxy-fix-rewrite.test.ts(12 cases) — spy-level rewrite tests pinning every strip; control test reproducing the broken-wrapper shape and assertingTypeError.test/http-proxy-fix-e2e.test.ts(1 case) — end-to-end. openssl shell-out at module load (skipped locally if openssl missing; loud-fails underCI=true). Spins up an HTTPS mock on127.0.0.1:0, builds the FORWARD-modehttp.requestshape with forward-proxyhttp.Agent+ proxy basic-auth + proxy-pointing Host, asserts the round trip completes.test/http-proxy-fix-sync.test.ts(6 cases, existing) — byte-for-byte parity between the canonical wrapper and thescripts/nemoclaw-start.shheredoc still enforced.What this PR no longer carries
This branch previously included a Dockerfile change that added
'request': {'allowPrivateNetwork': True}to the inference provider in the bakedopenclaw.json, and a regression test for it. Empirical investigation showed:request.allowPrivateNetworkas an unrecognized key (src/config/zod-schema.core.ts:283-292in v2026.4.9).openclaw doctor --fix— which the Dockerfile runs at image build time — silently strips the key, leaving"request": {}before the image ships.request.allowPrivateNetworkonly landed in 2026.4.10 (upstream PR #63671, commit0808dd111c).So the previous "fix" was a no-op diagnosing the wrong layer. Dockerfile and
test/sandbox-provisioning.test.tsare restored toorigin/main.What this PR keeps from the prior iteration
The label-rename + arithmetic-via-
--jsontest improvements added earlier on this branch are kept — they catch regressions independent of the proxy bug:test/e2e/test-full-e2e.sh: Phase 4b relabelled[LIVE]→[ROUTING]; new Phase 4c that runsopenclaw agent --jsonover SSH inside the sandbox and asserts the model answered42to "What is 6 multiplied by 7?". Phase 4c is the only assertion in the suite that exercises the openclaw HTTP client end-to-end againstinference.local.test/e2e/test-hermes-e2e.sh: same[LIVE]→[ROUTING]relabel.test/e2e/test-sandbox-operations.shTC-SBX-02: replacesSay exactly: HELLO_E2E+ merged-output grep with the arithmetic-via---jsonpattern.test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh: stops embedding${VERIFY_TOKEN}in the prompt; refuses overrides that smuggle it back; adds a negative assertion onSsrFBlockedError/ transport / gateway-unavailable markers.Type of Change
Verification
npx vitest run test/http-proxy-fix-e2e.test.ts test/http-proxy-fix-rewrite.test.ts test/http-proxy-fix-sync.test.ts test/sandbox-provisioning.test.ts— 28/28 pass.69115de7.bash -nandshellcheck -S warningclean onscripts/nemoclaw-start.shand the four touched e2e scripts.scripts/nemoclaw-start.shheredoc updated together.Per-commit on this branch
f162daa5— initial strip ofagent/auth/Host/Proxy-*(the fix), spy-level unit tests2f217f8a— end-to-end test against a local HTTPS mock with self-signed cert69115de7— wider RFC 7230 §6.1 hop-by-hop + TLS/transport hint strips, in-repo bisect control, vi.stubEnv consistency, http.request restore, 7-day cert, CI strict-skipHow to validate after merge
The signal job is
cloud-e2einnightly-e2e.yamlrunningtest/e2e/test-full-e2e.sh. Watch for:For the deepinfra-style failure specifically, the unit + e2e + control tests in this PR pin every strip the wrapper performs and the bug class it prevents. A real proxy + real upstream + real TLS round trip would need a CI job with a non-NVIDIA OpenAI-compatible API key — its own follow-up PR if you want belt-and-suspenders coverage.
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests