fix(sandbox): strip forward-proxy fields when rewriting to https.request by ericksoa · Pull Request #2490 · NVIDIA/NemoClaw

ericksoa · 2026-04-26T22:28:13Z

Summary

Follow-up to #2344 (NemoClaw 0.0.24). The http.request → https.request rewrite in nemoclaw-blueprint/scripts/http-proxy-fix.js was shallow-copying the caller's options into the rewritten https.request call. That dragged forward-proxy-hop fields onto a request that now goes direct to the upstream. This is consistent with the Discord report from mflova after the 0.0.23 → 0.0.24 bump:

Finally, when I run the bot, I always get an error after sending any message to it: error=LLM request failed: network connection error. rawError=Connection error

Some context: I am using 0.0.23 as the 0.0.24 had a bug with the network.

mflova was on a custom OpenAI-compatible endpoint (deepinfra). NVIDIA Endpoints' DNS-rewritten path through OpenShell doesn't end up in the wrapper's FORWARD-mode branch, which is why upstream regressions there stayed invisible to nightly-e2e. mflova's reported error is the openclaw client's wrapped surface; on Node 22 the underlying mechanism is a synchronous TypeError: Protocol "https:" not supported. Expected "http:" thrown by https.request when the forward-proxy http.Agent reaches it. On Node 18/20 the same root cause manifests as a TLS handshake failure rather than a synchronous throw — same bug class, different surface.

Bisect proof (committed in this PR)

test/http-proxy-fix-rewrite.test.ts carries both halves of the proof in-repo:

The "control" describe block reproduces the broken pre-fix rewrite shape inline (Object.assign with no field strips, hop-by-hop headers preserved) and asserts https.request rejects it with TypeError: Protocol "https:" not supported. That test passes regardless of the wrapper, locking in the bug class.
The rewrite describe block asserts the wrapper produces options that don't match that shape — agent, auth, Host, Proxy-*, the rest of RFC 7230 §6.1 hop-by-hop, servername, checkServerIdentity, socketPath, localAddress, lookup, family, hints are all stripped on rewrite.

Combined: bug class is real and detectable; wrapper's job is to never produce that shape. If a future maintainer reverts a strip, the rewrite test breaks; if Node changes the failure surface, the control test breaks. Two-sided coverage without storing a copy of the broken wrapper.

What was wrong

The rewrite was:

var rewritten = Object.assign({}, options, {
  hostname: target.hostname, host: target.hostname,
  port: target.port || 443, path: target.pathname + target.search,
  protocol: 'https:',
});
return https.request(rewritten, callback);

Three classes of forward-proxy-hop fields rode along:

options.agent — a forward-proxy http.Agent. http.Agent.protocol === 'http:', so on Node 22 https.request throws TypeError: Protocol "https:" not supported. Expected "http:" synchronously; on Node 18/20 it falls through and the TLS handshake fails. Manifests upstream as "Connection error".
options.auth — basic-auth meant for the proxy hop. Leaving it on https.request Basic-auths the target server with proxy credentials.
Hop-by-hop headers (RFC 7230 §6.1) — Connection, Keep-Alive, Proxy-Authorization, TE, Trailer, Transfer-Encoding, Upgrade, plus tokens named in Connection (transitively hop-by-hop), plus Host (proxy-pointing) and Proxy-Connection (de facto deprecated). Connection: close from the proxy hop defeats keep-alive on the rewrite; Upgrade to a non-WS target produces 400/426; Proxy-* leak proxy credentials upstream.

Plus: TLS identity (servername, checkServerIdentity), socket-path proxy hint (socketPath), and source-binding / DNS hints (localAddress, lookup, family, hints) that all describe the connection to the proxy and don't apply to a different upstream.

The fix

nemoclaw-blueprint/scripts/http-proxy-fix.js (canonical) and the heredoc in scripts/nemoclaw-start.sh (byte-for-byte synced via the existing http-proxy-fix-sync.test.ts):

sanitizeHeaders() strips the full RFC 7230 §6.1 hop-by-hop set plus Host, Proxy-Connection, Proxy-Authenticate, and any token named in the Connection header (transitive per the same RFC). Case-insensitive.
delete rewritten.agent, delete rewritten.auth, delete rewritten.servername, delete rewritten.checkServerIdentity, delete rewritten.socketPath, delete rewritten.localAddress, delete rewritten.lookup, delete rewritten.family, delete rewritten.hints after the Object.assign.
Signal (AbortController), TLS material (ca/cert/key/rejectUnauthorized), timeout, body, and target-intent headers (Authorization, Content-Type, …) all survive.

Tests

test/http-proxy-fix-rewrite.test.ts (12 cases) — spy-level rewrite tests pinning every strip; control test reproducing the broken-wrapper shape and asserting TypeError.
test/http-proxy-fix-e2e.test.ts (1 case) — end-to-end. openssl shell-out at module load (skipped locally if openssl missing; loud-fails under CI=true). Spins up an HTTPS mock on 127.0.0.1:0, builds the FORWARD-mode http.request shape with forward-proxy http.Agent + proxy basic-auth + proxy-pointing Host, asserts the round trip completes.
test/http-proxy-fix-sync.test.ts (6 cases, existing) — byte-for-byte parity between the canonical wrapper and the scripts/nemoclaw-start.sh heredoc still enforced.

What this PR no longer carries

This branch previously included a Dockerfile change that added 'request': {'allowPrivateNetwork': True} to the inference provider in the baked openclaw.json, and a regression test for it. Empirical investigation showed:

openclaw 2026.4.9's strict zod schema rejects request.allowPrivateNetwork as an unrecognized key (src/config/zod-schema.core.ts:283-292 in v2026.4.9).
openclaw doctor --fix — which the Dockerfile runs at image build time — silently strips the key, leaving "request": {} before the image ships.
The plumbing that reads request.allowPrivateNetwork only landed in 2026.4.10 (upstream PR #63671, commit 0808dd111c).

So the previous "fix" was a no-op diagnosing the wrong layer. Dockerfile and test/sandbox-provisioning.test.ts are restored to origin/main.

What this PR keeps from the prior iteration

The label-rename + arithmetic-via---json test improvements added earlier on this branch are kept — they catch regressions independent of the proxy bug:

test/e2e/test-full-e2e.sh: Phase 4b relabelled [LIVE] → [ROUTING]; new Phase 4c that runs openclaw agent --json over SSH inside the sandbox and asserts the model answered 42 to "What is 6 multiplied by 7?". Phase 4c is the only assertion in the suite that exercises the openclaw HTTP client end-to-end against inference.local.
test/e2e/test-hermes-e2e.sh: same [LIVE] → [ROUTING] relabel.
test/e2e/test-sandbox-operations.sh TC-SBX-02: replaces Say exactly: HELLO_E2E + merged-output grep with the arithmetic-via---json pattern.
test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh: stops embedding ${VERIFY_TOKEN} in the prompt; refuses overrides that smuggle it back; adds a negative assertion on SsrFBlockedError / transport / gateway-unavailable markers.

Type of Change

Code change (feature, bug fix, or refactor)

Verification

npx vitest run test/http-proxy-fix-e2e.test.ts test/http-proxy-fix-rewrite.test.ts test/http-proxy-fix-sync.test.ts test/sandbox-provisioning.test.ts — 28/28 pass.
Adversarial review's blockers (B1–B3) and concerns (C1–C7) addressed in commit 69115de7.
bash -n and shellcheck -S warning clean on scripts/nemoclaw-start.sh and the four touched e2e scripts.
Heredoc-sync test still passes; canonical wrapper and scripts/nemoclaw-start.sh heredoc updated together.
No secrets, API keys, or credentials committed.

Per-commit on this branch

f162daa5 — initial strip of agent/auth/Host/Proxy-* (the fix), spy-level unit tests
2f217f8a — end-to-end test against a local HTTPS mock with self-signed cert
69115de7 — wider RFC 7230 §6.1 hop-by-hop + TLS/transport hint strips, in-repo bisect control, vi.stubEnv consistency, http.request restore, 7-day cert, CI strict-skip

How to validate after merge

The signal job is cloud-e2e in nightly-e2e.yaml running test/e2e/test-full-e2e.sh. Watch for:

PASS: [LIVE] openclaw agent: model answered 6×7=42 through openclaw → inference.local

For the deepinfra-style failure specifically, the unit + e2e + control tests in this PR pin every strip the wrapper performs and the bug class it prevents. A real proxy + real upstream + real TLS round trip would need a CI job with a non-NVIDIA OpenAI-compatible API key — its own follow-up PR if you want belt-and-suspenders coverage.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved HTTP proxy request handling to sanitize headers and remove incompatible routing-specific fields, ensuring proper downstream forwarding.
Tests
- Added comprehensive test coverage for proxy header sanitization and rewrite behavior.
- Enhanced E2E testing with improved routing-layer assertions and JSON payload extraction from agent responses.

…w 4.9 OpenClaw 2026.4.9 added a global SSRF guard that blocks any hostname ending in .local/.localhost/.internal on every model provider HTTP request. NemoClaw routes inference at https://inference.local/v1, so every LLM call inside the sandbox now fails closed. Set the typed ConfiguredModelProviderRequest.allowPrivateNetwork field on the inference provider in the baked openclaw.json. Scoped to that one provider — the rest of openclaw's SSRF policy stays intact. Safe inside the sandbox because OpenShell already enforces egress via its network policy, making openclaw's hostname guard duplicative for this hop. Adds a static regression guard in test/sandbox-provisioning.test.ts so a future Dockerfile refactor cannot silently drop the opt-in. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai · 2026-04-26T22:28:23Z

📝 Walkthrough

Walkthrough

Prevented verification tokens from being embedded in agent prompts, reframed several sandbox assertions to routing-layer checks, added JSON-based agent response validation and error-pattern scanning, and implemented header/state sanitization when rewriting forward-proxy HTTPS requests, plus comprehensive unit and e2e tests for the proxy rewrite.

Changes

Cohort / File(s)	Summary
E2E token/assertion and JSON-response updates `test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh`, `test/e2e/test-full-e2e.sh`, `test/e2e/test-hermes-e2e.sh`, `test/e2e/test-sandbox-operations.sh`	Prevent embedding `${VERIFY_TOKEN}` in agent prompts and assert `SKILL_VERIFY_PROMPT` does not contain the literal; convert certain sandbox checks from "LIVE" to routing-layer `[ROUTING]` assertions; run `openclaw agent --json` with stderr suppressed, extract `result.payloads[].text`, validate an isolated `42` via regex; add error-pattern scanning for provider/transport/gateway/connection failures and immediate fail on matches.
Forward-proxy HTTPS rewrite: header & option sanitization `nemoclaw-blueprint/scripts/http-proxy-fix.js`, `scripts/nemoclaw-start.sh`	When rewriting FORWARD-mode proxy requests to `https.request`, strip hop-by-hop and proxy headers (e.g., `Host`, `Proxy-*`, RFC7230 hop headers and tokens from `Connection`), and delete proxy/agent/TLS/routing-specific fields (`agent`, `auth`, `servername`, `checkServerIdentity`, `socketPath`, `localAddress`, `lookup`, `family`, `hints`) so TLS/SNI and routing are recomputed for the target.
Unit tests for proxy rewrite behavior `test/http-proxy-fix-rewrite.test.ts`	Add Vitest unit suite that stubs `https.request` to assert rewritten options: target hostname/port/path/protocol correctness, removal of proxy/TLS/transport caller fields, stripping of hop-by-hop/proxy headers while preserving `Authorization`/`Content-Type`, and preservation of `signal`/`timeout`/`rejectUnauthorized`. Includes pre-fix bisect control asserting Node protocol validation error.
E2E test for proxy rewrite (in-process HTTPS upstream) `test/http-proxy-fix-e2e.test.ts`	Add Vitest E2E that spawns an ephemeral HTTPS upstream (self-signed cert via `openssl`), sets proxy env, loads the wrapper, issues an HTTP-proxy-shaped request, and asserts upstream saw rewritten `Host: localhost:<port>`, preserved `Authorization`, stripped proxy-hop headers, correct path `/v1/openai/chat/completions`, and a `200` response. Skips locally if `openssl` missing; fails in CI if missing.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client (proxy request)
    participant Proxy as Local Proxy Wrapper\n(nemoclaw http.request)
    participant Upstream as Target HTTPS Upstream

    rect rgba(230,240,255,0.5)
    Client->>Proxy: HTTP request with proxy-style URL (http://proxy/...https://target/...)
    end

    rect rgba(240,255,230,0.5)
    Proxy->>Proxy: Detect FORWARD-mode path -> parse https://target URL\nSanitize headers (remove Host, Proxy-*, Connection tokens, hop-by-hop)\nStrip caller TLS/agent fields (agent, auth, servername, etc.)
    end

    rect rgba(255,240,230,0.5)
    Proxy->>Upstream: Issue rewritten https.request with target host, port, path, sanitized headers
    Upstream-->>Proxy: HTTPS response (200 / body)
    Proxy-->>Client: Forward response back to client
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop and scrub headers, quiet and spry,
Tokens tucked safely, prompts kept shy.
Replies parsed in JSON, forty-two in sight,
Proxies now honest, routing feels right.
Hooray for clean hops — a rabbit's delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely summarizes the main fix: stripping forward-proxy-specific fields when rewriting http.request to https.request in the sandbox proxy wrapper.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/openclaw-4-9-inference-ssrf

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

test/sandbox-provisioning.test.ts (1)

78-85: Tighten the invariant to reduce false positives.

The current toContain check can still pass if the same literal is added in an unrelated block. Consider asserting ordering/context near the provider config too.

Suggested test hardening

   it("openclaw.json provider config opts into allowPrivateNetwork so inference.local resolves", () => {
@@
-    expect(src).toContain("'request': {'allowPrivateNetwork': True}");
+    const baseUrlIdx = src.indexOf("'baseUrl': inference_base_url");
+    const requestIdx = src.indexOf("'request': {'allowPrivateNetwork': True}");
+    expect(baseUrlIdx).toBeGreaterThanOrEqual(0);
+    expect(requestIdx).toBeGreaterThan(baseUrlIdx);
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/sandbox-provisioning.test.ts` around lines 78 - 85, The test currently
uses a loose expect(src).toContain("'request': {'allowPrivateNetwork': True}")
which can match unrelated code; tighten it by asserting the allowPrivateNetwork
literal appears in the openclaw provider config context — for example, locate
the provider identifier string used in the test (e.g., "openclaw" or
"openclaw.json") and assert that the substring "'request':
{'allowPrivateNetwork': True}" appears after it (or use a single regex that
matches the provider block and the request key together, e.g.,
/openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/)
so the check targets the provider config rather than any unrelated occurrence.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/sandbox-provisioning.test.ts`:
- Around line 78-85: The test currently uses a loose
expect(src).toContain("'request': {'allowPrivateNetwork': True}") which can
match unrelated code; tighten it by asserting the allowPrivateNetwork literal
appears in the openclaw provider config context — for example, locate the
provider identifier string used in the test (e.g., "openclaw" or
"openclaw.json") and assert that the substring "'request':
{'allowPrivateNetwork': True}" appears after it (or use a single regex that
matches the provider block and the request key together, e.g.,
/openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/)
so the check targets the provider config rather than any unrelated occurrence.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0f3f1a6c-f95f-4ace-b5cb-322fd7c02fa2

📥 Commits

Reviewing files that changed from the base of the PR and between f439189 and 7300c3a.

📒 Files selected for processing (2)

Dockerfile
test/sandbox-provisioning.test.ts

Address CodeRabbit nitpick on PR #2490: replace the bare `toContain` with an ordered-index check that pins the literal between `'baseUrl': inference_base_url` and `'models': [{**({'compat'` — the python that builds the providers dict. A bare match could pass on any unrelated occurrence elsewhere in the Dockerfile; the ordered indices prove the opt-in lives inside the provider config block. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

…tokens The "live inference" assertions in cloud-e2e (test-full-e2e.sh) and the Hermes e2e were curl-from-sandbox checks. They prove OpenShell's DNS forwarder + proxy can route inference.local; they never invoke openclaw's HTTP client and never reach openclaw's SSRF guard. That is why every openclaw 4.9 nightly-e2e run on PR #2464 reported [LIVE] Sandbox inference: PASS while real users were getting SsrFBlockedError on the same release. Changes: * test-full-e2e.sh: relabel Phase 4b from [LIVE] to [ROUTING] with a comment pointing at #2490; add Phase 4c, an actual openclaw-mediated turn that runs `openclaw agent --json` over SSH, parses result.payloads[].text, and asserts the model produced "42" for "What is 6 multiplied by 7?". The expected token is not a substring of the prompt, --json routes logs to stderr, stderr is dropped — so prompt-echo on an error path cannot satisfy the grep. * test-hermes-e2e.sh: same relabel for the equivalent curl assertion. * test-sandbox-operations.sh TC-SBX-02: replace `Say exactly: HELLO_E2E` prompt + grep on merged stdout/stderr with the same arithmetic-via-JSON pattern. The previous assertion would match the prompt itself in any error path that quoted it back, including the openclaw 4.9 SSRF rejection — false positive that hid the regression for the entire 4.2 → 4.7 → 4.8 → 4.9 bump series. * verify-sandbox-skill-via-agent.sh: stop embedding ${VERIFY_TOKEN} in the prompt (the agent must read it from SKILL.md — that is the test). Add a guard that refuses SKILL_VERIFY_PROMPT overrides which smuggle the token back in, and a negative assertion on SsrFBlockedError, transport errors, and gateway-unavailable markers before the positive grep. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai

🧹 Nitpick comments (2)

test/e2e/test-full-e2e.sh (1)
397-415: Consider extracting the JSON payload parser to a shared function.

The Python snippet for parsing result.payloads[].text from the openclaw agent JSON envelope is duplicated in test-sandbox-operations.sh (lines 297-309). If more tests adopt this pattern, consider extracting it to test/e2e/lib/ as a shared helper.

Not blocking since there are only two occurrences currently.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-full-e2e.sh` around lines 397 - 415, Extract the duplicated
Python JSON parsing snippet that builds agent_reply from agent_response into a
shared helper (e.g., create a script or shell function named
parse_agent_payloads or parse_agent_reply in test/e2e/lib/), replace the inline
Python block in both test-full-e2e.sh and test-sandbox-operations.sh with a call
to that helper, and ensure the helper accepts agent_response (or reads stdin)
and returns the concatenated payload texts so the existing grep/if logic around
agent_reply remains unchanged.
test/e2e/test-sandbox-operations.sh (1)
297-315: Consider logging when JSON parsing fails.

The Python parser silently exits with code 0 when JSON parsing fails (line 302), which causes $reply to be empty. While the test will still fail (empty string won't match "42"), the failure message won't distinguish between "no JSON output" and "JSON parsed but no matching payloads."

This is acceptable for now since the raw stdout is included in the failure message, but a debug log could help triage.
🔧 Optional: Add parse failure indicator
 reply=$(echo "$raw" | python3 -c "
 import json, sys
 try:
     doc = json.load(sys.stdin)
 except Exception:
+    print('PARSE_FAILED', file=sys.stderr)
     sys.exit(0)
 result = doc.get('result') or {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-sandbox-operations.sh` around lines 297 - 315, The test's
Python JSON extractor (the python3 one-liner that populates the reply variable
from raw) currently swallows JSON parse errors and exits silently; change that
python block (the invocation that sets reply) to catch the JSON parsing
exception and emit a clear parse-failure marker (for example print a sentinel
like "JSON_PARSE_ERROR" to stdout or stderr) so the shell can distinguish parse
failures from empty payloads, and update the surrounding shell check (the if
that inspects $reply and grep for 42) to detect this marker and produce a more
specific fail message referencing reply/raw when parsing failed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/test-full-e2e.sh`:
- Around line 397-415: Extract the duplicated Python JSON parsing snippet that
builds agent_reply from agent_response into a shared helper (e.g., create a
script or shell function named parse_agent_payloads or parse_agent_reply in
test/e2e/lib/), replace the inline Python block in both test-full-e2e.sh and
test-sandbox-operations.sh with a call to that helper, and ensure the helper
accepts agent_response (or reads stdin) and returns the concatenated payload
texts so the existing grep/if logic around agent_reply remains unchanged.

In `@test/e2e/test-sandbox-operations.sh`:
- Around line 297-315: The test's Python JSON extractor (the python3 one-liner
that populates the reply variable from raw) currently swallows JSON parse errors
and exits silently; change that python block (the invocation that sets reply) to
catch the JSON parsing exception and emit a clear parse-failure marker (for
example print a sentinel like "JSON_PARSE_ERROR" to stdout or stderr) so the
shell can distinguish parse failures from empty payloads, and update the
surrounding shell check (the if that inspects $reply and grep for 42) to detect
this marker and produce a more specific fail message referencing reply/raw when
parsing failed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 25682ebd-5989-4b93-8cd5-91f44bd1a632

📥 Commits

Reviewing files that changed from the base of the PR and between 856f984 and ca03324.

📒 Files selected for processing (4)

test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh
test/e2e/test-full-e2e.sh
test/e2e/test-hermes-e2e.sh
test/e2e/test-sandbox-operations.sh

ericksoa · 2026-04-26T23:33:02Z

Adversarial self-review

I ran a deliberate red-team pass on this change before requesting review. Posting the findings here so reviewers don't have to redo the work and so the open follow-ups are tracked. Cross-referenced openclaw HEAD at /Users/aerickson/Documents/NemoDev/openclaw-4-9-compat/openclaw for the SSRF and provider-config code paths.

Blockers

None. Schema match confirmed at openclaw src/config/zod-schema.core.ts:362 (strict, accepts request.allowPrivateNetwork); per-provider scoping confirmed at src/agents/provider-request-config.ts:601-664 and src/agents/provider-transport-fetch.ts:153,196 — allowPrivateNetwork only flows into the per-fetch policy when the configured model's transport carries it. No global disable.

Concerns (worth tracking, not blocking)

1. Cross-origin redirect blast radius. openclaw/src/infra/net/fetch-guard.ts:333-457 runs the redirect loop with the same closed-over params.policy for every hop. If a compromised or misconfigured inference.local ever returns a 302 to 169.254.169.254, 127.0.0.1, 192.168.0.1, or any other private IP, the SSRF guard will accept it because we set allowPrivateNetwork: true up-front. Mitigation today: OpenShell's network policy governs real egress, making this a defense-in-depth concern rather than an exploitable one. Better fix when upstream supports it: switch from allowPrivateNetwork: true to allowedHostnames: ['inference.local']. The internal SsrFPolicy already supports the latter (openclaw/src/infra/net/ssrf.ts:43-44,86); ConfiguredModelProviderRequest (openclaw/src/config/types.provider-request.ts:38-47) does not yet expose it. Tracked as a follow-up — I'll file the upstream openclaw issue separately.

2. Upstream regression magnet. openclaw/CHANGELOG.md:1324 records that models.providers.*.request.allowPrivateNetwork already broke between v2026.4.13 and v2026.4.14 for STT and had to be patched back. The same key has been touched in #71723 (Google), #66692 (audio), #58529 (Mattermost), #63671 (introduction). NemoClaw is pinned at Dockerfile.base:155 OPENCLAW_VERSION=2026.4.9; the Dockerfile-text test in this PR cannot detect a runtime regression on bump. Hard prerequisite for any future openclaw bump: the new Phase 4c assertion in test/e2e/test-full-e2e.sh must pass on cloud-e2e. That's the only test in the suite that proves openclaw can complete a turn against inference.local.

3. Test brittleness/laxity. The static check in test/sandbox-provisioning.test.ts:75-95 matches the literal string 'request': {'allowPrivateNetwork': True} with anchored ordering. It will fail on plausible benign refactors — True→true (canonical Python here, but a json.dumps-based rewrite would emit true), single→double quote reformatting, or renaming inference_base_url. Conversely, if a future maintainer moves the literal into a different provider stanza later in the file, the index-ordering check still passes because indexOf returns the first hit. A behavioral test that loads the rendered JSON and asserts config.models.providers[<key>].request.allowPrivateNetwork === true would be both stricter and more robust. Phase 4c covers the runtime case, so this is a follow-up rather than a blocker.

4. Provider-key fan-out fragility. src/lib/onboard.ts:1414-1466 shows every branch of getSandboxInferenceConfig() happens to point inferenceBaseUrl at https://inference.local{...}, so the single request stanza in the Dockerfile covers all current onboard paths. If a future onboarding branch uses a different inferenceBaseUrl ending in .local/.internal, the Dockerfile fix silently no-ops for it. Worth a comment in getSandboxInferenceConfig() warning that any new branch must revisit the Dockerfile providers dict — small follow-up.

Notes

Schema confirmed. request.allowPrivateNetwork is at openclaw/src/config/zod-schema.core.ts:362 (strict object, not nested under request.transport or similar).
Runtime config tampering: low risk. The second python block at Dockerfile:371-377 mutates the same ~/.openclaw/openclaw.json to inject the gateway token, then chmods 0600; a chown to root and Landlock-lock follow. The request key survives because json.load/json.dump round-trip preserves it. openclaw doctor --fix runs at Dockerfile:363 before the second python block — it would reject any unknown key under strict zod, but request is a known key.
Docker quoting. The new line 'request': {'allowPrivateNetwork': True}, \ uses the same single-quote-inside-double-quote-heredoc pattern as every other line in the RUN python3 -c "..." block. No new escaping fragility.
Public posture clean. Diff, commit messages (7300c3ab, 856f9841, ca03324f), and PR body contain no internal bug-tracker IDs, no internal URLs, no AI attribution. PR body accurately characterizes the scope ("scoped to that one provider") which matches provider-request-config.ts:662.
End-to-end validation in flight. Manually dispatched nightly-e2e.yaml on this branch (run 24969506775). cloud-e2e already passed with the new Phase 4c line: PASS: [LIVE] openclaw agent: model answered 6×7=42 through openclaw → inference.local (job log). sandbox-operations-e2e cross-check (TC-SBX-02) still in progress.

Follow-ups to file after merge

Behavioral test that loads the rendered openclaw.json from a built image and asserts the key on the parsed object.
Comment in getSandboxInferenceConfig() warning about Dockerfile coupling.
Upstream openclaw issue: expose allowedHostnames/hostnameAllowlist on ConfiguredModelProviderRequest. Switching to allowedHostnames: ['inference.local'] would close concern ci: auto-update release notes on push to main #1.

sandbox_exec_for() merges stderr into stdout (2>&1 at line 95) so it can log non-zero exits — but that pollutes the stdout `python3 json.load()` needs to parse, and node-side warnings (UNDICI-EHPA, etc.) from openclaw agent always land on stderr. Result on the first nightly-e2e dispatch: TC-SBX-02 captured `(node:1044) [UNDICI-EHPA] Warning: ...` ahead of any JSON, json.load raised, the except branch produced reply='', and the test failed with `expected '42' in agent reply, got: ''`. cloud-e2e Phase 4c didn't have this problem because it uses a direct ssh invocation with `2>/dev/null` at the source. Mirror that here: open the ssh-config in the helper, call ssh directly, drop stderr at the wire. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

Follow-up to #2344 (NemoClaw 0.0.24). The http.request → https.request rewrite that resolves the NODE_USE_ENV_PROXY=1 / library-side proxy double-processing was shallow-copying the caller's options into the rewritten https.request. That dragged three classes of forward-proxy- hop fields onto a request that now goes direct to the upstream: * options.agent — a forward-proxy http.Agent. http.Agent cannot speak TLS, so https.request either ignores it and falls back to a default agent or fails the TLS handshake outright depending on caller. This is the most likely root cause of the deepinfra-style report on Discord ("LLM request failed: network connection error" against a custom OpenAI-compatible upstream while NVIDIA-routed providers keep working). NVIDIA Endpoints' DNS-rewritten path through OpenShell doesn't end up in this branch, so the bug stayed invisible. * options.auth — basic-auth meant for the proxy hop. Leaving it on https.request would Basic-auth the *target* server with proxy credentials. * Host / Proxy-Authorization / Proxy-Connection / Proxy-Authenticate headers — Host points at the proxy host and the Proxy-* headers are hop-by-hop. Stripping them lets Node regenerate Host from the target's hostname/port and prevents proxy creds from leaking upstream. Adds test/http-proxy-fix-rewrite.test.ts with seven cases pinning each strip and confirming signal/timeout/TLS fields and target headers (Authorization, Content-Type, …) survive. The existing http-proxy-fix-sync.test.ts byte-for-byte enforcer still passes — the canonical wrapper at nemoclaw-blueprint/scripts/http-proxy-fix.js and the heredoc in scripts/nemoclaw-start.sh were updated together. Reverts the openclaw `request.allowPrivateNetwork` Dockerfile change that this PR previously carried — empirical investigation showed openclaw 2026.4.9 rejects that key in strict zod and `openclaw doctor --fix` (which the Dockerfile runs at build time) silently strips it back to `"request": {}` before the image ships. The upstream plumbing that reads `request.allowPrivateNetwork` only landed in 2026.4.10 (commit 0808dd111c, openclaw PR #63671). The change here was a no-op, diagnosing the wrong layer; this commit replaces it with the actual fix at the wrapper layer. The label-rename + arithmetic-via-`--json` test improvements added earlier on this branch (Phase 4c, TC-SBX-02 rewrite, skill-verify hardening, sandbox_exec stderr-merge fix) are kept — they catch regressions independent of the proxy bug. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

Spins up a local HTTPS mock on 127.0.0.1 with a self-signed cert generated via openssl, simulates the FORWARD-mode http.request shape that axios + HTTPS_PROXY produces (forward-proxy http.Agent attached, proxy basic-auth, Host pointing at the proxy, Proxy-* headers), and asserts an OpenAI-compatible chat-completions round trip actually completes. Bisect proof: Wrapper without the strip (parent commit reverted): × test fails with `TypeError: Protocol "https:" not supported. Expected "http:"` — Node detects the forward-proxy http.Agent riding into https.request. Same root cause as the deepinfra "LLM request failed: network connection error" symptom. Wrapper with the strip (this PR): ✓ test passes; mock receives POST /v1/openai/chat/completions with Authorization: Bearer real-deepinfra-token preserved, Host: localhost:<port> regenerated by Node, Proxy-* headers absent, and the request body intact. Caller gets the mock's PONG reply. Skipped if openssl is not on PATH; CI runners (ubuntu-latest, macOS) have it. Cert is generated fresh per test run with 1-day expiry so there is no long-lived crypto material in the repo. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

…sport hints Adversarial review caught three blockers and seven concerns on the prior commit. This addresses all of them: Wrapper (nemoclaw-blueprint/scripts/http-proxy-fix.js + heredoc in scripts/nemoclaw-start.sh, kept byte-for-byte synced): * sanitizeHeaders strips the full RFC 7230 §6.1 hop-by-hop set (Connection, Keep-Alive, Proxy-Authorization, TE, Trailer, Transfer-Encoding, Upgrade) plus Host, Proxy-Connection, and Proxy-Authenticate. Also walks the Connection header and strips every token named there (transitive hop-by-hop). The prior commit only covered Host + Proxy-*, leaving Connection: close from the proxy hop riding into a direct-to-target request and defeating keep-alive on the rewrite. (B1) * Proxy-Authenticate is response-only per RFC 7235 §4.3 — comment block updated to call this out so the next reviewer doesn't repeat the question. Kept in the strip set as belt-and-suspenders for clients that echo response headers into retry-request options. (B2) * Strip rewritten.servername and rewritten.checkServerIdentity — TLS SNI and identity-check pre-computed for the proxy hop must not survive into the direct-to-target handshake. (C2) * Strip rewritten.socketPath — Unix-socket proxies (cntlm-style) exist; routing TLS bytes into the proxy socket would defeat the rewrite. (C2) * Strip rewritten.localAddress, lookup, family, hints — source-binding and DNS hints picked for reachability to the proxy don't apply to a different upstream. (C3) Tests: * test/http-proxy-fix-rewrite.test.ts: new cases pin every new strip (RFC 7230 §6.1 hop-by-hop including transitive Connection-named tokens; servername/checkServerIdentity; socketPath/localAddress/ lookup/family/hints). Adds a separate "control" describe block that constructs the broken pre-fix rewrite shape inline and asserts https.request synchronously rejects it on Node 22. The bisect now lives in the repo, not in PR description prose. (B3) * test/http-proxy-fix-e2e.test.ts: switches from raw process.env mutation to vi.stubEnv (matches the rewrite suite, prevents env leaks across test files in the same vitest worker). Saves and restores the original http.request in beforeEach/afterEach so a failed test does not leave the wrapped function chained into the next test file. (C4, C5) * Cert expiry: 1-day → 7-day so a long-running `vitest --watch` session does not outrun it. (C6) * When CI=true and openssl is missing, throws at module load instead of silent-skipping. CI runners ship openssl; a missing-binary skip would be a false green. (C7) Run-time totals: 28 cases across rewrite (12) + sync (6) + e2e (1) + sandbox-provisioning (10) — all pass; 0 regressions. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai

🧹 Nitpick comments (1)

test/http-proxy-fix-rewrite.test.ts (1)
249-260: Correct assertion for explicit port, but type inconsistency worth noting.

The test correctly expects "8443" as a string because URL.port returns a string for explicit ports. The earlier test at line 97 expects 443 as a number because the production code uses target.port || 443 where empty string falsy-falls back to the numeric default.

This is correct behavior but the mixed string/number type could confuse future maintainers. Consider adding a brief comment.
📝 Optional: Add clarifying comment
   it("uses the explicit target port when one is present in the URL", () => {
     http.request({
       hostname: PROXY_HOST,
       port: 3128,
       path: "https://internal.example.com:8443/v1/x",
       headers: {},
     });

     expect(captured).not.toBeNull();
+    // URL.port returns a string for explicit ports; the production code's
+    // `target.port || 443` preserves the string type here (vs. numeric 443 fallback).
     expect(captured?.port).toBe("8443");
     expect(captured?.hostname).toBe("internal.example.com");
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/http-proxy-fix-rewrite.test.ts` around lines 249 - 260, The test "uses
the explicit target port when one is present in the URL" asserts captured?.port
=== "8443" which is correct because URL.port returns a string while other tests
expect numeric 443 due to production code using target.port || 443; add a
one-line comment above this test (or above the assertion) explaining that
explicit URL ports are string-typed via URL.port and that the code falls back to
numeric 443 when URL.port is empty to avoid future confusion referencing the
captured variable and the test name.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/http-proxy-fix-rewrite.test.ts`:
- Around line 249-260: The test "uses the explicit target port when one is
present in the URL" asserts captured?.port === "8443" which is correct because
URL.port returns a string while other tests expect numeric 443 due to production
code using target.port || 443; add a one-line comment above this test (or above
the assertion) explaining that explicit URL ports are string-typed via URL.port
and that the code falls back to numeric 443 when URL.port is empty to avoid
future confusion referencing the captured variable and the test name.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e060c472-d919-4006-9f36-5474c932e073

📥 Commits

Reviewing files that changed from the base of the PR and between cade7f1 and 69115de.

📒 Files selected for processing (4)

nemoclaw-blueprint/scripts/http-proxy-fix.js
scripts/nemoclaw-start.sh
test/http-proxy-fix-e2e.test.ts
test/http-proxy-fix-rewrite.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

nemoclaw-blueprint/scripts/http-proxy-fix.js
scripts/nemoclaw-start.sh

## Summary Refreshes user-facing docs for the last 24 hours of merged NemoClaw history and bumps the docs metadata to 0.0.29, the next version after v0.0.28. The updates are limited to behavior supported by merged PR descriptions and diffs. ## Changes - `docs/reference/commands.md`: documented `nemoclaw <name> policy-add --from-file` and `--from-dir`, including custom preset review guidance, from #2077 / commit `7720b175`. - `docs/deployment/deploy-to-remote-gpu.md`: clarified that non-loopback `CHAT_UI_URL` disables OpenClaw device pairing for remote browser-only deployments, from #2449 / commit `f5ee8a4d`. - `docs/inference/inference-options.md`: documented provider-aware credential retry validation and the NVIDIA-only `nvapi-` prefix check, from #2389 / commit `6f7f0c6d`. - `docs/inference/switch-inference-providers.md`: documented `NEMOCLAW_INFERENCE_INPUTS` for text/image-capable model metadata baked into `openclaw.json`, from #2441 / commit `f4391892`. - `docs/reference/troubleshooting.md`: added the Git certificate verification entry for proxy CA propagation through `GIT_SSL_CAINFO`, `GIT_SSL_CAPATH`, `CURL_CA_BUNDLE`, and `REQUESTS_CA_BUNDLE`, from #2345 / commit `fa0dc1ab`. - `docs/versions1.json` and `docs/project.json`: promoted docs version `0.0.29`; `docs/versions1.json` omits unpublished `0.0.26`, `0.0.27`, and `0.0.28` entries. - `.agents/skills/nemoclaw-user-*`: regenerated derived user skill references from the updated docs. - Reviewed with no extra doc changes: #2575 / `d392ec07`, #2565 / `a3231049`, #1965 / `db1ef3ca`, #1990 / `db665834`, #2495 / `7da86fa3`, #2496 / `3192f4f4`, #2490 / `8c209058`, #2487 / `1f615e2f`, #2483 / `5653d33a`, #2482 / `31c782c0`, #2464 / `23bb5703`, #2472 / `a54f9a34`, and #2437 / `6bc860d7`. - Skipped per docs policy: #2420 / `7b76df6b` touched the experimental sandbox config path listed in `docs/.docs-skip`; #2466 / `cc15689c` touched a skipped term and CI-only sandbox image files. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification  - [x] `npx prek run --all-files` passes - [ ] `npm test` passes — failed locally in installer-integration tests and one onboard helper timeout; the doc-scoped hook test projects passed under `prek`. - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) — build succeeded, but local Sphinx emitted the existing version-switcher file read message. - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) ## AI Disclosure  - [x] AI-assisted — tool: Codex ---  Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>  ## Summary by CodeRabbit * **New Features** * Support for custom YAML presets in policy configuration via --from-file and --from-dir. * New build-time inference input option to declare accepted modalities (text or text,image). * **Improvements** * Credential validation now offers interactive recovery: re-enter key, retry, choose another provider, or exit. * Clarified provider-specific API key prefix handling (nvapi- only applies to NVIDIA keys). * **Documentation** * TLS certificate troubleshooting for inspected networks. * Clarified remote dashboard security/device-pairing behavior; command docs updated; docs version bumped.  --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

coderabbitai Bot reviewed Apr 26, 2026

View reviewed changes

ericksoa mentioned this pull request Apr 26, 2026

test(e2e): cover openclaw → inference.local and stop matching prompt tokens #2491

Closed

5 tasks

ericksoa added 2 commits April 26, 2026 15:56

ericksoa changed the title ~~fix(sandbox): allow private-network for inference provider on openclaw 4.9~~ fix(sandbox): make openclaw 4.9 inference work and add tests that prove it Apr 26, 2026

coderabbitai Bot reviewed Apr 26, 2026

View reviewed changes

ericksoa added 2 commits April 26, 2026 16:36

Merge branch 'main' into fix/openclaw-4-9-inference-ssrf

517e881

brandonpelfrey approved these changes Apr 26, 2026

View reviewed changes

ericksoa changed the title ~~fix(sandbox): make openclaw 4.9 inference work and add tests that prove it~~ fix(sandbox): strip forward-proxy fields when rewriting to https.request Apr 27, 2026

ericksoa added 2 commits April 26, 2026 17:48

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

ericksoa merged commit 8c20905 into main Apr 27, 2026
55 of 58 checks passed

ericksoa deleted the fix/openclaw-4-9-inference-ssrf branch April 27, 2026 02:43

miyoungc mentioned this pull request Apr 28, 2026

docs: refresh daily docs for 0.0.29 #2576

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490

fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490
ericksoa merged 8 commits intomainfrom
fix/openclaw-4-9-inference-ssrf

ericksoa commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ericksoa commented Apr 26, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericksoa commented Apr 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bisect proof (committed in this PR)

What was wrong

The fix

Tests

What this PR no longer carries

What this PR keeps from the prior iteration

Type of Change

Verification

Per-commit on this branch

How to validate after merge

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ericksoa commented Apr 26, 2026

Adversarial self-review

Blockers

Concerns (worth tracking, not blocking)

Notes

Follow-ups to file after merge

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericksoa commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading