Skip to content

fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490

Merged
ericksoa merged 8 commits intomainfrom
fix/openclaw-4-9-inference-ssrf
Apr 27, 2026
Merged

fix(sandbox): strip forward-proxy fields when rewriting to https.request#2490
ericksoa merged 8 commits intomainfrom
fix/openclaw-4-9-inference-ssrf

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented Apr 26, 2026

Summary

Follow-up to #2344 (NemoClaw 0.0.24). The http.requesthttps.request rewrite in nemoclaw-blueprint/scripts/http-proxy-fix.js was shallow-copying the caller's options into the rewritten https.request call. That dragged forward-proxy-hop fields onto a request that now goes direct to the upstream. This is consistent with the Discord report from mflova after the 0.0.23 → 0.0.24 bump:

Finally, when I run the bot, I always get an error after sending any message to it: error=LLM request failed: network connection error. rawError=Connection error

Some context: I am using 0.0.23 as the 0.0.24 had a bug with the network.

mflova was on a custom OpenAI-compatible endpoint (deepinfra). NVIDIA Endpoints' DNS-rewritten path through OpenShell doesn't end up in the wrapper's FORWARD-mode branch, which is why upstream regressions there stayed invisible to nightly-e2e. mflova's reported error is the openclaw client's wrapped surface; on Node 22 the underlying mechanism is a synchronous TypeError: Protocol "https:" not supported. Expected "http:" thrown by https.request when the forward-proxy http.Agent reaches it. On Node 18/20 the same root cause manifests as a TLS handshake failure rather than a synchronous throw — same bug class, different surface.

Bisect proof (committed in this PR)

test/http-proxy-fix-rewrite.test.ts carries both halves of the proof in-repo:

  • The "control" describe block reproduces the broken pre-fix rewrite shape inline (Object.assign with no field strips, hop-by-hop headers preserved) and asserts https.request rejects it with TypeError: Protocol "https:" not supported. That test passes regardless of the wrapper, locking in the bug class.
  • The rewrite describe block asserts the wrapper produces options that don't match that shape — agent, auth, Host, Proxy-*, the rest of RFC 7230 §6.1 hop-by-hop, servername, checkServerIdentity, socketPath, localAddress, lookup, family, hints are all stripped on rewrite.

Combined: bug class is real and detectable; wrapper's job is to never produce that shape. If a future maintainer reverts a strip, the rewrite test breaks; if Node changes the failure surface, the control test breaks. Two-sided coverage without storing a copy of the broken wrapper.

What was wrong

The rewrite was:

var rewritten = Object.assign({}, options, {
  hostname: target.hostname, host: target.hostname,
  port: target.port || 443, path: target.pathname + target.search,
  protocol: 'https:',
});
return https.request(rewritten, callback);

Three classes of forward-proxy-hop fields rode along:

  • options.agent — a forward-proxy http.Agent. http.Agent.protocol === 'http:', so on Node 22 https.request throws TypeError: Protocol "https:" not supported. Expected "http:" synchronously; on Node 18/20 it falls through and the TLS handshake fails. Manifests upstream as "Connection error".
  • options.auth — basic-auth meant for the proxy hop. Leaving it on https.request Basic-auths the target server with proxy credentials.
  • Hop-by-hop headers (RFC 7230 §6.1)Connection, Keep-Alive, Proxy-Authorization, TE, Trailer, Transfer-Encoding, Upgrade, plus tokens named in Connection (transitively hop-by-hop), plus Host (proxy-pointing) and Proxy-Connection (de facto deprecated). Connection: close from the proxy hop defeats keep-alive on the rewrite; Upgrade to a non-WS target produces 400/426; Proxy-* leak proxy credentials upstream.

Plus: TLS identity (servername, checkServerIdentity), socket-path proxy hint (socketPath), and source-binding / DNS hints (localAddress, lookup, family, hints) that all describe the connection to the proxy and don't apply to a different upstream.

The fix

nemoclaw-blueprint/scripts/http-proxy-fix.js (canonical) and the heredoc in scripts/nemoclaw-start.sh (byte-for-byte synced via the existing http-proxy-fix-sync.test.ts):

  • sanitizeHeaders() strips the full RFC 7230 §6.1 hop-by-hop set plus Host, Proxy-Connection, Proxy-Authenticate, and any token named in the Connection header (transitive per the same RFC). Case-insensitive.
  • delete rewritten.agent, delete rewritten.auth, delete rewritten.servername, delete rewritten.checkServerIdentity, delete rewritten.socketPath, delete rewritten.localAddress, delete rewritten.lookup, delete rewritten.family, delete rewritten.hints after the Object.assign.
  • Signal (AbortController), TLS material (ca/cert/key/rejectUnauthorized), timeout, body, and target-intent headers (Authorization, Content-Type, …) all survive.

Tests

  • test/http-proxy-fix-rewrite.test.ts (12 cases) — spy-level rewrite tests pinning every strip; control test reproducing the broken-wrapper shape and asserting TypeError.
  • test/http-proxy-fix-e2e.test.ts (1 case) — end-to-end. openssl shell-out at module load (skipped locally if openssl missing; loud-fails under CI=true). Spins up an HTTPS mock on 127.0.0.1:0, builds the FORWARD-mode http.request shape with forward-proxy http.Agent + proxy basic-auth + proxy-pointing Host, asserts the round trip completes.
  • test/http-proxy-fix-sync.test.ts (6 cases, existing) — byte-for-byte parity between the canonical wrapper and the scripts/nemoclaw-start.sh heredoc still enforced.

What this PR no longer carries

This branch previously included a Dockerfile change that added 'request': {'allowPrivateNetwork': True} to the inference provider in the baked openclaw.json, and a regression test for it. Empirical investigation showed:

  • openclaw 2026.4.9's strict zod schema rejects request.allowPrivateNetwork as an unrecognized key (src/config/zod-schema.core.ts:283-292 in v2026.4.9).
  • openclaw doctor --fix — which the Dockerfile runs at image build time — silently strips the key, leaving "request": {} before the image ships.
  • The plumbing that reads request.allowPrivateNetwork only landed in 2026.4.10 (upstream PR #63671, commit 0808dd111c).

So the previous "fix" was a no-op diagnosing the wrong layer. Dockerfile and test/sandbox-provisioning.test.ts are restored to origin/main.

What this PR keeps from the prior iteration

The label-rename + arithmetic-via---json test improvements added earlier on this branch are kept — they catch regressions independent of the proxy bug:

  • test/e2e/test-full-e2e.sh: Phase 4b relabelled [LIVE][ROUTING]; new Phase 4c that runs openclaw agent --json over SSH inside the sandbox and asserts the model answered 42 to "What is 6 multiplied by 7?". Phase 4c is the only assertion in the suite that exercises the openclaw HTTP client end-to-end against inference.local.
  • test/e2e/test-hermes-e2e.sh: same [LIVE][ROUTING] relabel.
  • test/e2e/test-sandbox-operations.sh TC-SBX-02: replaces Say exactly: HELLO_E2E + merged-output grep with the arithmetic-via---json pattern.
  • test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh: stops embedding ${VERIFY_TOKEN} in the prompt; refuses overrides that smuggle it back; adds a negative assertion on SsrFBlockedError / transport / gateway-unavailable markers.

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • npx vitest run test/http-proxy-fix-e2e.test.ts test/http-proxy-fix-rewrite.test.ts test/http-proxy-fix-sync.test.ts test/sandbox-provisioning.test.ts — 28/28 pass.
  • Adversarial review's blockers (B1–B3) and concerns (C1–C7) addressed in commit 69115de7.
  • bash -n and shellcheck -S warning clean on scripts/nemoclaw-start.sh and the four touched e2e scripts.
  • Heredoc-sync test still passes; canonical wrapper and scripts/nemoclaw-start.sh heredoc updated together.
  • No secrets, API keys, or credentials committed.

Per-commit on this branch

  • f162daa5 — initial strip of agent/auth/Host/Proxy-* (the fix), spy-level unit tests
  • 2f217f8a — end-to-end test against a local HTTPS mock with self-signed cert
  • 69115de7 — wider RFC 7230 §6.1 hop-by-hop + TLS/transport hint strips, in-repo bisect control, vi.stubEnv consistency, http.request restore, 7-day cert, CI strict-skip

How to validate after merge

The signal job is cloud-e2e in nightly-e2e.yaml running test/e2e/test-full-e2e.sh. Watch for:

PASS: [LIVE] openclaw agent: model answered 6×7=42 through openclaw → inference.local

For the deepinfra-style failure specifically, the unit + e2e + control tests in this PR pin every strip the wrapper performs and the bug class it prevents. A real proxy + real upstream + real TLS round trip would need a CI job with a non-NVIDIA OpenAI-compatible API key — its own follow-up PR if you want belt-and-suspenders coverage.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved HTTP proxy request handling to sanitize headers and remove incompatible routing-specific fields, ensuring proper downstream forwarding.
  • Tests

    • Added comprehensive test coverage for proxy header sanitization and rewrite behavior.
    • Enhanced E2E testing with improved routing-layer assertions and JSON payload extraction from agent responses.

…w 4.9

OpenClaw 2026.4.9 added a global SSRF guard that blocks any hostname ending
in .local/.localhost/.internal on every model provider HTTP request. NemoClaw
routes inference at https://inference.local/v1, so every LLM call inside the
sandbox now fails closed.

Set the typed ConfiguredModelProviderRequest.allowPrivateNetwork field on the
inference provider in the baked openclaw.json. Scoped to that one provider —
the rest of openclaw's SSRF policy stays intact. Safe inside the sandbox
because OpenShell already enforces egress via its network policy, making
openclaw's hostname guard duplicative for this hop.

Adds a static regression guard in test/sandbox-provisioning.test.ts so a
future Dockerfile refactor cannot silently drop the opt-in.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 26, 2026

📝 Walkthrough

Walkthrough

Prevented verification tokens from being embedded in agent prompts, reframed several sandbox assertions to routing-layer checks, added JSON-based agent response validation and error-pattern scanning, and implemented header/state sanitization when rewriting forward-proxy HTTPS requests, plus comprehensive unit and e2e tests for the proxy rewrite.

Changes

Cohort / File(s) Summary
E2E token/assertion and JSON-response updates
test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh, test/e2e/test-full-e2e.sh, test/e2e/test-hermes-e2e.sh, test/e2e/test-sandbox-operations.sh
Prevent embedding ${VERIFY_TOKEN} in agent prompts and assert SKILL_VERIFY_PROMPT does not contain the literal; convert certain sandbox checks from "LIVE" to routing-layer [ROUTING] assertions; run openclaw agent --json with stderr suppressed, extract result.payloads[].text, validate an isolated 42 via regex; add error-pattern scanning for provider/transport/gateway/connection failures and immediate fail on matches.
Forward-proxy HTTPS rewrite: header & option sanitization
nemoclaw-blueprint/scripts/http-proxy-fix.js, scripts/nemoclaw-start.sh
When rewriting FORWARD-mode proxy requests to https.request, strip hop-by-hop and proxy headers (e.g., Host, Proxy-*, RFC7230 hop headers and tokens from Connection), and delete proxy/agent/TLS/routing-specific fields (agent, auth, servername, checkServerIdentity, socketPath, localAddress, lookup, family, hints) so TLS/SNI and routing are recomputed for the target.
Unit tests for proxy rewrite behavior
test/http-proxy-fix-rewrite.test.ts
Add Vitest unit suite that stubs https.request to assert rewritten options: target hostname/port/path/protocol correctness, removal of proxy/TLS/transport caller fields, stripping of hop-by-hop/proxy headers while preserving Authorization/Content-Type, and preservation of signal/timeout/rejectUnauthorized. Includes pre-fix bisect control asserting Node protocol validation error.
E2E test for proxy rewrite (in-process HTTPS upstream)
test/http-proxy-fix-e2e.test.ts
Add Vitest E2E that spawns an ephemeral HTTPS upstream (self-signed cert via openssl), sets proxy env, loads the wrapper, issues an HTTP-proxy-shaped request, and asserts upstream saw rewritten Host: localhost:<port>, preserved Authorization, stripped proxy-hop headers, correct path /v1/openai/chat/completions, and a 200 response. Skips locally if openssl missing; fails in CI if missing.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client (proxy request)
    participant Proxy as Local Proxy Wrapper\n(nemoclaw http.request)
    participant Upstream as Target HTTPS Upstream

    rect rgba(230,240,255,0.5)
    Client->>Proxy: HTTP request with proxy-style URL (http://proxy/...https://target/...)
    end

    rect rgba(240,255,230,0.5)
    Proxy->>Proxy: Detect FORWARD-mode path -> parse https://target URL\nSanitize headers (remove Host, Proxy-*, Connection tokens, hop-by-hop)\nStrip caller TLS/agent fields (agent, auth, servername, etc.)
    end

    rect rgba(255,240,230,0.5)
    Proxy->>Upstream: Issue rewritten https.request with target host, port, path, sanitized headers
    Upstream-->>Proxy: HTTPS response (200 / body)
    Proxy-->>Client: Forward response back to client
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop and scrub headers, quiet and spry,
Tokens tucked safely, prompts kept shy.
Replies parsed in JSON, forty-two in sight,
Proxies now honest, routing feels right.
Hooray for clean hops — a rabbit's delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main fix: stripping forward-proxy-specific fields when rewriting http.request to https.request in the sandbox proxy wrapper.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/openclaw-4-9-inference-ssrf

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/sandbox-provisioning.test.ts (1)

78-85: Tighten the invariant to reduce false positives.

The current toContain check can still pass if the same literal is added in an unrelated block. Consider asserting ordering/context near the provider config too.

Suggested test hardening
   it("openclaw.json provider config opts into allowPrivateNetwork so inference.local resolves", () => {
@@
-    expect(src).toContain("'request': {'allowPrivateNetwork': True}");
+    const baseUrlIdx = src.indexOf("'baseUrl': inference_base_url");
+    const requestIdx = src.indexOf("'request': {'allowPrivateNetwork': True}");
+    expect(baseUrlIdx).toBeGreaterThanOrEqual(0);
+    expect(requestIdx).toBeGreaterThan(baseUrlIdx);
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/sandbox-provisioning.test.ts` around lines 78 - 85, The test currently
uses a loose expect(src).toContain("'request': {'allowPrivateNetwork': True}")
which can match unrelated code; tighten it by asserting the allowPrivateNetwork
literal appears in the openclaw provider config context — for example, locate
the provider identifier string used in the test (e.g., "openclaw" or
"openclaw.json") and assert that the substring "'request':
{'allowPrivateNetwork': True}" appears after it (or use a single regex that
matches the provider block and the request key together, e.g.,
/openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/)
so the check targets the provider config rather than any unrelated occurrence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/sandbox-provisioning.test.ts`:
- Around line 78-85: The test currently uses a loose
expect(src).toContain("'request': {'allowPrivateNetwork': True}") which can
match unrelated code; tighten it by asserting the allowPrivateNetwork literal
appears in the openclaw provider config context — for example, locate the
provider identifier string used in the test (e.g., "openclaw" or
"openclaw.json") and assert that the substring "'request':
{'allowPrivateNetwork': True}" appears after it (or use a single regex that
matches the provider block and the request key together, e.g.,
/openclaw(?:\.json)?[\s\S]*'request':\s*{\s*'allowPrivateNetwork':\s*True\s*}/)
so the check targets the provider config rather than any unrelated occurrence.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0f3f1a6c-f95f-4ace-b5cb-322fd7c02fa2

📥 Commits

Reviewing files that changed from the base of the PR and between f439189 and 7300c3a.

📒 Files selected for processing (2)
  • Dockerfile
  • test/sandbox-provisioning.test.ts

Address CodeRabbit nitpick on PR #2490: replace the bare `toContain` with
an ordered-index check that pins the literal between `'baseUrl': inference_base_url`
and `'models': [{**({'compat'` — the python that builds the providers
dict. A bare match could pass on any unrelated occurrence elsewhere in
the Dockerfile; the ordered indices prove the opt-in lives inside the
provider config block.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…tokens

The "live inference" assertions in cloud-e2e (test-full-e2e.sh) and the
Hermes e2e were curl-from-sandbox checks. They prove OpenShell's DNS
forwarder + proxy can route inference.local; they never invoke openclaw's
HTTP client and never reach openclaw's SSRF guard. That is why every
openclaw 4.9 nightly-e2e run on PR #2464 reported [LIVE] Sandbox inference:
PASS while real users were getting SsrFBlockedError on the same release.

Changes:

* test-full-e2e.sh: relabel Phase 4b from [LIVE] to [ROUTING] with a comment
  pointing at #2490; add Phase 4c, an actual openclaw-mediated turn that
  runs `openclaw agent --json` over SSH, parses result.payloads[].text, and
  asserts the model produced "42" for "What is 6 multiplied by 7?". The
  expected token is not a substring of the prompt, --json routes logs to
  stderr, stderr is dropped — so prompt-echo on an error path cannot
  satisfy the grep.
* test-hermes-e2e.sh: same relabel for the equivalent curl assertion.
* test-sandbox-operations.sh TC-SBX-02: replace `Say exactly: HELLO_E2E`
  prompt + grep on merged stdout/stderr with the same arithmetic-via-JSON
  pattern. The previous assertion would match the prompt itself in any
  error path that quoted it back, including the openclaw 4.9 SSRF
  rejection — false positive that hid the regression for the entire
  4.2 → 4.7 → 4.8 → 4.9 bump series.
* verify-sandbox-skill-via-agent.sh: stop embedding ${VERIFY_TOKEN} in the
  prompt (the agent must read it from SKILL.md — that is the test). Add
  a guard that refuses SKILL_VERIFY_PROMPT overrides which smuggle the
  token back in, and a negative assertion on SsrFBlockedError, transport
  errors, and gateway-unavailable markers before the positive grep.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa changed the title fix(sandbox): allow private-network for inference provider on openclaw 4.9 fix(sandbox): make openclaw 4.9 inference work and add tests that prove it Apr 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/e2e/test-full-e2e.sh (1)

397-415: Consider extracting the JSON payload parser to a shared function.

The Python snippet for parsing result.payloads[].text from the openclaw agent JSON envelope is duplicated in test-sandbox-operations.sh (lines 297-309). If more tests adopt this pattern, consider extracting it to test/e2e/lib/ as a shared helper.

Not blocking since there are only two occurrences currently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-full-e2e.sh` around lines 397 - 415, Extract the duplicated
Python JSON parsing snippet that builds agent_reply from agent_response into a
shared helper (e.g., create a script or shell function named
parse_agent_payloads or parse_agent_reply in test/e2e/lib/), replace the inline
Python block in both test-full-e2e.sh and test-sandbox-operations.sh with a call
to that helper, and ensure the helper accepts agent_response (or reads stdin)
and returns the concatenated payload texts so the existing grep/if logic around
agent_reply remains unchanged.
test/e2e/test-sandbox-operations.sh (1)

297-315: Consider logging when JSON parsing fails.

The Python parser silently exits with code 0 when JSON parsing fails (line 302), which causes $reply to be empty. While the test will still fail (empty string won't match "42"), the failure message won't distinguish between "no JSON output" and "JSON parsed but no matching payloads."

This is acceptable for now since the raw stdout is included in the failure message, but a debug log could help triage.

🔧 Optional: Add parse failure indicator
 reply=$(echo "$raw" | python3 -c "
 import json, sys
 try:
     doc = json.load(sys.stdin)
 except Exception:
+    print('PARSE_FAILED', file=sys.stderr)
     sys.exit(0)
 result = doc.get('result') or {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-sandbox-operations.sh` around lines 297 - 315, The test's
Python JSON extractor (the python3 one-liner that populates the reply variable
from raw) currently swallows JSON parse errors and exits silently; change that
python block (the invocation that sets reply) to catch the JSON parsing
exception and emit a clear parse-failure marker (for example print a sentinel
like "JSON_PARSE_ERROR" to stdout or stderr) so the shell can distinguish parse
failures from empty payloads, and update the surrounding shell check (the if
that inspects $reply and grep for 42) to detect this marker and produce a more
specific fail message referencing reply/raw when parsing failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/test-full-e2e.sh`:
- Around line 397-415: Extract the duplicated Python JSON parsing snippet that
builds agent_reply from agent_response into a shared helper (e.g., create a
script or shell function named parse_agent_payloads or parse_agent_reply in
test/e2e/lib/), replace the inline Python block in both test-full-e2e.sh and
test-sandbox-operations.sh with a call to that helper, and ensure the helper
accepts agent_response (or reads stdin) and returns the concatenated payload
texts so the existing grep/if logic around agent_reply remains unchanged.

In `@test/e2e/test-sandbox-operations.sh`:
- Around line 297-315: The test's Python JSON extractor (the python3 one-liner
that populates the reply variable from raw) currently swallows JSON parse errors
and exits silently; change that python block (the invocation that sets reply) to
catch the JSON parsing exception and emit a clear parse-failure marker (for
example print a sentinel like "JSON_PARSE_ERROR" to stdout or stderr) so the
shell can distinguish parse failures from empty payloads, and update the
surrounding shell check (the if that inspects $reply and grep for 42) to detect
this marker and produce a more specific fail message referencing reply/raw when
parsing failed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 25682ebd-5989-4b93-8cd5-91f44bd1a632

📥 Commits

Reviewing files that changed from the base of the PR and between 856f984 and ca03324.

📒 Files selected for processing (4)
  • test/e2e/e2e-cloud-experimental/features/skill/verify-sandbox-skill-via-agent.sh
  • test/e2e/test-full-e2e.sh
  • test/e2e/test-hermes-e2e.sh
  • test/e2e/test-sandbox-operations.sh

@ericksoa
Copy link
Copy Markdown
Contributor Author

Adversarial self-review

I ran a deliberate red-team pass on this change before requesting review. Posting the findings here so reviewers don't have to redo the work and so the open follow-ups are tracked. Cross-referenced openclaw HEAD at /Users/aerickson/Documents/NemoDev/openclaw-4-9-compat/openclaw for the SSRF and provider-config code paths.

Blockers

None. Schema match confirmed at openclaw src/config/zod-schema.core.ts:362 (strict, accepts request.allowPrivateNetwork); per-provider scoping confirmed at src/agents/provider-request-config.ts:601-664 and src/agents/provider-transport-fetch.ts:153,196allowPrivateNetwork only flows into the per-fetch policy when the configured model's transport carries it. No global disable.

Concerns (worth tracking, not blocking)

1. Cross-origin redirect blast radius. openclaw/src/infra/net/fetch-guard.ts:333-457 runs the redirect loop with the same closed-over params.policy for every hop. If a compromised or misconfigured inference.local ever returns a 302 to 169.254.169.254, 127.0.0.1, 192.168.0.1, or any other private IP, the SSRF guard will accept it because we set allowPrivateNetwork: true up-front. Mitigation today: OpenShell's network policy governs real egress, making this a defense-in-depth concern rather than an exploitable one. Better fix when upstream supports it: switch from allowPrivateNetwork: true to allowedHostnames: ['inference.local']. The internal SsrFPolicy already supports the latter (openclaw/src/infra/net/ssrf.ts:43-44,86); ConfiguredModelProviderRequest (openclaw/src/config/types.provider-request.ts:38-47) does not yet expose it. Tracked as a follow-up — I'll file the upstream openclaw issue separately.

2. Upstream regression magnet. openclaw/CHANGELOG.md:1324 records that models.providers.*.request.allowPrivateNetwork already broke between v2026.4.13 and v2026.4.14 for STT and had to be patched back. The same key has been touched in #71723 (Google), #66692 (audio), #58529 (Mattermost), #63671 (introduction). NemoClaw is pinned at Dockerfile.base:155 OPENCLAW_VERSION=2026.4.9; the Dockerfile-text test in this PR cannot detect a runtime regression on bump. Hard prerequisite for any future openclaw bump: the new Phase 4c assertion in test/e2e/test-full-e2e.sh must pass on cloud-e2e. That's the only test in the suite that proves openclaw can complete a turn against inference.local.

3. Test brittleness/laxity. The static check in test/sandbox-provisioning.test.ts:75-95 matches the literal string 'request': {'allowPrivateNetwork': True} with anchored ordering. It will fail on plausible benign refactors — Truetrue (canonical Python here, but a json.dumps-based rewrite would emit true), single→double quote reformatting, or renaming inference_base_url. Conversely, if a future maintainer moves the literal into a different provider stanza later in the file, the index-ordering check still passes because indexOf returns the first hit. A behavioral test that loads the rendered JSON and asserts config.models.providers[<key>].request.allowPrivateNetwork === true would be both stricter and more robust. Phase 4c covers the runtime case, so this is a follow-up rather than a blocker.

4. Provider-key fan-out fragility. src/lib/onboard.ts:1414-1466 shows every branch of getSandboxInferenceConfig() happens to point inferenceBaseUrl at https://inference.local{...}, so the single request stanza in the Dockerfile covers all current onboard paths. If a future onboarding branch uses a different inferenceBaseUrl ending in .local/.internal, the Dockerfile fix silently no-ops for it. Worth a comment in getSandboxInferenceConfig() warning that any new branch must revisit the Dockerfile providers dict — small follow-up.

Notes

  • Schema confirmed. request.allowPrivateNetwork is at openclaw/src/config/zod-schema.core.ts:362 (strict object, not nested under request.transport or similar).
  • Runtime config tampering: low risk. The second python block at Dockerfile:371-377 mutates the same ~/.openclaw/openclaw.json to inject the gateway token, then chmods 0600; a chown to root and Landlock-lock follow. The request key survives because json.load/json.dump round-trip preserves it. openclaw doctor --fix runs at Dockerfile:363 before the second python block — it would reject any unknown key under strict zod, but request is a known key.
  • Docker quoting. The new line 'request': {'allowPrivateNetwork': True}, \ uses the same single-quote-inside-double-quote-heredoc pattern as every other line in the RUN python3 -c "..." block. No new escaping fragility.
  • Public posture clean. Diff, commit messages (7300c3ab, 856f9841, ca03324f), and PR body contain no internal bug-tracker IDs, no internal URLs, no AI attribution. PR body accurately characterizes the scope ("scoped to that one provider") which matches provider-request-config.ts:662.
  • End-to-end validation in flight. Manually dispatched nightly-e2e.yaml on this branch (run 24969506775). cloud-e2e already passed with the new Phase 4c line: PASS: [LIVE] openclaw agent: model answered 6×7=42 through openclaw → inference.local (job log). sandbox-operations-e2e cross-check (TC-SBX-02) still in progress.

Follow-ups to file after merge

  • Behavioral test that loads the rendered openclaw.json from a built image and asserts the key on the parsed object.
  • Comment in getSandboxInferenceConfig() warning about Dockerfile coupling.
  • Upstream openclaw issue: expose allowedHostnames/hostnameAllowlist on ConfiguredModelProviderRequest. Switching to allowedHostnames: ['inference.local'] would close concern ci: auto-update release notes on push to main #1.

sandbox_exec_for() merges stderr into stdout (2>&1 at line 95) so it can
log non-zero exits — but that pollutes the stdout `python3 json.load()`
needs to parse, and node-side warnings (UNDICI-EHPA, etc.) from openclaw
agent always land on stderr. Result on the first nightly-e2e dispatch:
TC-SBX-02 captured `(node:1044) [UNDICI-EHPA] Warning: ...` ahead of any
JSON, json.load raised, the except branch produced reply='', and the test
failed with `expected '42' in agent reply, got: ''`.

cloud-e2e Phase 4c didn't have this problem because it uses a direct ssh
invocation with `2>/dev/null` at the source. Mirror that here: open the
ssh-config in the helper, call ssh directly, drop stderr at the wire.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Follow-up to #2344 (NemoClaw 0.0.24). The http.request → https.request
rewrite that resolves the NODE_USE_ENV_PROXY=1 / library-side proxy
double-processing was shallow-copying the caller's options into the
rewritten https.request. That dragged three classes of forward-proxy-
hop fields onto a request that now goes direct to the upstream:

  * options.agent — a forward-proxy http.Agent. http.Agent cannot speak
    TLS, so https.request either ignores it and falls back to a default
    agent or fails the TLS handshake outright depending on caller. This
    is the most likely root cause of the deepinfra-style report on
    Discord ("LLM request failed: network connection error" against a
    custom OpenAI-compatible upstream while NVIDIA-routed providers
    keep working). NVIDIA Endpoints' DNS-rewritten path through OpenShell
    doesn't end up in this branch, so the bug stayed invisible.

  * options.auth — basic-auth meant for the proxy hop. Leaving it on
    https.request would Basic-auth the *target* server with proxy
    credentials.

  * Host / Proxy-Authorization / Proxy-Connection / Proxy-Authenticate
    headers — Host points at the proxy host and the Proxy-* headers are
    hop-by-hop. Stripping them lets Node regenerate Host from the
    target's hostname/port and prevents proxy creds from leaking
    upstream.

Adds test/http-proxy-fix-rewrite.test.ts with seven cases pinning each
strip and confirming signal/timeout/TLS fields and target headers
(Authorization, Content-Type, …) survive. The existing
http-proxy-fix-sync.test.ts byte-for-byte enforcer still passes — the
canonical wrapper at nemoclaw-blueprint/scripts/http-proxy-fix.js and
the heredoc in scripts/nemoclaw-start.sh were updated together.

Reverts the openclaw `request.allowPrivateNetwork` Dockerfile change
that this PR previously carried — empirical investigation showed
openclaw 2026.4.9 rejects that key in strict zod and `openclaw doctor
--fix` (which the Dockerfile runs at build time) silently strips it
back to `"request": {}` before the image ships. The upstream plumbing
that reads `request.allowPrivateNetwork` only landed in 2026.4.10
(commit 0808dd111c, openclaw PR #63671). The change here was a no-op,
diagnosing the wrong layer; this commit replaces it with the actual
fix at the wrapper layer.

The label-rename + arithmetic-via-`--json` test improvements added
earlier on this branch (Phase 4c, TC-SBX-02 rewrite, skill-verify
hardening, sandbox_exec stderr-merge fix) are kept — they catch
regressions independent of the proxy bug.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa changed the title fix(sandbox): make openclaw 4.9 inference work and add tests that prove it fix(sandbox): strip forward-proxy fields when rewriting to https.request Apr 27, 2026
Spins up a local HTTPS mock on 127.0.0.1 with a self-signed cert generated
via openssl, simulates the FORWARD-mode http.request shape that axios +
HTTPS_PROXY produces (forward-proxy http.Agent attached, proxy basic-auth,
Host pointing at the proxy, Proxy-* headers), and asserts an
OpenAI-compatible chat-completions round trip actually completes.

Bisect proof:

  Wrapper without the strip (parent commit reverted):
    × test fails with `TypeError: Protocol "https:" not supported.
      Expected "http:"` — Node detects the forward-proxy http.Agent
      riding into https.request. Same root cause as the deepinfra
      "LLM request failed: network connection error" symptom.

  Wrapper with the strip (this PR):
    ✓ test passes; mock receives POST /v1/openai/chat/completions with
      Authorization: Bearer real-deepinfra-token preserved, Host:
      localhost:<port> regenerated by Node, Proxy-* headers absent,
      and the request body intact. Caller gets the mock's PONG reply.

Skipped if openssl is not on PATH; CI runners (ubuntu-latest, macOS) have
it. Cert is generated fresh per test run with 1-day expiry so there is no
long-lived crypto material in the repo.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…sport hints

Adversarial review caught three blockers and seven concerns on the prior
commit. This addresses all of them:

Wrapper (nemoclaw-blueprint/scripts/http-proxy-fix.js + heredoc in
scripts/nemoclaw-start.sh, kept byte-for-byte synced):

  * sanitizeHeaders strips the full RFC 7230 §6.1 hop-by-hop set
    (Connection, Keep-Alive, Proxy-Authorization, TE, Trailer,
    Transfer-Encoding, Upgrade) plus Host, Proxy-Connection, and
    Proxy-Authenticate. Also walks the Connection header and strips
    every token named there (transitive hop-by-hop). The prior commit
    only covered Host + Proxy-*, leaving Connection: close from the
    proxy hop riding into a direct-to-target request and defeating
    keep-alive on the rewrite. (B1)
  * Proxy-Authenticate is response-only per RFC 7235 §4.3 — comment
    block updated to call this out so the next reviewer doesn't repeat
    the question. Kept in the strip set as belt-and-suspenders for
    clients that echo response headers into retry-request options. (B2)
  * Strip rewritten.servername and rewritten.checkServerIdentity — TLS
    SNI and identity-check pre-computed for the proxy hop must not
    survive into the direct-to-target handshake. (C2)
  * Strip rewritten.socketPath — Unix-socket proxies (cntlm-style)
    exist; routing TLS bytes into the proxy socket would defeat the
    rewrite. (C2)
  * Strip rewritten.localAddress, lookup, family, hints —
    source-binding and DNS hints picked for reachability to the proxy
    don't apply to a different upstream. (C3)

Tests:

  * test/http-proxy-fix-rewrite.test.ts: new cases pin every new strip
    (RFC 7230 §6.1 hop-by-hop including transitive Connection-named
    tokens; servername/checkServerIdentity; socketPath/localAddress/
    lookup/family/hints). Adds a separate "control" describe block that
    constructs the broken pre-fix rewrite shape inline and asserts
    https.request synchronously rejects it on Node 22. The bisect now
    lives in the repo, not in PR description prose. (B3)
  * test/http-proxy-fix-e2e.test.ts: switches from raw process.env
    mutation to vi.stubEnv (matches the rewrite suite, prevents env
    leaks across test files in the same vitest worker). Saves and
    restores the original http.request in beforeEach/afterEach so a
    failed test does not leave the wrapped function chained into the
    next test file. (C4, C5)
  * Cert expiry: 1-day → 7-day so a long-running `vitest --watch`
    session does not outrun it. (C6)
  * When CI=true and openssl is missing, throws at module load instead
    of silent-skipping. CI runners ship openssl; a missing-binary skip
    would be a false green. (C7)

Run-time totals: 28 cases across rewrite (12) + sync (6) + e2e (1) +
sandbox-provisioning (10) — all pass; 0 regressions.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/http-proxy-fix-rewrite.test.ts (1)

249-260: Correct assertion for explicit port, but type inconsistency worth noting.

The test correctly expects "8443" as a string because URL.port returns a string for explicit ports. The earlier test at line 97 expects 443 as a number because the production code uses target.port || 443 where empty string falsy-falls back to the numeric default.

This is correct behavior but the mixed string/number type could confuse future maintainers. Consider adding a brief comment.

📝 Optional: Add clarifying comment
   it("uses the explicit target port when one is present in the URL", () => {
     http.request({
       hostname: PROXY_HOST,
       port: 3128,
       path: "https://internal.example.com:8443/v1/x",
       headers: {},
     });

     expect(captured).not.toBeNull();
+    // URL.port returns a string for explicit ports; the production code's
+    // `target.port || 443` preserves the string type here (vs. numeric 443 fallback).
     expect(captured?.port).toBe("8443");
     expect(captured?.hostname).toBe("internal.example.com");
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/http-proxy-fix-rewrite.test.ts` around lines 249 - 260, The test "uses
the explicit target port when one is present in the URL" asserts captured?.port
=== "8443" which is correct because URL.port returns a string while other tests
expect numeric 443 due to production code using target.port || 443; add a
one-line comment above this test (or above the assertion) explaining that
explicit URL ports are string-typed via URL.port and that the code falls back to
numeric 443 when URL.port is empty to avoid future confusion referencing the
captured variable and the test name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/http-proxy-fix-rewrite.test.ts`:
- Around line 249-260: The test "uses the explicit target port when one is
present in the URL" asserts captured?.port === "8443" which is correct because
URL.port returns a string while other tests expect numeric 443 due to production
code using target.port || 443; add a one-line comment above this test (or above
the assertion) explaining that explicit URL ports are string-typed via URL.port
and that the code falls back to numeric 443 when URL.port is empty to avoid
future confusion referencing the captured variable and the test name.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e060c472-d919-4006-9f36-5474c932e073

📥 Commits

Reviewing files that changed from the base of the PR and between cade7f1 and 69115de.

📒 Files selected for processing (4)
  • nemoclaw-blueprint/scripts/http-proxy-fix.js
  • scripts/nemoclaw-start.sh
  • test/http-proxy-fix-e2e.test.ts
  • test/http-proxy-fix-rewrite.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • nemoclaw-blueprint/scripts/http-proxy-fix.js
  • scripts/nemoclaw-start.sh

@ericksoa ericksoa merged commit 8c20905 into main Apr 27, 2026
55 of 58 checks passed
@ericksoa ericksoa deleted the fix/openclaw-4-9-inference-ssrf branch April 27, 2026 02:43
@miyoungc miyoungc mentioned this pull request Apr 28, 2026
13 tasks
miyoungc added a commit that referenced this pull request Apr 28, 2026
## Summary
Refreshes user-facing docs for the last 24 hours of merged NemoClaw
history and bumps the docs metadata to 0.0.29, the next version after
v0.0.28. The updates are limited to behavior supported by merged PR
descriptions and diffs.

## Changes
- `docs/reference/commands.md`: documented `nemoclaw <name> policy-add
--from-file` and `--from-dir`, including custom preset review guidance,
from #2077 / commit `7720b175`.
- `docs/deployment/deploy-to-remote-gpu.md`: clarified that non-loopback
`CHAT_UI_URL` disables OpenClaw device pairing for remote browser-only
deployments, from #2449 / commit `f5ee8a4d`.
- `docs/inference/inference-options.md`: documented provider-aware
credential retry validation and the NVIDIA-only `nvapi-` prefix check,
from #2389 / commit `6f7f0c6d`.
- `docs/inference/switch-inference-providers.md`: documented
`NEMOCLAW_INFERENCE_INPUTS` for text/image-capable model metadata baked
into `openclaw.json`, from #2441 / commit `f4391892`.
- `docs/reference/troubleshooting.md`: added the Git certificate
verification entry for proxy CA propagation through `GIT_SSL_CAINFO`,
`GIT_SSL_CAPATH`, `CURL_CA_BUNDLE`, and `REQUESTS_CA_BUNDLE`, from #2345
/ commit `fa0dc1ab`.
- `docs/versions1.json` and `docs/project.json`: promoted docs version
`0.0.29`; `docs/versions1.json` omits unpublished `0.0.26`, `0.0.27`,
and `0.0.28` entries.
- `.agents/skills/nemoclaw-user-*`: regenerated derived user skill
references from the updated docs.
- Reviewed with no extra doc changes: #2575 / `d392ec07`, #2565 /
`a3231049`, #1965 / `db1ef3ca`, #1990 / `db665834`, #2495 / `7da86fa3`,
#2496 / `3192f4f4`, #2490 / `8c209058`, #2487 / `1f615e2f`, #2483 /
`5653d33a`, #2482 / `31c782c0`, #2464 / `23bb5703`, #2472 / `a54f9a34`,
and #2437 / `6bc860d7`.
- Skipped per docs policy: #2420 / `7b76df6b` touched the experimental
sandbox config path listed in `docs/.docs-skip`; #2466 / `cc15689c`
touched a skipped term and CI-only sandbox image files.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
<!-- Check each item you ran and confirmed. Leave unchecked items you
skipped. -->
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes — failed locally in installer-integration tests
and one onboard helper timeout; the doc-scoped hook test projects passed
under `prek`.
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only) — build
succeeded, but local Sphinx emitted the existing version-switcher file
read message.
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure
<!-- If an AI agent authored or co-authored this PR, check the box and
name the tool. Remove this section for fully human-authored PRs. -->
- [x] AI-assisted — tool: Codex

---
<!-- DCO sign-off required by CI. Run: git config user.name && git
config user.email -->
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Support for custom YAML presets in policy configuration via
--from-file and --from-dir.
* New build-time inference input option to declare accepted modalities
(text or text,image).

* **Improvements**
* Credential validation now offers interactive recovery: re-enter key,
retry, choose another provider, or exit.
* Clarified provider-specific API key prefix handling (nvapi- only
applies to NVIDIA keys).

* **Documentation**
  * TLS certificate troubleshooting for inspected networks.
* Clarified remote dashboard security/device-pairing behavior; command
docs updated; docs version bumped.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants