Skip to content

feat(harness): web::fetch worker with SSRF-hardened HTTP client#202

Merged
andersonleal merged 3 commits into
mainfrom
feat/harness-web-fetch-worker
May 29, 2026
Merged

feat(harness): web::fetch worker with SSRF-hardened HTTP client#202
andersonleal merged 3 commits into
mainfrom
feat/harness-web-fetch-worker

Conversation

@andersonleal
Copy link
Copy Markdown
Collaborator

Summary

Adds a standalone harness worker exposing web::fetch so the agent can fetch URLs through a structured, server-guarded envelope instead of reaching for shell::exec + curl. Scoped entirely to harness/src/web/ + harness/tests/web/ — no changes to existing workers.

What it does

web::fetch takes { url, method?, headers?, body?, json?, timeout_ms?, max_bytes?, follow_redirects?, response_format? } and returns { ok, status, headers, body, … } (or { ok:false, error, message }), with size/timeout caps and SSRF protection enforced server-side.

Security model (ssrf.ts + fetch.ts)

  • Resolve-once / validate-all / pin-to-IP: DNS is resolved once; every resolved address is checked against the blocklist; the request is then dialed against the validated IP (pinned lookup + TLS servername) so there's no DNS-rebinding window between check and connect.
  • Blocklist: private (RFC1918), link-local, loopback (allowed by default for dev, configurable), and cloud-metadata ranges — including ::ffff:-mapped IPv4 in both dotted and hex forms.
  • Redirects: each hop is re-validated against the blocklist; Authorization/Cookie are stripped on cross-host redirects and on any https → http downgrade.
  • Bounded: byte-capped response reader (bytes_truncated flag) and per-request timeout.

Shape

  • schemas.ts — zod ingress + zodToJsonSchema export; case-insensitive method; json payload auto-stringify + content-type; text / base64 / json response formats.
  • main.ts + iii.worker.yaml + register.ts — standalone deployable worker registering web::fetch.

Test plan

  • tsc -b clean
  • vitest run tests/web/62/62 pass: ssrf unit (mapped-IPv4 hex/dotted, all ranges), fetch guard + helper surface (stripCrossOriginAuth, readIncomingCapped), handler, and a real http.createServer loopback integration suite (IP pinning, per-hop re-validation, byte cap, timeout, POST/JSON round-trip)
  • biome check (2.4.10) clean
  • Manual: live https:// smoke (servername/cert-identity path is correct-by-construction + integration-tested over plaintext loopback; a trusted-cert HTTPS server isn't feasible in unit tests)

Notes

  • Not yet wired into the combined harness/src/index.ts — it's a standalone worker (own main.ts + manifest), runnable via node dist/web/main.js. Wiring it into the bundled harness can be a follow-up if desired.

A standalone harness worker exposing `web::fetch` so agents fetch URLs
through a structured, guarded envelope instead of reaching for `shell::exec`
curl.

- SSRF guard (ssrf.ts): resolve-once / validate-all / pin-to-IP; blocks
  private, loopback (configurable), link-local, and cloud-metadata ranges,
  including `::ffff:`-mapped IPv4 in both dotted and hex forms; each redirect
  hop is re-validated against the resolved IP.
- Transport (fetch.ts): node:http/https with a pinned DNS lookup +
  servername so the validated IP is the one actually dialed (no DNS-rebind
  window); strips Authorization/Cookie on cross-host redirects and
  https->http downgrades; byte-capped, timeout-bounded response reader.
- Schemas (schemas.ts): zod ingress + JSON-schema export; case-insensitive
  method; json payload auto-stringify; text/base64/json response formats.
- Standalone worker: main.ts + iii.worker.yaml + register.ts (registers
  `web::fetch`); not yet wired into the combined index.ts.

Tests: 62 pass — ssrf unit, fetch guard/helper surface, handler, and a
loopback http.createServer integration suite (pinning, per-hop re-validation,
byte cap, timeout, POST). tsc -b clean; biome (2.4.10) clean.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
workers Ready Ready Preview, Comment May 29, 2026 1:01pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

Warning

Review limit reached

@andersonleal, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes and 33 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7c8a3aba-0200-464c-a027-fc697c8368cb

📥 Commits

Reviewing files that changed from the base of the PR and between 187a56a and 1885aaf.

📒 Files selected for processing (13)
  • harness/src/web/config.ts
  • harness/src/web/fetch.ts
  • harness/src/web/handlers/fetch.ts
  • harness/src/web/iii.worker.yaml
  • harness/src/web/main.ts
  • harness/src/web/register.ts
  • harness/src/web/schemas.ts
  • harness/src/web/skills/index.md
  • harness/src/web/ssrf.ts
  • harness/tests/web/fetch.integration.test.ts
  • harness/tests/web/fetch.test.ts
  • harness/tests/web/handler.test.ts
  • harness/tests/web/ssrf.test.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/harness-web-fetch-worker

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

skill-check — worker

0 verified, 13 skipped (no docs/).

Layer Result
structure
vale
ai
render

Note

17 stale rendered artifact(s) detected on main, unrelated to this PR. This PR is fine; the drift was already there. A maintainer should open a chore PR to re-render these.

  • shell/README.md
  • shell/skill.md
  • shell/skills/chmod.md
  • shell/skills/exec.md
  • shell/skills/exec_bg.md
  • shell/skills/grep.md
  • shell/skills/kill.md
  • shell/skills/list.md
  • shell/skills/ls.md
  • shell/skills/mkdir.md
  • shell/skills/mv.md
  • shell/skills/read.md
  • …and 5 more (see the workflow logs)

Single self-contained index.md skill for the web worker, mirroring the
sandbox skill format: callable id, when-to-use table, live-schema pointer,
the request/response envelope, the json-vs-body and response_format rules,
truncation-vs-error semantics, and the SSRF guard (blocked ranges incl.
::ffff:-mapped IPv4, pin-to-IP, per-hop redirect re-check, cross-origin auth
stripping). Lives at harness/src/web/skills/ alongside the worker source,
same convention as coder/database/shell.
Rewrite for the actual consumer (an agent calling web::fetch on the first
try), applying DX principles to the doc:

- Lead with the minimal call (url is the only required field).
- Add the ok-vs-status rule: HTTP 4xx/5xx are ok:true (a completed fetch),
  ok:false is only fetch-level failure. Fixes the prior misleading
  `status:502` in the ok:false example (executeFetch never sets status on
  errors).
- Add an error -> cause -> fix table for every ok:false code.
- Decision table up top; request fields as a table with defaults + gotchas.
- Document verified behaviors: response header keys are lower-cased,
  set-cookie joined with ', '; GET/HEAD ignore body/json; response_format
  json doesn't parse a truncated body.
- Tighten prose for token budget (system-prompt injection).
@andersonleal andersonleal merged commit 974dde3 into main May 29, 2026
14 checks passed
andersonleal added a commit that referenced this pull request May 29, 2026
The web::fetch worker (#202) shipped its own files and standalone
main.ts but was never wired into src/index.ts, so `pnpm dev:all` and
`start:all` (which run the composite all-in-one process) never started
it. Add the registration so the worker boots with the rest.

Also backfill the missing dev:web / iii-web wiring in package.json, plus
the same gap for provider-llamacpp, and add a regression test asserting
every runnable worker folder (src/*/main.ts) is wired into the composite
manifest, dev scripts, and bin entries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants