feat(security): SSRF guard for OTel export endpoint (CRITICAL-01) by risjai · Pull Request #134 · agentoptics/rewind

risjai · 2026-04-21T05:57:21Z

Summary

Addresses CRITICAL-01 from the security audit: POST /api/sessions/{id}/export/otel accepted a user-supplied endpoint URL and made outbound HTTP/gRPC requests with no IP-range validation. An attacker with API access (now gated by PR #133's auth, but defense-in-depth) could target cloud metadata endpoints (169.254.169.254), internal services, or loopback.

Also includes the full docs/security-audit.md (first commit to master) with CRITICAL-01, CRITICAL-02, and MEDIUM-09 marked as addressed.

Implementation

New file: `crates/rewind-web/src/url_guard.rs`

A standalone SSRF guard module that:

Parses the URL to extract host and port (minimal parser — no url crate dep)
If the host is an IP literal, validates it directly
If the host is a name, resolves via ToSocketAddrs and validates all resolved IPs
Rejects any IP that falls in a blocked range, returning a user-facing error string

Blocked ranges (comprehensive):

IPv4	IPv6
RFC 1918 (10/8, 172.16/12, 192.168/16)	Loopback (::1)
Loopback (127/8)	Unspecified (::)
Link-local (169.254/16 — cloud metadata)	Multicast (ff00::/8)
Unspecified (0.0.0.0)	Unique-local (fc00::/7)
Multicast (224/4)	Link-local (fe80::/10)
Broadcast (255.255.255.255)	v4-mapped (::ffff:priv — recurses into v4 check)
Documentation (192.0.2/24, 198.51.100/24, 203.0.113/24)	v4-compatible deprecated (::/96)
Shared-address-space / CGNAT (100.64/10)	Documentation (2001:db8::/32)
Benchmarking (198.18/15)

Changed: `crates/rewind-web/src/api.rs`

The old validation:

if !endpoint.starts_with("http://") && !endpoint.starts_with("https://") { ... }

Replaced with:

crate::url_guard::validate_export_endpoint(&export_endpoint)
    .map_err(|e| (StatusCode::BAD_REQUEST, e))?;

This runs before session lookup or any outbound connection attempt.

Updated: `docs/security-audit.md`

CRITICAL-01: marked ✅ fixed with DNS-rebinding caveat documented
CRITICAL-02: marked ✅ fixed (PR feat(auth): fail-closed auth for non-loopback web server binds (CRITICAL-02) #133)
MEDIUM-09: marked ⚠️ partially fixed (WS path closed, loopback-CSRF follow-up)
§5 ship order table: status column added, PRs Fix CI badge to reference master branch #1 and Add CLI auto-bootstrap to pip install #2 marked shipped

Known limitation: DNS rebinding

The guard resolves the hostname once and validates the resulting IPs. The downstream opentelemetry-otlp client may re-resolve at connection time, so a malicious resolver could theoretically rebind between validation and connection. Fully closing this gap requires a pinned resolver, which isn't supported by opentelemetry-otlp's current API. The remaining window is narrow and documented inline in url_guard.rs.

Tests

23 url_guard unit tests covering:

Every blocked IPv4 range (RFC 1918, link-local, loopback, unspecified, multicast, broadcast, documentation, CGNAT, benchmarking)
Every blocked IPv6 range (loopback, unspecified, multicast, unique-local, link-local, v4-mapped, documentation)
Public unicast allowed (v4 + v6)
URL parser: HTTP/HTTPS ports, explicit ports, IPv6 brackets, userinfo stripping, malformed rejection

4 integration tests in api_tests.rs:

test_export_otel_rejects_loopback_endpoint → 400
test_export_otel_rejects_cloud_metadata_endpoint → 400
test_export_otel_rejects_rfc1918_endpoint → 400 (tests 10.x, 192.168.x, 172.16.x)
test_export_otel_returns_404_for_missing_session → updated to use public IP endpoint

Full suite: 370+ tests, 0 failures, clippy clean on touched files, UI build + 113 vitest tests pass.

Test plan

cargo test --workspace — all pass
cargo clippy -p rewind-web --all-targets -- -D warnings — zero errors in url_guard or api_tests
npm run build && npm test — 113/113 UI tests, prod build succeeds
Unit: 23 IP-range category tests cover every blocked range
Integration: loopback, metadata, RFC 1918 all return 400 before session lookup
Existing test_export_otel_returns_404_for_missing_session passes with public endpoint
Existing test_export_otel_no_config_returns_501 still passes

Ship order

#	Fix	Status
1	Fail-closed auth (CRITICAL-02)	✅ PR #133 merged
2	SSRF guard on OTel export (CRITICAL-01)	✅ This PR
3	Blob redaction + hop-by-hop filtering (HIGH-01/06, MEDIUM-06/08)	⏳ Next
4	`query_raw` PRAGMA lockdown (HIGH-02)	⏳ Planned

No version bump — deferred until all 4 security PRs ship.

🤖 Generated with Claude Code

…AL-01) Addresses CRITICAL-01 from the security audit: the POST /api/sessions/{id}/ export/otel endpoint accepted a user-supplied URL and made outbound HTTP/gRPC requests with no IP-range validation. An attacker with API access could target cloud metadata endpoints (169.254.169.254), internal services, or loopback. Changes: - New `crates/rewind-web/src/url_guard.rs`: validates export endpoints by resolving the hostname and rejecting any IP in a blocked range before the outbound connection. Covers: RFC 1918 (10/8, 172.16/12, 192.168/16), link-local (169.254/16, fe80::/10), loopback (127/8, ::1), unspecified, multicast, broadcast, documentation (192.0.2/24 etc, 2001:db8::/32), shared-address-space (100.64/10), benchmarking (198.18/15), unique-local v6 (fc00::/7), and v4-mapped v6 (::ffff:priv). 23 unit tests. - `api.rs::export_otel` now calls `validate_export_endpoint()` before any session lookup or outbound request. Returns 400 with SSRF error message. - 4 new integration tests in api_tests.rs: loopback rejection, cloud metadata rejection, RFC 1918 rejection, existing 404 test updated to use a public endpoint IP. - `docs/security-audit.md` updated: CRITICAL-01 + CRITICAL-02 marked as fixed (PR #133 + #134), MEDIUM-09 marked as partially fixed, ship order table updated with status column. Known limitation: DNS rebinding between validation and the opentelemetry-otlp library's connection-time re-resolution. Documented inline in url_guard.rs. Requires upstream API changes to fully close. This is PR #2 of the 5-PR audit ship order. No version bump (deferred until all 4 security PRs ship per team decision). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-04-21T05:57:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
rewind	Ready	Preview, Comment	Apr 21, 2026 7:14am

risjai

Code Review — PR #134: SSRF guard for OTel export (CRITICAL-01)

Verdict: Well-structured guard with strong IP-range coverage. I'd block merge on the octal IP parsing differential and the missing Teredo/6to4 ranges — both are exploitable bypasses. The URL parser also has a class of parser-differential bugs. Details below.

Overview

Adds crates/rewind-web/src/url_guard.rs — a standalone SSRF validation module. The old starts_with("http://") check in api.rs::export_otel is replaced by validate_export_endpoint() which parses the URL, resolves hostnames via ToSocketAddrs, and rejects any resolved IP in a private/reserved range. 23 unit tests, 4 integration tests. The DNS rebinding limitation is documented.

Good scope, good defense-in-depth posture (this runs even though PR #133 now gates the endpoint behind auth).

🔴 Must fix before merge

1. Octal/hex/decimal IP bypass via `to_socket_addrs` parser differential

When host.parse::<IpAddr>() fails (Rust's parser rejects non-standard forms like 0177.0.0.1, 0x7f000001, 2130706433), the code falls through to to_socket_addrs(), which delegates to the platform's getaddrinfo(3). On some libc implementations:

0177.0.0.1 resolves as 177.0.0.1 (octal digits interpreted as decimal) — passes the guard as public, but the downstream HTTP client (hyper via reqwest) may interpret the octets differently via inet_aton, connecting to 127.0.0.1
The safety of the guard depends on getaddrinfo and the HTTP client agreeing on IP interpretation, which is not guaranteed across OSes or library versions

Fix: After host.parse::<IpAddr>() fails, reject any host that looks like a numeric IP (all digits/dots/hex). Only pass through hostnames with at least one alphabetic character:

fn looks_like_numeric_ip(host: &str) -> bool {
    let h = host.strip_prefix("0x").unwrap_or(host);
    !h.is_empty() && h.chars().all(|c| c.is_ascii_hexdigit() || c == '.' || c == 'x' || c == 'X')
}
// In validate_export_endpoint, between the IpAddr::parse and to_socket_addrs:
if looks_like_numeric_ip(&host) {
    return Err(format!("Endpoint host '{host}' looks like a non-standard IP literal (rejected for SSRF safety)"));
}

2. Missing Teredo (2001:0000::/32) and 6to4 (2002::/16) blocks

Both IPv6 transition mechanisms embed a destination IPv4 that a relay/gateway will route to. An attacker can encode 127.0.0.1 or 169.254.169.254 inside a Teredo or 6to4 address:

Teredo: 2001:0000:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx — the embedded IPv4 is in the last 32 bits (obfuscated/inverted)
6to4: 2002:7f00:0001:: encodes 127.0.0.1 directly in bits 16-47

These are exploitable on hosts with Teredo/6to4 enabled (common in mixed-stack environments — exactly the K8s scenario this guard targets).

Fix: Add to is_blocked_v6:

|| is_teredo(ip)    // 2001:0000::/32 — embeds routable IPv4
|| is_6to4(ip)      // 2002::/16 — embeds destination IPv4

fn is_teredo(ip: Ipv6Addr) -> bool {
    let s = ip.segments();
    s[0] == 0x2001 && s[1] == 0x0000
}
fn is_6to4(ip: Ipv6Addr) -> bool {
    ip.segments()[0] == 0x2002
}

Note: is_documentation_v6 blocks 2001:0db8::/32 but not 2001:0000::/32. These are different allocations; Teredo is the dangerous one.

🟡 Should address

3. URL parser does not reject backslash, whitespace, or percent-encoded authority

parse_host_port does not strip or reject \, \t, \n, \r, or %-encoded characters in the authority component. Classic parser-differential attacks:

http://evil.com\@127.0.0.1/ — the guard's rsplit_once('@') extracts 127.0.0.1, checks it (blocked!), but some HTTP clients following WHATWG URL parsing treat \ as a path separator and connect to evil.com
http://evil.com%5C@127.0.0.1/ — percent-encoded variant of the same

In this specific case the guard would correctly reject the loopback IP, so the parser differential is "opposite direction" (guard is more restrictive than the client). But the mismatch still undermines trust in the guard's invariants — a future change could flip the direction.

Fix: Reject any authority containing \, %, or control characters:

if authority.bytes().any(|b| b == b'\\' || b == b'%' || b < 0x20) {
    return None;
}

Or, better yet, add the url crate as a dependency (it's already transitively present via reqwest) and use its WHATWG-compliant parser instead of the hand-rolled one. This eliminates the entire class of parser-differential bugs.

4. `to_socket_addrs` is blocking I/O in an async context

(host.as_str(), port).to_socket_addrs() calls getaddrinfo(3), which is a blocking syscall. In api.rs::export_otel (an async handler), this blocks the tokio worker thread for the duration of DNS resolution. Under load or with a slow resolver, this starves the runtime.

Fix: Wrap in tokio::task::spawn_blocking, or use tokio::net::lookup_host instead:

let addrs: Vec<_> = tokio::net::lookup_host((host.as_str(), port))
    .await
    .map_err(|e| format!("Failed to resolve '{host}': {e}"))?
    .collect();

This requires making validate_export_endpoint async. Alternatively, keep it sync and call it from spawn_blocking at the call site in api.rs.

🟢 Nits / optional

docs/security-audit.md is 808 lines added as a new file. But the file already exists on master (added in an earlier session). Confirm this isn't a merge conflict waiting to happen — the system reminder says the file was modified, which suggests it was edited locally but never committed to master. If master already has it, this PR should modify, not create.
Integration tests duplicate boilerplate. All 4 new integration tests (test_export_otel_rejects_loopback_endpoint, _cloud_metadata_, _rfc1918_) share identical AppState setup. Extract a setup_otel_test() helper (like the existing setup() in the file). ~30 LOC saved, easier to maintain.
parse_host_port strips query and fragment, but the URL is passed through unchanged. If someone passes http://public.example.com/v1/traces#@169.254.169.254, the guard checks public.example.com (allowed) but the fragment @169.254.169.254 is sent to the HTTP client. This isn't exploitable since HTTP clients strip fragments before sending, but it's worth a comment noting the invariant.
The url crate is already in the dep tree via reqwest → url 2.x. Using it directly in Cargo.toml adds no new code to the binary. The hand-rolled parser works but is a maintenance burden — every time a new bypass vector is found, you're re-implementing URL parsing logic that the url crate already handles.

Test coverage

Strong for the IP-range checking. Gaps:

No test for octal/hex/decimal IP forms — the critical bypass in #1
No test for Teredo/6to4 addresses — the high bypass in #2
No test for backslash/percent in authority — #3
No async DNS test — the #[ignore] test for public hosts is fine, but there's no test confirming that to_socket_addrs on a numeric-looking non-IP-parseable string is handled correctly
e2e_allows_public_host is #[ignore] — good (CI sandboxes), but consider adding a test with a mocked DNS resolver for deterministic coverage

Security cross-check against the audit

Audit requirement	Status
Reject RFC 1918	✅
Reject link-local / cloud metadata (169.254.x.x)	✅
Reject loopback	✅
Reject unspecified	✅
Reject multicast/broadcast	✅
Reject documentation ranges	✅
Reject CGNAT/shared (100.64/10)	✅
Reject benchmarking (198.18/15)	✅
Resolve hostname before connecting	✅
Validate ALL resolved IPs (not just first)	✅
IPv4-mapped IPv6 (::ffff:x.x.x.x)	✅
IPv4-compatible deprecated (::x.x.x.x)	✅
Unique-local (fc00::/7)	✅
Link-local v6 (fe80::/10)	✅
Teredo (2001:0000::/32)	❌ Missing — exploitable
6to4 (2002::/16)	❌ Missing — exploitable
Octal/hex IP forms	❌ Missing — platform-dependent bypass
DNS rebinding	⚠️ Documented as known limitation
Parser differential (backslash/percent)	⚠️ Not handled

Summary

Block on: Octal/hex IP bypass (#1), Teredo/6to4 ranges (#2)
Before merge: Authority sanitization (#3), async DNS (#4)
Follow-up: Replace hand-rolled parser with url crate, add bypass regression test suite

The guard's IP-range coverage is excellent for standard representations — 9 IPv4 categories and 8 IPv6 categories is more thorough than most SSRF guards I've reviewed. The two blocking issues are both "non-standard representation bypasses" which is the #1 class of SSRF guard evasion. Once those are patched, this is solid.

Addresses all blocking and should-fix items from the code review. 🔴 Blockers resolved: 1. Octal/hex/decimal IP bypass: added `looks_like_numeric_ip()` that rejects hosts consisting entirely of hex digits, dots, and x/X (e.g., 0177.0.0.1, 0x7f000001, 2130706433) BEFORE passing to DNS. Closes the parser-differential between Rust's IpAddr::parse (strict) and getaddrinfo(3) (platform-dependent lax parsing). 2. Teredo (2001:0000::/32) and 6to4 (2002::/16) blocks added to is_blocked_v6. Both IPv6 transition mechanisms embed a routable IPv4 that a relay/gateway will connect to. An attacker could encode 127.0.0.1 or 169.254.169.254 inside these addresses. 🟡 Should-fix resolved: 3. Authority sanitization: parse_host_port now rejects any authority containing backslash, percent-encoding, or control characters. Eliminates the entire class of parser-differential attacks where the guard and HTTP client disagree on the host. 4. Async DNS: validate_export_endpoint is now async, using tokio::net::lookup_host instead of std::net::ToSocketAddrs. No longer blocks the tokio worker thread during DNS resolution. New tests: 15 additional (Teredo, 6to4, octal/hex/decimal bypass, backslash, percent, control chars, end-to-end for each bypass class). Total url_guard tests: 38 (was 23). All pass. Clippy clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

risjai · 2026-04-21T07:14:07Z

Pushed 5bd16fa addressing all blockers + should-fix items.

🔴 Blockers resolved:

#	Issue	Fix
1	Octal/hex/decimal IP bypass	`looks_like_numeric_ip()` rejects hosts that are all hex-digits/dots/x before DNS resolution. Tests: `e2e_rejects_octal_ip_bypass`, `e2e_rejects_hex_ip_bypass`, `e2e_rejects_decimal_ip_bypass`
2	Missing Teredo/6to4	`is_teredo()` blocks `2001:0000::/32`, `is_6to4()` blocks `2002::/16`. Tests: `blocks_teredo`, `blocks_6to4`, `e2e_rejects_teredo_v6`, `e2e_rejects_6to4_v6`

🟡 Should-fix resolved:

#	Issue	Fix
3	Backslash/percent/control chars in authority	`parse_host_port` now rejects `\`, `%`, and bytes < 0x20 in the authority. Tests: `rejects_backslash_in_authority`, `rejects_percent_in_authority`, `rejects_control_chars_in_authority`, `e2e_rejects_backslash_bypass`
4	Blocking DNS	`validate_export_endpoint` is now `async`, using `tokio::net::lookup_host` instead of `std::net::ToSocketAddrs`

Test count: 38 url_guard unit tests (was 23), 30 api integration tests, all green. Clippy clean.

Follow-up (not in this PR): Replace hand-rolled URL parser with url crate — reviewer is right that it's already transitively present and would eliminate the parser-differential class entirely. Scoping as a separate PR to keep this one focused on the IP-range guard.

risjai · 2026-04-21T07:19:42Z

Follow-up review — fix commit `5bd16fa7`

All 4 review items addressed. One minor false-positive in the new check, otherwise approve.

Status vs original review

#	Original ask	Status	Notes
1	Octal/hex/decimal IP bypass	✅	`looks_like_numeric_ip()` at url_guard.rs:73-78 rejects non-standard numeric forms before DNS. Tests: `rejects_octal_ip_form`, `rejects_hex_ip_form`, `rejects_decimal_ip_form`, plus 3 end-to-end.
2	Teredo/6to4 ranges	✅	`is_teredo` and `is_6to4` at url_guard.rs:170-182. Both check segment prefixes correctly. Tests: `blocks_teredo`, `blocks_6to4`, `e2e_rejects_teredo_v6`, `e2e_rejects_6to4_v6`.
3	Authority sanitization	✅	`parse_host_port` rejects `\`, `%`, and `< 0x20` at url_guard.rs:202-206. Tests: `rejects_backslash_in_authority`, `rejects_percent_in_authority`, `rejects_control_chars_in_authority`, `e2e_rejects_backslash_bypass`.
4	Async DNS	✅	`validate_export_endpoint` is now `async`, uses `tokio::net::lookup_host`. Call site in `api.rs` has `.await`. All tests migrated to `#[tokio::test]`.

New issue in the fix

🟡 `looks_like_numeric_ip` false-positive on all-hex hostnames

The check h.bytes().all(|b| b.is_ascii_hexdigit() || b == b'.' || b == b'x' || b == b'X') rejects any host composed entirely of hex chars and dots. This means:

cafe.babe → rejected (all hex digits + dot) — but this is a valid hostname
dead.beef → rejected
0xfeed.example.com → allowed (contains non-hex alphabetics like l, p)
aabb.cc → rejected (.cc is a real TLD, aabb is all hex)

These are extremely unlikely as OTel collector hostnames, so this is a marginal over-restriction, not a security bypass. But it's worth noting for correctness. A tighter check: require that the string starts with a digit or 0x/0X prefix — real numeric IPs always start with a digit, while hostnames starting with a letter are never ambiguous:

fn looks_like_numeric_ip(host: &str) -> bool {
    let first = host.bytes().next().unwrap_or(0);
    if !(first.is_ascii_digit()) { return false; }
    let h = host.strip_prefix("0x").or_else(|| host.strip_prefix("0X")).unwrap_or(host);
    !h.is_empty() && h.bytes().all(|b| b.is_ascii_hexdigit() || b == b'.')
}

This still catches 0177.0.0.1, 0x7f000001, 2130706433 but allows cafe.babe. Not a blocker — the current version is safe (over-rejects, never under-rejects).

Verification

Confirmed is_teredo checks segments()[0] == 0x2001 && segments()[1] == 0x0000 — correct for 2001:0000::/32
Confirmed is_6to4 checks segments()[0] == 0x2002 — correct for 2002::/16
Confirmed looks_like_numeric_ip("0177.0.0.1") → true, looks_like_numeric_ip("example.com") → false
Confirmed authority sanitization fires before userinfo stripping (correct order — backslash in userinfo portion is caught)
Confirmed validate_export_endpoint return type changed from Result<&str, String> to Result<(), String> — call site doesn't use the returned URL, so this is a clean simplification
Test count: 38 url_guard tests (was 23), all converted to #[tokio::test] where async

Verdict

Approve. All blocking bypasses closed. The cafe.babe false-positive is over-restriction, not under-restriction — safe to ship and tighten later if anyone hits it. Good work on the turnaround.

risjai commented Apr 21, 2026

View reviewed changes

vercel Bot deployed to Preview April 21, 2026 07:14 View deployment

risjai merged commit cc0a3f4 into master Apr 21, 2026
7 checks passed

risjai deleted the feat/ssrf-otel-export branch April 21, 2026 07:26

This was referenced Apr 21, 2026

feat(security): blob redaction, hop-by-hop filtering, query_raw lockdown (#3+4) #135

Merged

chore: bump versions — Rust 0.12.9→0.12.10, Python SDK 0.14.7→0.14.8 #136

Merged

feat(security): filesystem permissions + owner validation (#5) #137

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(security): SSRF guard for OTel export endpoint (CRITICAL-01)#134

feat(security): SSRF guard for OTel export endpoint (CRITICAL-01)#134
risjai merged 2 commits into
masterfrom
feat/ssrf-otel-export

risjai commented Apr 21, 2026

Uh oh!

vercel Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

risjai left a comment

Uh oh!

risjai commented Apr 21, 2026

Uh oh!

risjai commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

risjai commented Apr 21, 2026

Summary

Implementation

New file: crates/rewind-web/src/url_guard.rs

Changed: crates/rewind-web/src/api.rs

Updated: docs/security-audit.md

Known limitation: DNS rebinding

Tests

Test plan

Ship order

Uh oh!

vercel Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

risjai left a comment

Choose a reason for hiding this comment

Code Review — PR #134: SSRF guard for OTel export (CRITICAL-01)

Overview

🔴 Must fix before merge

1. Octal/hex/decimal IP bypass via to_socket_addrs parser differential

2. Missing Teredo (2001:0000::/32) and 6to4 (2002::/16) blocks

🟡 Should address

3. URL parser does not reject backslash, whitespace, or percent-encoded authority

4. to_socket_addrs is blocking I/O in an async context

🟢 Nits / optional

Test coverage

Security cross-check against the audit

Summary

Uh oh!

risjai commented Apr 21, 2026

Uh oh!

risjai commented Apr 21, 2026

Follow-up review — fix commit 5bd16fa7

Status vs original review

New issue in the fix

🟡 looks_like_numeric_ip false-positive on all-hex hostnames

Verification

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New file: `crates/rewind-web/src/url_guard.rs`

Changed: `crates/rewind-web/src/api.rs`

Updated: `docs/security-audit.md`

vercel Bot commented Apr 21, 2026 •

edited

Loading

1. Octal/hex/decimal IP bypass via `to_socket_addrs` parser differential

4. `to_socket_addrs` is blocking I/O in an async context

Follow-up review — fix commit `5bd16fa7`

🟡 `looks_like_numeric_ip` false-positive on all-hex hostnames