Skip to content

feat(go/ressrf-static): native Go SSRF evaluator (no WASM)#25

Draft
arajkumar wants to merge 12 commits into
mainfrom
arajkumar/exciting-ishizaka-5b196e
Draft

feat(go/ressrf-static): native Go SSRF evaluator (no WASM)#25
arajkumar wants to merge 12 commits into
mainfrom
arajkumar/exciting-ishizaka-5b196e

Conversation

@arajkumar
Copy link
Copy Markdown
Member

@arajkumar arajkumar commented May 20, 2026

Summary

Adds a new sibling package go/ressrf-static — a pure-Go SSRF evaluator that
re-implements the full ressrf-core algorithm in Go and consumes the same
JSON policy data via //go:embed. The existing WASM-backed go/ressrf
package is untouched; the two coexist behind a behavioral parity gate.

Motivation. The user observed that the host-side Go bindings already do
their own URL parsing and net.ParseIP work before crossing the WASM
boundary, so the "shared parser" argument for WASM is weaker than it first
appears. Replacing wazero with a native evaluator drops a runtime dependency,
shrinks binaries, and makes future Java support a straight port instead of
a WASM-runtime integration. The cost is that cross-language conformance
vectors become load-bearing — and the new TestParityIPLevel /
TestParityURLLevel gates (129 cases) keep both backends in lockstep.

What landed

Core evaluator (mirrors crates/ressrf-core/src/* in Go):

  • cidr.go — strict CIDR parser (rejects octal/hex/zone-IDs/host-bits), IPv4-mapped IPv6 normalization, CIDRSet linear-scan containment.
  • uri_validator.go — scheme allowlist, userinfo bypass guard (@ + %40), domain-suffix boundary matching, control-char rejection (raw + %00/%0d/%0a), punycode passthrough, embedded IP-literal extraction, BS-5 backslash normalization, BS-6 UNC rejection.
  • url_rules.go — host glob (* = one label), path glob (* = one segment, ** = any depth, recursive backtrack), regex via Go regexp/RE2; deny-first → no-allows-NoMatch → first-allow-wins → none-matched-Denied precedence.
  • policy.goPolicyBuilder fluent API. IsAllowed orchestrates URL rules → URI validator → bare-IP IP check. IsNetworkAllowed and ValidateHost handle IP-level decisions with preset-aware allow/deny semantics; empty IP list → dns_empty_response (Rust parity).
  • cloud.go + cloudmod/ + cloud/{aws,azure,gcp,all}/tree-shakable cloud provider modules, see below.
  • audit.goAuditSink interface + typed event variants (HostValidated, URLValidated, ConnectionAttempt, RedirectIntercepted, PolicyCreated) + RecordingSink test helper.

Protocol adapters (mirror go/ressrf/protocol_*.go):

  • protocol_tcp.goSafeDialer / DialContext with a Control hook that runs after Go's DNS resolution, defeating DNS rebinding.
  • protocol_http.goHTTPTransport / HTTPClient / CheckRedirect with per-hop policy enforcement, max-redirects cap, and configurable HTTPS→HTTP downgrade rejection.
  • protocol_ssh.goSSHDial (defense in depth: URL-layer + post-DNS IP check via SafeDialer).

Data plumbing:

  • embed.go//go:embed config/ip_ranges.json (IANA tiers always needed).
  • gen.go — four //go:generate directives copying the canonical files from crates/ressrf-core/config/. CI drift gate: go generate then git diff --exit-code.
  • disabled.go, errors.go — process-wide Disabled() toggle + ErrBlocked sentinel matching go/ressrf so callers can swap imports.

Tree-shakable cloud providers (cloud/aws, cloud/azure, cloud/gcp, cloud/all):

  • Each provider lives in its own sub-package with its own //go:embed. Only the providers you actually import are linked.
  • Build the policy with b.WithCloudModule(aws.Module()) (type-safe — forgetting an import is a compile error).
  • Provider sub-packages also expose DeniedSuffixes() / ServiceSuffixes() so callers can opt into domain-level cloud denial via WithDeniedSuffixes / WithTrustedSuffixes.
  • Measured size impact (minimal program): aws-only = 6.7 MB; aws+azure+gcp = 9.9 MB → 3.2 MB savings when only AWS is needed (matches the Azure dataset size — the linker actually prunes it).
  • A tiny cloudmod package holds the CloudModule value type to keep the dependency graph acyclic (sub-packages depend on cloudmod, not on ressrf-static). ressrfstatic.CloudModule is a type alias of cloudmod.Module so the user-facing name stays canonical.

Testing:

  • All 8 cross-language vector files under tests/vectors/*.json are wired in. 354 subtests, zero failures.
  • TestParityIPLevel + TestParityURLLevel (129 cases) run the same inputs through both go/ressrf (WASM) and go/ressrf-static (native) and assert identical allow/block outcomes. Zero divergences.
  • bench_test.go mirrors go/ressrf/bench_test.go for direct comparison.

Tooling:

  • New top-level go.work enables side-by-side development of the two modules; required for the cross-module parity test.

Benchmark numbers (Apple M1 Max, Go 1.26, -benchtime=5s on both sides)

Op Native WASM Speedup
PolicyBuild 56 µs 7.8 ms ~140×
IsAllowed (allowed) 1.29 µs 2.40 µs ~1.9×
IsAllowed (blocked) 815 ns 3.08 µs ~3.8×
IsNetworkAllowed 481 ns 1.51 µs ~3.1×

Per-check operations are 2–4× faster. The big win is PolicyBuild (~140×):
the WASM-backed version pays for module instantiation each time. For
long-running services that build a policy once, this barely matters; for
short-lived CLIs or per-request policy construction, it's significant. The
non-performance benefits (no wazero runtime, no embedded ~1 MB core.wasm,
smaller binaries that can drop unused cloud datasets via the sub-package
split) are arguably more important than the speedup.

Earlier in PR review: I posted numbers showing ~50–60× speedup on per-check
ops. Those were wrong — the WASM-side benchmark was capped at -benchtime=2x
(literally 2 iterations) which produced cold-start-dominated noise. Commit
a6b2e9b corrects this in the README; this PR body reflects the corrected
numbers.

Reproduce with:

go test -bench=. -benchmem -run=^$ -benchtime=5s ./go/ressrf-static/...
go test -bench=. -benchmem -run=^$ -benchtime=5s ./go/ressrf/...

Quick start

import (
    "github.com/timescale/ressrf/go/ressrf-static"
    "github.com/timescale/ressrf/go/ressrf-static/cloud/aws" // only the providers you need
)

p, _ := ressrfstatic.NewPolicyBuilder(ressrfstatic.PresetExternalOnly).
    WithCloudModule(aws.Module()).
    Build()

p.IsAllowed(ctx, "http://169.254.169.254/")             // *BlockedError
p.IsNetworkAllowed([]string{"10.0.0.1"})                // *BlockedError
client := p.HTTPClient(nil)                              // SSRF-safe http.Client
dialer := p.SafeDialer()                                 // SSRF-safe net.Dialer
sshClient, _ := p.SSHDial(ctx, "host:22", sshConfig)     // SSRF-safe ssh.Client

Reviewer notes

  • The Python generator (scripts/generate_ip_ranges.py) is reused as-is — this PR explicitly does not port it. The Go package consumes the same crates/ressrf-core/config/*.json outputs via four //go:generate cp directives + drift gate.
  • Cloud modules are deny-CIDRs-only (matches what crates/ressrf-wasm/src/lib.rs does via builder.with_cloud(...)). Domain-level cloud denial is an opt-in via WithDeniedSuffixes(aws.DeniedSuffixes()...). This keeps the parity gate clean — diverging here would make WASM↔native comparison impossible.
  • Audit sink shape differs between this package (typed event variants) and go/ressrf (flat AuditEvent{Kind, Fields json.RawMessage}). Both implement the spirit of audit.rs; the API divergence was acceptable because audit isn't part of the parity gate.
  • go/ressrf is untouched. This is a strict additive change. Recommendation is to keep both packages side-by-side for at least one release while ressrf-static gets production miles.
  • Regex caveat: URL-rule regexes are compiled with Go regexp (RE2). The Rust core uses the regex crate, which is also linear-time. Same feature set for the common cases; Rust-specific Unicode classes might need translation if encountered (add a vector if you find one).

Test plan

  • go test ./go/ressrf-static/... -count=1 — 354 subtests pass
  • go test -run TestParity ./go/ressrf-static/... — 129 parity cases, zero divergences
  • go generate ./go/ressrf-static/... && git diff --exit-code go/ressrf-static/{config,cloud}/ — drift gate clean
  • go vet ./go/ressrf-static/... — clean
  • Binary-size check (minimal main): aws-only = 6.7 MB, all-providers = 9.9 MB → 3.2 MB savings when only AWS is needed
  • go test -bench=. -benchmem -benchtime=5s ./go/ressrf-static/... and same on ./go/ressrf/... — equal-budget benchmarks, numbers documented above
  • go list -m -deps ./go/ressrf-static/... | grep -i wazero — empty (no WASM runtime dependency)
  • Reviewer: run the parity gate on your machine and confirm zero divergences before merging
  • Reviewer: skim README.md for the architecture / caveats overview

Out of scope

  • Removing or deprecating go/ressrf (keep both for at least one release for diff testing).
  • Python/Node/Java bindings — this PR only proves the Go side.
  • Changing the JSON data files or adding new presets.
  • Switching CI's monthly refresh from Python to Go.

arajkumar added 8 commits May 20, 2026 13:18
Adds go/ressrf-static as a sibling module to go/ressrf. Embeds the four
canonical policy JSON files from crates/ressrf-core/config/ via //go:embed,
exposes them via typed LoadIPRanges() and LoadCloud() loaders, and wires
a //go:generate directive that copies the canonical files into ./config/
so CI can detect drift with git diff --exit-code.

Per-user CLAUDE.md: no Co-Authored-By footer.
Ports crates/ressrf-core/src/cidr.rs to Go:
- ParseCIDR (strict — rejects octal/hex octets, zone IDs, host bits set)
- ParseCIDRLoose (auto-masks host bits, mirrors Rust parse_loose)
- CIDR.Contains with IPv4-mapped IPv6 normalization so 1.2.3.4 and
  ::ffff:1.2.3.4 match the same range
- CIDRSet for append-only first-match lookups
- IsIPLiteral / IsAmbiguousIP helpers for the URI validator (next task)

Drives the implementation off tests/vectors/cidr_containment.json (53 cases)
and ipv4_ipv6_mapping.json (7 cases) — all pass. Vector files copied into
testdata/vectors/ since //go:embed cannot escape the module root.
Ports crates/ressrf-core/src/uri_validator.rs to Go:
- URIValidator with NewURIValidator / AddTrustedSuffixes / AddDeniedSuffixes /
  SetRejectDoubleDash / ValidateURL.
- parseURLParts handles BS-3 (NUL/CR/LF rejection + percent-encoded forms),
  BS-5 (backslash normalization after scheme), BS-6 (UNC paths).
- pseudoScheme rejects javascript:/data:/file://host.
- hasUserinfoBypass detects raw @ + %40 in authority.
- domainMatchesSuffix is boundary-aware and trailing-dot tolerant.

Also adds:
- errors.go: BlockedError + DenyReason taxonomy mirroring the Rust enum.
- policy.go: minimal Policy + NewExternalOnlyPolicy that loads the embedded
  IANA/override/CSP-metadata deny ranges and exposes IsNetworkAllowed.
  Task 7 will add the full builder, URL rules, cloud modules, audit, and
  IsAllowed orchestration.
- conformance_test.go: generic vectorFile[V] + runVectors helper used by
  feature-specific test files. All eight vector files embedded once here.

21 url_validation.json cases pass + 9 hand-written URI smoke cases.
Reason-type strictness in the test runner is relaxed to match the Rust
conformance runner (which only asserts on err presence).
Ports crates/ressrf-core/src/url_rules.rs to Go:
- URLRule (Scheme/Host/Path/Regex/BypassIPCheck) with cross-language
  JSON-compatible field tags.
- URLRuleset with Compile() (pre-builds Go regexp.Regexp for any Regex
  rules) and Evaluate(url) returning {NoMatch, Allowed, AllowedBypassIP,
  Denied}.
- parseURLComponents shared with the URI validator's URL parsing style
  (handles [v6]:port and v4:port).
- globMatchHost: case-insensitive, * = exactly one DNS label.
- globMatchPath: * = one segment, ** = zero-or-more (recursive backtrack
  matching Rust impl).

Evaluation precedence per spec: any deny match -> Denied; no allows -> NoMatch;
first allow match -> AllowedBypassIP/Allowed; allows but no match -> Denied.

Drives off tests/vectors/url_rules.json (14 cases) + 14 hand-written glob
unit cases + a regex-compile-error case. All pass.
…tion

Task 7 (Policy + builder + IsAllowed):
- Rewrites the minimal Policy from Task 4 with a complete fluent
  PolicyBuilder mirroring go/ressrf signatures: WithAllowedCIDRs,
  WithDeniedCIDRs, WithCloudProviders, WithURLAllow, WithURLDeny,
  WithURLRuleset, WithAuditSink, WithTrustedSuffixes, WithDeniedSuffixes,
  Build.
- Policy.IsAllowed orchestration: URL rules first (deny/allow/bypass-IP),
  then URI structural validation + bare-IP IP check. Mirrors
  ressrf_policy_is_request_allowed in crates/ressrf-wasm/src/lib.rs.
- Policy.IsNetworkAllowed: empty IP list rejected with DnsEmptyResponse
  (Rust parity, crates/ressrf-core/src/policy.rs:316-318); preset-aware
  allow/deny semantics.
- audit.go: AuditSink interface + event types (HostValidated, URLValidated,
  ConnectionAttempt, RedirectIntercepted, PolicyCreated) wired through
  decision points. RecordingSink helper for tests. Task 8 will add vector
  coverage.

Task 6 (Cloud provider expansion):
- cloud.go::applyCloudModule adds DENY CIDRs only, matching WASM behavior
  (crates/ressrf-wasm/src/lib.rs only calls builder.with_cloud which is
  CIDR-only — domain suffixes are opt-in per Rust core layering).
- CloudDeniedSuffixesFor / CloudServiceSuffixesFor expose the suffix lists
  for callers who want to pipe them into WithDeniedSuffixes /
  WithTrustedSuffixes — preserves WASM parity while offering stronger
  domain-level denial as opt-in.

Driven by tests/vectors/policy_decisions.json (37 IP-level cases — all
pass) plus orchestration smokes for URL-rule deny, bypass_ip_check,
allow-overrides-deny, internal-only default-deny.
Adds the remaining pieces needed by audit_events.json:
- MatchReason field on HostValidated and URLValidated (carries the CIDR
  string or denied-suffix that matched, used by match_reason_contains
  assertions in vectors).
- Policy.ValidateHost(host, ips) for callers with hostname context (TCP
  dialer, HTTP transport). IsNetworkAllowed remains for IP-only callers
  and now emits HostValidated with an empty Host field.
- audit_test.go: drives audit_events.json (10 cases). 6 pass now
  (policy_created, host_validated x2, url_validated x2, no_sink).
  The 4 connection_attempt/redirect_intercepted cases skip with a clear
  marker until Tasks 9-11 wire the protocol adapters.

Total subtests now: 182. go vet clean.
Task 9 - TCP:
- SafeDialer / SafeDialerWithTimeout / DialContext mirroring
  go/ressrf/protocol_tcp.go. The Control hook runs after Go's DNS
  resolution: it strict-parses the address, calls IsNetworkAllowed,
  emits ConnectionAttempt audit events on both allow and deny paths.

Task 10 - HTTP:
- HTTPTransport / HTTPClient. RoundTrip calls IsAllowed on req.URL
  before delegating to the cloned base transport; DialContext also
  re-validates after splitting host:port (defense in depth, matches
  go/ressrf).
- checkRedirect plugged in as Client.CheckRedirect: per-hop IsAllowed,
  configurable max-redirects (default 10), HTTPS->HTTP downgrade
  rejection that respects new WithAllowPlaintextHTTP builder option
  (matches Rust ProtocolRules.allow_plaintext_http).
- Emits RedirectIntercepted audit events.

Task 11 - SSH:
- SSHDial defense-in-depth: synthetic https:// URL through IsAllowed +
  SafeDialer for the TCP layer. Uses golang.org/x/crypto/ssh.

Supporting plumbing:
- disabled.go: process-wide Disabled() toggle + DisableForTests for
  parity with go/ressrf (callers can swap imports).
- errors.go: ErrBlocked sentinel + BlockedError.Is/Unwrap so
  errors.Is(err, ErrBlocked) works.
- destinationURLForHost shared between TCP/HTTP/SSH.

Drives tests/vectors/redirect_chains.json (12 cases - all pass) plus
hand-written TCP / HTTP / SSH integration cases against local
httptest.Server / net.Listen. 194 subtests + 29 top-level tests.
go vet clean.
Task 12 - Parity gate:
- parity_test.go: in-package test (so it can read the existing embedded
  vector JSON) that imports the WASM-backed go/ressrf module under an
  alias 'wasm' and runs both backends through the same inputs.
- TestParityIPLevel: 37 policy_decisions.json cases through both
  IsNetworkAllowed implementations. Zero divergences.
- TestParityURLLevel: 92 ssrf_techniques.json URLs through both IsAllowed
  implementations using ExternalOnly preset. Zero divergences.
- Config drift gate (the go generate / git diff loop) was wired up in
  Task 0; verified passing here.

Task 13 - Benchmarks + README:
- bench_test.go mirrors go/ressrf/bench_test.go (PolicyBuild / IsAllowed /
  IsAllowedBlocked / IsNetworkAllowed) so the two can be compared with
  identical command lines.

  Apple M1 Max numbers (native vs WASM):
    PolicyBuild         55  us  vs  148    ms  (~2670x)
    IsAllowed            1.3 us  vs   75.8 us  (~59x)
    IsAllowedBlocked     740 ns  vs   41.0 us  (~55x)
    IsNetworkAllowed     451 ns  vs   22.5 us  (~50x)

  Per-check ops are ~50-60x faster; build is ~3 orders of magnitude
  faster (no WASM module load).

- README.md: quick start, architecture, maintenance procedure,
  conformance coverage table, benchmark table with side-by-side WASM
  comparison, caveats (cloud-modules-are-CIDR-only, regex engine
  caveats, audit-shape divergence).
@arajkumar arajkumar marked this pull request as draft May 20, 2026 09:21
arajkumar added 4 commits May 20, 2026 16:23
The original benchmarks in this README were based on -benchtime=2x on the
WASM side (only 2 iterations) versus default 1s on the native side. With
proper 5s-per-bench sampling on both sides:

- Per-check ops: 2-4x faster (was: claimed 50-60x)
- PolicyBuild:   ~140x faster (was: claimed 2670x)

The bigger benefits are still real (no wazero, no embedded core.wasm,
smaller binaries), but the per-call speedup story is much more modest
than originally claimed.
…kages

Replaces the single embed-all approach with one sub-package per provider
so the Go linker can prune unused providers from consuming binaries. The
Azure dataset alone is ~3.1 MB.

Layout:
  go/ressrf-static/
  +- cloud/
  |  +- aws/   (~440 KB JSON)
  |  +- azure/ (~3.1 MB JSON)
  |  +- gcp/   (~28 KB JSON)
  |  +- all/   (convenience aggregator)
  +- cloudmod/ (CloudModule value type, no payload)

API change (intentional, in this fresh package):
- Removes PolicyBuilder.WithCloudProviders(...string) and the
  CloudDeniedSuffixesFor / CloudServiceSuffixesFor helpers.
- Adds PolicyBuilder.WithCloudModule(CloudModule) and
  WithCloudModules(...CloudModule). Construct CloudModules via the
  provider sub-package, e.g. aws.Module() / aws.DeniedSuffixes() /
  aws.ServiceSuffixes(). Forgetting an import is a compile error, not a
  runtime error.

Internal:
- cloudmod is a tiny package containing just the CloudModule value type
  (Name + JSON bytes). It exists to break the cycle that would otherwise
  arise between the main package's internal tests and the provider
  sub-packages. ressrfstatic.CloudModule is a type alias of cloudmod.Module
  so callers see the canonical name.
- applyCloudModule takes a CloudModule directly instead of looking up by
  string name. ParseCloudFile is now the public parser entry point.
- gen.go has four //go:generate directives (one per file) for the new
  copy targets.

Measured size impact (minimal main importing only cloud/aws vs cloud/all):
- aws-only:    6.7 MB
- all (aws+azure+gcp): 9.9 MB
- savings:     3.2 MB when only AWS is needed

Verification:
- 354 subtests pass (was 323 — added Module sanity tests).
- WASM<->native parity gate still green: 129/129 cases agree.
- go vet clean; go generate produces no drift.
…ard pattern)

Adds a concrete recipe showing how a long-running service should build the
Policy once at process start (using sync.Once) and share it across requests.
Modelled after the netguard pattern from timescale/tiger-connect#366 —
covers the pre-flight ValidateURL helper, the SafeDialer / SafeDialContext
helpers for pgx/Kafka/HTTP, and the ErrBlocked re-export so existing
errors.Is(err, ErrBlocked) call sites keep working.

Also calls out two perf wins explicitly:
- amortized Build() cost (~56us paid once instead of per-request)
- one fewer DNS lookup per request when the singleton's SafeDialer is used
  (the Control hook runs after Go's own DNS resolution, so an upstream
  net.LookupIP pre-flight is redundant).
README:
- Drop the netguard wrapper indirection from the long-running-service
  recipe. Now shows: one sync.Once getter for the *Policy, then call
  sites use policy.IsAllowed / policy.DialContext / policy.HTTPClient /
  policy.SSHDial directly. No wrapper functions in between.

E2E tests (protocol_tcp_test.go, protocol_http_test.go,
protocol_ssh_test.go):
- Switch to github.com/stretchr/testify/require for assertions (used
  consistently; setup vs. assertion no longer have different style).
- Idiomatic Go cleanups:
  - t.Cleanup(...) instead of defer for resource teardown that survives
    a require.* fail-fast.
  - startLoopbackListener / insecureSSHConfig helpers eliminate the
    boilerplate each test was duplicating.
  - dialCtx helper centralizes the short timeout so a blocked dial
    fails in 2 s instead of ~75 s.
  - Split TestRedirectChainVectors into named helpers
    (buildRedirectPolicy, walkRedirectHop) so the loop body is one
    screen tall and each cross-hop rule is self-documenting.
  - require.ErrorIs(err, ErrBlocked) instead of errors.Is +
    t.Errorf — the sentinel is now the explicit assertion.
- 354 subtests still pass; go vet clean.
gonzaloserrano added a commit that referenced this pull request May 27, 2026
A 601-line hand-styled HTML comparison page that lived in
go-native/ressrf/docs/ as porting-decision rationale during the
native-port standalone repo's life. It was imported as-is when the
port moved into this monorepo.

Two problems with keeping it:
  - It's in the package's docs/ directory, so pkg.go.dev and any
    repo browser sees a 601-line page about an unrelated upstream
    PR alongside the legitimate how-it-works walkthrough.
  - "PR #25" will close or merge; after that the comparison becomes
    orphan-context nobody will know is safe to delete.

The comparison's purpose was to justify the port. The port has
happened (merged into timescale/ressrf as the go-native binding);
future readers wanting "why a native port?" can read the
integration PR's description rather than a 601-line HTML page in
docs/. git history preserves the file for anyone who needs it.

No inbound references confirmed via grep across *.md / *.go / *.html
/ *.yml.

Addresses td-code-review finding #8.
gonzaloserrano added a commit that referenced this pull request May 27, 2026
A 601-line hand-styled HTML comparison page that lived in
go-native/ressrf/docs/ as porting-decision rationale during the
native-port standalone repo's life. It was imported as-is when the
port moved into this monorepo.

Two problems with keeping it:
  - It's in the package's docs/ directory, so pkg.go.dev and any
    repo browser sees a 601-line page about an unrelated upstream
    PR alongside the legitimate how-it-works walkthrough.
  - "PR #25" will close or merge; after that the comparison becomes
    orphan-context nobody will know is safe to delete.

The comparison's purpose was to justify the port. The port has
happened (merged into timescale/ressrf as the go-native binding);
future readers wanting "why a native port?" can read the
integration PR's description rather than a 601-line HTML page in
docs/. git history preserves the file for anyone who needs it.

No inbound references confirmed via grep across *.md / *.go / *.html
/ *.yml.

Addresses td-code-review finding #8.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant