Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/PROJECT-MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ Stateless WebSocket router between mobile clients and pyry binaries. Internet-ex
- **Type-owned background goroutine: `ticker + done chan + WaitGroup`, idempotent + synchronous `Close`.** When a type owns a long-lived bookkeeping goroutine (sweep/eviction loop, not a per-conn read pump), start it from the constructor and stop it via the type's own `Close()`. Shape: `closeOnce.Do(func(){ close(done) })` then `wg.Wait()` **outside** the once — putting `wg.Wait()` inside would let a second concurrent caller return early while shutdown is still in flight. The wiring site registers `defer x.Close()` at the composition root. Adopted in `IPRateLimiter` (#50). Distinct from "Per-conn goroutines exit cleanly via LIFO defers" — that pattern is for goroutines whose lifecycle is owned by an HTTP handler; this one is for goroutines whose lifecycle is owned by the type itself.
- **When one timestamp must serve two semantic purposes, split it.** In `IPRateLimiter` (#50), refill math needs an anchor that advances in token-sized chunks (preserving fractional elapsed time for the next call) while eviction needs an anchor that updates on every poll (so an actively-hammered/denied bucket isn't dropped and recycled for a fresh burst). A single `last`-of-anything cannot do both without one purpose corrupting the other. Two fields (`refillLast`, `pollLast`) decouple them at zero structural cost and make a load-bearing security invariant (no free-burst-recycling) structurally true rather than carefully maintained. Generalisable: whenever a timestamp's "should it update here?" question gets conflicting answers from different callers, the answer is two timestamps.
- **Injectable wall-clock as an unexported `now func() time.Time` field, defaulting to `time.Now`.** When a type's correctness depends on `time.Now` and tests need determinism without sleeping, give the struct a package-private `now func() time.Time` field set to `time.Now` in the constructor; in-package tests assign a fake (`l.now = fakeClock.Now`) directly. No exported `WithClock` option, no clock interface, no constructor variants. The single eviction goroutine reads `l.now` on every tick; assignment before the first tick is race-free in practice and `-race` tests cover the steady-state. Adopted in `IPRateLimiter` (#50); applies to any future timer-driven primitive whose tests need fake-clock determinism.
- **Empty-string return as the "deny" signal on a pure security primitive — no `(string, error)` ceremony.** When every caller would handle the error identically (treat as deny), an `(value, error)` pair pushes a meaningless `if err != nil` into every call site. Instead: document the empty-string contract in the function's docstring, name the deny semantics explicitly, and let the wiring ticket enforce it in code. The helper performs no policy; the caller's `if ip == "" { deny }` is the policy. Adopted in `ClientIP` (#51). Applies to any pure security primitive whose output is keyed (rate-limit key, bucket id) and whose only failure mode is "no usable input."
- **Two helpers diverge on contract, not on mechanics → add a new export, don't mutate the existing one.** `internal/relay/server_endpoint.go`'s unexported `remoteHost` (logging helper: falls back to `r.RemoteAddr` verbatim on parse error, no XFF awareness) and the new `ClientIP` (security primitive: empty-string-as-deny, opt-in XFF trust) share the same `net.SplitHostPort` mechanics but have structurally different contracts. Mutating `remoteHost` to add XFF and change its empty-string semantics would have broken its log-line callers; adding a new export preserves both contracts and lets the wiring ticket decide whether logging migrates. Pattern: when a new caller needs the same mechanics with a *different* contract (deny semantics, validation strictness, return shape), the answer is a new export, not a parameter on the old one. Adopted in `ClientIP` (#51).

## Conventions

Expand Down
34 changes: 34 additions & 0 deletions docs/knowledge/codebase/51.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Ticket #51 — client-IP extraction helper for HTTP requests

Pure helper that produces the source IP for a request, in a form suitable for keying a rate-limit bucket. Standalone primitive: ships the function and its test table only, no callers wired. Feeds the per-IP rate-limiter (#50) once the wiring ticket lands; the split exists so the trust-model nuance (when to read `X-Forwarded-For`, when to ignore it) gets its own review pass before any handler depends on it.

## Implementation

- **`internal/relay/client_ip.go`** — exported `ClientIP(r *http.Request, trustForwardedFor bool) string`. Pure function, no I/O, no state.
- `trustForwardedFor=false`: returns `net.SplitHostPort(r.RemoteAddr)` host portion. Empty string on parse error.
- `trustForwardedFor=true`: takes the left-most `X-Forwarded-For` entry (`strings.IndexByte(',')` + `strings.TrimSpace`), falls back to `RemoteAddr` if the header is absent, empty, or yields no non-empty first entry.
- Empty-string return is the only "no usable source" signal — callers MUST treat as deny. No `(string, error)` pair; every caller would handle the error identically.
- No canonicalisation: the returned string is the raw host portion as it appears on the wire (no IPv6 lower-casing, no zone-id stripping). Two different zones (`fe80::1%eth0` vs `fe80::1%eth1`) legitimately denote different interfaces; collapsing them is the wrong default. Docstring names the consequence — log emitters must `strconv.Quote` to avoid log injection from embedded control bytes.
- **`internal/relay/client_ip_test.go`** — 13-row table covering `RemoteAddr` (IPv4 with port, IPv6 bracketed, IPv6 loopback, malformed no-port, empty), XFF-disabled-header-ignored, XFF-enabled (single, multi-entry, leading whitespace), and the four fallback rows (absent header, empty header, whitespace-only first entry, both sources malformed → `""`). Constructed via `httptest.NewRequest` with `r.RemoteAddr` and `r.Header` set directly — no real listener.
- **No changes to `cmd/pyrycode-relay/main.go`** — per the AC, this is the primitive only. The unexported `remoteHost` helper in `server_endpoint.go:129-136` is **unchanged**: it remains the logging helper (falls back to `r.RemoteAddr` verbatim on parse error). The two helpers diverge on contract (logging vs rate-limit key), so the new helper is a new export rather than a mutation of `remoteHost`. Whether to migrate logging is the wiring ticket's call.

## Trust model

The bool parameter is the entire policy surface. The helper itself performs no trust decision; the caller passes the bit it has determined from operator configuration. Two named threats:

- **Direct-facing deployment, `trustForwardedFor=true` by mistake.** Adversarial phone forges `X-Forwarded-For: 127.0.0.1`; every request reports `127.0.0.1`; rate-limit collapses to one bucket. Self-DoS via misconfiguration, not privilege escalation. The wiring ticket is responsible for defaulting the bit to `false`.
- **Trusted-proxy deployment, proxy misconfigured.** Phone sets `X-Forwarded-For: victim-ip` *before* the proxy; the proxy must overwrite or append. The relay-side helper cannot defend against an upstream that doesn't sanitise — that's the proxy's responsibility. Documented in the docstring.

## Out of scope

- Wiring into any handler / `cmd/pyrycode-relay/main.go` — the rate-limit wiring ticket's job.
- Trusted-proxy CIDR allowlist (right-most-trusted hop chain). Today's bool is flat all-or-nothing. The hop-aware variant is a future ticket once a real proxy is chosen.
- `X-Real-IP` support. Separate (proxy-specific) convention; deferred to the wiring ticket.
- Multi-header `X-Forwarded-For` (two separate `X-Forwarded-For:` headers on one request). `r.Header.Get` returns only the first; if a real proxy is observed emitting two-header XFF, the fix is `r.Header.Values` + `strings.Join` — two lines, one new test row. Deferred until observed (per *Evidence-Based Fix Selection*).
- Canonicalisation, zone-id stripping, control-byte sanitisation — deliberate non-policy on a security primitive.

## Cross-links

- [Threat model § DoS resistance](../../threat-model.md) — the "Future hardening: per-IP rate limit" line this helper enables.
- [Codebase: #50 per-IP rate-limiter](50.md) — the sibling primitive this helper produces keys for.
- [Lesson: `r.Header.Get` returns the first header only](../../lessons.md) — the multi-header XFF gotcha deferred above.
12 changes: 12 additions & 0 deletions docs/lessons.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,18 @@ After crediting `tokensAdd := int(elapsed / refillEvery)` tokens, the natural-lo

The natural shape for "idempotent close that stops one goroutine" is `closeOnce.Do(func(){ close(done); wg.Wait() })`. That works for a single caller, but a second concurrent `Close()` skips the `Do` body entirely and returns immediately — while the goroutine is still running and the first caller is still inside `wg.Wait()`. Caller two now races past `Close` thinking shutdown is complete. Hoist `wg.Wait()` outside the once: `closeOnce.Do(func(){ close(done) }); wg.Wait()`. Now both callers block on the WaitGroup until the goroutine exits; both return synchronously. Matches the "Close is idempotent and synchronous" expectation on `WSConn.Close`. Source: `IPRateLimiter.Close` (#50).

## `r.Header.Get("X-Forwarded-For")` returns only the FIRST header value when XFF arrives as multiple separate headers

`r.Header` is a `map[string][]string`. `Get` returns `values[0]`. If a request carries `X-Forwarded-For: a` and a second `X-Forwarded-For: b` as distinct headers (legal per RFC 7230 §3.2.2 for list-valued headers), `Get` returns only `a` and the helper takes the left-most token of that. Most real proxies emit a single comma-joined `X-Forwarded-For`, so this is structurally fine *today*; the de-facto-correct fix when a real proxy emits two-header XFF is `r.Header.Values("X-Forwarded-For")` + `strings.Join(..., ",")` before splitting. Documented as deferred in `ClientIP`'s spec under § Open questions; not fixed pre-emptively per *Evidence-Based Fix Selection*. Source: `internal/relay/client_ip.go` (#51).

## For allocation-free "first comma-separated token" parsing, use `strings.IndexByte` + slice, not `strings.SplitN`

`strings.SplitN(s, ",", 2)` allocates a `[]string` of length 1 or 2 every call. At rate-limit-decision frequency (one call per WS upgrade), the allocation isn't free. The equivalent `if i := strings.IndexByte(s, ','); i >= 0 { s = s[:i] }` is a single slice-header rewrite — no heap traffic, identical semantics, identical readability. Use `SplitN` when you need *both* halves; use `IndexByte` + slice when you only need the prefix. Source: `ClientIP` (#51).

## `net.SplitHostPort` is the right tool for "host portion of a `host:port` string" — handles `[ipv6]:port` and errors cleanly on malformed input

Manual `strings.LastIndex(s, ":")` parsing breaks on bracketed IPv6 literals (`[::1]:443`). `net.SplitHostPort` handles both forms, strips the brackets from IPv6 (returning `::1`, not `[::1]`), and returns a non-nil error on missing colon or empty input — which doubles as the discriminator for "I have a usable source IP" vs "I do not." Prefer it over hand-rolled splitting whenever both IPv4 and IPv6 are in scope (i.e. almost always). Same call sites: `ClientIP` (#51), `EnforceHost` (#9), the unexported `remoteHost` logging helper (`server_endpoint.go`).

## Map iteration with `delete(m, k)` is spec-safe; `for range` snapshot is not required

The Go spec guarantees that deleting the current key during `for k, v := range m { ... delete(m, k) ... }` is safe — the deleted entry is omitted from subsequent iterations, and entries not yet visited may or may not be visited (which is fine when you're filtering by a predicate that any-order eviction satisfies). No need to build a `toDelete` slice first. Source: `IPRateLimiter.sweep` (#50); confirmed by the Go spec §"For statements with `range` clause".
Loading