Skip to content

relay: structured-log compliance test — assert log calls never carry forbidden fields #36

@ilmoniemi

Description

@ilmoniemi

User Story

As a relay maintainer, I want a Go test that fails when any structured-log call carries a key outside an explicit allowlist, so that the docs/threat-model.md § Log hygiene rule is enforced at go test time rather than during production-log review.

Context

docs/threat-model.md § Log hygiene (lines 42–55) enumerates fields the relay must NEVER log — token values, raw inner-frame payloads, full request headers, full URL paths that may carry tokens — and the fields it MAY log. Today this is enforced by code review only. A single drift in a future PR can reintroduce a leak that nobody notices until production logs are audited. docs/lessons.md:45 captures the existing "defence is layered: spec, code review, structural absence" posture; this ticket adds the missing fourth layer: a deterministic test.

The relay currently has 14 logger.{Info,Warn,Error,Debug} call sites across internal/relay/client_endpoint.go, internal/relay/forward.go, and internal/relay/server_endpoint.go. Keys appear both inline with the message argument and on subsequent lines (multi-line calls are common).

Acceptance Criteria

  • A new test under internal/relay/ walks every non-test .go file in the package, locates each slog.Logger call (Info, Warn, Error, Debug) — including multi-line calls with keys on subsequent lines — extracts the string-literal keys, and fails if any key is absent from the allowlist.
  • An allowlist source (e.g. internal/relay/log_allowlist.go) holds the canonical set of permitted keys; the test reads from it so adding a key forces editing this file in the same commit.
  • The test passes against the current main codebase — every existing key is added to the allowlist as part of this ticket.
  • On failure, the message names file path, line number, and the offending key, so a developer sees exactly what to fix without rerunning under -v.
  • docs/threat-model.md § Log hygiene gains a one-line cross-reference pointing readers to the allowlist source as the canonical enforcement point.

Technical Notes

Two viable approaches; architect picks:

  1. AST walk (go/ast + go/parser) — precise; handles multi-line calls cleanly and can also flag non-literal key arguments as suspect (a slog.String(dynamicKey, v) call defeats an allowlist).
  2. Regex / structural search — cheap to write, brittle on multi-line calls. Listed for completeness; AST is the expected pick.

The allowlist is the canonical rule source; the threat-model entry becomes a pointer. This is deliberate: the allowlist sits next to the log call sites, so the friction of touching it lands on the developer adding a key — exactly as the threat model intends.

The architect should also decide how the test treats non-string-literal keys (fail loud vs. skip with a TODO) and whether the allowlist lives in a normal .go source or a _test.go fixture; the body's log_allowlist.go suggestion is illustrative, not binding.

Out of Scope

  • Runtime log redaction middleware (fix at the call site, not after the fact).
  • Multi-instance log aggregation (deployment concern, not source-code concern).
  • Extending the test to cmd/pyrycode-relay/ — that surface has handler construction only, no log call sites worth gating; revisit if it grows.

Size Estimate

S. One new test file (AST walk + assertions, ~100–150 lines), one new allowlist source (~30 lines), one-line doc cross-reference. Three files touched, 5 AC, single package — within S limits.

Metadata

Metadata

Assignees

No one assigned

    Labels

    security-sensitiveTouches auth, crypto, or internet-exposed input pathssize:sSmall ticket: <100 lines production code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions