feat(relay): upgrade-attempt and register-failure counters (#57)#90
Conversation
Adds two Prometheus CounterVecs to the WebSocket-upgrade call sites so operators can distinguish flood patterns (malformed headers, 4409 conflicts, 4404/4429 phone failures, rate-limit denies) without log-grepping. - internal/relay/metrics_upgrade.go: new UpgradeMetrics type, nine event- named nil-safe methods, two WrapXxxRateLimitDeny status-code observers. All 16 label cells pre-bound; 3 unreachable cells exposed at 0 for a stable scrape shape. - internal/relay/server_endpoint.go: ServerHandler gains a *UpgradeMetrics parameter; three increment sites (header-reject, 4409 conflict, accept). - internal/relay/client_endpoint.go: ClientHandler gains a *UpgradeMetrics parameter; four increment sites (header-reject, 4404, 4429, accept). - cmd/pyrycode-relay/main.go: constructs UpgradeMetrics, stacks WrapServerRateLimitDeny / WrapClientRateLimitDeny OUTSIDE the rate- limit middleware so the deny observer sees the 429. - internal/relay/metrics_upgrade_test.go: five tests covering every terminal path, the co-increment invariant, the full 16-series scrape, and the no-DefaultRegisterer-leak structural defence. - Existing handler tests: seven mechanical `, nil` appends, no behaviour change. - docs/knowledge/codebase/57.md: codebase summary. make vet, make test -race, make build clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Code Review: #57Decision: PASS FindingsNone blocking. Two notes:
SummaryClean implementation of the architect's design. Two
Wire-up: Blast-radius check (codegraph_callers + grep): 9 prod+test call sites of Tests: five tests in Security goggles (
|
…EX entry (#57) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
What
Adds two Prometheus
CounterVecs at the WebSocket-upgrade call sites:pyrycode_relay_ws_upgrade_attempts_total{endpoint, outcome}— 2 endpoints × 6 outcomes, 12 cells pre-bound (9 reachable, 3 structurally unreachable exposed at 0 for a stable scrape shape).pyrycode_relay_register_failures_total{kind}— 4 cells, co-incremented in lockstep with their pairedoutcome=reject_*cells via composite event-named methods.Total: 16 series, label cardinality fixed and budgeted.
Issue
Closes #57. Slice of the #56 metrics rollout (after #59 scaffolding, #61 gauges, #60 listener, #58 frame/grace counters).
Testing
internal/relay/metrics_upgrade_test.go— five tests:TestUpgradeMetrics_ServerEndpoint_TerminalPathstable-drives the four server terminals (accept,reject_headers,reject_409,reject_rate_limit) and asserts the exercised cell at 1 plus every other endpoint=server cell at 0 (no-double-increment proof).TestUpgradeMetrics_ClientEndpoint_TerminalPathsanalogous over the five client terminals (accept,reject_headers,reject_404,reject_429,reject_rate_limit).TestUpgradeMetrics_RegisterFailures_CoIncrementdrives each composite method once and pins the "sum on kind equals sum on the matching outcome" invariant.TestUpgradeMetrics_AllSixteenSeries_Exposedadvances every reachable method, then asserts all 16 series exist in the scrape (9 reachable at 1, 3 unreachable at 0, 4 failure kinds at their composite totals).TestUpgradeMetrics_NoGlobalRegistrarLeakpins ADR-0008 § Scope of use locally., nilappends in existing handler tests, no behaviour change.go test -race ./...,go vet ./...,go build ./...all clean.Architecture compliance
docs/specs/architecture/57-upgrade-and-register-failure-counters.mdverbatim:NewRateLimitMiddlewarebecause the latter would cross the 11-call-site edit-fan-out red line).*UpgradeMetricsparameter onServerHandler/ClientHandler(7 mechanical test-side appends, under the 10-call-site red line).statusObserverResponseWriterimplementshttp.Hijackerby delegation sowebsocket.Acceptdownstream can still upgrade on the success branch.docs/knowledge/codebase/57.md.docs/knowledge/INDEX.mdleft to the documentation phase per the developer-agent contract.🤖 Generated with Claude Code