Skip to content

feat(relay): /v1/client WS upgrade with header gate and phone register (#5)#24

Merged
ilmoniemi merged 4 commits into
mainfrom
feature/5
May 9, 2026
Merged

feat(relay): /v1/client WS upgrade with header gate and phone register (#5)#24
ilmoniemi merged 4 commits into
mainfrom
feature/5

Conversation

@ilmoniemi
Copy link
Copy Markdown
Contributor

What

Implements /v1/client, the phone-side WebSocket upgrade endpoint. Validates X-Pyrycode-Server, X-Pyrycode-Token, and User-Agent headers pre-upgrade; on success registers the phone in the registry under the requested server-id and holds the connection open until the peer (or the registry, on binary-grace expiry) closes it. Closes WS code 4404 ("no server with that id") when no binary holds the slot.

The relay treats x-pyrycode-token as opaque — read once for the presence check, then out of scope. Never parsed, compared, or logged. Token verification is the binary's job.

Issue

Closes #5.

Files

  • New: internal/relay/client_endpoint.go — exports ClientHandler(reg, logger). Reuses randHex8 and remoteHost from server_endpoint.go verbatim.
  • New: internal/relay/client_endpoint_test.go — six tests, mirrors server_endpoint_test.go harness (startClient, dialWithClient, validClientHeaders, seedBinary).
  • Edit: cmd/pyrycode-relay/main.go — one-line mux registration next to /v1/server.

Testing

Six new tests covering each AC bullet:

  • TestClientEndpoint_ValidUpgrade_RegistersPhone — phone registers under the binary's server-id; ConnID() is client-<id>-<8 hex chars>.
  • TestClientEndpoint_HeaderGate_400 — table-driven over the three required headers (missing or empty); each row asserts 400 + zero registry mutations.
  • TestClientEndpoint_NoBinary_4404 — no binary seeded → close 4404 with the literal reason "no server with that id".
  • TestClientEndpoint_PeerClose_UnregistersPhone — peer close removes the phone from the registry; binary entry untouched.
  • TestClientEndpoint_MultiplePhones_IndependentLifecycle — three phones register; closing the middle leaves the other two; closing in any order works.
  • TestClientEndpoint_DeviceNameOptional_HandlerAccepts — handler accepts requests with and without X-Pyrycode-Device-Name.

Verification:

  • go vet ./... — clean.
  • go test -race ./... — all packages pass.
  • go build ./cmd/pyrycode-relay — builds.
  • gosec ./... — 0 issues.
  • govulncheck ./... — same set of pre-existing stdlib/golang.org/x/net findings as feature/5 HEAD before this PR (verified by stashing the change). Unrelated to this ticket; an env/runtime-upgrade concern surfaced by make lint.

Architecture compliance

  • Public API: single export ClientHandler(*Registry, *slog.Logger) http.Handler — matches the spec exactly. No new types, no new sentinel errors.
  • Header validation runs before websocket.Accept; missing/empty required header → empty-body 400; registry untouched.
  • 4404 close path uses c.Close(websocket.StatusCode(4404), "no server with that id") directly on the underlying *websocket.Conn (stillborn WSConn pattern, same as /v1/server's 4409). WSConn.Close always emits StatusNormalClosure, so the application code path bypasses the wrapper.
  • defer { UnregisterPhone; wsconn.Close; log } is registered AFTER RegisterPhone returns nil — the "defer-after-successful-claim" rule from PROJECT-MEMORY.
  • connID := "client-" + serverID + "-" + randHex8() — opaque routing key; crypto/rand for the suffix; never relied on for security.
  • Logged fields exactly match the spec's enumerated set: phone_registeredserver_id, conn_id, device_name, remote; phone_register_no_serverserver_id, remote; phone_unregisteredserver_id, conn_id. Token, user-agent, and full-header dumps are never logged.
  • <-readCtx.Done() placeholder via c.CloseRead(r.Context()) — to be replaced by the frame loop in relay: frame forwarding loop — wrap/unwrap routing envelope, route by server-id and conn_id #6. Drains control frames so the conn observes peer-close.

Security review compliance

The architect's ## Security review PASS verdict named two posture-level reminders, both honoured by this diff:

  1. Token never reaches a log statement, error response, or formatted error string. The token-bearing local goes out of scope after the presence check and is never used downstream. (Verifiable by grep -n token internal/relay/client_endpoint.go — the only reference is in the pre-upgrade gate.)
  2. Defer registered AFTER RegisterPhone succeeds; cleanup never schedules against a slot the handler does not own.

Connection caps and per-header byte caps are documented residual DoS risks in docs/threat-model.md § "DoS resistance" — out of scope per the spec.

ilmoniemi and others added 2 commits May 9, 2026 23:24
#5)

Implements the phone-side handshake from protocol-mobile.md § Phone → relay
→ binary. Validates x-pyrycode-server, x-pyrycode-token, user-agent before
upgrade; on success registers the phone in the registry under the requested
server-id. Closes 4404 if no binary holds the slot. Token is read for
presence only — never parsed, compared, or logged.

Mirrors the structural shape of /v1/server (#4): validate-pre-upgrade,
defer-after-successful-claim, application close codes emitted on the
underlying *websocket.Conn for the stillborn-WSConn 4404 path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi
Copy link
Copy Markdown
Contributor Author

Code Review: #5

Decision: PASS

Findings

  • [NIT] internal/relay/client_endpoint.go:35-36 — comment says "token goes out of scope here unread" but the token is read (in the presence check on line 31). Reading the comment as "not read further after the gate" is the obviously-intended meaning, but the literal phrasing trips the reader for a beat. Could be tightened to e.g. "token is not read again past the gate — never parsed, compared, or logged." Optional.

Summary

Spec-faithful implementation. The handler is a structural twin of /v1/server (#4): same shape (validate → upgrade → wrap → register → defer-after-claim → CloseRead/wait), same import grouping, same OriginPatterns: ["*"] rationale documented inline, same stillborn-WSConn pattern for the application close code (_ = c.Close(websocket.StatusCode(4404), …) directly on the underlying conn). randHex8 and remoteHost are reused verbatim — zero duplication.

Verified against the architect's ## Security review section:

  • Token never logged. grep -n token internal/relay/client_endpoint.go confirms the only references are the r.Header.Get read, the presence check, and a comment. The token does not appear in any logger.Info field, error string, or response body on any code path (success, 4404, generic register-error, defer).
  • Defer registered AFTER RegisterPhone succeeds (line 73, after the early returns at lines 61 and 64) — matches the PROJECT-MEMORY rule sharpened in relay: /v1/server — defer binary release by 30s grace window #21.
  • Logged field set matches the spec table exactly: phone_registered{server_id, conn_id, device_name, remote}, phone_register_no_server{server_id, remote}, phone_unregistered{server_id, conn_id}. No User-Agent, no full-header dumps, no payload echoes.
  • 4404 close-reason is the literal "no server with that id" — does not echo serverID back, so the close frame leaks nothing on the enumeration path beyond the protocol-spec-mandated existence signal.

Tests cover all six AC bullets 1:1 (valid upgrade + ConnID prefix/8-hex suffix, header gate table over the three required headers × {missing, empty}, 4404 with literal reason, peer close → unregister + binary untouched, three-phone independent lifecycle, optional device-name accepted with and without). Test harness mirrors server_endpoint_test.go (startClient/dialWithClient/validClientHeaders/seedBinary/waitForPhones); fakeConn reused from registry_test.go.

Verification:

  • go vet ./... — clean.
  • go test -race -count=1 ./internal/relay/ — all pass.
  • go build ./cmd/pyrycode-relay — clean.
  • Symmetric _ = c.Close(websocket.StatusCode(4404), …) placement on the registry-error early-return path matches server_endpoint.go:68.

The grace-window interaction with #20 (registry's handleGraceExpiry closing phones via wsconn.Close() → cancels per-conn ctx → CloseRead's read goroutine errors → <-readCtx.Done() fires → defer runs → UnregisterPhone no-ops on already-deleted entry) is structurally idempotent on every step; the spec walks this end-to-end and the implementation honours it without explicit code (it's the absence of any pre-check that makes it safe).

Connection caps and per-header byte caps are documented residual DoS risks in docs/threat-model.md § "DoS resistance"; out of scope per the spec.

ilmoniemi and others added 2 commits May 9, 2026 23:31
Add evergreen feature doc for /v1/client. Update INDEX, PROJECT-MEMORY's
"What's built" row, and sharpen patterns where #5 makes them concrete:
validate-pre-upgrade, application-close-codes-on-underlying-conn, and the
defer-after-successful-claim ordering. Add a new pattern for "courier
credentials" — relay-handled secrets it does not validate, presence-checked
then discarded, never logged or echoed (the X-Pyrycode-Token discipline).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi ilmoniemi merged commit 145b65f into main May 9, 2026
2 checks passed
@ilmoniemi ilmoniemi deleted the feature/5 branch May 9, 2026 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

relay: WS upgrade for /v1/client — accept phone connection, look up server-id, register

1 participant