From cd18ca87ba1259213c351a60a8d65dbee7f1995d Mon Sep 17 00:00:00 2001 From: SyniRon <66834451+SyniRon@users.noreply.github.com> Date: Thu, 28 May 2026 23:24:47 -0400 Subject: [PATCH] docs: seed CONTEXT.md, ADRs, and agent docs Co-Authored-By: Claude Opus 4.7 (1M context) --- CONTEXT.md | 108 ++++++++++++++++++ ...001-split-process-grpc-and-http-gateway.md | 31 +++++ .../0002-intra-process-plaintext-grpc-dial.md | 24 ++++ ...response-cache-with-update-time-polling.md | 37 ++++++ docs/adr/0004-scope-based-bearer-auth.md | 47 ++++++++ docs/agents/domain.md | 51 +++++++++ docs/agents/issue-tracker.md | 22 ++++ docs/agents/triage-labels.md | 15 +++ 8 files changed, 335 insertions(+) create mode 100644 CONTEXT.md create mode 100644 docs/adr/0001-split-process-grpc-and-http-gateway.md create mode 100644 docs/adr/0002-intra-process-plaintext-grpc-dial.md create mode 100644 docs/adr/0003-redis-response-cache-with-update-time-polling.md create mode 100644 docs/adr/0004-scope-based-bearer-auth.md create mode 100644 docs/agents/domain.md create mode 100644 docs/agents/issue-tracker.md create mode 100644 docs/agents/triage-labels.md diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 0000000..69f3957 --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,108 @@ +# Context + +Domain language used by the 7Cav API. The proto files +(`proto/milpacs.proto`, `proto/tickets.proto`) are the contract; this +document covers the concepts and any nuance that isn't obvious from +reading the schema. + +## Source data + +The API is a read layer over a [XenForo](https://xenforo.com) forum's +MySQL database, augmented by the `NF Rosters` add-on (which contributes +the `xf_nf_rosters_*` tables that hold milpac records) and the +`Cav7/ApiKeyManager` add-on (which contributes the API-key and scope +tables). The API itself owns no schema; it queries upstream tables and +maps them to its own proto types. + +## Milpac + +A "milpac" is a member's military-personnel-record entry: rank, +position, awards, service record, and the identifiers used to look them +up across the wider 7Cav stack. `MilpacService` is the surface that +serves them. + +## Profile shapes + +A member's milpac is served in three shapes by different RPCs, each +tuned to a known consumer: + +- **`Profile`** — full view: rank, positions, awards, records, and the + connected-account identifiers. +- **`LiteProfile`** — slim view: enough to render a roster row, without + the per-member relational payload. +- **`S1UniformsProfile`** — view for the uniforms tool; includes + uniform-relevant fields and omits the rest. + +The three are not subsets of one type; they are hand-mapped from the +same upstream rows into distinct proto messages. + +## Roster and RosterType + +A `Roster` is a collection of members grouped by unit, course, or +status. `RosterType` is an enum identifying which roster is requested; +its numeric values are used as the `roster_id` foreign key in the +upstream tables. The three roster RPCs (`GetRoster`, `GetLiteRoster`, +`GetS1UniformsRoster`) return the same set of members in the +corresponding profile shape above. + +## Rank, Position, PositionGroup + +A `Rank` is a pay-grade entry from the upstream rank catalog. A +`Position` is an org-chart slot; positions are grouped into +`PositionGroup`s for hierarchical browsing. `RankExpanded` and +`PositionExpanded` are the variants that include relational fields the +plain message omits. + +## Record and Award + +A `Record` is an entry on a member's service history (joins, promotions, +transfers, etc.); `RecordType` enumerates the categories. An `Award` is +a decoration entry. + +## AWOL + +An entry on the AWOL list — members flagged as absent without leave. +Served by `GetAwol`, used by status-tracking consumers. + +## Connected accounts + +Members are looked up by 7Cav user id, by username, and by external +account identifiers maintained by the forum's connected-account +integrations: + +- **Discord** — Discord user id. +- **Gamertag** — Xbox / PlayStation handle. +- **Keycloak** — legacy SSO identifier. The Keycloak auth path has been + removed; the lookup RPC is on the chopping block and should not be + used in new code. + +## Tickets + +`TicketsService` exposes the forum's ticket system (powered by the +`NF Tickets` add-on) as a read-only API. + +- **`Ticket`** — a thread: title, status, category, participants, + message count, timestamps. `forum_url` is populated when the API is + configured with the public forum base URL. +- **`Message`** — one post within a ticket, addressed by `position` + (0-indexed within the thread). +- **`Category`** — a top-level grouping for tickets; carries a current + ticket count. +- **`TicketParticipant`** — a member-to-ticket association with a role. + +`ListTicketMessages` paginates with an opaque cursor whose semantic is +"next `position` to include" (inclusive lower bound), so `position=0` +is reachable. + +## API key and scope + +Clients authenticate with a `Bearer` token (case-insensitive prefix per +RFC 7235). Tokens are issued by the forum admin UI, not by this +service. Each token carries a set of named scopes. Current scopes: + +- **`read`** — gates the milpac surface (profiles, rosters, ranks, + positions, AWOL). +- **`read:tickets`** — gates the tickets surface. + +Scope membership is checked per-handler; a token with `read` cannot +read tickets, and vice versa. diff --git a/docs/adr/0001-split-process-grpc-and-http-gateway.md b/docs/adr/0001-split-process-grpc-and-http-gateway.md new file mode 100644 index 0000000..f38902e --- /dev/null +++ b/docs/adr/0001-split-process-grpc-and-http-gateway.md @@ -0,0 +1,31 @@ +# ADR 0001: Split-process gRPC server + HTTP/JSON gateway from a single proto + +## Context + +The API exposes the same surface over two transports: a gRPC service for +machine clients that want typed RPCs, and an HTTP/JSON surface for clients +that don't. We want one source of truth for the schema, the routes, and the +generated client code, with no risk of the two surfaces drifting. + +## Decision + +Define the service in `proto/milpacs.proto` (and additional `.proto` files +per service). The gRPC server is hand-implemented against the generated +`*.pb.go`. The HTTP surface is generated by `grpc-gateway` via +`google.api.http` annotations on each RPC, which produces a reverse-proxy +handler (`*.pb.gw.go`) that translates REST → gRPC. The OpenAPI document +is generated by `protoc-gen-openapiv2` from the same proto. + +Both surfaces run inside one binary as two TCP listeners — gRPC on one +port, the gateway on another. The gateway dials the gRPC server in-process. + +## Consequences + +- Adding an endpoint is a single edit (proto annotation + handler method); + the HTTP route and OpenAPI entry fall out of code generation. +- The gateway is a thin translator with no business logic; all behavior + lives in the gRPC handlers. +- The OpenAPI document is always in sync with the gRPC service because + both are generated from the same source. +- The two-listener layout assumes a reverse proxy terminates TLS in front + of the gateway; see ADR 0002. diff --git a/docs/adr/0002-intra-process-plaintext-grpc-dial.md b/docs/adr/0002-intra-process-plaintext-grpc-dial.md new file mode 100644 index 0000000..bfe1389 --- /dev/null +++ b/docs/adr/0002-intra-process-plaintext-grpc-dial.md @@ -0,0 +1,24 @@ +# ADR 0002: Intra-process plaintext gRPC dial; TLS terminates at the reverse proxy + +## Context + +The binary runs two listeners (see ADR 0001): the gRPC server on one port +and the HTTP/JSON gateway on another. The gateway dials the gRPC server +to translate each REST call into a gRPC call. Public traffic to the +gateway arrives over HTTPS; we need to decide whether the in-process +gateway → gRPC hop should also be TLS. + +## Decision + +The gateway dials the local gRPC server with plaintext credentials. TLS +is terminated by a reverse proxy in front of the gateway; the API binary +itself does not handle TLS. + +## Consequences + +- No certificate management inside the binary; deployment topology owns it. +- The two listeners are co-located in one process by design — the + plaintext hop never traverses an untrusted network. +- If end-to-end TLS were ever required (e.g., running the gRPC server in + a separate process or on a separate host), both the gateway dial and + the gRPC server setup would need to grow TLS credentials. diff --git a/docs/adr/0003-redis-response-cache-with-update-time-polling.md b/docs/adr/0003-redis-response-cache-with-update-time-polling.md new file mode 100644 index 0000000..9984d92 --- /dev/null +++ b/docs/adr/0003-redis-response-cache-with-update-time-polling.md @@ -0,0 +1,37 @@ +# ADR 0003: Redis response cache invalidated by polling MySQL UPDATE_TIME + +## Context + +The API is a read layer over a MySQL database it does not own. Almost +every endpoint is a GET that runs joins and preloads against tables that +change infrequently relative to read volume. We need a cache layer that +keeps responses fast without serving data that has gone stale after a +write — but the writes happen in the upstream application, not in this +service, so we cannot bust the cache from the write path. + +## Decision + +Cache successful HTTP GET responses in Redis, keyed by request path, with +a long TTL (currently 6h). The cache middleware sits inside the auth +middleware so authentication still runs on every request, and serves +cached bodies pre-gzipped. + +Invalidation runs in a background goroutine that polls +`information_schema.tables` on a fixed interval (currently 10 minutes) +for a fixed set of monitored tables. If any monitored table's +`UPDATE_TIME` is newer than the cached snapshot, the goroutine issues a +Redis `FlushAll`. There is no per-record bust and no per-endpoint bust. + +## Consequences + +- Reads are cheap and uniform; the long TTL is safe because the polling + goroutine flushes everything when upstream data changes. +- Stale data is possible for up to one polling interval after an upstream + write — acceptable because the read surface is not transactional. +- The blast radius of any write is one `FlushAll`; this is intentional + and keeps the invalidation logic trivial. +- Adding a new endpoint that reads from a previously-unmonitored upstream + table requires registering that table in the monitored set, or stale + responses linger for up to one TTL. +- Cache middleware logging is intentionally verbose at INFO (HIT / MISS + per request); it is the current substitute for request analytics. diff --git a/docs/adr/0004-scope-based-bearer-auth.md b/docs/adr/0004-scope-based-bearer-auth.md new file mode 100644 index 0000000..b5cc675 --- /dev/null +++ b/docs/adr/0004-scope-based-bearer-auth.md @@ -0,0 +1,47 @@ +# ADR 0004: Scope-based bearer auth validated against the upstream DB on every request + +## Context + +The API needs to authenticate machine clients and gate individual +endpoints by capability (e.g., a key that can read profile data should +not necessarily be able to read tickets). Keys are issued and managed in +the upstream forum application via its admin UI, not by this service. +We need an auth model where revocation and capability changes made in +the upstream UI take effect immediately, without a deploy or a cache +invalidation in this service. + +## Decision + +Clients send a `Bearer` token (case-insensitive prefix per RFC 7235), +which is validated on every request by a single JOIN over three upstream +tables: the key table, the key-to-scope mapping, and the scope catalog. +A row is returned per scope granted to the key. Zero rows → 401. + +The validation result carries the set of granted scope names. Each +handler self-gates by calling a `RequireScope(ctx, "")` +helper at the top of its body; missing scope → permission-denied. + +The scope catalog (which scopes exist, what they mean) is owned by the +upstream application's admin UI. This service treats the catalog as +read-only and never writes to it. + +The bearer token has a hard length cap to bound the cost of the lookup +on malformed input. + +## Consequences + +- Revocation, key creation, and scope changes are instant: the next + request hits the new state. No restart, no cache flush. +- Every request pays for one DB round-trip on the auth path. This is + acceptable for the current load and keeps the model simple; if it + becomes a bottleneck, a short-TTL in-process cache keyed by token hash + is the obvious next step. +- The gRPC `UnaryInterceptor` and the HTTP gateway middleware must both + extract and validate the bearer token; they share a parsing helper to + keep the prefix handling consistent across surfaces. +- Adding a new endpoint requires picking the scope it gates on (or + introducing a new scope in the upstream admin UI first) and adding a + `RequireScope` call at the top of the handler. +- Because auth touches the same DB as the data path, an outage in that + DB surfaces as 401 from this service, not as a 5xx — fault-injection + smokes against handler-level error mapping must account for this. diff --git a/docs/agents/domain.md b/docs/agents/domain.md new file mode 100644 index 0000000..c97d6a6 --- /dev/null +++ b/docs/agents/domain.md @@ -0,0 +1,51 @@ +# Domain Docs + +How the engineering skills should consume this repo's domain documentation when exploring the codebase. + +## Before exploring, read these + +- **`CONTEXT.md`** at the repo root, or +- **`CONTEXT-MAP.md`** at the repo root if it exists — it points at one `CONTEXT.md` per context. Read each one relevant to the topic. +- **`docs/adr/`** — read ADRs that touch the area you're about to work in. In multi-context repos, also check `src//docs/adr/` for context-scoped decisions. + +If any of these files don't exist, **proceed silently**. Don't flag their absence; don't suggest creating them upfront. The producer skill (`/grill-with-docs`) creates them lazily when terms or decisions actually get resolved. + +## File structure + +Single-context repo (most repos): + +``` +/ +├── CONTEXT.md +├── docs/adr/ +│ ├── 0001-event-sourced-orders.md +│ └── 0002-postgres-for-write-model.md +└── src/ +``` + +Multi-context repo (presence of `CONTEXT-MAP.md` at the root): + +``` +/ +├── CONTEXT-MAP.md +├── docs/adr/ ← system-wide decisions +└── src/ + ├── ordering/ + │ ├── CONTEXT.md + │ └── docs/adr/ ← context-specific decisions + └── billing/ + ├── CONTEXT.md + └── docs/adr/ +``` + +## Use the glossary's vocabulary + +When your output names a domain concept (in an issue title, a refactor proposal, a hypothesis, a test name), use the term as defined in `CONTEXT.md`. Don't drift to synonyms the glossary explicitly avoids. + +If the concept you need isn't in the glossary yet, that's a signal — either you're inventing language the project doesn't use (reconsider) or there's a real gap (note it for `/grill-with-docs`). + +## Flag ADR conflicts + +If your output contradicts an existing ADR, surface it explicitly rather than silently overriding: + +> _Contradicts ADR-0007 (event-sourced orders) — but worth reopening because…_ diff --git a/docs/agents/issue-tracker.md b/docs/agents/issue-tracker.md new file mode 100644 index 0000000..cce77ec --- /dev/null +++ b/docs/agents/issue-tracker.md @@ -0,0 +1,22 @@ +# Issue tracker: GitHub + +Issues and PRDs for this repo live as GitHub issues. Use the `gh` CLI for all operations. + +## Conventions + +- **Create an issue**: `gh issue create --title "..." --body "..."`. Use a heredoc for multi-line bodies. +- **Read an issue**: `gh issue view --comments`, filtering comments by `jq` and also fetching labels. +- **List issues**: `gh issue list --state open --json number,title,body,labels,comments --jq '[.[] | {number, title, body, labels: [.labels[].name], comments: [.comments[].body]}]'` with appropriate `--label` and `--state` filters. +- **Comment on an issue**: `gh issue comment --body "..."` +- **Apply / remove labels**: `gh issue edit --add-label "..."` / `--remove-label "..."` +- **Close**: `gh issue close --comment "..."` + +Infer the repo from `git remote -v` — `gh` does this automatically when run inside a clone. + +## When a skill says "publish to the issue tracker" + +Create a GitHub issue. + +## When a skill says "fetch the relevant ticket" + +Run `gh issue view --comments`. diff --git a/docs/agents/triage-labels.md b/docs/agents/triage-labels.md new file mode 100644 index 0000000..b716855 --- /dev/null +++ b/docs/agents/triage-labels.md @@ -0,0 +1,15 @@ +# Triage Labels + +The skills speak in terms of five canonical triage roles. This file maps those roles to the actual label strings used in this repo's issue tracker. + +| Label in mattpocock/skills | Label in our tracker | Meaning | +| -------------------------- | -------------------- | ---------------------------------------- | +| `needs-triage` | `needs-triage` | Maintainer needs to evaluate this issue | +| `needs-info` | `needs-info` | Waiting on reporter for more information | +| `ready-for-agent` | `ready-for-agent` | Fully specified, ready for an AFK agent | +| `ready-for-human` | `ready-for-human` | Requires human implementation | +| `wontfix` | `wontfix` | Will not be actioned | + +When a skill mentions a role (e.g. "apply the AFK-ready triage label"), use the corresponding label string from this table. + +Edit the right-hand column to match whatever vocabulary you actually use.