Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions CONTEXT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Context

Domain language used by the 7Cav API. The proto files
(`proto/milpacs.proto`, `proto/tickets.proto`) are the contract; this
document covers the concepts and any nuance that isn't obvious from
reading the schema.

## Source data

The API is a read layer over a [XenForo](https://xenforo.com) forum's
MySQL database, augmented by the `NF Rosters` add-on (which contributes
the `xf_nf_rosters_*` tables that hold milpac records) and the
`Cav7/ApiKeyManager` add-on (which contributes the API-key and scope
tables). The API itself owns no schema; it queries upstream tables and
maps them to its own proto types.

## Milpac

A "milpac" is a member's military-personnel-record entry: rank,
position, awards, service record, and the identifiers used to look them
up across the wider 7Cav stack. `MilpacService` is the surface that
serves them.

## Profile shapes

A member's milpac is served in three shapes by different RPCs, each
tuned to a known consumer:

- **`Profile`** — full view: rank, positions, awards, records, and the
connected-account identifiers.
- **`LiteProfile`** — slim view: enough to render a roster row, without
the per-member relational payload.
- **`S1UniformsProfile`** — view for the uniforms tool; includes
uniform-relevant fields and omits the rest.

The three are not subsets of one type; they are hand-mapped from the
same upstream rows into distinct proto messages.

## Roster and RosterType

A `Roster` is a collection of members grouped by unit, course, or
status. `RosterType` is an enum identifying which roster is requested;
its numeric values are used as the `roster_id` foreign key in the
upstream tables. The three roster RPCs (`GetRoster`, `GetLiteRoster`,
`GetS1UniformsRoster`) return the same set of members in the
corresponding profile shape above.

## Rank, Position, PositionGroup

A `Rank` is a pay-grade entry from the upstream rank catalog. A
`Position` is an org-chart slot; positions are grouped into
`PositionGroup`s for hierarchical browsing. `RankExpanded` and
`PositionExpanded` are the variants that include relational fields the
plain message omits.

## Record and Award

A `Record` is an entry on a member's service history (joins, promotions,
transfers, etc.); `RecordType` enumerates the categories. An `Award` is
a decoration entry.

## AWOL

An entry on the AWOL list — members flagged as absent without leave.
Served by `GetAwol`, used by status-tracking consumers.

## Connected accounts

Members are looked up by 7Cav user id, by username, and by external
account identifiers maintained by the forum's connected-account
integrations:

- **Discord** — Discord user id.
- **Gamertag** — Xbox / PlayStation handle.
- **Keycloak** — legacy SSO identifier. The Keycloak auth path has been
removed; the lookup RPC is on the chopping block and should not be
used in new code.

## Tickets

`TicketsService` exposes the forum's ticket system (powered by the
`NF Tickets` add-on) as a read-only API.

- **`Ticket`** — a thread: title, status, category, participants,
message count, timestamps. `forum_url` is populated when the API is
configured with the public forum base URL.
- **`Message`** — one post within a ticket, addressed by `position`
(0-indexed within the thread).
- **`Category`** — a top-level grouping for tickets; carries a current
ticket count.
- **`TicketParticipant`** — a member-to-ticket association with a role.

`ListTicketMessages` paginates with an opaque cursor whose semantic is
"next `position` to include" (inclusive lower bound), so `position=0`
is reachable.

## API key and scope

Clients authenticate with a `Bearer` token (case-insensitive prefix per
RFC 7235). Tokens are issued by the forum admin UI, not by this
service. Each token carries a set of named scopes. Current scopes:

- **`read`** — gates the milpac surface (profiles, rosters, ranks,
positions, AWOL).
- **`read:tickets`** — gates the tickets surface.

Scope membership is checked per-handler; a token with `read` cannot
read tickets, and vice versa.
31 changes: 31 additions & 0 deletions docs/adr/0001-split-process-grpc-and-http-gateway.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# ADR 0001: Split-process gRPC server + HTTP/JSON gateway from a single proto

## Context

The API exposes the same surface over two transports: a gRPC service for
machine clients that want typed RPCs, and an HTTP/JSON surface for clients
that don't. We want one source of truth for the schema, the routes, and the
generated client code, with no risk of the two surfaces drifting.

## Decision

Define the service in `proto/milpacs.proto` (and additional `.proto` files
per service). The gRPC server is hand-implemented against the generated
`*.pb.go`. The HTTP surface is generated by `grpc-gateway` via
`google.api.http` annotations on each RPC, which produces a reverse-proxy
handler (`*.pb.gw.go`) that translates REST → gRPC. The OpenAPI document
is generated by `protoc-gen-openapiv2` from the same proto.

Both surfaces run inside one binary as two TCP listeners — gRPC on one
port, the gateway on another. The gateway dials the gRPC server in-process.

## Consequences

- Adding an endpoint is a single edit (proto annotation + handler method);
the HTTP route and OpenAPI entry fall out of code generation.
- The gateway is a thin translator with no business logic; all behavior
lives in the gRPC handlers.
- The OpenAPI document is always in sync with the gRPC service because
both are generated from the same source.
- The two-listener layout assumes a reverse proxy terminates TLS in front
of the gateway; see ADR 0002.
24 changes: 24 additions & 0 deletions docs/adr/0002-intra-process-plaintext-grpc-dial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# ADR 0002: Intra-process plaintext gRPC dial; TLS terminates at the reverse proxy

## Context

The binary runs two listeners (see ADR 0001): the gRPC server on one port
and the HTTP/JSON gateway on another. The gateway dials the gRPC server
to translate each REST call into a gRPC call. Public traffic to the
gateway arrives over HTTPS; we need to decide whether the in-process
gateway → gRPC hop should also be TLS.

## Decision

The gateway dials the local gRPC server with plaintext credentials. TLS
is terminated by a reverse proxy in front of the gateway; the API binary
itself does not handle TLS.

## Consequences

- No certificate management inside the binary; deployment topology owns it.
- The two listeners are co-located in one process by design — the
plaintext hop never traverses an untrusted network.
- If end-to-end TLS were ever required (e.g., running the gRPC server in
a separate process or on a separate host), both the gateway dial and
the gRPC server setup would need to grow TLS credentials.
37 changes: 37 additions & 0 deletions docs/adr/0003-redis-response-cache-with-update-time-polling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# ADR 0003: Redis response cache invalidated by polling MySQL UPDATE_TIME

## Context

The API is a read layer over a MySQL database it does not own. Almost
every endpoint is a GET that runs joins and preloads against tables that
change infrequently relative to read volume. We need a cache layer that
keeps responses fast without serving data that has gone stale after a
write — but the writes happen in the upstream application, not in this
service, so we cannot bust the cache from the write path.

## Decision

Cache successful HTTP GET responses in Redis, keyed by request path, with
a long TTL (currently 6h). The cache middleware sits inside the auth
middleware so authentication still runs on every request, and serves
cached bodies pre-gzipped.

Invalidation runs in a background goroutine that polls
`information_schema.tables` on a fixed interval (currently 10 minutes)
for a fixed set of monitored tables. If any monitored table's
`UPDATE_TIME` is newer than the cached snapshot, the goroutine issues a
Redis `FlushAll`. There is no per-record bust and no per-endpoint bust.

## Consequences

- Reads are cheap and uniform; the long TTL is safe because the polling
goroutine flushes everything when upstream data changes.
- Stale data is possible for up to one polling interval after an upstream
write — acceptable because the read surface is not transactional.
- The blast radius of any write is one `FlushAll`; this is intentional
and keeps the invalidation logic trivial.
- Adding a new endpoint that reads from a previously-unmonitored upstream
table requires registering that table in the monitored set, or stale
responses linger for up to one TTL.
- Cache middleware logging is intentionally verbose at INFO (HIT / MISS
per request); it is the current substitute for request analytics.
47 changes: 47 additions & 0 deletions docs/adr/0004-scope-based-bearer-auth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# ADR 0004: Scope-based bearer auth validated against the upstream DB on every request

## Context

The API needs to authenticate machine clients and gate individual
endpoints by capability (e.g., a key that can read profile data should
not necessarily be able to read tickets). Keys are issued and managed in
the upstream forum application via its admin UI, not by this service.
We need an auth model where revocation and capability changes made in
the upstream UI take effect immediately, without a deploy or a cache
invalidation in this service.

## Decision

Clients send a `Bearer` token (case-insensitive prefix per RFC 7235),
which is validated on every request by a single JOIN over three upstream
tables: the key table, the key-to-scope mapping, and the scope catalog.
A row is returned per scope granted to the key. Zero rows → 401.

The validation result carries the set of granted scope names. Each
handler self-gates by calling a `RequireScope(ctx, "<scope_name>")`
helper at the top of its body; missing scope → permission-denied.

The scope catalog (which scopes exist, what they mean) is owned by the
upstream application's admin UI. This service treats the catalog as
read-only and never writes to it.

The bearer token has a hard length cap to bound the cost of the lookup
on malformed input.

## Consequences

- Revocation, key creation, and scope changes are instant: the next
request hits the new state. No restart, no cache flush.
- Every request pays for one DB round-trip on the auth path. This is
acceptable for the current load and keeps the model simple; if it
becomes a bottleneck, a short-TTL in-process cache keyed by token hash
is the obvious next step.
- The gRPC `UnaryInterceptor` and the HTTP gateway middleware must both
extract and validate the bearer token; they share a parsing helper to
keep the prefix handling consistent across surfaces.
- Adding a new endpoint requires picking the scope it gates on (or
introducing a new scope in the upstream admin UI first) and adding a
`RequireScope` call at the top of the handler.
- Because auth touches the same DB as the data path, an outage in that
DB surfaces as 401 from this service, not as a 5xx — fault-injection
smokes against handler-level error mapping must account for this.
51 changes: 51 additions & 0 deletions docs/agents/domain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Domain Docs

How the engineering skills should consume this repo's domain documentation when exploring the codebase.

## Before exploring, read these

- **`CONTEXT.md`** at the repo root, or
- **`CONTEXT-MAP.md`** at the repo root if it exists — it points at one `CONTEXT.md` per context. Read each one relevant to the topic.
- **`docs/adr/`** — read ADRs that touch the area you're about to work in. In multi-context repos, also check `src/<context>/docs/adr/` for context-scoped decisions.

If any of these files don't exist, **proceed silently**. Don't flag their absence; don't suggest creating them upfront. The producer skill (`/grill-with-docs`) creates them lazily when terms or decisions actually get resolved.

## File structure

Single-context repo (most repos):

```
/
├── CONTEXT.md
├── docs/adr/
│ ├── 0001-event-sourced-orders.md
│ └── 0002-postgres-for-write-model.md
└── src/
```

Multi-context repo (presence of `CONTEXT-MAP.md` at the root):

```
/
├── CONTEXT-MAP.md
├── docs/adr/ ← system-wide decisions
└── src/
├── ordering/
│ ├── CONTEXT.md
│ └── docs/adr/ ← context-specific decisions
└── billing/
├── CONTEXT.md
└── docs/adr/
```

## Use the glossary's vocabulary

When your output names a domain concept (in an issue title, a refactor proposal, a hypothesis, a test name), use the term as defined in `CONTEXT.md`. Don't drift to synonyms the glossary explicitly avoids.

If the concept you need isn't in the glossary yet, that's a signal — either you're inventing language the project doesn't use (reconsider) or there's a real gap (note it for `/grill-with-docs`).

## Flag ADR conflicts

If your output contradicts an existing ADR, surface it explicitly rather than silently overriding:

> _Contradicts ADR-0007 (event-sourced orders) — but worth reopening because…_
22 changes: 22 additions & 0 deletions docs/agents/issue-tracker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Issue tracker: GitHub

Issues and PRDs for this repo live as GitHub issues. Use the `gh` CLI for all operations.

## Conventions

- **Create an issue**: `gh issue create --title "..." --body "..."`. Use a heredoc for multi-line bodies.
- **Read an issue**: `gh issue view <number> --comments`, filtering comments by `jq` and also fetching labels.
- **List issues**: `gh issue list --state open --json number,title,body,labels,comments --jq '[.[] | {number, title, body, labels: [.labels[].name], comments: [.comments[].body]}]'` with appropriate `--label` and `--state` filters.
- **Comment on an issue**: `gh issue comment <number> --body "..."`
- **Apply / remove labels**: `gh issue edit <number> --add-label "..."` / `--remove-label "..."`
- **Close**: `gh issue close <number> --comment "..."`

Infer the repo from `git remote -v` — `gh` does this automatically when run inside a clone.

## When a skill says "publish to the issue tracker"

Create a GitHub issue.

## When a skill says "fetch the relevant ticket"

Run `gh issue view <number> --comments`.
15 changes: 15 additions & 0 deletions docs/agents/triage-labels.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Triage Labels

The skills speak in terms of five canonical triage roles. This file maps those roles to the actual label strings used in this repo's issue tracker.

| Label in mattpocock/skills | Label in our tracker | Meaning |
| -------------------------- | -------------------- | ---------------------------------------- |
| `needs-triage` | `needs-triage` | Maintainer needs to evaluate this issue |
| `needs-info` | `needs-info` | Waiting on reporter for more information |
| `ready-for-agent` | `ready-for-agent` | Fully specified, ready for an AFK agent |
| `ready-for-human` | `ready-for-human` | Requires human implementation |
| `wontfix` | `wontfix` | Will not be actioned |

When a skill mentions a role (e.g. "apply the AFK-ready triage label"), use the corresponding label string from this table.

Edit the right-hand column to match whatever vocabulary you actually use.