Goal
Document the usage_events + messages + sessions schema as a publicly versioned spec under docs/specs/session-schema-v1.md, so other tools can write to it / read from it without reverse-engineering. Position StackUnderflow's schema as "OpenTelemetry for AI coding sessions".
Why now
16 adapters is rare. The schema is already de-facto stable across v0.7.x. Codifying it as a published spec is the low-cost path to becoming the standard exchange format.
Schema
None — this is documentation. The spec doc references the current SQL CREATE TABLE statements verbatim (and pins the schema_version to 14).
User-visible surface
- New doc
docs/specs/session-schema-v1.md with:
- Overview + design principles (local-first, additive migrations, normalised costs)
- Full table schemas (DDL) for
messages, sessions, projects, usage_events, the 8 marts
- The 16 normalizer contracts (input shape per provider →
usage_events row)
- The
cost_source enum semantics (live / rate_card / estimated / unknown)
- Conformance test guide (how someone else's tool can validate they're writing valid
usage_events)
- Versioning policy (additive only; new columns require a new schema version)
- New doc
docs/specs/adapter-contract.md — what implementing a new adapter requires (the SourceAdapter Protocol, enumerate(), read(since_ts) contract).
- README mention + a short "Spec" link in the navigation.
- HANDOFF architecture map gets a "spec docs" pointer.
Implementation plan
- Read
stackunderflow/store/migrations/*.sql and stackunderflow/etl/normalize/base.py to capture the current shape.
- Write
docs/specs/session-schema-v1.md (~800-1500 words).
- Write
docs/specs/adapter-contract.md (~500-1000 words).
- Add cross-links from README and
docs/HANDOFF.md.
- Optional: add a tiny conformance test in
tests/stackunderflow/store/test_schema_v1_spec.py that asserts every column listed in the spec doc exists in the live schema.
Tests
- Conformance test (optional but recommended): parses the schema doc, runs
PRAGMA table_info(<table>) on a clean store, asserts column lists match.
Hard parts
- The schema isn't perfectly stable — pin to schema_version=14 and call out which parts may evolve.
- Cost-source semantics need to be precise — they're the highest-impact field for downstream consumers.
Out of scope
- Proposing the spec to an external body (W3C, OpenTelemetry, etc.) — out of scope for one issue.
- Cross-language SDKs.
- Webhook / push API for tools to write into the store (separate spec).
Dependencies
Estimated effort
Size XS — single agent, ~30 min. (Docs-heavy but the source of truth is already in the SQL files.)
Hard rules
- DO NOT touch versions / CHANGELOG headings.
- Branch:
docs/session-schema-spec off main.
Goal
Document the
usage_events+messages+sessionsschema as a publicly versioned spec underdocs/specs/session-schema-v1.md, so other tools can write to it / read from it without reverse-engineering. Position StackUnderflow's schema as "OpenTelemetry for AI coding sessions".Why now
16 adapters is rare. The schema is already de-facto stable across v0.7.x. Codifying it as a published spec is the low-cost path to becoming the standard exchange format.
Schema
None — this is documentation. The spec doc references the current SQL CREATE TABLE statements verbatim (and pins the schema_version to 14).
User-visible surface
docs/specs/session-schema-v1.mdwith:messages,sessions,projects,usage_events, the 8 martsusage_eventsrow)cost_sourceenum semantics (live/rate_card/estimated/unknown)usage_events)docs/specs/adapter-contract.md— what implementing a new adapter requires (theSourceAdapterProtocol,enumerate(),read(since_ts)contract).Implementation plan
stackunderflow/store/migrations/*.sqlandstackunderflow/etl/normalize/base.pyto capture the current shape.docs/specs/session-schema-v1.md(~800-1500 words).docs/specs/adapter-contract.md(~500-1000 words).docs/HANDOFF.md.tests/stackunderflow/store/test_schema_v1_spec.pythat asserts every column listed in the spec doc exists in the live schema.Tests
PRAGMA table_info(<table>)on a clean store, asserts column lists match.Hard parts
Out of scope
Dependencies
Estimated effort
Size XS — single agent, ~30 min. (Docs-heavy but the source of truth is already in the SQL files.)
Hard rules
docs/session-schema-specoff main.