Skip to content

Spec 12: open session-schema spec — publish the event schema as an open standard #91

@0bserver07

Description

@0bserver07

Goal

Document the usage_events + messages + sessions schema as a publicly versioned spec under docs/specs/session-schema-v1.md, so other tools can write to it / read from it without reverse-engineering. Position StackUnderflow's schema as "OpenTelemetry for AI coding sessions".

Why now

16 adapters is rare. The schema is already de-facto stable across v0.7.x. Codifying it as a published spec is the low-cost path to becoming the standard exchange format.

Schema

None — this is documentation. The spec doc references the current SQL CREATE TABLE statements verbatim (and pins the schema_version to 14).

User-visible surface

  • New doc docs/specs/session-schema-v1.md with:
    • Overview + design principles (local-first, additive migrations, normalised costs)
    • Full table schemas (DDL) for messages, sessions, projects, usage_events, the 8 marts
    • The 16 normalizer contracts (input shape per provider → usage_events row)
    • The cost_source enum semantics (live / rate_card / estimated / unknown)
    • Conformance test guide (how someone else's tool can validate they're writing valid usage_events)
    • Versioning policy (additive only; new columns require a new schema version)
  • New doc docs/specs/adapter-contract.md — what implementing a new adapter requires (the SourceAdapter Protocol, enumerate(), read(since_ts) contract).
  • README mention + a short "Spec" link in the navigation.
  • HANDOFF architecture map gets a "spec docs" pointer.

Implementation plan

  1. Read stackunderflow/store/migrations/*.sql and stackunderflow/etl/normalize/base.py to capture the current shape.
  2. Write docs/specs/session-schema-v1.md (~800-1500 words).
  3. Write docs/specs/adapter-contract.md (~500-1000 words).
  4. Add cross-links from README and docs/HANDOFF.md.
  5. Optional: add a tiny conformance test in tests/stackunderflow/store/test_schema_v1_spec.py that asserts every column listed in the spec doc exists in the live schema.

Tests

  • Conformance test (optional but recommended): parses the schema doc, runs PRAGMA table_info(<table>) on a clean store, asserts column lists match.

Hard parts

  • The schema isn't perfectly stable — pin to schema_version=14 and call out which parts may evolve.
  • Cost-source semantics need to be precise — they're the highest-impact field for downstream consumers.

Out of scope

  • Proposing the spec to an external body (W3C, OpenTelemetry, etc.) — out of scope for one issue.
  • Cross-language SDKs.
  • Webhook / push API for tools to write into the store (separate spec).

Dependencies

  • None.

Estimated effort

Size XS — single agent, ~30 min. (Docs-heavy but the source of truth is already in the SQL files.)

Hard rules

  • DO NOT touch versions / CHANGELOG headings.
  • Branch: docs/session-schema-spec off main.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationsize-xs~10 min agent runspecSpec/feature for an agent to implementwave-1Wave 1: independent foundations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions