Skip to content

feat: persist Record.speed to messages table (close fast-mode SQLite gap)#48

Merged
0bserver07 merged 1 commit intomainfrom
feat/fast-mode-store-schema
May 1, 2026
Merged

feat: persist Record.speed to messages table (close fast-mode SQLite gap)#48
0bserver07 merged 1 commit intomainfrom
feat/fast-mode-store-schema

Conversation

@0bserver07
Copy link
Copy Markdown
Owner

Summary

PR #44 added Anthropic Opus priority/fast tier (service_tier=\"priority\") detection in the in-process pipeline (Record.speed, compute_cost(..., speed=...), aggregator collectors keyed by (model, speed)) but the SQLite messages table had no speed column, so every SQL-driven cost path silently re-billed fast records at the standard 1× rate. This PR closes that gap.

Verified gap

Synthetic Opus session with one priority + one standard message of identical token counts (1000 in, 500 out each):

Before this PR After this PR
Standard slice $0.0525 $0.0525
Fast slice $0.0525 (wrong, billed standard) $0.3150 (6× multiplier applied)
Total reported $0.1050 $0.3675

A 3.5× understatement on the 50/50 split. Pure-fast sessions were under-reported by 6×.

Changes

  • Migration v003_messages_speed.sql adds messages.speed TEXT NOT NULL DEFAULT 'standard'. Existing rows backfill to 'standard' via the DEFAULT (the conservative direction — under-charging a priority record at standard is the bug we're fixing; over-charging would be worse).
  • schema.apply is now reentrant for ALTER TABLE migrations — guards via PRAGMA table_info(messages) so a partial-application state (column added by hand, or a previous run crashed before bumping user_version) recovers cleanly. CURRENT_VERSION = 3.
  • Writer (stackunderflow/ingest/writer.py) binds rec.speed into the new column.
  • Cost-bearing SQL paths all bucket by (model, speed) and thread speed= into compute_cost:
    • store/queries.get_global_stats and cross_project_daily_totals
    • services/compare._fetch_messages
    • services/yield_tracker._compute_cost_for_session
    • reports/export._load_messages_grouped + _models_from_messages
    • reports/aggregate.build_report
    • routes/commands._interaction_to_command (mixed-tier sessions)
  • MessageRow typed dataclass gains speed: str = \"standard\".

Test plan

  • pytest tests/ -q1127 passed, 2 skipped (1115 baseline + 12 new)
  • ruff check clean on all modified files
  • End-to-end smoke: real ClaudeAdapter ingests synthetic JSONL with service_tier=\"priority\", the speed column round-trips, get_global_stats reports the 6× multiplied cost
  • Migration idempotent — schema.apply() on a DB where the speed column already exists is a no-op and still bumps PRAGMA user_version to 3
  • Sonnet-on-fast-tier still bills 1× (only Opus families get the multiplier per the AnthropicPricer contract)

12 new tests across:

  • tests/stackunderflow/store/test_migration_v003.py (5: column shape, default backfill, idempotent re-apply, version bump, default-on-bare-INSERT)
  • tests/stackunderflow/store/test_queries.py (4: fast-mode arithmetic, standard-only no-regression, Sonnet-no-multiplier, cross_project_daily_totals carries speed)
  • tests/stackunderflow/ingest/test_fast_mode_end_to_end.py (3: adapter→writer→DB→query round-trip, the get_global_stats cost story, get_project_stats full-pipeline cost story)

Completes the fast-mode work end-to-end through the dashboard.

…gap)

PR #44 added Anthropic Opus priority/fast tier (service_tier="priority")
detection in the in-process pipeline (Record.speed, compute_cost speed=...,
aggregator collectors keyed by (model, speed)) but the SQLite messages
table had no speed column, so every SQL-driven cost path silently
re-billed fast records at the standard 1x rate.

Verified gap with a synthetic Opus session containing one priority + one
standard message of identical token counts: SQL-driven cost returned
$0.1050 (both billed standard) when it should have returned $0.3675
(fast slice 6x multiplied) — a 3.5x understatement on the 50/50 split,
6x for pure-fast sessions.

Changes:
- New migration v003_messages_speed.sql adds messages.speed TEXT NOT NULL
  DEFAULT 'standard'. Existing rows backfill to 'standard' via the
  DEFAULT (the conservative direction — under-charging a priority record
  at standard is the bug we're fixing; over-charging would be worse).
- schema.apply guards ALTER TABLE migrations with PRAGMA table_info so a
  partial-application state (column added by hand or previous run crashed
  before bumping user_version) recovers cleanly. CURRENT_VERSION = 3.
- ingest/writer.py binds rec.speed into the new column.
- store/queries.get_global_stats groups by (day, model, speed) and threads
  speed= into compute_cost. cross_project_daily_totals appends speed to
  the result tuple.
- services/compare._fetch_messages, services/yield_tracker, reports/export
  (_load_messages_grouped + _models_from_messages), reports/aggregate, and
  routes/commands._interaction_to_command all bucket by (model, speed).
- store/types.MessageRow gains speed: str = "standard".

Tests: 12 new across test_migration_v003.py (column shape, backfill,
idempotent re-apply, version bump), test_queries.py (speed-aware
get_global_stats arithmetic, standard-only no-regression, Sonnet-fast
no-multiplier, cross_project_daily_totals carries speed), and
test_fast_mode_end_to_end.py (full adapter→writer→DB→query round-trip
on a synthetic Claude JSONL with service_tier="priority"). 1127 passed,
2 skipped (1115 baseline + 12 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@0bserver07 0bserver07 merged commit e5bbe93 into main May 1, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant