fix: dedupe /api/projects across providers#5
Merged
Conversation
Schema has UNIQUE(provider, slug) so a project used through both Claude and Codex got two rows. Frontend rendered them as separate projects with the same dir_name, breaking sort and showing duplicates (e.g. SutroYaro once with $4645, again with $0). Group rows by slug in /api/projects, merge stats additively (sum tokens / commands / cost; min first_message_date; max last_message_date; weighted-mean averages by command count). Schema unchanged — fix is presentation-layer only. Verified on this machine: 166 → 159 projects, 7 → 0 duplicate slugs. chimera total_cost: \$7,379 (claude only) → \$7,414 (claude + codex). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0bserver07
added a commit
that referenced
this pull request
May 20, 2026
* fix(api): dedupe /api/projects across providers (claude + codex) Schema has UNIQUE(provider, slug) so a project used through both Claude and Codex got two rows. Frontend rendered them as separate projects with the same dir_name, breaking sort and showing duplicates (e.g. SutroYaro once with $4645, again with $0). Group rows by slug in /api/projects, merge stats additively (sum tokens / commands / cost; min first_message_date; max last_message_date; weighted-mean averages by command count). Schema unchanged — fix is presentation-layer only. Verified on this machine: 166 → 159 projects, 7 → 0 duplicate slugs. chimera total_cost: \$7,379 (claude only) → \$7,414 (claude + codex). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * release: 0.3.3 — dedupe projects across providers * chore: bump version to 0.3.3 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0bserver07
added a commit
that referenced
this pull request
May 20, 2026
Ships HANDOFF §"What's left" #5: a UNION-ALL VIEW over per-month ``messages_YYYYMM`` partition tables behind the existing ``messages`` name. Future-proofs the store at multi-year scale without touching read code. What lands ---------- * ``store/migrations/v008_messages_partitioning.py`` — idempotent .py migration. Discovers existing months by ``substr(timestamp, 1, 7)``, splits rows into ``messages_YYYYMM`` (or ``messages_unknown`` for malformed timestamps), rebuilds ``usage_events`` to drop the FK on ``source_message_fk`` (FKs to a view aren't enforceable), drops the base ``messages`` table, recreates it as a UNION-ALL view, and installs ``_messages_id_seq`` + an INSTEAD OF trigger. * ``ingest/writer.py`` — ``_partition_for(ts)`` routes inserts to the right partition; ``_ensure_partition`` lazily creates new month tables + rebuilds the view + trigger. The INSTEAD OF trigger is the slow path (raw ``INSERT INTO messages`` from tests / tooling); production writes bypass it. * ``docs/specs/messages-partitioning.md`` — design choice (Option A view, not Option B ATTACH), rollback plan, ops rollout for the maintainer's 1.9 GB store. * ``tests/stackunderflow/store/test_partitioning.py`` — 12 tests cover migration on fresh + seeded DBs, FK-drop verification, writer routing across months, future-month auto-creation, malformed-ts routing, normalize hook end-to-end, backfill end-to-end, the trigger's explicit-id path. * Spot fixes to existing tests where ``cur.lastrowid`` after ``INSERT INTO messages`` is now meaningless (the trigger's nested insert id doesn't propagate); they read from ``_messages_id_seq`` instead. Constraint compliance --------------------- * Did NOT touch ``~/.stackunderflow/store.db``. Migration is reviewed + applied manually by the maintainer per the spec doc. * Pre: 1598 passing, 2 skipped, 11 deselected. * Post: 1610 passing (12 new partition tests), 2 skipped, 11 deselected. * ``ruff check`` clean on the new files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
0bserver07
added a commit
that referenced
this pull request
May 20, 2026
HANDOFF #5 asked whether /api/cost-data's command_costs block could migrate from the aggregator to command_mart, mirroring the Wave 5 tool_costs migration. Investigation confirms the shape mismatch the HANDOFF flagged is structural, not stale: - aggregator: list of per-Interaction rows (interaction_id, session_id, prompt_preview, timestamp, tools_used, steps, models_used, had_error, cost, tokens), top 50 desc by cost - command_mart: (day, project_id, command_name) rollup with {event_count, cost_usd, tokens_in, tokens_out, session_count} command_mart_for_project returns sums over command_name — the helper is already wired and feeds reports/optimize.py + the CLI report command. It is NOT a drop-in source for this route's response shape because the mart's grain discards the per-Interaction fields the frontend's CommandCostList (CommandCost[] in analytics.ts) reads. Extending the helper cannot recover what the grain doesn't store. Changes: - routes/cost.py: expand the _overlay_mart_rollups docstring to spell out the structural reason command_costs stays aggregator-driven - tests/stackunderflow/routes/test_cost_command_mart_overlay.py: three new tests lock in the verified behaviour — populated command_mart does not swap out aggregator output, empty command_mart does the same, and the helper's rollup shape is asserted (no per-Interaction fields) A future per-Interaction-grain mart (e.g. interaction_mart) could power this overlay; the new tests will need updating then. Tests: 2312 → 2315 passing (+3). ruff baseline preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Same project, used through both Claude and Codex, appears twice in the projects list. Sort by Est. Cost looks broken because each duplicate has different stats (one row has the Claude cost, the other has Codex cost or $0).
Reproduced live: 166 total entries with 7 duplicate `dir_name`s including SutroYaro, chimera, StackUnderflow.
Cause
`projects` schema has `UNIQUE (provider, slug)` — same slug per provider is fine. The Codex adapter (added in v0.3.1) registered the same projects under `provider='codex'`, producing a second row per slug. The `/api/projects` endpoint passes them through as-is.
Fix
Group `project_rows` by `slug` in `routes/projects.py` and merge:
Schema unchanged. Presentation-layer fix only.
Verification
🤖 Generated with Claude Code