Skip to content

feat(gain): colony gain drift + savings_drift_report MCP tool#575

Merged
NagyVikt merged 2 commits into
mainfrom
worktree-agent-a7b5923da2c1c97e3
May 15, 2026
Merged

feat(gain): colony gain drift + savings_drift_report MCP tool#575
NagyVikt merged 2 commits into
mainfrom
worktree-agent-a7b5923da2c1c97e3

Conversation

@NagyVikt
Copy link
Copy Markdown
Collaborator

Summary

Adds a long-run regression detector that flags tools whose median tokens-per-call has drifted up or down between a baseline window and a recent window. Pure read path against existing `mcp_metrics` — no schema change.

  • CLI: `colony gain drift [--baseline-days 14] [--recent-days 3] [--min-calls 20] [--threshold 1.25] [--down-threshold 0.75]`
  • MCP: `savings_drift_report` (mirrors `savings_report` shape)
  • Storage: `mcpTokenDriftPerOperation()` + `mcpMetricsMinTs()`
  • Classifier: `up_drift` / `down_drift` / `new_tool` / `gone` / `insufficient_data` / `stable`

Closes `⏳ Long-run regression detector that flags when a tool's median tokens-per-call drifts up` under README §v0.x "Receipts and observability".

OpenSpec

`openspec/changes/gain-drift-detector-2026-05-16/CHANGE.md`

Design adaptation

The original Plan used a correlated `LIMIT 1 OFFSET (COUNT(*)-1)/2` for the per-operation median. SQLite forbids outer aggregate refs in scalar-subquery `OFFSET`. Switched to `ROW_NUMBER() OVER (PARTITION BY operation ORDER BY tpc)` with a CTE join — same semantics, cleaner, supported by bundled better-sqlite3 (SQLite 3.49.2 verified).

Test plan

  • `pnpm --filter @colony/storage test` — 165 pass (18 mcp-metrics, +7 new)
  • `pnpm --filter colonyq typecheck` — clean
  • `pnpm --filter colonyq test -- gain-drift` — 5 pass
  • `pnpm --filter @colony/mcp-server test` — 299 pass
  • `pnpm --filter colonyq build` — clean

Merge order

Touches `packages/storage/src/storage.ts`. If #573 and #574 (scenarios) merge first, this should be third. The coach-mode PR also extends `storage.ts` independently — whichever lands second needs a trivial rebase.

🤖 Generated with Claude Code

NagyVikt and others added 2 commits May 16, 2026 01:33
- Median tokens-per-call comparison across non-overlapping windows
- Classifies up_drift / down_drift / new_tool / gone / insufficient_data / stable
- No schema change — reads existing mcp_metrics table

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@NagyVikt NagyVikt merged commit a83eeea into main May 15, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant