You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two layers of fiction stack: ESTIMATED_COST_PER_SCORING_CALL = 0.005 (handlers.ts:315) and AI_COST_PER_CANDIDATE = 0.01 (budget-estimator.ts:14) are flat constants. Neither varies by model, input length, or cache hits. Budget gate trusts these, so Opus 4.7 runs silently blow the cap.
Full spec:
docs/hardening-roadmap-2026-04-16.md#h-7Description
Two layers of fiction stack:
ESTIMATED_COST_PER_SCORING_CALL = 0.005(handlers.ts:315) andAI_COST_PER_CANDIDATE = 0.01(budget-estimator.ts:14) are flat constants. Neither varies by model, input length, or cache hits. Budget gate trusts these, so Opus 4.7 runs silently blow the cap.Current State
usagein return types.Suggested Fix
AIProvider.chat()/.structuredOutput()return types withusage: { inputTokens, outputTokens, cachedTokens }.response.usage).ModelPricingmap in@sourcerer/ai/pricing.tskeyed by model ID →{ inputPer1M, outputPer1M, cacheReadPer1M }. Seed: claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5, gpt-4o, gpt-4o-mini.costIncurredinPhaseResult.model, useModelPricing+ rough heuristic (~1K in, ~500 out per call) instead of flat0.01.Verification
pnpm buildpassespnpm testpassespnpm typecheckcleanrun-meta.jsoncost ≈ sum of per-call costs (±2%)maxCostUsd: 0.01sourcerer runs show <id>displays token counts per phaseAutomation Hints
scope: packages/core/src/ai.ts, packages/ai/src, packages/scoring/src, apps/cli/src/handlers.ts, apps/cli/src/budget-estimator.ts
do-not-touch: data adapters
approach: refactor-types
risk: medium
max-files-changed: 12
blocked-by: none
bail-if: scoring tests fail
Priority
High