Spec 18: mode recommender (heuristic v1) — 'this task fits a Sonnet, you used Opus'

## Goal
Recommend the cheapest model that fits the task, based on the user's own past sessions. v1 is a heuristic — pattern-match the prompt + tool-call shape against historical sessions, suggest the model that solved similar tasks at lowest cost.

## Why now
Every team is overpaying for the wrong model on the wrong task today and nobody can prove it. A heuristic v1 is shippable now without the full benchmark engine (Spec 26 — that's the v2).

## Schema
**v016 — `mode_recommendations`** table:
- `id INTEGER PRIMARY KEY`
- `task_pattern_hash TEXT` (md5 of normalized prompt features)
- `recommended_model TEXT`
- `confidence REAL` ([0, 1])
- `evidence_session_ids TEXT` (JSON array)
- `created_ts TEXT`
- `last_used_ts TEXT`
- `INDEX (task_pattern_hash)`

Additive, `IF NOT EXISTS`-guarded.

## User-visible surface
- **CLI**: `stackunderflow recommend mode --prompt "<text>" [--current-model X]` returns recommendation + cost-delta.
- **MCP tool**: `recommend_mode(prompt, current_model?)` returns same.
- **Meta-agent tool**: same.
- **UI**: optional — defer to v2.

## Implementation plan
1. New service `stackunderflow/services/mode_recommender.py`:
   - `_extract_features(prompt)` — pull intent (build/fix/refactor/test/explore), language hints, file-mention count, code-block count.
   - `_hash_features(features) -> str` — stable hash for caching.
   - `_find_similar_past_sessions(conn, features, limit=20)` — substring + intent match.
   - `recommend(conn, prompt, current_model=None) -> dict`.
2. Cache results in `mode_recommendations` (24h TTL).
3. CLI + MCP + meta-agent wiring.

## Tests
- Feature-extraction snapshots (prompt → feature dict).
- Recommendation shape on a seeded store with 5+ historical sessions.
- Cache hit + miss + TTL eviction.
- Empty-store returns `confidence=0.0` with a clean message ("no historical data").

## Hard parts
- Defining "similar task" is squishy. Start simple: same intent + token-count band + language overlap. Document the heuristic; mark it v1 explicitly so users don't expect ML.
- Cost-delta math needs to use post-v0.8.0 `usage_events.cost_usd` (not the legacy compute_cost path).

## Out of scope
- LLM-based feature extraction (defer to v2).
- Per-team / per-project recommenders (defer).
- The full comparative benchmark (Spec 26).

## Dependencies
- None blocking. Builds on v0.8.0 cost-fix.

## Estimated effort
**Size M** — single agent, ~1 hr.

## Hard rules
- DO NOT touch versions / CHANGELOG headings.
- Pre-assigned schema slot: **v016**.
- Branch: `feat/mode-recommender-v1` off main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec 18: mode recommender (heuristic v1) — 'this task fits a Sonnet, you used Opus' #88

Goal

Why now

Schema

User-visible surface

Implementation plan

Tests

Hard parts

Out of scope

Dependencies

Estimated effort

Hard rules

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Spec 18: mode recommender (heuristic v1) — 'this task fits a Sonnet, you used Opus' #88

Description

Goal

Why now

Schema

User-visible surface

Implementation plan

Tests

Hard parts

Out of scope

Dependencies

Estimated effort

Hard rules

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions