Skip to content

feat(#51): AI-powered model recommendations for cost optimization#73

Open
GalDayan wants to merge 6 commits intomainfrom
feature/issue-51-model-recommendations
Open

feat(#51): AI-powered model recommendations for cost optimization#73
GalDayan wants to merge 6 commits intomainfrom
feature/issue-51-model-recommendations

Conversation

@GalDayan
Copy link
Copy Markdown
Contributor

Summary

Implements the backend for AI-powered model recommendations (closes #51).

What's included

3-layer architecture:

  1. Classifier (modelClassifier.ts) — Analyzes session complexity (simple/moderate/complex) using heuristics:

    • Message length and count
    • Tool usage patterns (file ops, code execution, sub-agents)
    • Session duration and model diversity
  2. Mapper (modelRecommendations.ts) — Maps complexity to cheaper model alternatives:

    • Opus → Haiku for simple tasks, Sonnet for moderate
    • GPT-4 → GPT-3.5 for simple, GPT-4-Turbo for moderate
    • Calculates potential cost savings with rounding
  3. API Endpoints (routes.ts):

    • GET /api/recommendations/:sessionId — Per-session recommendation
    • GET /api/recommendations/summary — Bulk savings across all agents (the money view)

Edge cases handled

  • Zero-cost sessions
  • Unknown/empty model names
  • Sessions with no messages
  • Savings rounding to 4 decimal places
  • Route ordering (static /summary before param /:sessionId)

Tests

  • 23 unit tests (vitest) — 10 classifier, 13 recommendations
  • All passing, clean tsc --noEmit

What's next

  • Frontend: Anas building the Recommendations dashboard tab/card
  • v1 is heuristic-based — no auto-switching, no per-message analysis

cc @GalDayan

…API endpoints

- modelClassifier.ts: session complexity classifier (simple/moderate/complex)
  based on message length, tool usage patterns, and feature analysis
- modelRecommendations.ts: maps complexity to cheaper model alternatives
  with cost savings estimates
- routes.ts: GET /api/recommendations/:sessionId and /api/recommendations/summary

Implements backend for AI-powered model recommendations (issue #51)
…ge cases

- 23 tests covering classifier and recommendation mapper
- Edge cases: zero-cost sessions, empty models, no messages, unknown models
- Savings rounding to 4 decimal places
- Bulk recommendations filter completed sessions, respect limits
- Added vitest as test framework
- All tests passing, build clean
Prevents Express from matching 'summary' as a sessionId param.
Copilot AI review requested due to automatic review settings March 25, 2026 03:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds backend support for AI-assisted “cheaper model” recommendations by classifying session complexity and estimating potential cost savings, exposed via new recommendations API endpoints.

Changes:

  • Introduces heuristic session complexity classification (simple|moderate|complex).
  • Adds model recommendation + savings estimation logic for single sessions and recent-session summaries.
  • Exposes new API routes for per-session and summary recommendations, and adds vitest unit tests.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
backend/src/routes.ts Adds /api/recommendations/summary and /api/recommendations/:sessionId endpoints.
backend/src/modelRecommendations.ts Implements complexity→model mapping and savings estimation (single + bulk).
backend/src/modelClassifier.ts Implements heuristic session complexity classifier (detail + summary).
backend/src/tests/modelRecommendations.test.ts Adds unit tests for recommendation mapping and savings edge cases.
backend/src/tests/modelClassifier.test.ts Adds unit tests for complexity classification heuristics.
backend/package.json Adds vitest scripts and dev dependency.
backend/package-lock.json Locks vitest and transitive dependencies.
Files not reviewed (1)
  • backend/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +154
// Calculate potential savings
const currentCost = session.costUsd;
const estimatedNewCost = estimateSessionCost(session.tokenCount, recommendation.model);
const savings = Math.max(0, currentCost - estimatedNewCost);
const savingsPercentage = currentCost > 0 ? (savings / currentCost) * 100 : 0;

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Savings are computed even when the recommendation doesn’t actually change the model (or changes only the label, e.g. anthropic/claude-haiku-3claude-haiku). Because estimateSessionCost is only a rough estimate, this can report non-zero “savings” for a no-op recommendation. Consider short-circuiting: if the recommended tier/normalized model is the same as the current tier, set savings to 0 (and percentage to 0) instead of estimating.

Copilot uses AI. Check for mistakes.
Comment on lines +182 to +189
const complexity = classifySessionSummary(session);
const recommendation = mapComplexityToModel(complexity, session.model);

const currentCost = session.costUsd;
const estimatedNewCost = estimateSessionCost(session.tokenCount, recommendation.model);
const savings = Math.max(0, currentCost - estimatedNewCost);
const savingsPercentage = currentCost > 0 ? (savings / currentCost) * 100 : 0;

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In bulk mode, savings are estimated even when mapComplexityToModel effectively recommends the same model tier (e.g. already on Haiku / GPT-3.5, or default “no change”). Since estimateSessionCost is approximate, this can produce misleading positive savings for a no-op. Consider detecting “no real change” (e.g. comparing normalized model ids) and forcing savings/percentage to 0 in those cases.

Copilot uses AI. Check for mistakes.
Comment thread backend/src/routes.ts
router.get("/recommendations/summary", async (req: Request, res: Response) => {
try {
const profile = req.query.profile as string | undefined;
const limit = parseInt(req.query.limit as string) || 20;
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limit is parsed without a radix and without clamping/validation. Elsewhere in this file (e.g. the earlier /sessions listing) limit is parsed with base 10 and clamped to a safe range. Consider matching that pattern here (e.g. parseInt(..., 10) + min/max bounds) to avoid negative/NaN/overly large limits affecting behavior/perf.

Suggested change
const limit = parseInt(req.query.limit as string) || 20;
const rawLimit = parseInt(req.query.limit as string, 10);
const limit =
!Number.isNaN(rawLimit) ? Math.max(1, Math.min(rawLimit, 100)) : 20;

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +2
import { SessionDetail, SessionSummary } from "./sessions.js";
import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier.js";
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relative import uses a .js extension ("./sessions.js"), but the source file is sessions.ts and the project is compiled as module: "commonjs" without NodeNext-style module resolution. This is likely to fail type-check/build with "Cannot find module". Use the same extensionless style used across the backend (e.g. "./sessions").

Suggested change
import { SessionDetail, SessionSummary } from "./sessions.js";
import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier.js";
import { SessionDetail, SessionSummary } from "./sessions";
import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier";

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,148 @@
import { SessionDetail, SessionSummary } from "./sessions.js";
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relative import uses a .js extension ("./sessions.js"), but the codebase generally uses extensionless relative imports and the backend tsconfig is module: "commonjs" (not NodeNext). To avoid module resolution/build issues, import "./sessions" instead.

Suggested change
import { SessionDetail, SessionSummary } from "./sessions.js";
import { SessionDetail, SessionSummary } from "./sessions";

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +213
return {
sessionId: session.id,
title: session.title,
recommendation: {
currentModel: session.model,
recommendedModel: recommendation.model,
complexity,
confidence: 0.6, // Lower confidence without full detail
reasons: [recommendation.reason],
potentialSavings: {
costUsd: savings,
percentage: savingsPercentage,
},
},
};
})
.filter((r) => r.recommendation.potentialSavings.costUsd > 0.001); // Filter out negligible savings

const totalSavings = recommendations.reduce((sum, r) => sum + r.recommendation.potentialSavings.costUsd, 0);

return {
totalSessions: recentSessions.length,
potentialTotalSavings: totalSavings,
recommendations,
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bulk recommendations return potentialSavings.costUsd and percentage without rounding, while the per-session endpoint rounds (4dp for cost / 1dp for percentage) and the PR description calls out rounding as an edge case. Consider rounding here as well (and rounding potentialTotalSavings) to keep the API consistent and avoid long floating-point decimals in the UI.

Copilot uses AI. Check for mistakes.
- Add BulkRecommendationSummary, SessionRecommendation types
- Add API functions: getRecommendationSummary, getSessionRecommendation
- Create RecommendationsTab component with summary card + session list
- Wire tab into dashboard page with fetch effect
- Responsive desktop/mobile layouts, complexity badges, savings display
- Shows total potential savings, avg per session, individual recommendations
- SessionClient: stack stats on mobile, full-width messages, responsive timeline dots
- ProjectClient: hide dividers on mobile, stack stats, truncate model names, responsive timeline
- Both: sm: breakpoints, break-words for long text, min-h touch targets on buttons
- Pattern matches dashboard tabs (mobile cards, desktop layouts)
@GalDayan
Copy link
Copy Markdown
Contributor Author

Frontend review

Reviewed the UI integration:

  • RecommendationsTab renders cleanly — summary card + session list with responsive layouts
  • Types align with backend response shapes
  • API functions use same fetchJson pattern as existing endpoints
  • Tab wired with lazy-fetch on tab switch (no unnecessary API calls)
  • Complexity badges, savings display, confidence indicators all working
  • Mobile/desktop layouts follow existing dashboard patterns

Ready for merge from the frontend side.

…ndary thresholds, GPT/Gemini paths, savings precision

- Classifier: boundary thresholds (100/300 char, 0.3/1.0 tool ratio, 20/21 msg count, 30/31 min duration)
- Classifier: case-sensitive tool names (Read/Write/Edit), null content handling, confidence caps
- Recommendations: GPT-4 → gpt-3.5/gpt-4-turbo paths, Gemini pro → flash paths
- Recommendations: cheap model upgrade hints for complex tasks (haiku, gpt-3.5)
- Bulk: negligible savings filter, mixed model families, idle sessions, confidence level
- Savings: non-negative guarantee, percentage validation for known cost reductions

Coverage: 23 → 88 tests (4 test files)
Copy link
Copy Markdown
Contributor Author

@GalDayan GalDayan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎨 Frontend Review — PR #73 (Model Recommendations)

Reviewed all 6 frontend files. Overall: solid work, merge-ready. A few notes:

✅ What's Good

  • RecommendationsTab.tsx — Clean component structure. Skeleton loading states are well done (matching existing patterns). Empty state handles both zero-sessions and all-optimal cases. Nice touch.
  • Responsive layouts — The dual desktop/mobile layout in RecommendationsTab is good UX. The responsive fixes in ProjectClient and SessionClient (flex-col on mobile, hidden dividers, break-words) are exactly the kind of polish we need.
  • Typestypes.ts additions are well-typed. SessionComplexity as a union type is clean. BulkRecommendationSummary shape matches the API contract.
  • API layerapi.ts functions follow existing patterns (fetchJson, URLSearchParams). Good.
  • Tab integration — The useEffect with cancellation token for recommendations fetch is correct. Only fetches when tab is active — no wasted calls.

🟡 Minor Observations (non-blocking)

  1. formatModel() could live in a shared util — It's useful outside RecommendationsTab (e.g., session detail views). Not urgent, but worth extracting later.

  2. Confidence dot — The inline style={{ backgroundColor: ... }} with ternary chains works but could be a complexityConfig-style lookup for consistency. Fine for now.

  3. overflow-x-auto on tab bar — Smart addition for the 5th tab. Good catch preventing horizontal overflow on narrow viewports.

  4. getSessionRecommendation() in api.ts — Exported but not used anywhere in this PR. Presumably for the per-session view later? No issue, just noting it.

  5. Average percentage calculation — Line ~140: Math.round() on the reduce result — the division by recommendations.length is outside Math.round(). Works correctly, just dense. A variable would improve readability.

🟢 Verdict

Ship it. Clean component, good responsive handling, proper TypeScript, follows existing patterns. The responsive fixes in ProjectClient/SessionClient are a nice bonus — those pages needed the mobile love.

— Anas 🎨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: AI-powered model recommendations for cost optimization

2 participants