feat(costs): per-task and per-agent breakdowns + task board cost badge (#558) by frankbria · Pull Request #591 · frankbria/codeframe

frankbria · 2026-05-14T18:39:19Z

Closes #558.

Summary

Adds per-task and per-agent cost breakdowns to the existing /costs page (built on top of the summary added in [Phase 5.2] Cost analytics page: total spend and time range #557).
Adds an inline cost badge to each card on the /tasks board with a hover tooltip showing input/output token counts.
Backfills a long-standing v2 gap where react_agent.py int-cast UUID task IDs and silently stored NULL in token_usage, leaving per-task analytics permanently empty.

Changes

Backend

TokenRepository.get_top_tasks_by_cost(days, limit) — group by task_id, return top N with the most-used agent per task.
TokenRepository.get_costs_by_agent(days) — group by agent_id, return per-agent rollup plus total input/output tokens.
GET /api/v2/costs/tasks — top 10 tasks; titles resolved via tasks.get, with a placeholder for tasks that no longer exist.
GET /api/v2/costs/by-agent — per-agent rollup.
TokenUsage.task_id widened to Optional[Union[int, str]]. react_agent now passes the task ID through verbatim (UUID for v2, int for v1) instead of dropping it.

Frontend (Next.js / Tailwind / Hugeicons)

New types and costsApi.getTopTasks / getByAgent.
New TopTasksTable and AgentCostBars components (pure Tailwind horizontal bars + input/output token split row — no charting library required, matches the existing stack).
/costs page wires both sections under the existing summary cards and chart, sharing the time-range selector.
TaskCard adds a MoneyBag02Icon + cost badge when the task has a positive cost entry. costMap flows from TaskBoardView → TaskBoardContent → TaskColumn → TaskCard as an optional prop (non-breaking).

Test plan

uv run pytest tests/persistence/test_token_repository_costs.py tests/ui/test_costs_v2.py — 44 passing
uv run ruff check codeframe/ — clean
cd web-ui && npm test — 834 passing (66 suites)
cd web-ui && npm run build — clean, all 15 routes build
Verified TypeScript error count did not increase (26 on main → 22 with branch)
Demo against cf-test/: page renders top-10 table, agent bars, token split; task board cards show badges where cost data exists

Notes

26 pre-existing integration/self-correction test failures (tests/integration/test_worker_agent_*, tests/testing/test_self_correction_*) reproduce on main without these changes — they require live API behavior and have been flaky before.

Summary by CodeRabbit

New Features
- Cost badges on task cards; "Top tasks by cost" and "Cost by agent" analytics sections with detailed token and USD breakdowns.
Bug Fixes
- Improved handling of task identifiers to preserve text/UUID task IDs and avoid losing token-usage data.
Documentation
- Roadmap and release notes updated to reflect cost analytics completion.
Tests
- Expanded unit and UI tests covering analytics endpoints, components, and formatting.

#558) Extends the /costs page (added in #557) with two analytics sections and adds an inline cost badge to each task card on the /tasks board. Backend (Python): - New TokenRepository.get_top_tasks_by_cost(days, limit) and get_costs_by_agent(days), aggregating the workspace token_usage table. - New endpoints under /api/v2/costs: GET /tasks -> top 10 tasks with titles, agent, tokens, cost GET /by-agent -> per-agent rollup + total input/output tokens - Title resolution falls back to a placeholder when token_usage references a task no longer present in the workspace, so the table never blanks out. - TokenUsage.task_id widened to Optional[Union[int, str]] so v2 UUID task IDs are preserved end-to-end (react_agent.py was int()-casting and storing NULL for every v2 record, blocking per-task analytics). Frontend (TypeScript / Next.js): - New types TaskCostEntry, TaskCostsResponse, AgentCostEntry, AgentCostsResponse and matching costsApi.getTopTasks / getByAgent. - New CostsView sections: TopTasksTable and AgentCostBars (pure-Tailwind horizontal bars + input/output token split row, no charting library). - TaskCard renders a small MoneyBag02Icon + cost badge with a tooltip showing input/output token breakdown when costMap has a positive entry for that task. costMap threads through TaskBoardView -> Content -> Column as an optional prop; non-breaking.

coderabbitai · 2026-05-14T18:39:32Z

Walkthrough

This PR implements cost analytics infrastructure and UI to help users identify expensive tasks and cost drivers by agent. It updates the core task ID model to support both integer and UUID-string identifiers (v1 and v2 workspace schemas), adds backend repository queries for per-task and per-agent cost aggregation, exposes two new REST API endpoints, and integrates cost visualization into the costs page and task board.

Changes

Cost Analytics Infrastructure

Layer / File(s)	Summary
Task ID type widening `codeframe/core/models.py`, `codeframe/core/react_agent.py`, `codeframe/lib/metrics_tracker.py`	`TokenUsage.task_id` now accepts `Optional[Union[int, str]]` to support both v1 integer PKs and v2 UUID strings; `ReactAgent` preserves non-integer task IDs during persistence rather than discarding them as `None`; public `MetricsTracker` recording methods updated to accept the broader type.
Repository cost aggregation queries `codeframe/persistence/repositories/token_repository.py`, `tests/persistence/test_token_repository_costs.py`	New `_window_iso_bounds` helper computes time windows for trailing-day aggregations; `get_top_tasks_by_cost(days, limit=10)` ranks tasks by total expense and includes the most-used agent per task; `get_costs_by_agent(days)` aggregates token usage per agent across a window; tests cover empty results, multi-row aggregation, filtering, null handling, and text-based task IDs.
Cost analytics REST API `codeframe/ui/routers/costs_v2.py`, `tests/ui/test_costs_v2.py`	New Pydantic models for cost entry/response shapes; safe DB query helpers that gracefully handle missing tables or query failures; `GET /api/v2/costs/tasks` returns top-10 tasks with resolved titles; `GET /api/v2/costs/by-agent` returns per-agent cost and token breakdowns; endpoint tests validate aggregation, limit behavior, time-window filtering, and zero states.

Frontend Cost Analytics UI

Layer / File(s)	Summary
Frontend types and API client `web-ui/src/types/index.ts`, `web-ui/src/lib/api.ts`	Added TypeScript interfaces `TaskCostEntry`, `TaskCostsResponse`, `AgentCostEntry`, `AgentCostsResponse` for cost data shapes; extended `costsApi` with `getTopTasks` and `getByAgent` methods that fetch cost breakdowns with configurable day windows (default 30).
Costs page components `web-ui/src/components/costs/TopTasksTable.tsx`, `web-ui/src/components/costs/AgentCostBars.tsx`	`TopTasksTable` renders a scrollable table of per-task costs (title, agent, tokens, cost) with loading and empty states; `AgentCostBars` renders per-agent cost bars scaled to max, includes token split footer, and gracefully handles missing data.
Costs page integration `web-ui/src/app/costs/page.tsx`, `web-ui/src/__tests__/components/costs/CostsPage.test.tsx`	Added SWR fetches for top tasks and per-agent data; rendered new sections below the spend chart; refactored test setup with centralized `setupSwr` helper to mock multiple API endpoints and updated assertions to validate top-tasks and per-agent UI rendering.
Task board cost badges `web-ui/src/components/tasks/TaskBoardView.tsx`, `web-ui/src/components/tasks/TaskBoardContent.tsx`, `web-ui/src/components/tasks/TaskColumn.tsx`, `web-ui/src/components/tasks/TaskCard.tsx`, `web-ui/__tests__/components/tasks/TaskCard.test.tsx`	`TaskBoardView` fetches task cost data and builds a `costMap` via `useMemo`; threaded through component hierarchy to `TaskCard`; `TaskCard` renders a MoneyBag cost badge when cost exists and is nonzero, with a tooltip showing token breakdown; test coverage validates rendering, hiding, and formatting behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

frankbria/codeframe#590: Extends the v2 costs analytics stack by adding new per-task and per-agent cost endpoints while updating task ID handling in the same core files.
frankbria/codeframe#442: Both PRs extend the token/cost tracking pipeline through codeframe/core/react_agent.py, codeframe/lib/metrics_tracker.py, and codeframe/persistence/repositories/token_repository.py for task-level cost aggregation.

Poem

🐰 I hopped through code to tally the fee,
Tasks and agents now tell cost to me,
Badges on cards and charts that sing,
Tokens mapped to every spring—
A rabbit's cheer for clarity!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 38.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main changes: adding per-task and per-agent cost breakdowns plus task board cost badges, all aligned with the linked issue `#558`.
Linked Issues check	✅ Passed	All coding objectives from issue `#558` are met: top-10 tasks table with title/agent/tokens/cost [backend: TokenRepository.get_top_tasks_by_cost, GET /api/v2/costs/tasks; frontend: TopTasksTable], cost-by-agent breakdown [TokenRepository.get_costs_by_agent, GET /api/v2/costs/by-agent; AgentCostBars], input/output token split in both components, task card badges with tooltip showing tokens (TaskCard costMap integration), and tests passing.
Out of Scope Changes check	✅ Passed	All changes align with `#558` scope: backend cost analytics APIs, frontend components/pages, type definitions, and tests. No unrelated refactoring, documentation updates are minor and expected (CLAUDE.md, PRODUCT_ROADMAP.md).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/558-cost-breakdowns-and-badge

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-05-14T18:45:05Z

CodeFRAME Development Guidelines

Last updated: 2026-05-11

Product Vision

CodeFrame is a project delivery system: Think → Build → Prove → Ship.

It owns the edges of the AI coding pipeline — everything BEFORE code gets written (PRD, specification, task decomposition) and everything AFTER (verification gates, quality memory, deployment). The actual code writing is delegated to frontier coding agents (Claude Code, Codex, OpenCode).

CodeFrame does not compete with coding agents. It orchestrates them.

THINK:  cf prd generate → cf prd stress-test → cf tasks generate
BUILD:  cf work start --engine claude-code  (or codex, opencode, built-in)
PROVE:  cf proof run  (9-gate evidence-based quality system)
SHIP:   cf pr create → cf pr merge
LOOP:   Glitch → cf proof capture → New REQ → Enforced forever

If you are an agent working in this repo: do not improvise architecture. Follow the documents listed below.

Primary Contract (MUST FOLLOW)

Golden Path: docs/GOLDEN_PATH.md — the only workflow we build until it works end-to-end
Command Tree + Module Mapping: docs/CLI_WIREFRAME.md — CLI commands → core modules
Product Roadmap: docs/PRODUCT_ROADMAP.md — current phase plan (Phase 3.5/4/5)
Vision: docs/VISION.md — north star for all decisions
Agent System Reference: docs/AGENT_SYSTEM_REFERENCE.md — agent components, execution flows

Rule 0: If a change does not directly support the Think → Build → Prove → Ship pipeline, do not implement it.

Current Focus: Phase 4A

Phase 5.1 is complete — Settings page now ships three working tabs: Agent (#554), API Keys (#555), and PROOF9 Defaults + Workspace Config (#556). Backend: GET/PUT /api/v2/proof/config and /api/v2/workspaces/config, plus run_proof() now honors enabled_gates filtering and strictness (strict vs warn). Atomic JSON writes via codeframe/ui/routers/_helpers.atomic_write_json. The 9-gate canonical order and proof_config.json filename live in codeframe/core/proof/models.py.

Phase 3.5C is complete — CaptureGlitchModal form (description/markdown, source, scope, gate obligations, severity, expiry) reachable from the PROOF9 page and the persistent sidebar "Capture Glitch" button. REQ detail view (/proof/[req_id]) ships markdown description rendering, ProofScope metadata display, obligations table with Latest Run column, sortable/filterable evidence history, and empty-state CTA. Backend: ScopeOut model on RequirementResponse. Issues #568, #569.

Next, in order:

4A: PR status tracking + PROOF9 merge gate
4B: Post-merge glitch capture loop
5.2–5.5: Platform completeness ([Phase 5.2] Cost analytics page: total spend and time range #557–[Phase 5.5] GitHub Issues import: execute import and task traceability #565)

See docs/PRODUCT_ROADMAP.md for full specs and issue links.

Architecture Rules (non-negotiable)

1) Core must be headless

codeframe/core/** must NOT import FastAPI, WebSocket frameworks, HTTP request/response objects, or UI modules.

Core is allowed to: read/write durable state (SQLite/filesystem), run orchestration/worker loops, emit events to an append-only event log, call adapters via interfaces (LLM, git, fs).

2) CLI must not require a server

Golden Path commands must work from the CLI with no server running. FastAPI is optional, started explicitly via codeframe serve, and must wrap core.

3) Agent state transitions flow through runtime

Agent (agent.py) manages its own AgentState (IDLE, PLANNING, EXECUTING, BLOCKED, COMPLETED, FAILED)
Runtime (runtime.py) handles all TaskStatus transitions (BACKLOG, READY, IN_PROGRESS, DONE, BLOCKED)
Agent does NOT call tasks.update_status() — runtime does this based on agent state

This separation prevents duplicate state transitions (e.g., DONE→DONE errors).

4) Legacy can be read, not depended on

server/ is reference only. Do NOT import legacy UI/server modules into core.

5) Keep commits runnable

At all times: codeframe --help works, Golden Path stubs can run, no breaking renames/moves.

Current State

v2 Architecture

Core-first: Domain logic lives in codeframe/core/ (headless, no FastAPI imports)
CLI-first: Golden Path works without any running FastAPI server
Adapters: LLM providers in codeframe/adapters/llm/
Server/UI optional: FastAPI and UI are thin adapters over core; web UI connects via REST/WebSocket
server/ contains v1 code retained as reference only; do not build toward v1 patterns

Phase 3 Web UI (actively developed — not legacy)

Next.js 16 App Router, TypeScript, Shadcn/UI, Tailwind CSS, Hugeicons, XTerm.js, WebSocket + SSE.

Shipped pages: /, /prd, /tasks, /execution, /execution/[taskId], /blockers, /proof, /proof/[req_id], /review, /sessions, /sessions/[id], /settings.

Testing: cd web-ui && npm test must pass; npm run build must succeed. The frontend-tests CI job enforces this on every PR.

What's implemented

Full feature list in docs/PRODUCT_ROADMAP.md. Key capabilities: ReAct agent execution, batch execution (serial/parallel/auto), task dependencies, stall detection, self-correction, GitHub PR workflow, SSE streaming, API auth, rate limiting, OpenAPI docs, multi-provider LLM (Anthropic/OpenAI-compatible), agent adapters (ClaudeCode/Codex/OpenCode/Kilocode), worktree isolation, E2B cloud execution, interactive agent sessions (WebSocket chat + XTerm.js terminal), PROOF9 quality system (gate runs, per-gate evidence, run history).

Repository Structure

codeframe/
├── core/           # Headless domain + orchestration (NO FastAPI imports)
│   ├── react_agent.py, tools.py, editor.py   # ReAct engine (default)
│   ├── agent.py, planner.py, executor.py     # Plan engine (legacy --engine plan)
│   ├── runtime.py                            # Run lifecycle, engine selection
│   ├── conductor.py                          # Batch orchestration + worker pool
│   ├── dependency_graph.py, dependency_analyzer.py
│   ├── gates.py, fix_tracker.py, quick_fixes.py  # Verification + self-correction
│   ├── stall_detector.py, stall_monitor.py   # Stall detection
│   ├── tasks.py, blockers.py, prd.py, workspace.py
│   ├── context.py, state_machine.py, events.py, streaming.py
│   ├── environment.py, installer.py, diagnostics.py, diagnostic_agent.py
│   ├── credentials.py, agents_config.py
│   └── sandbox/context.py, sandbox/worktree.py   # Isolation abstractions
├── adapters/
│   ├── llm/base.py, llm/anthropic.py, llm/openai.py, llm/mock.py
│   └── e2b/        # Cloud sandbox (optional: pip install codeframe[cloud])
├── cli/app.py      # Typer CLI entry + subcommands
├── ui/             # FastAPI server (thin adapter over core)
│   ├── server.py, models.py, dependencies.py
│   └── routers/    # 16 v2 router modules
├── auth/           # API key service + auth dependencies
├── lib/            # rate_limiter.py, audit_logger.py
└── server/         # Legacy v1 (reference only)

web-ui/             # Phase 3 Web UI (Next.js, actively developed)
tests/
├── core/           # Core module tests (auto-marked v2)
├── adapters/       # LLM + E2B adapter tests
├── agents/         # Worker agent tests
├── integration/    # Cross-module integration tests
├── lifecycle/      # End-to-end lifecycle tests (CLI + API + web, uses MockProvider)
└── ui/             # FastAPI router tests

Commands

Python / CLI

uv run pytest                     # All tests
uv run pytest -m v2               # v2 tests only
uv run pytest tests/core/         # Core module tests
uv run pytest tests/lifecycle/    # Lifecycle tests (no live API calls — uses MockProvider)
uv run ruff check .

# Web UI
cd web-ui && npm test
cd web-ui && npm run build

Golden Path CLI

# Workspace
cf init <repo> [--detect | --tech-stack "..." | --tech-stack-interactive]
cf status

# PRD
cf prd add <file.md>
cf prd show

# Tasks
cf tasks generate
cf tasks list [--status READY]
cf tasks show <id>

# Work — single task
cf work start <task-id> [--execute] [--engine react|plan] [--verbose] [--dry-run]
cf work start <task-id> --execute --stall-timeout 120 --stall-action retry|blocker|fail
cf work start <task-id> --execute --llm-provider openai --llm-model gpt-4o
cf work stop <task-id>
cf work resume <task-id>
cf work follow <task-id> [--tail 50]
cf work diagnose <task-id>

# Work — batch
cf work batch run [<id>...] [--all-ready] [--engine react|plan]
cf work batch run --strategy serial|parallel|auto [--max-parallel 4] [--retry 3]
cf work batch run --all-ready --llm-provider openai --llm-model qwen2.5-coder:7b
cf work batch status|cancel|resume [batch_id]

# Blockers
cf blocker list
cf blocker show <id>
cf blocker answer <id> "answer"

# Quality / State
cf review && cf patch export && cf commit
cf checkpoint create|list|restore
cf summary

# Environment
cf env check|install|doctor

# GitHub PR
cf pr create|status|checks|merge

Note: codeframe serve exists but Golden Path does not depend on it.

What NOT to do

Don't add HTTP endpoints to support CLI commands (CLI must work without a server)
Don't require codeframe serve for CLI workflows
Don't implement UI concepts (tabs, panels, progress bars) inside codeframe/core/
Don't "clean up the repo" as a goal — only refactor to enable the pipeline
Don't update task status from agent.py — let runtime.py handle transitions
Don't skip web UI testing when verifying features that have a web surface
Don't leave a CI gate disabled when its feature area becomes active. Re-enable DISABLED: / # COMMENTED OUT: jobs before the first PR in that area. Verify frontend-tests is wired into test-summary.

Testing / Demoing

Quality check (covers both backend and web UI)

uv run pytest && uv run ruff check .
cd web-ui && npm test && npm run build

New v2 tests: add @pytest.mark.v2 or pytestmark = pytest.mark.v2 at module level.

Demoing against a sample project (e.g., `cf-test/`)

You are observing the CodeFRAME agent's work, not doing the work yourself.

Do NOT help out, fix errors, or write code on behalf of the agent
Do NOT intervene when the agent makes mistakes — that's data
Report what worked, what failed, final state vs. acceptance criteria

Practical Working Mode

Read docs/GOLDEN_PATH.md — confirm the change is required
Find the command in docs/CLI_WIREFRAME.md
Implement core functionality in codeframe/core/
Call it from Typer command in codeframe/cli/
Emit events + persist state
Keep it runnable. Commit.

When unsure: simpler state, fewer dependencies, smaller surface area, core-first, CLI-first.

Environment Variables

ANTHROPIC_API_KEY=sk-ant-...          # Required for Anthropic provider (default)
E2B_API_KEY=e2b_...                   # Required for --engine cloud
DATABASE_PATH=./codeframe.db          # Optional

# LLM Provider selection (multi-provider support)
# Priority: CLI flag > env var > .codeframe/config.yaml > default (anthropic)
CODEFRAME_LLM_PROVIDER=anthropic      # Provider: anthropic (default), openai, ollama, vllm, compatible
CODEFRAME_LLM_MODEL=gpt-4o            # Model override (used with openai/ollama/vllm/compatible)
OPENAI_API_KEY=sk-...                 # Required for openai provider; not needed for local providers
OPENAI_BASE_URL=http://localhost:11434/v1  # Base URL override (for ollama, vllm, or custom endpoints)
# Per-workspace config: .codeframe/config.yaml supports llm: block
# llm:
#   provider: openai
#   model: qwen2.5-coder:7b
#   base_url: http://localhost:11434/v1   # optional, for local models

# Optional — Rate limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_DEFAULT=100/minute
RATE_LIMIT_AUTH=10/minute
RATE_LIMIT_AI=20/minute
REDIS_URL=redis://localhost:6379

CODEFRAME_API_KEY_SECRET=<secret>     # API key hashing

Documentation

Doc	Purpose
`docs/VISION.md`	North star: Think → Build → Prove → Ship thesis
`docs/PRODUCT_ROADMAP.md`	Current roadmap — Phase 3.5/4/5 web product completeness
`docs/GOLDEN_PATH.md`	CLI-first workflow contract
`docs/CLI_WIREFRAME.md`	Command → module mapping
`docs/AGENT_SYSTEM_REFERENCE.md`	Component table, model selection, execution flows, self-correction
`docs/REACT_AGENT_ARCHITECTURE.md`	ReAct deep-dive: tools, editor, token management
`docs/PHASE_3_UI_ARCHITECTURE.md`	Web UI architecture (Next.js, pages, components)
`docs/PHASE_2_DEVELOPER_GUIDE.md`	Server layer + v2 router patterns
`docs/PHASE_2_CLI_API_MAPPING.md`	CLI to API endpoint mapping
`docs/QUICKSTART.md`	User-facing quickstart guide

Archived (completed plans, old gap analyses): docs/archive/

Legacy (v1 reference only): SPRINTS.md, sprints/, specs/, CODEFRAME_SPEC.md

claude · 2026-05-14T18:45:36Z

TEST COMMENT - WILL BE REPLACED - placeholder for review

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/ui/test_costs_v2.py (1)

321-324: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add upper-bound validation coverage for /api/v2/costs/by-agent days.

Line 321 currently verifies only the lower bound. Add a days > 90 rejection test to prevent range-validation regressions on this endpoint.

Proposed test addition

 class TestCostsTasksDaysValidation:
@@
     def test_by_agent_below_minimum_rejected(self, test_client):
         response = test_client.get("/api/v2/costs/by-agent?days=0")
         assert response.status_code == 422
+
+    def test_by_agent_above_maximum_rejected(self, test_client):
+        response = test_client.get("/api/v2/costs/by-agent?days=400")
+        assert response.status_code == 422

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/ui/test_costs_v2.py` around lines 321 - 324, Add a complementary test
to cover the upper-bound validation for the /api/v2/costs/by-agent endpoint:
create a new test function (e.g., test_by_agent_above_maximum_rejected) next to
test_by_agent_below_minimum_rejected that uses test_client to GET
"/api/v2/costs/by-agent?days=91" and asserts response.status_code == 422 so days
> 90 is rejected; ensure the test uses the same test_client fixture and naming
pattern as the existing test.

🧹 Nitpick comments (1)

codeframe/persistence/repositories/token_repository.py (1)

369-447: ⚡ Quick win

N+1 query pattern in agent lookup.

The method executes one query per task to find the most-used agent (lines 423-437). For the default limit of 10, this results in 11 total queries. While acceptable for an analytics endpoint, consider consolidating into a single query using a window function or correlated subquery if this becomes a performance concern.

Example optimization:

SELECT task_id, agent_id, input_tokens, output_tokens, total_cost_usd
FROM (
  SELECT 
    task_id,
    agent_id,
    SUM(input_tokens) AS input_tokens,
    SUM(output_tokens) AS output_tokens,
    SUM(estimated_cost_usd) AS total_cost_usd,
    COUNT(*) AS calls,
    ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC) AS rn
  FROM token_usage
  WHERE task_id IS NOT NULL AND timestamp >= ? AND timestamp < ?
  GROUP BY task_id, agent_id
)
WHERE rn = 1
ORDER BY total_cost_usd DESC
LIMIT ?

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codeframe/persistence/repositories/token_repository.py` around lines 369 -
447, get_top_tasks_by_cost suffers an N+1 query when resolving the most-used
agent per task (the per-row cursor.execute block that queries token_usage for
each task_id); replace the per-task lookup with a single consolidated query that
computes per-task aggregates and selects the top agent per task (e.g., using a
window function ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC)
or a correlated subquery) so the method returns task_id, agent_id, input_tokens,
output_tokens, total_cost_usd in one query and eliminates the looped agent
lookup.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codeframe/ui/routers/costs_v2.py`:
- Around line 180-270: Router file contains business logic (DB opening,
aggregation, task-title lookup) inside functions _open_workspace_conn,
_token_usage_exists, _query_top_tasks and _query_costs_by_agent; move that logic
into a new core service. Create functions in codeframe/core (e.g.,
costs_service.get_top_tasks(db_path, workspace, days, limit) and
costs_service.get_costs_by_agent(db_path, days)) that encapsulate connection
handling, TokenRepository calls, token table existence checks, and task title
resolution (use _placeholder_task_title and tasks_module.get inside the core
service), then import and call those service functions from the router so the
router only adapts request/response. Keep the same behavior on errors (return
empty lists/dicts and log via logger), preserve the existing function names when
refactoring for easy replacement, and update imports/tests to reference the new
costs_service APIs.

In `@web-ui/src/components/tasks/TaskBoardView.tsx`:
- Around line 42-53: costMap is currently built from costsApi.getTopTasks (via
useSWR) which only returns the top-10 tasks, so tasks outside that set never get
badges; change the data fetch to request costs for the actual visible task IDs
(or an API that returns all tasks' costs) and rebuild costMap from that complete
result. Concretely: replace the useSWR call that invokes
costsApi.getTopTasks(workspacePath) with a call that passes the list of visible
task IDs (e.g. costsApi.getCostsForTasks(workspacePath, visibleTaskIds) or an
equivalent endpoint that returns all task costs), use a SWR key that includes
the visibleTaskIds array, and update the useMemo dependency from
[costData?.tasks] to include visibleTaskIds so the Map<string, TaskCostEntry>
built in the costMap logic contains entries for every visible task ID.

In `@web-ui/src/components/tasks/TaskCard.tsx`:
- Around line 13-24: The badge shows "$0.00" for micro-costs; update
formatBadgeCost to treat values < 0.01 specially (return the label "< $0.01"
instead of toFixed(2)), and add a helper (e.g., formatFullCost) that renders
full precision (up to 6 decimals) for use in the badge tooltip; then update the
TaskCard badge label and its tooltip to use formatBadgeCost for the visible
badge and formatFullCost for the tooltip so tiny non-zero costs are not
represented as "$0.00".
- Around line 178-205: The cost badge currently shows for any positive cost but
can display "$0.00" for tiny amounts; update the display condition in the
TaskCard rendering (the block that checks showCostBadge and costEntry and uses
formatBadgeCost) to only render the badge when the value would round to at least
$0.01 (e.g., costEntry.total_cost_usd >= 0.01 or by checking
formatBadgeCost(costEntry.total_cost_usd) !== "$0.00"); if you choose to hide
the badge for these tiny amounts, keep the TooltipContent but show
higher-precision cost (e.g., 4+ decimal places) there so users can see the
actual cost.

---

Outside diff comments:
In `@tests/ui/test_costs_v2.py`:
- Around line 321-324: Add a complementary test to cover the upper-bound
validation for the /api/v2/costs/by-agent endpoint: create a new test function
(e.g., test_by_agent_above_maximum_rejected) next to
test_by_agent_below_minimum_rejected that uses test_client to GET
"/api/v2/costs/by-agent?days=91" and asserts response.status_code == 422 so days
> 90 is rejected; ensure the test uses the same test_client fixture and naming
pattern as the existing test.

---

Nitpick comments:
In `@codeframe/persistence/repositories/token_repository.py`:
- Around line 369-447: get_top_tasks_by_cost suffers an N+1 query when resolving
the most-used agent per task (the per-row cursor.execute block that queries
token_usage for each task_id); replace the per-task lookup with a single
consolidated query that computes per-task aggregates and selects the top agent
per task (e.g., using a window function ROW_NUMBER() OVER (PARTITION BY task_id
ORDER BY COUNT(*) DESC) or a correlated subquery) so the method returns task_id,
agent_id, input_tokens, output_tokens, total_cost_usd in one query and
eliminates the looped agent lookup.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c238e7d1-e5e7-48fd-8f96-b9099b5cd3a9

📥 Commits

Reviewing files that changed from the base of the PR and between a4f7e66 and 84abfbe.

📒 Files selected for processing (18)

codeframe/core/models.py
codeframe/core/react_agent.py
codeframe/lib/metrics_tracker.py
codeframe/persistence/repositories/token_repository.py
codeframe/ui/routers/costs_v2.py
tests/persistence/test_token_repository_costs.py
tests/ui/test_costs_v2.py
web-ui/__tests__/components/tasks/TaskCard.test.tsx
web-ui/src/__tests__/components/costs/CostsPage.test.tsx
web-ui/src/app/costs/page.tsx
web-ui/src/components/costs/AgentCostBars.tsx
web-ui/src/components/costs/TopTasksTable.tsx
web-ui/src/components/tasks/TaskBoardContent.tsx
web-ui/src/components/tasks/TaskBoardView.tsx
web-ui/src/components/tasks/TaskCard.tsx
web-ui/src/components/tasks/TaskColumn.tsx
web-ui/src/lib/api.ts
web-ui/src/types/index.ts

coderabbitai · 2026-05-14T18:46:17Z

+def _open_workspace_conn(db_path: str) -> Optional[sqlite3.Connection]:
+    """Open the workspace DB or return None if it cannot be read.
+
+    Mirrors _query_costs's tolerance for fresh/locked workspaces: callers
+    fall back to an empty response rather than 500'ing the dashboard.
+    """
+    try:
+        conn = sqlite3.connect(db_path)
+        conn.row_factory = sqlite3.Row
+        return conn
+    except sqlite3.Error as e:
+        logger.warning("costs: failed to open %s: %s", db_path, e)
+        return None
+
+
+def _token_usage_exists(conn: sqlite3.Connection) -> bool:
+    cursor = conn.cursor()
+    cursor.execute(
+        "SELECT name FROM sqlite_master WHERE type='table' AND name='token_usage'"
+    )
+    return cursor.fetchone() is not None
+
+
+def _query_top_tasks(
+    db_path: str, workspace: Workspace, days: int, limit: int = 10,
+) -> List[Dict[str, Any]]:
+    """Aggregate per-task cost and join titles via workspace.tasks.
+
+    Returns a list of dicts ready for serialization into ``TaskCostEntry``.
+    """
+    conn = _open_workspace_conn(db_path)
+    if conn is None:
+        return []
+
+    try:
+        if not _token_usage_exists(conn):
+            return []
+        try:
+            repo = TokenRepository(sync_conn=conn)
+            rows = repo.get_top_tasks_by_cost(days=days, limit=limit)
+        except sqlite3.Error as e:
+            logger.warning("costs/tasks: query failed on %s: %s", db_path, e)
+            return []
+    finally:
+        conn.close()
+
+    entries: List[Dict[str, Any]] = []
+    for row in rows:
+        raw_id = row["task_id"]
+        task_id_str = str(raw_id) if raw_id is not None else ""
+        title = _placeholder_task_title(task_id_str)
+        try:
+            task = tasks_module.get(workspace, task_id_str)
+            if task is not None:
+                title = task.title
+        except Exception:
+            # Lookup failures are non-fatal — keep the placeholder title.
+            logger.debug("costs/tasks: task lookup failed for %s", task_id_str, exc_info=True)
+
+        entries.append({
+            "task_id": task_id_str,
+            "task_title": title,
+            "agent_id": row["agent_id"],
+            "input_tokens": row["input_tokens"],
+            "output_tokens": row["output_tokens"],
+            "total_cost_usd": row["total_cost_usd"],
+        })
+
+    return entries
+
+
+def _query_costs_by_agent(db_path: str, days: int) -> Dict[str, Any]:
+    """Aggregate per-agent cost over the window."""
+    empty = {"by_agent": [], "total_input_tokens": 0, "total_output_tokens": 0}
+
+    conn = _open_workspace_conn(db_path)
+    if conn is None:
+        return empty
+
+    try:
+        if not _token_usage_exists(conn):
+            return empty
+        try:
+            repo = TokenRepository(sync_conn=conn)
+            return repo.get_costs_by_agent(days=days)
+        except sqlite3.Error as e:
+            logger.warning("costs/by-agent: query failed on %s: %s", db_path, e)
+            return empty
+    finally:
+        conn.close()
+


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move aggregation and lookup logic out of the router module.

The connection management, repository aggregation, and task-title resolution here are core/business logic. Keep router handlers as thin request/response adapters and move this logic into codeframe/core/ service code.

As per coding guidelines, codeframe/{core,cli,ui}/**/*.py: "Core domain logic must be implemented in codeframe/core/" and codeframe/ui/routers/**/*.py: "FastAPI routers must be thin adapters over core. Do not implement business logic inside routers."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codeframe/ui/routers/costs_v2.py` around lines 180 - 270, Router file contains business logic (DB opening, aggregation, task-title lookup) inside functions _open_workspace_conn, _token_usage_exists, _query_top_tasks and _query_costs_by_agent; move that logic into a new core service. Create functions in codeframe/core (e.g., costs_service.get_top_tasks(db_path, workspace, days, limit) and costs_service.get_costs_by_agent(db_path, days)) that encapsulate connection handling, TokenRepository calls, token table existence checks, and task title resolution (use _placeholder_task_title and tasks_module.get inside the core service), then import and call those service functions from the router so the router only adapts request/response. Keep the same behavior on errors (return empty lists/dicts and log via logger), preserve the existing function names when refactoring for easy replacement, and update imports/tests to reference the new costs_service APIs.

claude · 2026-05-14T18:46:25Z

placeholder2

claude · 2026-05-14T18:46:29Z

line1
line2
line3

claude · 2026-05-14T18:51:25Z

Deleting test comment - will post actual review now

claude · 2026-05-14T18:51:43Z

placeholder - about to write review file

claude · 2026-05-14T18:51:59Z

line without hash works fine here

claude · 2026-05-14T18:55:44Z

Code Review — PR 591: Per-task and per-agent cost breakdowns + task board cost badge

This is a solid PR. The v2 UUID task-ID bug fix is the most impactful change here, and the analytics additions are well-scoped and tested. A few items worth addressing before merge.

What is working well

UUID fix is correct. The int(task_id) silent-drop was a real data loss bug. The fix preserves the raw value, attempts int conversion, and falls back to str. The comment explaining the v1/v2 type split is useful.
Non-breaking prop threading. costMap flows as an optional prop from TaskBoardView to TaskBoardContent to TaskColumn to TaskCard. Existing callers are unaffected.
Empty-state and loading states. Both TopTasksTable and AgentCostBars handle all three states (loading skeleton, empty, data) cleanly.
Accessibility. role="progressbar" with aria-valuenow/min/max on cost bars and scope="col" on table headers are good touches.
Fault-tolerant backend. _open_workspace_conn and _token_usage_exists guard both new endpoints against fresh/uninitialized workspaces, returning graceful empty responses rather than 500.
Test coverage is strong. 44 backend tests cover the full repository layer and router, including the UUID text task ID path and time-window boundary cases.

Issues to address

1. Cost badge shows $0.00 for sub-cent tasks

File: web-ui/src/components/tasks/TaskCard.tsx

formatBadgeCost uses toFixed(2) for values under $1, so a cost of 0.0012 renders as $0.00 even though showCostBadge is true (the guard is > 0, not >= 0.005). A task that actually consumed tokens would show a misleading zero badge.

TopTasksTable and AgentCostBars use 4-6 decimal places via minimumFractionDigits: 4, which is appropriate for AI costs. Consider matching that precision in the badge, or at minimum tightening the guard to match the display threshold (e.g. >= 0.005).

2. N+1 queries in get_top_tasks_by_cost

File: codeframe/persistence/repositories/token_repository.py, lines 154-179

The outer query fetches the top-N tasks, then a second query per row finds the dominant agent — up to 11 queries total for the default limit=10. Fine at current scale, but the same result is achievable in a single CTE. Not a blocker, but worth a TODO comment on the current loop for when the limit grows.

3. _open_workspace_conn duplicates logic already in _query_costs

File: codeframe/ui/routers/costs_v2.py

_open_workspace_conn is a clean extraction, but the original _query_costs function still opens its own connection inline — two code paths for the same operation. Worth consolidating in a follow-up to keep the file consistent.

4. SWR key inconsistency between TaskBoardView and CostsPage

File: web-ui/src/components/tasks/TaskBoardView.tsx

TaskBoardView uses a string key while CostsPage uses an array key. They never share the SWR cache. This is intentionally fine (the board always wants 30 days; the page uses the user-selected range), but a brief comment acknowledging the intentional scope difference would help future readers.

Minor notes

formatNumber is duplicated in TopTasksTable.tsx and AgentCostBars.tsx with identical implementations. Consider extracting to a shared lib/formatters.ts in a follow-up.
project_id: 0 hardcoded in react_agent.py is pre-existing, not introduced here, but limits per-project analytics. Worth a TODO if follow-ups revisit multi-project cost breakdowns.
test_supports_text_task_ids correctly disables FK checks to simulate the v2 schema — good technique, and the comment explains the trade-off clearly.

Test gap

Neither TopTasksTable nor AgentCostBars has a direct component test; they are only exercised via mocks in CostsPage.test.tsx. The loading-skeleton render paths (data-testid="top-tasks-loading" and data-testid="agent-bars-loading") are untested. Low priority, but worth adding before the /costs page is customer-facing.

Verdict

The UUID fix alone justifies merging. The analytics additions are clean and complete. Addressing the $0.00 badge edge case (item 1) before merge would prevent user confusion; the rest are low-priority follow-ups.

- TaskCard badge: format sub-cent costs at 4dp instead of collapsing to $0.00. Adds a regression test against the $0.0042 case. - /api/v2/costs/tasks: expose a `limit` query param (1..1000, default 10). Analytics view keeps the top-10 default; TaskBoardView now requests limit=1000 so the badge map covers every task with spend, not just the top 10 (a board task outside the top-10 would otherwise never show). - react_agent: simplify task_id persistence — the upstream caller never passes None, so drop the unreachable branch. - TokenRepository: TODO note on the N+1 dominant-agent lookup. Fine at current scale; flagged for a future CTE refactor. - New direct unit tests for TopTasksTable and AgentCostBars covering empty/loading/data states and the zero-total token-split edge case.

frankbria · 2026-05-14T19:08:56Z

Thanks for the reviews. Addressed in bae5bac:

Sub-cent badge precision (claude[bot] Add Claude Code GitHub Workflow #1, coderabbitai TaskCard:24) — formatBadgeCost now uses 4dp for values under $0.01, matching TopTasksTable. New regression test covers $0.0042.
costMap capped to top-10 (coderabbitai TaskBoardView:53) — Added limit query param to GET /api/v2/costs/tasks (1..1000, default 10). Analytics view keeps the top-10 default; TaskBoardView now requests limit=1000 so every task with spend gets a badge.
Unreachable None branch in react_agent (coderabbitai) — Simplified.
N+1 in get_top_tasks_by_cost (claude[bot] Brainstorming: Integrate remaining general concepts into specification #2) — Added a TODO(perf) comment; not a blocker at limit=1000.
SWR-key intent (claude[bot] feat: Project schema refactoring with source types and workspace management #4) — Added a comment in TaskBoardView explaining the intentional separation from /costs page caching.
Direct component tests for TopTasksTable / AgentCostBars (claude[bot]) — Added.

Deferred to follow-up (logged here for the record):

Move router business logic to core (coderabbitai costs_v2.py:270) — Valid, but the existing costs_v2.py from [Phase 5.2] Cost analytics page: total spend and time range #557 ships the same pattern (_query_costs, _empty_summary live in the router). I'd rather keep [Phase 5.2] Cost analytics: per-task and per-agent breakdowns + task board cost badge #558 narrow and refactor both endpoints together in a separate PR.
_open_workspace_conn consolidation (claude[bot] feat(sprint-4): Multi-Agent Coordination System - P0 & P1 Complete #3) — Same follow-up.
project_id: 0 hardcode — Pre-existing, unrelated to this issue.

Tests: 845 web-ui passing (was 834), 44 backend tests for the new endpoints, ruff clean, npm run build clean.

claude · 2026-05-14T19:12:11Z

Code Review — PR #591: Per-task and per-agent cost breakdowns + task board cost badge

This is a well-scoped feature delivery. The bug fix for the UUID task ID data-loss in react_agent is the most impactful change and is handled correctly. Coverage is thorough. A few items worth addressing before merge:

Backend

Bug fix (react_agent.py) — correct
The old int(task_id) fallback to None silently dropped every v2 record's task linkage. The new approach (pass the raw value, str or int) is the right fix. The persist_task_id: int | str annotation without an initial value is safe because both branches of the try/except assign it.

N+1 query in get_top_tasks_by_cost — acknowledged but worth confirming limits
The TODO comment is appreciated, but the cap is limit=1000 (task board badge map). At 1000 tasks that's 1001 SQLite queries per board load, which is fine for a local SQLite file but could become sluggish if task boards grow. Consider folding this into a single CTE before the limit is raised further, e.g.:

WITH ranked AS (
  SELECT task_id, agent_id, COUNT(*) AS calls,
         ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC) AS rn
  FROM token_usage WHERE ...
  GROUP BY task_id, agent_id
)
SELECT t.task_id, t.input_tokens, t.output_tokens, t.total_cost_usd, r.agent_id
FROM agg_tasks t JOIN ranked r ON r.task_id = t.task_id AND r.rn = 1

Not a blocker at limit=1000 against SQLite, just worth tracking.

_open_workspace_conn catches sqlite3.Error but not OSError
sqlite3.connect() can raise OSError (e.g. permissions, bad path). The current handler would let that bubble up as a 500 instead of the graceful empty-response fallback the docstring promises:

except (sqlite3.Error, OSError) as e:
    logger.warning(...)
    return None

_query_top_tasks — bare except Exception on task title lookup
This is the right call for non-fatal title resolution, but the log level is debug. If title lookups are consistently failing (e.g. broken workspace path) this would be invisible in production. logger.warning would be more appropriate unless the failure is expected to be common.

Frontend

Math.max(...agents.map(...)) spread in AgentCostBars

const maxCost = Math.max(...agents.map((a) => a.total_cost_usd), 0);

Spreading into Math.max hits a stack overflow beyond ~10K arguments. The analytics view caps at a handful of agents so this won't fire today, but reduce is the idiomatic safe form:

const maxCost = agents.reduce((m, a) => Math.max(m, a.total_cost_usd), 0);

formatBadgeCost middle-range inconsistency
Between $0.01 and $1 the function uses $${value.toFixed(2)} (no locale separators) while the sub-cent and over-dollar ranges use toLocaleString. Unlikely to matter for USD amounts in this range, but it's inconsistent. Consider using toLocaleString throughout or at least document the choice.

Dead onKeyDown handler on non-focusable div

onKeyDown={(e) => e.stopPropagation()}

The containing div has no tabIndex, so this handler can never fire. Safe to remove.

SWR key format inconsistency between task board and costs page
The task board uses a string key while the costs page uses an array key ['/api/v2/costs/tasks', workspacePath, days]. This is intentional (different limit, no cache sharing desired) but the inconsistency could confuse future devs. A brief comment explaining why they're separate keys would help.

Tests

Coverage is solid:

44 new backend tests across repository and router layers
Frontend tests cover loading, empty, data, precision edge cases, divide-by-zero guard, and link format
The PRAGMA foreign_keys = OFF approach in test_supports_text_task_ids correctly models the v2 schema scenario
The orphan-task placeholder title test is a good regression guard

One gap: no test for the OSError path in _open_workspace_conn (e.g., pass a path to a directory rather than a file). Not blocking but would complete the error-path coverage.

Summary

Item	Status
Bug fix correctness	✅
Test coverage	✅ (44 backend + comprehensive frontend)
Non-breaking prop threading	✅
Architecture adherence (core-headless, no v1 imports)	✅
Must-fix before merge	⚠️ `OSError` not caught in `_open_workspace_conn`
Recommended (not blocking)	`reduce` over `Math.max(...spread)`, `warning` on title lookup failure, dead `onKeyDown` cleanup

The OSError gap is the one item worth addressing before merge since the docstring explicitly promises a graceful empty-response fallback. Everything else is polish.

claude · 2026-05-14T19:13:03Z

Follow-up Review — PR #591 (`bae5bac`)

All items I flagged in the initial review have been addressed. Acknowledging the responses and noting one small gap still open.

Prior issues resolved

Item	Status
Sub-cent badge shows $0.00	Fixed — `formatBadgeCost` now uses 4dp for values < $0.01, matching `TopTasksTable`. New regression test covers `$0.0042`.
N+1 queries in `get_top_tasks_by_cost`	Acknowledged — `TODO(perf)` comment added with clear rationale (fold into CTE if limit grows).
SWR key intentionality unclear	Resolved — comment in `TaskBoardView` explains the intentional separation from the `/costs` page cache.
No direct component tests for `TopTasksTable` / `AgentCostBars`	Fixed — both now have dedicated test files covering loading, empty, and data paths.
`_open_workspace_conn` consolidation deferred	Accepted — narrow PR is the right call; refactor both endpoints together in a follow-up.
`formatNumber` duplication deferred	Accepted — same rationale.

One still-open gap (minor)

TestCostsTasksDaysValidation is missing an upper-bound test for /api/v2/costs/by-agent.

The class covers days=0 rejection for both endpoints and days=400 rejection for /costs/tasks, but there is no counterpart for /costs/by-agent?days=400. The validation is correctly enforced by FastAPI Query(ge=1, le=365) today, but the asymmetry means a future refactor that accidentally widens one endpoint would go undetected.

Suggested addition to TestCostsTasksDaysValidation:

def test_by_agent_above_maximum_rejected(self, test_client):
    response = test_client.get("/api/v2/costs/by-agent?days=400")
    assert response.status_code == 422

Not a blocker — can land in the same follow-up as the other deferred items.

Verdict

The UUID fix alone justifies merging. All blocking feedback from the first round has been resolved. The analytics additions are clean, well-tested, and non-breaking. Ready to merge.

coderabbitai

🧹 Nitpick comments (2)

web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx (2)

54-58: ⚡ Quick win

Add one link test with a non-URL-safe task ID.

Current coverage only checks a URL-safe ID, so encoding regressions won’t be caught.

Add special-character ID coverage

   it('links the task title to the tasks page filtered by id', () => {
     render(<TopTasksTable tasks={[makeEntry({ task_id: 'abc-123' })]} />);
     const link = screen.getByRole('link', { name: /build login flow/i });
     expect(link).toHaveAttribute('href', '/tasks?selected=abc-123');
   });
+
+  it('URL-encodes task ids in the task link', () => {
+    render(<TopTasksTable tasks={[makeEntry({ task_id: 'abc/123 test' })]} />);
+    const link = screen.getByRole('link', { name: /build login flow/i });
+    expect(link).toHaveAttribute('href', '/tasks?selected=abc%2F123%20test');
+  });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx` around lines 54
- 58, Add a test in the TopTasksTable.test.tsx that uses TopTasksTable with a
task whose task_id contains special/non-URL-safe characters (e.g., spaces,
slashes, punctuation) and assert the rendered link’s href is '/tasks?selected='
plus the percent-encoded form of that id (use the same identifier string in the
test and compare to encodeURIComponent(id) to avoid hardcoding). Locate the
existing test that checks the link (the one rendering <TopTasksTable
tasks={[makeEntry({ task_id: 'abc-123' })]} />) and add a sibling it(...) that
supplies a non-URL-safe task_id and expects the link href to equal
'/tasks?selected=' + encodeURIComponent(task_id).

30-45: ⚡ Quick win

Add explicit token and cost assertions in the row-render test.

This case currently skips verification of token and cost cells, so those columns can regress unnoticed.

Suggested test hardening

   it('renders one row per task with title, agent, tokens, and cost', () => {
     render(
       <TopTasksTable
         tasks={[
           makeEntry({ task_id: 't-1', task_title: 'Foo', total_cost_usd: 0.50 }),
           makeEntry({ task_id: 't-2', task_title: 'Bar', total_cost_usd: 0.10 }),
         ]}
       />
     );
     const table = screen.getByTestId('top-tasks-table');
     expect(table).toBeInTheDocument();
     expect(screen.getByText('Foo')).toBeInTheDocument();
     expect(screen.getByText('Bar')).toBeInTheDocument();
+    expect(table).toHaveTextContent(/1,?234/);
+    expect(table).toHaveTextContent(/567/);
+    expect(table).toHaveTextContent(/\$0\.50/);
+    expect(table).toHaveTextContent(/\$0\.10/);
     // Both agent IDs render
     expect(screen.getAllByText('react-agent').length).toBeGreaterThanOrEqual(2);
   });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx` around lines 30
- 45, The test for TopTasksTable currently only checks titles and agent IDs but
doesn't assert token and cost cells; update the 'renders one row per task with
title, agent, tokens, and cost' test to additionally assert the token count and
formatted cost cells for each makeEntry row (use the same sample entries passed
to TopTasksTable and assert e.g. the rendered token values and formatted USD
strings like "$0.50" and "$0.10" are present), locating cells via
screen.getByText or by querying within the table/testid 'top-tasks-table' so the
tokens and total_cost_usd columns cannot regress.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx`:
- Around line 54-58: Add a test in the TopTasksTable.test.tsx that uses
TopTasksTable with a task whose task_id contains special/non-URL-safe characters
(e.g., spaces, slashes, punctuation) and assert the rendered link’s href is
'/tasks?selected=' plus the percent-encoded form of that id (use the same
identifier string in the test and compare to encodeURIComponent(id) to avoid
hardcoding). Locate the existing test that checks the link (the one rendering
<TopTasksTable tasks={[makeEntry({ task_id: 'abc-123' })]} />) and add a sibling
it(...) that supplies a non-URL-safe task_id and expects the link href to equal
'/tasks?selected=' + encodeURIComponent(task_id).
- Around line 30-45: The test for TopTasksTable currently only checks titles and
agent IDs but doesn't assert token and cost cells; update the 'renders one row
per task with title, agent, tokens, and cost' test to additionally assert the
token count and formatted cost cells for each makeEntry row (use the same sample
entries passed to TopTasksTable and assert e.g. the rendered token values and
formatted USD strings like "$0.50" and "$0.10" are present), locating cells via
screen.getByText or by querying within the table/testid 'top-tasks-table' so the
tokens and total_cost_usd columns cannot regress.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6432fea7-ac45-4507-8fe1-ebe2c4f34773

📥 Commits

Reviewing files that changed from the base of the PR and between 84abfbe and c2bcd90.

📒 Files selected for processing (11)

CLAUDE.md
codeframe/core/react_agent.py
codeframe/persistence/repositories/token_repository.py
codeframe/ui/routers/costs_v2.py
docs/PRODUCT_ROADMAP.md
web-ui/__tests__/components/tasks/TaskCard.test.tsx
web-ui/src/__tests__/components/costs/AgentCostBars.test.tsx
web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx
web-ui/src/components/tasks/TaskBoardView.tsx
web-ui/src/components/tasks/TaskCard.tsx
web-ui/src/lib/api.ts

✅ Files skipped from review due to trivial changes (3)

docs/PRODUCT_ROADMAP.md
web-ui/src/tests/components/costs/AgentCostBars.test.tsx
CLAUDE.md

🚧 Files skipped from review as they are similar to previous changes (6)

web-ui/tests/components/tasks/TaskCard.test.tsx
web-ui/src/lib/api.ts
web-ui/src/components/tasks/TaskCard.tsx
web-ui/src/components/tasks/TaskBoardView.tsx
codeframe/persistence/repositories/token_repository.py
codeframe/ui/routers/costs_v2.py

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

docs: mark Phase 5.2 (cost analytics) complete (#558)

c2bcd90

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

frankbria merged commit ac19311 into main May 14, 2026
12 checks passed

frankbria deleted the feat/558-cost-breakdowns-and-badge branch May 14, 2026 19:17

Conversation

frankbria commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

claude Bot commented May 14, 2026

CodeFRAME Development Guidelines

Product Vision

Primary Contract (MUST FOLLOW)

Current Focus: Phase 4A

Architecture Rules (non-negotiable)

1) Core must be headless

2) CLI must not require a server

3) Agent state transitions flow through runtime

4) Legacy can be read, not depended on

5) Keep commits runnable

Current State

v2 Architecture

Phase 3 Web UI (actively developed — not legacy)

What's implemented

Repository Structure

Commands

Python / CLI

Golden Path CLI

What NOT to do

Testing / Demoing

Quality check (covers both backend and web UI)

Demoing against a sample project (e.g., cf-test/)

Practical Working Mode

Environment Variables

Documentation

Uh oh!

claude Bot commented May 14, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude Bot commented May 14, 2026

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR 591: Per-task and per-agent cost breakdowns + task board cost badge

What is working well

Issues to address

Minor notes

Test gap

Verdict

Uh oh!

frankbria commented May 14, 2026

frankbria commented May 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Demoing against a sample project (e.g., `cf-test/`)

claude Bot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

Follow-up Review — PR #591 (`bae5bac`)