feat(costs): per-task and per-agent breakdowns + task board cost badge (#558)#591
Conversation
#558) Extends the /costs page (added in #557) with two analytics sections and adds an inline cost badge to each task card on the /tasks board. Backend (Python): - New TokenRepository.get_top_tasks_by_cost(days, limit) and get_costs_by_agent(days), aggregating the workspace token_usage table. - New endpoints under /api/v2/costs: GET /tasks -> top 10 tasks with titles, agent, tokens, cost GET /by-agent -> per-agent rollup + total input/output tokens - Title resolution falls back to a placeholder when token_usage references a task no longer present in the workspace, so the table never blanks out. - TokenUsage.task_id widened to Optional[Union[int, str]] so v2 UUID task IDs are preserved end-to-end (react_agent.py was int()-casting and storing NULL for every v2 record, blocking per-task analytics). Frontend (TypeScript / Next.js): - New types TaskCostEntry, TaskCostsResponse, AgentCostEntry, AgentCostsResponse and matching costsApi.getTopTasks / getByAgent. - New CostsView sections: TopTasksTable and AgentCostBars (pure-Tailwind horizontal bars + input/output token split row, no charting library). - TaskCard renders a small MoneyBag02Icon + cost badge with a tooltip showing input/output token breakdown when costMap has a positive entry for that task. costMap threads through TaskBoardView -> Content -> Column as an optional prop; non-breaking.
WalkthroughThis PR implements cost analytics infrastructure and UI to help users identify expensive tasks and cost drivers by agent. It updates the core task ID model to support both integer and UUID-string identifiers (v1 and v2 workspace schemas), adds backend repository queries for per-task and per-agent cost aggregation, exposes two new REST API endpoints, and integrates cost visualization into the costs page and task board. ChangesCost Analytics Infrastructure
Frontend Cost Analytics UI
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
CodeFRAME Development GuidelinesLast updated: 2026-05-11 Product VisionCodeFrame is a project delivery system: Think → Build → Prove → Ship. It owns the edges of the AI coding pipeline — everything BEFORE code gets written (PRD, specification, task decomposition) and everything AFTER (verification gates, quality memory, deployment). The actual code writing is delegated to frontier coding agents (Claude Code, Codex, OpenCode). CodeFrame does not compete with coding agents. It orchestrates them. Status: CLI ✅ | Server ✅ | ReAct agent ✅ | Web UI ✅ | Agent adapters ✅ | Multi-provider LLM ✅ | Next: Phase 4A — See If you are an agent working in this repo: do not improvise architecture. Follow the documents listed below. Primary Contract (MUST FOLLOW)
Rule 0: If a change does not directly support the Think → Build → Prove → Ship pipeline, do not implement it. Current Focus: Phase 4APhase 5.1 is complete — Settings page now ships three working tabs: Agent (#554), API Keys (#555), and PROOF9 Defaults + Workspace Config (#556). Backend: Phase 3.5C is complete — Next, in order:
See Architecture Rules (non-negotiable)1) Core must be headless
Core is allowed to: read/write durable state (SQLite/filesystem), run orchestration/worker loops, emit events to an append-only event log, call adapters via interfaces (LLM, git, fs). 2) CLI must not require a serverGolden Path commands must work from the CLI with no server running. FastAPI is optional, started explicitly via 3) Agent state transitions flow through runtime
This separation prevents duplicate state transitions (e.g., DONE→DONE errors). 4) Legacy can be read, not depended on
5) Keep commits runnableAt all times: Current Statev2 Architecture
Phase 3 Web UI (actively developed — not legacy)Next.js 16 App Router, TypeScript, Shadcn/UI, Tailwind CSS, Hugeicons, XTerm.js, WebSocket + SSE. Shipped pages: Testing: What's implementedFull feature list in Repository StructureCommandsPython / CLIuv run pytest # All tests
uv run pytest -m v2 # v2 tests only
uv run pytest tests/core/ # Core module tests
uv run pytest tests/lifecycle/ # Lifecycle tests (no live API calls — uses MockProvider)
uv run ruff check .
# Web UI
cd web-ui && npm test
cd web-ui && npm run buildGolden Path CLI# Workspace
cf init <repo> [--detect | --tech-stack "..." | --tech-stack-interactive]
cf status
# PRD
cf prd add <file.md>
cf prd show
# Tasks
cf tasks generate
cf tasks list [--status READY]
cf tasks show <id>
# Work — single task
cf work start <task-id> [--execute] [--engine react|plan] [--verbose] [--dry-run]
cf work start <task-id> --execute --stall-timeout 120 --stall-action retry|blocker|fail
cf work start <task-id> --execute --llm-provider openai --llm-model gpt-4o
cf work stop <task-id>
cf work resume <task-id>
cf work follow <task-id> [--tail 50]
cf work diagnose <task-id>
# Work — batch
cf work batch run [<id>...] [--all-ready] [--engine react|plan]
cf work batch run --strategy serial|parallel|auto [--max-parallel 4] [--retry 3]
cf work batch run --all-ready --llm-provider openai --llm-model qwen2.5-coder:7b
cf work batch status|cancel|resume [batch_id]
# Blockers
cf blocker list
cf blocker show <id>
cf blocker answer <id> "answer"
# Quality / State
cf review && cf patch export && cf commit
cf checkpoint create|list|restore
cf summary
# Environment
cf env check|install|doctor
# GitHub PR
cf pr create|status|checks|mergeNote: What NOT to do
Testing / DemoingQuality check (covers both backend and web UI)uv run pytest && uv run ruff check .
cd web-ui && npm test && npm run buildNew v2 tests: add Demoing against a sample project (e.g.,
|
| Doc | Purpose |
|---|---|
docs/VISION.md |
North star: Think → Build → Prove → Ship thesis |
docs/PRODUCT_ROADMAP.md |
Current roadmap — Phase 3.5/4/5 web product completeness |
docs/GOLDEN_PATH.md |
CLI-first workflow contract |
docs/CLI_WIREFRAME.md |
Command → module mapping |
docs/AGENT_SYSTEM_REFERENCE.md |
Component table, model selection, execution flows, self-correction |
docs/REACT_AGENT_ARCHITECTURE.md |
ReAct deep-dive: tools, editor, token management |
docs/PHASE_3_UI_ARCHITECTURE.md |
Web UI architecture (Next.js, pages, components) |
docs/PHASE_2_DEVELOPER_GUIDE.md |
Server layer + v2 router patterns |
docs/PHASE_2_CLI_API_MAPPING.md |
CLI to API endpoint mapping |
docs/QUICKSTART.md |
User-facing quickstart guide |
Archived (completed plans, old gap analyses): docs/archive/
Legacy (v1 reference only): SPRINTS.md, sprints/, specs/, CODEFRAME_SPEC.md
|
TEST COMMENT - WILL BE REPLACED - placeholder for review |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/ui/test_costs_v2.py (1)
321-324:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd upper-bound validation coverage for
/api/v2/costs/by-agentdays.Line 321 currently verifies only the lower bound. Add a
days > 90rejection test to prevent range-validation regressions on this endpoint.Proposed test addition
class TestCostsTasksDaysValidation: @@ def test_by_agent_below_minimum_rejected(self, test_client): response = test_client.get("/api/v2/costs/by-agent?days=0") assert response.status_code == 422 + + def test_by_agent_above_maximum_rejected(self, test_client): + response = test_client.get("/api/v2/costs/by-agent?days=400") + assert response.status_code == 422🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ui/test_costs_v2.py` around lines 321 - 324, Add a complementary test to cover the upper-bound validation for the /api/v2/costs/by-agent endpoint: create a new test function (e.g., test_by_agent_above_maximum_rejected) next to test_by_agent_below_minimum_rejected that uses test_client to GET "/api/v2/costs/by-agent?days=91" and asserts response.status_code == 422 so days > 90 is rejected; ensure the test uses the same test_client fixture and naming pattern as the existing test.
🧹 Nitpick comments (1)
codeframe/persistence/repositories/token_repository.py (1)
369-447: ⚡ Quick winN+1 query pattern in agent lookup.
The method executes one query per task to find the most-used agent (lines 423-437). For the default limit of 10, this results in 11 total queries. While acceptable for an analytics endpoint, consider consolidating into a single query using a window function or correlated subquery if this becomes a performance concern.
Example optimization:
SELECT task_id, agent_id, input_tokens, output_tokens, total_cost_usd FROM ( SELECT task_id, agent_id, SUM(input_tokens) AS input_tokens, SUM(output_tokens) AS output_tokens, SUM(estimated_cost_usd) AS total_cost_usd, COUNT(*) AS calls, ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC) AS rn FROM token_usage WHERE task_id IS NOT NULL AND timestamp >= ? AND timestamp < ? GROUP BY task_id, agent_id ) WHERE rn = 1 ORDER BY total_cost_usd DESC LIMIT ?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codeframe/persistence/repositories/token_repository.py` around lines 369 - 447, get_top_tasks_by_cost suffers an N+1 query when resolving the most-used agent per task (the per-row cursor.execute block that queries token_usage for each task_id); replace the per-task lookup with a single consolidated query that computes per-task aggregates and selects the top agent per task (e.g., using a window function ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC) or a correlated subquery) so the method returns task_id, agent_id, input_tokens, output_tokens, total_cost_usd in one query and eliminates the looped agent lookup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@codeframe/ui/routers/costs_v2.py`:
- Around line 180-270: Router file contains business logic (DB opening,
aggregation, task-title lookup) inside functions _open_workspace_conn,
_token_usage_exists, _query_top_tasks and _query_costs_by_agent; move that logic
into a new core service. Create functions in codeframe/core (e.g.,
costs_service.get_top_tasks(db_path, workspace, days, limit) and
costs_service.get_costs_by_agent(db_path, days)) that encapsulate connection
handling, TokenRepository calls, token table existence checks, and task title
resolution (use _placeholder_task_title and tasks_module.get inside the core
service), then import and call those service functions from the router so the
router only adapts request/response. Keep the same behavior on errors (return
empty lists/dicts and log via logger), preserve the existing function names when
refactoring for easy replacement, and update imports/tests to reference the new
costs_service APIs.
In `@web-ui/src/components/tasks/TaskBoardView.tsx`:
- Around line 42-53: costMap is currently built from costsApi.getTopTasks (via
useSWR) which only returns the top-10 tasks, so tasks outside that set never get
badges; change the data fetch to request costs for the actual visible task IDs
(or an API that returns all tasks' costs) and rebuild costMap from that complete
result. Concretely: replace the useSWR call that invokes
costsApi.getTopTasks(workspacePath) with a call that passes the list of visible
task IDs (e.g. costsApi.getCostsForTasks(workspacePath, visibleTaskIds) or an
equivalent endpoint that returns all task costs), use a SWR key that includes
the visibleTaskIds array, and update the useMemo dependency from
[costData?.tasks] to include visibleTaskIds so the Map<string, TaskCostEntry>
built in the costMap logic contains entries for every visible task ID.
In `@web-ui/src/components/tasks/TaskCard.tsx`:
- Around line 13-24: The badge shows "$0.00" for micro-costs; update
formatBadgeCost to treat values < 0.01 specially (return the label "< $0.01"
instead of toFixed(2)), and add a helper (e.g., formatFullCost) that renders
full precision (up to 6 decimals) for use in the badge tooltip; then update the
TaskCard badge label and its tooltip to use formatBadgeCost for the visible
badge and formatFullCost for the tooltip so tiny non-zero costs are not
represented as "$0.00".
- Around line 178-205: The cost badge currently shows for any positive cost but
can display "$0.00" for tiny amounts; update the display condition in the
TaskCard rendering (the block that checks showCostBadge and costEntry and uses
formatBadgeCost) to only render the badge when the value would round to at least
$0.01 (e.g., costEntry.total_cost_usd >= 0.01 or by checking
formatBadgeCost(costEntry.total_cost_usd) !== "$0.00"); if you choose to hide
the badge for these tiny amounts, keep the TooltipContent but show
higher-precision cost (e.g., 4+ decimal places) there so users can see the
actual cost.
---
Outside diff comments:
In `@tests/ui/test_costs_v2.py`:
- Around line 321-324: Add a complementary test to cover the upper-bound
validation for the /api/v2/costs/by-agent endpoint: create a new test function
(e.g., test_by_agent_above_maximum_rejected) next to
test_by_agent_below_minimum_rejected that uses test_client to GET
"/api/v2/costs/by-agent?days=91" and asserts response.status_code == 422 so days
> 90 is rejected; ensure the test uses the same test_client fixture and naming
pattern as the existing test.
---
Nitpick comments:
In `@codeframe/persistence/repositories/token_repository.py`:
- Around line 369-447: get_top_tasks_by_cost suffers an N+1 query when resolving
the most-used agent per task (the per-row cursor.execute block that queries
token_usage for each task_id); replace the per-task lookup with a single
consolidated query that computes per-task aggregates and selects the top agent
per task (e.g., using a window function ROW_NUMBER() OVER (PARTITION BY task_id
ORDER BY COUNT(*) DESC) or a correlated subquery) so the method returns task_id,
agent_id, input_tokens, output_tokens, total_cost_usd in one query and
eliminates the looped agent lookup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c238e7d1-e5e7-48fd-8f96-b9099b5cd3a9
📒 Files selected for processing (18)
codeframe/core/models.pycodeframe/core/react_agent.pycodeframe/lib/metrics_tracker.pycodeframe/persistence/repositories/token_repository.pycodeframe/ui/routers/costs_v2.pytests/persistence/test_token_repository_costs.pytests/ui/test_costs_v2.pyweb-ui/__tests__/components/tasks/TaskCard.test.tsxweb-ui/src/__tests__/components/costs/CostsPage.test.tsxweb-ui/src/app/costs/page.tsxweb-ui/src/components/costs/AgentCostBars.tsxweb-ui/src/components/costs/TopTasksTable.tsxweb-ui/src/components/tasks/TaskBoardContent.tsxweb-ui/src/components/tasks/TaskBoardView.tsxweb-ui/src/components/tasks/TaskCard.tsxweb-ui/src/components/tasks/TaskColumn.tsxweb-ui/src/lib/api.tsweb-ui/src/types/index.ts
| def _open_workspace_conn(db_path: str) -> Optional[sqlite3.Connection]: | ||
| """Open the workspace DB or return None if it cannot be read. | ||
|
|
||
| Mirrors _query_costs's tolerance for fresh/locked workspaces: callers | ||
| fall back to an empty response rather than 500'ing the dashboard. | ||
| """ | ||
| try: | ||
| conn = sqlite3.connect(db_path) | ||
| conn.row_factory = sqlite3.Row | ||
| return conn | ||
| except sqlite3.Error as e: | ||
| logger.warning("costs: failed to open %s: %s", db_path, e) | ||
| return None | ||
|
|
||
|
|
||
| def _token_usage_exists(conn: sqlite3.Connection) -> bool: | ||
| cursor = conn.cursor() | ||
| cursor.execute( | ||
| "SELECT name FROM sqlite_master WHERE type='table' AND name='token_usage'" | ||
| ) | ||
| return cursor.fetchone() is not None | ||
|
|
||
|
|
||
| def _query_top_tasks( | ||
| db_path: str, workspace: Workspace, days: int, limit: int = 10, | ||
| ) -> List[Dict[str, Any]]: | ||
| """Aggregate per-task cost and join titles via workspace.tasks. | ||
|
|
||
| Returns a list of dicts ready for serialization into ``TaskCostEntry``. | ||
| """ | ||
| conn = _open_workspace_conn(db_path) | ||
| if conn is None: | ||
| return [] | ||
|
|
||
| try: | ||
| if not _token_usage_exists(conn): | ||
| return [] | ||
| try: | ||
| repo = TokenRepository(sync_conn=conn) | ||
| rows = repo.get_top_tasks_by_cost(days=days, limit=limit) | ||
| except sqlite3.Error as e: | ||
| logger.warning("costs/tasks: query failed on %s: %s", db_path, e) | ||
| return [] | ||
| finally: | ||
| conn.close() | ||
|
|
||
| entries: List[Dict[str, Any]] = [] | ||
| for row in rows: | ||
| raw_id = row["task_id"] | ||
| task_id_str = str(raw_id) if raw_id is not None else "" | ||
| title = _placeholder_task_title(task_id_str) | ||
| try: | ||
| task = tasks_module.get(workspace, task_id_str) | ||
| if task is not None: | ||
| title = task.title | ||
| except Exception: | ||
| # Lookup failures are non-fatal — keep the placeholder title. | ||
| logger.debug("costs/tasks: task lookup failed for %s", task_id_str, exc_info=True) | ||
|
|
||
| entries.append({ | ||
| "task_id": task_id_str, | ||
| "task_title": title, | ||
| "agent_id": row["agent_id"], | ||
| "input_tokens": row["input_tokens"], | ||
| "output_tokens": row["output_tokens"], | ||
| "total_cost_usd": row["total_cost_usd"], | ||
| }) | ||
|
|
||
| return entries | ||
|
|
||
|
|
||
| def _query_costs_by_agent(db_path: str, days: int) -> Dict[str, Any]: | ||
| """Aggregate per-agent cost over the window.""" | ||
| empty = {"by_agent": [], "total_input_tokens": 0, "total_output_tokens": 0} | ||
|
|
||
| conn = _open_workspace_conn(db_path) | ||
| if conn is None: | ||
| return empty | ||
|
|
||
| try: | ||
| if not _token_usage_exists(conn): | ||
| return empty | ||
| try: | ||
| repo = TokenRepository(sync_conn=conn) | ||
| return repo.get_costs_by_agent(days=days) | ||
| except sqlite3.Error as e: | ||
| logger.warning("costs/by-agent: query failed on %s: %s", db_path, e) | ||
| return empty | ||
| finally: | ||
| conn.close() | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Move aggregation and lookup logic out of the router module.
The connection management, repository aggregation, and task-title resolution here are core/business logic. Keep router handlers as thin request/response adapters and move this logic into codeframe/core/ service code.
As per coding guidelines, codeframe/{core,cli,ui}/**/*.py: "Core domain logic must be implemented in codeframe/core/" and codeframe/ui/routers/**/*.py: "FastAPI routers must be thin adapters over core. Do not implement business logic inside routers."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@codeframe/ui/routers/costs_v2.py` around lines 180 - 270, Router file
contains business logic (DB opening, aggregation, task-title lookup) inside
functions _open_workspace_conn, _token_usage_exists, _query_top_tasks and
_query_costs_by_agent; move that logic into a new core service. Create functions
in codeframe/core (e.g., costs_service.get_top_tasks(db_path, workspace, days,
limit) and costs_service.get_costs_by_agent(db_path, days)) that encapsulate
connection handling, TokenRepository calls, token table existence checks, and
task title resolution (use _placeholder_task_title and tasks_module.get inside
the core service), then import and call those service functions from the router
so the router only adapts request/response. Keep the same behavior on errors
(return empty lists/dicts and log via logger), preserve the existing function
names when refactoring for easy replacement, and update imports/tests to
reference the new costs_service APIs.
|
placeholder2 |
|
line1 |
|
Deleting test comment - will post actual review now |
|
placeholder - about to write review file |
|
line without hash works fine here |
Code Review — PR 591: Per-task and per-agent cost breakdowns + task board cost badgeThis is a solid PR. The v2 UUID task-ID bug fix is the most impactful change here, and the analytics additions are well-scoped and tested. A few items worth addressing before merge. What is working well
Issues to address1. Cost badge shows $0.00 for sub-cent tasks File: web-ui/src/components/tasks/TaskCard.tsx formatBadgeCost uses toFixed(2) for values under $1, so a cost of 0.0012 renders as $0.00 even though showCostBadge is true (the guard is > 0, not >= 0.005). A task that actually consumed tokens would show a misleading zero badge. TopTasksTable and AgentCostBars use 4-6 decimal places via minimumFractionDigits: 4, which is appropriate for AI costs. Consider matching that precision in the badge, or at minimum tightening the guard to match the display threshold (e.g. >= 0.005). 2. N+1 queries in get_top_tasks_by_cost File: codeframe/persistence/repositories/token_repository.py, lines 154-179 The outer query fetches the top-N tasks, then a second query per row finds the dominant agent — up to 11 queries total for the default limit=10. Fine at current scale, but the same result is achievable in a single CTE. Not a blocker, but worth a TODO comment on the current loop for when the limit grows. 3. _open_workspace_conn duplicates logic already in _query_costs File: codeframe/ui/routers/costs_v2.py _open_workspace_conn is a clean extraction, but the original _query_costs function still opens its own connection inline — two code paths for the same operation. Worth consolidating in a follow-up to keep the file consistent. 4. SWR key inconsistency between TaskBoardView and CostsPage File: web-ui/src/components/tasks/TaskBoardView.tsx TaskBoardView uses a string key while CostsPage uses an array key. They never share the SWR cache. This is intentionally fine (the board always wants 30 days; the page uses the user-selected range), but a brief comment acknowledging the intentional scope difference would help future readers. Minor notes
Test gapNeither TopTasksTable nor AgentCostBars has a direct component test; they are only exercised via mocks in CostsPage.test.tsx. The loading-skeleton render paths (data-testid="top-tasks-loading" and data-testid="agent-bars-loading") are untested. Low priority, but worth adding before the /costs page is customer-facing. VerdictThe UUID fix alone justifies merging. The analytics additions are clean and complete. Addressing the $0.00 badge edge case (item 1) before merge would prevent user confusion; the rest are low-priority follow-ups. |
- TaskCard badge: format sub-cent costs at 4dp instead of collapsing to $0.00. Adds a regression test against the $0.0042 case. - /api/v2/costs/tasks: expose a `limit` query param (1..1000, default 10). Analytics view keeps the top-10 default; TaskBoardView now requests limit=1000 so the badge map covers every task with spend, not just the top 10 (a board task outside the top-10 would otherwise never show). - react_agent: simplify task_id persistence — the upstream caller never passes None, so drop the unreachable branch. - TokenRepository: TODO note on the N+1 dominant-agent lookup. Fine at current scale; flagged for a future CTE refactor. - New direct unit tests for TopTasksTable and AgentCostBars covering empty/loading/data states and the zero-total token-split edge case.
|
Thanks for the reviews. Addressed in
Deferred to follow-up (logged here for the record):
Tests: 845 web-ui passing (was 834), 44 backend tests for the new endpoints, ruff clean, |
Code Review — PR #591: Per-task and per-agent cost breakdowns + task board cost badgeThis is a well-scoped feature delivery. The bug fix for the UUID task ID data-loss in BackendBug fix (react_agent.py) — correct N+1 query in WITH ranked AS (
SELECT task_id, agent_id, COUNT(*) AS calls,
ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY COUNT(*) DESC) AS rn
FROM token_usage WHERE ...
GROUP BY task_id, agent_id
)
SELECT t.task_id, t.input_tokens, t.output_tokens, t.total_cost_usd, r.agent_id
FROM agg_tasks t JOIN ranked r ON r.task_id = t.task_id AND r.rn = 1Not a blocker at limit=1000 against SQLite, just worth tracking.
except (sqlite3.Error, OSError) as e:
logger.warning(...)
return None
Frontend
const maxCost = Math.max(...agents.map((a) => a.total_cost_usd), 0);Spreading into const maxCost = agents.reduce((m, a) => Math.max(m, a.total_cost_usd), 0);
Dead onKeyDown={(e) => e.stopPropagation()}The containing SWR key format inconsistency between task board and costs page TestsCoverage is solid:
One gap: no test for the Summary
The |
Follow-up Review — PR #591 (bae5bac)All items I flagged in the initial review have been addressed. Acknowledging the responses and noting one small gap still open. Prior issues resolved
One still-open gap (minor)
The class covers Suggested addition to def test_by_agent_above_maximum_rejected(self, test_client):
response = test_client.get("/api/v2/costs/by-agent?days=400")
assert response.status_code == 422Not a blocker — can land in the same follow-up as the other deferred items. VerdictThe UUID fix alone justifies merging. All blocking feedback from the first round has been resolved. The analytics additions are clean, well-tested, and non-breaking. Ready to merge. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx (2)
54-58: ⚡ Quick winAdd one link test with a non-URL-safe task ID.
Current coverage only checks a URL-safe ID, so encoding regressions won’t be caught.
Add special-character ID coverage
it('links the task title to the tasks page filtered by id', () => { render(<TopTasksTable tasks={[makeEntry({ task_id: 'abc-123' })]} />); const link = screen.getByRole('link', { name: /build login flow/i }); expect(link).toHaveAttribute('href', '/tasks?selected=abc-123'); }); + + it('URL-encodes task ids in the task link', () => { + render(<TopTasksTable tasks={[makeEntry({ task_id: 'abc/123 test' })]} />); + const link = screen.getByRole('link', { name: /build login flow/i }); + expect(link).toHaveAttribute('href', '/tasks?selected=abc%2F123%20test'); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx` around lines 54 - 58, Add a test in the TopTasksTable.test.tsx that uses TopTasksTable with a task whose task_id contains special/non-URL-safe characters (e.g., spaces, slashes, punctuation) and assert the rendered link’s href is '/tasks?selected=' plus the percent-encoded form of that id (use the same identifier string in the test and compare to encodeURIComponent(id) to avoid hardcoding). Locate the existing test that checks the link (the one rendering <TopTasksTable tasks={[makeEntry({ task_id: 'abc-123' })]} />) and add a sibling it(...) that supplies a non-URL-safe task_id and expects the link href to equal '/tasks?selected=' + encodeURIComponent(task_id).
30-45: ⚡ Quick winAdd explicit token and cost assertions in the row-render test.
This case currently skips verification of token and cost cells, so those columns can regress unnoticed.
Suggested test hardening
it('renders one row per task with title, agent, tokens, and cost', () => { render( <TopTasksTable tasks={[ makeEntry({ task_id: 't-1', task_title: 'Foo', total_cost_usd: 0.50 }), makeEntry({ task_id: 't-2', task_title: 'Bar', total_cost_usd: 0.10 }), ]} /> ); const table = screen.getByTestId('top-tasks-table'); expect(table).toBeInTheDocument(); expect(screen.getByText('Foo')).toBeInTheDocument(); expect(screen.getByText('Bar')).toBeInTheDocument(); + expect(table).toHaveTextContent(/1,?234/); + expect(table).toHaveTextContent(/567/); + expect(table).toHaveTextContent(/\$0\.50/); + expect(table).toHaveTextContent(/\$0\.10/); // Both agent IDs render expect(screen.getAllByText('react-agent').length).toBeGreaterThanOrEqual(2); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx` around lines 30 - 45, The test for TopTasksTable currently only checks titles and agent IDs but doesn't assert token and cost cells; update the 'renders one row per task with title, agent, tokens, and cost' test to additionally assert the token count and formatted cost cells for each makeEntry row (use the same sample entries passed to TopTasksTable and assert e.g. the rendered token values and formatted USD strings like "$0.50" and "$0.10" are present), locating cells via screen.getByText or by querying within the table/testid 'top-tasks-table' so the tokens and total_cost_usd columns cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@web-ui/src/__tests__/components/costs/TopTasksTable.test.tsx`:
- Around line 54-58: Add a test in the TopTasksTable.test.tsx that uses
TopTasksTable with a task whose task_id contains special/non-URL-safe characters
(e.g., spaces, slashes, punctuation) and assert the rendered link’s href is
'/tasks?selected=' plus the percent-encoded form of that id (use the same
identifier string in the test and compare to encodeURIComponent(id) to avoid
hardcoding). Locate the existing test that checks the link (the one rendering
<TopTasksTable tasks={[makeEntry({ task_id: 'abc-123' })]} />) and add a sibling
it(...) that supplies a non-URL-safe task_id and expects the link href to equal
'/tasks?selected=' + encodeURIComponent(task_id).
- Around line 30-45: The test for TopTasksTable currently only checks titles and
agent IDs but doesn't assert token and cost cells; update the 'renders one row
per task with title, agent, tokens, and cost' test to additionally assert the
token count and formatted cost cells for each makeEntry row (use the same sample
entries passed to TopTasksTable and assert e.g. the rendered token values and
formatted USD strings like "$0.50" and "$0.10" are present), locating cells via
screen.getByText or by querying within the table/testid 'top-tasks-table' so the
tokens and total_cost_usd columns cannot regress.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6432fea7-ac45-4507-8fe1-ebe2c4f34773
📒 Files selected for processing (11)
CLAUDE.mdcodeframe/core/react_agent.pycodeframe/persistence/repositories/token_repository.pycodeframe/ui/routers/costs_v2.pydocs/PRODUCT_ROADMAP.mdweb-ui/__tests__/components/tasks/TaskCard.test.tsxweb-ui/src/__tests__/components/costs/AgentCostBars.test.tsxweb-ui/src/__tests__/components/costs/TopTasksTable.test.tsxweb-ui/src/components/tasks/TaskBoardView.tsxweb-ui/src/components/tasks/TaskCard.tsxweb-ui/src/lib/api.ts
✅ Files skipped from review due to trivial changes (3)
- docs/PRODUCT_ROADMAP.md
- web-ui/src/tests/components/costs/AgentCostBars.test.tsx
- CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (6)
- web-ui/tests/components/tasks/TaskCard.test.tsx
- web-ui/src/lib/api.ts
- web-ui/src/components/tasks/TaskCard.tsx
- web-ui/src/components/tasks/TaskBoardView.tsx
- codeframe/persistence/repositories/token_repository.py
- codeframe/ui/routers/costs_v2.py
Closes #558.
Summary
/costspage (built on top of the summary added in [Phase 5.2] Cost analytics page: total spend and time range #557)./tasksboard with a hover tooltip showing input/output token counts.react_agent.pyint-cast UUID task IDs and silently storedNULLintoken_usage, leaving per-task analytics permanently empty.Changes
Backend
TokenRepository.get_top_tasks_by_cost(days, limit)— group bytask_id, return top N with the most-used agent per task.TokenRepository.get_costs_by_agent(days)— group byagent_id, return per-agent rollup plus total input/output tokens.GET /api/v2/costs/tasks— top 10 tasks; titles resolved viatasks.get, with a placeholder for tasks that no longer exist.GET /api/v2/costs/by-agent— per-agent rollup.TokenUsage.task_idwidened toOptional[Union[int, str]].react_agentnow passes the task ID through verbatim (UUID for v2, int for v1) instead of dropping it.Frontend (Next.js / Tailwind / Hugeicons)
costsApi.getTopTasks/getByAgent.TopTasksTableandAgentCostBarscomponents (pure Tailwind horizontal bars + input/output token split row — no charting library required, matches the existing stack)./costspage wires both sections under the existing summary cards and chart, sharing the time-range selector.TaskCardadds aMoneyBag02Icon+ cost badge when the task has a positive cost entry.costMapflows fromTaskBoardView→TaskBoardContent→TaskColumn→TaskCardas an optional prop (non-breaking).Test plan
uv run pytest tests/persistence/test_token_repository_costs.py tests/ui/test_costs_v2.py— 44 passinguv run ruff check codeframe/— cleancd web-ui && npm test— 834 passing (66 suites)cd web-ui && npm run build— clean, all 15 routes buildcf-test/: page renders top-10 table, agent bars, token split; task board cards show badges where cost data existsNotes
tests/integration/test_worker_agent_*,tests/testing/test_self_correction_*) reproduce onmainwithout these changes — they require live API behavior and have been flaky before.Summary by CodeRabbit