Skip to content

feat(studio): achieve full convex-evals feature parity#811

Merged
christso merged 3 commits intomainfrom
feat/810-studio-parity
Mar 28, 2026
Merged

feat(studio): achieve full convex-evals feature parity#811
christso merged 3 commits intomainfrom
feat/810-studio-parity

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Mar 28, 2026

Summary

Implements all 5 gaps from #810 plus low-priority items to achieve full convex-evals feature parity in AgentV Studio:

  • Gap 1 (High): File tree in Output/Task tabs — split layout with collapsible FileTree + Monaco editor
  • Gap 2 (High): Category drill-down page at /runs/:runId/category/:category with scoped stats and eval list
  • Gap 3 (Medium): Landing page tabs — Recent Runs, Experiments, Targets with URL param persistence
  • Gap 4 (Medium): Experiment detail page at /experiments/:experimentName with aggregate stats
  • Gap 5 (Medium): Breadcrumb navigation derived from TanStack Router matches
  • Low priority: Step timing badges, Target/Experiment columns in run list, run metadata enrichment

New files

  • FileTree.tsx, Breadcrumbs.tsx, ExperimentsTab.tsx, TargetsTab.tsx
  • Routes: $runId_.category.$category.tsx, $experimentName.tsx

Modified files

  • serve.ts — 4 new API endpoints (file tree, file content, experiments, targets)
  • EvalDetail.tsx — split layout for Output/Task tabs with file tree
  • RunDetail.tsx — category cards now navigate to drill-down pages
  • Sidebar.tsx — context-aware for category and experiment pages
  • Layout.tsx — breadcrumbs above content area
  • index.tsx — tabbed landing page

Test plan

  • Studio builds successfully
  • All 353 tests pass
  • Typecheck passes
  • Lint clean
  • All pre-commit hooks pass
  • Manual E2E with synthetic test data, browser-verified all screens:
    • Landing page with 3 tabs (Recent Runs, Experiments, Targets)
    • Experiments tab with table
    • Targets tab with table and evals passed/total
    • Experiment detail page with stat cards, sidebar, run list
    • Run detail with category cards linking to drill-down pages
    • Category drill-down page with scoped stats and filtered eval list
    • Eval detail with file tree split layout (Output tab)
    • File selection with syntax highlighting (JSON, Markdown)
    • Breadcrumb navigation on all sub-pages
    • Context-aware sidebar (runs, evals, categories, experiments)

Closes #810

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 28, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7e627f9
Status:⚡️  Build in progress...

View logs

@christso christso marked this pull request as ready for review March 28, 2026 11:26
christso and others added 3 commits March 28, 2026 11:59
Implements all 5 gaps from #810 plus low-priority items:

- Gap 1: File tree in Output/Task tabs with split layout (FileTree + Monaco)
- Gap 2: Category drill-down page at /runs/:runId/category/:category
- Gap 3: Landing page tabs (Recent Runs, Experiments, Targets)
- Gap 4: Experiment detail page at /experiments/:experimentName
- Gap 5: Breadcrumb navigation derived from TanStack Router matches

Low priority:
- Step timing badges on assertions
- Target/Experiment columns in run list
- Run metadata enrichment in API

New API endpoints: /api/experiments, /api/targets,
/api/runs/:filename/evals/:evalId/files,
/api/runs/:filename/evals/:evalId/files/*

Closes #810

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use LightweightResultRecord's experiment field directly instead of
unsafe Record<string, unknown> casts. Fix import ordering and formatting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TargetSummary type used `passed`/`total` fields but the API returns
`passed_count`/`eval_count`. Aligned the type and component.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christso christso force-pushed the feat/810-studio-parity branch from a626e62 to 7e627f9 Compare March 28, 2026 12:01
@christso christso merged commit b892eab into main Mar 28, 2026
1 of 2 checks passed
@christso christso deleted the feat/810-studio-parity branch March 28, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(studio): achieve full convex-evals feature parity

1 participant