Skip to content

chore: remove legacy flat results.jsonl support and align naming around dataset/testcase #938

@christso

Description

@christso

Summary

AgentV should finish the transition to canonical run workspaces and consistent terminology.

Current active design uses:

  • .agentv/results/runs/<run-id>/index.jsonl as the canonical persisted run layout
  • dataset as the persisted result field and intended filter term
  • test case as the more accurate concept name for individual cases

Follow-up cleanup should remove remaining support for legacy flat results.jsonl inputs and align mixed internal naming that still uses terms like suite, evalSetName, and evalCase.

Why

This reduces design drift and avoids carrying legacy compatibility into new commands like agentv trend.

Scope

Remove legacy result layout support

Audit CLI/result-loading surfaces and remove support for legacy flat result file inputs where we still accept them.

Target canonical input shape:

.agentv/results/runs/<run-id>/index.jsonl

Align terminology

Proposed renames:

Current Proposed Why
suite dataset aligns with persisted result field and CLI filter term
evalSetName datasetName same reason
evalCase testCase object is a test case, not an eval run
evalCases testCases same reason
rawTestcases rawTestCases same reason plus consistent casing
legacy evalId fallback naming testId only where possible reduce dual-term confusion

Non-Goals

  • Do not change external wire format away from dataset/test_id
  • Do not block feature work like #913
  • Do not bundle risky behavioral changes unrelated to result input layout or naming cleanup

Acceptance Signals

  • remaining legacy flat result-file compatibility is removed or explicitly isolated behind a deliberate compatibility boundary
  • new commands only document and accept canonical run workspace inputs
  • variable naming in touched areas is aligned toward dataset and testCase
  • docs and code comments use consistent terminology

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions