chore: remove legacy flat results.jsonl support and align naming around dataset/testcase

## Summary

AgentV should finish the transition to canonical run workspaces and consistent terminology.

Current active design uses:

- `.agentv/results/runs/<run-id>/index.jsonl` as the canonical persisted run layout
- `dataset` as the persisted result field and intended filter term
- `test case` as the more accurate concept name for individual cases

Follow-up cleanup should remove remaining support for legacy flat `results.jsonl` inputs and align mixed internal naming that still uses terms like `suite`, `evalSetName`, and `evalCase`.

## Why

This reduces design drift and avoids carrying legacy compatibility into new commands like `agentv trend`.

## Scope

### Remove legacy result layout support

Audit CLI/result-loading surfaces and remove support for legacy flat result file inputs where we still accept them.

Target canonical input shape:

```text
.agentv/results/runs/<run-id>/index.jsonl
```

### Align terminology

Proposed renames:

| Current | Proposed | Why |
|--------|----------|-----|
| `suite` | `dataset` | aligns with persisted result field and CLI filter term |
| `evalSetName` | `datasetName` | same reason |
| `evalCase` | `testCase` | object is a test case, not an eval run |
| `evalCases` | `testCases` | same reason |
| `rawTestcases` | `rawTestCases` | same reason plus consistent casing |
| legacy `evalId` fallback naming | `testId` only where possible | reduce dual-term confusion |

## Non-Goals

- Do not change external wire format away from `dataset`/`test_id`
- Do not block feature work like `#913`
- Do not bundle risky behavioral changes unrelated to result input layout or naming cleanup

## Acceptance Signals

- remaining legacy flat result-file compatibility is removed or explicitly isolated behind a deliberate compatibility boundary
- new commands only document and accept canonical run workspace inputs
- variable naming in touched areas is aligned toward `dataset` and `testCase`
- docs and code comments use consistent terminology


Current	Proposed	Why
`suite`	`dataset`	aligns with persisted result field and CLI filter term
`evalSetName`	`datasetName`	same reason
`evalCase`	`testCase`	object is a test case, not an eval run
`evalCases`	`testCases`	same reason
`rawTestcases`	`rawTestCases`	same reason plus consistent casing
legacy `evalId` fallback naming	`testId` only where possible	reduce dual-term confusion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: remove legacy flat results.jsonl support and align naming around dataset/testcase #938

Summary

Why

Scope

Remove legacy result layout support

Align terminology

Non-Goals

Acceptance Signals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

chore: remove legacy flat results.jsonl support and align naming around dataset/testcase #938

Description

Summary

Why

Scope

Remove legacy result layout support

Align terminology

Non-Goals

Acceptance Signals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions