Add *.eval.ts auto-discovery so TS-first eval authors don't need a hand-wired runner

## Problem

Running evals authored in TypeScript today requires writing a `run.ts` that imports case modules and calls `evaluate(...)` explicitly. YAML evals are auto-discovered by the CLI; TS evals aren't.

Evidence: `apps/cli/src/commands/eval/commands/run.ts:23` describes the positional args as *"Path(s) or glob(s) to evaluation .yaml file(s)"*. Globbing works for YAML / JSONL / JSON only. The `sdk-config-file` example covers `agentv.config.ts` discovery but that's the **config**, not eval case files.

That missing piece of boilerplate is the main reason the TS authoring path feels second-class next to YAML.

## Proposal

A discovery convention:

- CLI discovers `**/*.eval.ts` (and `**/*.eval.js` after a build step, if relevant) the same way it discovers `EVAL.yaml` / `*.eval.yaml`.
- Each discovered module default-exports (or named-exports) an `EvalConfig` value, and the CLI runs it with the same reporter, inspect, and compare tooling as YAML evals.
- Config discovery precedence and `--filter` / `--tag` / `--only` flags apply uniformly across YAML and TS evals.
- Runtime: use whatever loader the runtime supports for TS modules (Bun direct import, or tsx / jiti for Node). Document the expectation.

## Acceptance criteria

- `agentv run` picks up `*.eval.ts` files with no extra flags.
- A TS eval and a YAML eval in the same workspace produce identical trace / inspect output.
- Example under `examples/features/` demonstrates a mixed YAML + TS suite.
- Docs updated with discovery rules and the runtime expectation for executing `.ts` files.
- `--workers`, `--threshold`, `--tag`, `--exclude-tag`, `--filter`, `--retry-errors`, `--output`, and the cache/output-dir behaviour all work identically for TS-authored suites.

## Non-goals

- Does not add new `EvalConfig` fields — depends on #1115 for `beforeAll` / `budgetUsd` / multi-turn parity first.
- Does not replace the programmatic `evaluate()` function; `.eval.ts` files use it under the hood.

## Depends on

- #1115 — without the programmatic API gap closed first, auto-discovery would surface a TS authoring path that's still missing `beforeAll` / `budgetUsd` / multi-turn support.

## Motivation

Closes the DX gap with hand-rolled TS harnesses while keeping agentv's framework advantages — cost tracking, inspect, compare, grader variety. The current cliff is: "use YAML, or write your own runner script." Neither is what a TS-first user wants.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add *.eval.ts auto-discovery so TS-first eval authors don't need a hand-wired runner #1116

Problem

Proposal

Acceptance criteria

Non-goals

Depends on

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add *.eval.ts auto-discovery so TS-first eval authors don't need a hand-wired runner #1116

Description

Problem

Proposal

Acceptance criteria

Non-goals

Depends on

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions