Close programmatic TS API gap: add beforeAll, budgetUsd, and multi-turn fields to EvalConfig / EvalTestInput

## Problem

The programmatic SDK entry point `evaluate({ tests, target, ... })` is the path TypeScript-first eval authors use instead of YAML. It works today (see `examples/features/sdk-programmatic-api/evaluate.ts`) and gives real type-safety wins — autocomplete, refactor rename, compile errors on typos — that YAML can't match.

But several first-class YAML features are not exposed on the public programmatic types, which forces authors to fall back to YAML the moment they need any of them.

Verified against current `main`:

- `EvalConfig` (`packages/core/src/evaluation/evaluate.ts:138`) does not expose `beforeAll` or `budgetUsd` — both are supported in YAML (`execution.total_budget_usd`; suite-level `beforeAll` command).
- `EvalTestInput` (`packages/core/src/evaluation/evaluate.ts:83`) does not expose `turns` or `aggregation` — both are typed on the internal `EvalTest` and usable from YAML, but not from the programmatic API.

## Proposal

Add the following to the public TS types, wiring through to the same orchestrator paths used by the YAML loader:

### `EvalConfig`

- `beforeAll?: string | readonly string[]` — command(s) to run before the suite. Same semantics as YAML `beforeAll`.
- `budgetUsd?: number` — suite-level cost cap. Same semantics as YAML `execution.total_budget_usd` (renamed to `execution.budget_usd` if #1114 lands first).

### `EvalTestInput`

- `turns?: readonly ...[]` — multi-turn conversation definition (match the existing internal `EvalTest` shape).
- `aggregation?: ...` — aggregation strategy across turns (match existing internal shape).

## Acceptance criteria

- `EvalConfig` accepts `beforeAll` and `budgetUsd` and they route to the same code paths as YAML (including `budget_exceeded` emission on breach).
- `EvalTestInput` accepts `turns` and `aggregation`; a multi-turn test can be authored purely in TS with no YAML.
- Existing YAML loader behaviour unchanged; programmatic and YAML paths share the same underlying `EvalTest` / suite shape.
- One example added under `examples/features/` exercising all four fields in TS. (The `multi-turn-conversation` example can be cloned to a TS-authored variant.)
- Docs for the programmatic API list these fields as supported.
- No existing tests regress; new tests cover each field end-to-end via `evaluate()`.

## Non-goals

- Auto-discovery of `*.eval.ts` files (tracked separately).
- Renaming existing fields.
- Adding new grader types.

## Motivation

Closing this gap makes the TS authoring path a real peer to YAML rather than a subset. Today a user who starts programmatically and hits any of these features has to rewrite their suite in YAML — a needless migration cliff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close programmatic TS API gap: add beforeAll, budgetUsd, and multi-turn fields to EvalConfig / EvalTestInput #1115

Problem

Proposal

`EvalConfig`

`EvalTestInput`

Acceptance criteria

Non-goals

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Close programmatic TS API gap: add beforeAll, budgetUsd, and multi-turn fields to EvalConfig / EvalTestInput #1115

Description

Problem

Proposal

EvalConfig

EvalTestInput

Acceptance criteria

Non-goals

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`EvalConfig`

`EvalTestInput`