feat(core): add beforeAll, budgetUsd, turns, aggregation to programmatic API by christso · Pull Request #1119 · EntityProcess/agentv

christso · 2026-04-16T00:05:17Z

Summary

Closes the programmatic TS API gap by exposing four YAML-only features on the public SDK types.

Closes #1115

Changes

Types (`packages/core/src/evaluation/evaluate.ts`)

EvalConfig.beforeAll: string | string[] — command(s) run before suite, converted to WorkspaceHookConfig via sh -c
EvalConfig.budgetUsd: number — suite cost cap, passed to orchestrator
EvalTestInput.turns: ConversationTurnInput[] — multi-turn conversation definition
EvalTestInput.aggregation: 'mean' | 'min' | 'max' — score aggregation across turns
EvalTestInput.mode: 'conversation' — auto-inferred when turns[] present
EvalTestInput.input: Made optional (not needed in conversation mode)
New ConversationTurnInput interface exported from @agentv/core

Conversion Logic

toBeforeAllHook(): Converts string/array to WorkspaceHookConfig
toMessageArray()/extractQuestion(): Extracted helpers for input normalization
convertAssertions(): Extracted from inline code for reuse in turn conversion
Turn inputs kept as TestMessageContent (matching YAML parser behavior)
Validation: throws if input missing on non-conversation test

Example

examples/features/sdk-programmatic-api-advanced/ — exercises all four new fields

Tests (`packages/core/test/evaluation/evaluate-programmatic-api.test.ts`)

11 tests covering: budgetUsd, turns with explicit/inferred mode, expectedOutput on turns, message array turns, aggregation, beforeAll (string and array), combined usage, standard single-turn, and missing-input validation.

E2E Verification

Green (all new tests pass):

$ bun test packages/core/test/evaluation/evaluate-programmatic-api.test.ts
 11 pass, 0 fail

Test Results

472/472 pass, 0 failures (workspace total)

…tic API (#1115) Close the programmatic TS API gap by adding four YAML-first features to the public SDK types: - EvalConfig.beforeAll: string | string[] — suite-level setup command - EvalConfig.budgetUsd: number — cost cap passed to orchestrator - EvalTestInput.turns: ConversationTurnInput[] — multi-turn conversations - EvalTestInput.aggregation: ConversationAggregation — score strategy - EvalTestInput.mode: "conversation" — inferred automatically from turns[] New ConversationTurnInput type mirrors YAML turn structure with camelCase. Input field on EvalTestInput is now optional (omit when using turns[]). Includes 10 new tests, an advanced SDK example, and full lint/build pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

christso mentioned this pull request Apr 16, 2026

feat(cli): add *.eval.ts auto-discovery #1120

Merged

christso merged commit 0ee2e93 into main Apr 16, 2026
4 checks passed

christso deleted the feat/1115-programmatic-api branch April 16, 2026 03:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): add beforeAll, budgetUsd, turns, aggregation to programmatic API#1119

feat(core): add beforeAll, budgetUsd, turns, aggregation to programmatic API#1119
christso merged 1 commit intomainfrom
feat/1115-programmatic-api

christso commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Apr 16, 2026

Summary

Changes

Types (packages/core/src/evaluation/evaluate.ts)

Conversion Logic

Example

Tests (packages/core/test/evaluation/evaluate-programmatic-api.test.ts)

E2E Verification

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Types (`packages/core/src/evaluation/evaluate.ts`)

Tests (`packages/core/test/evaluation/evaluate-programmatic-api.test.ts`)