Objective
Add agentv eval validate that checks whether an eval follows self-contained conventions, and --fix that automatically inlines missing dependencies to make the eval portable in place.
Rationale
If evals follow conventions (relative paths, committed workspace templates, local scripts, local targets), the eval directory is already a portable artifact. Rather than building a separate bundle compiler, validate conventions and autofix violations in place.
This is analogous to how agent-skills validates skill structure and conventions.
Current problem: AgentV's examples/ use a shared parent targets.yaml (e.g., examples/features/.agentv/targets.yaml). This means no single example directory is self-contained — they all depend on a parent directory for target resolution. The --fix option resolves this by inlining the resolved targets locally.
Design
Validate mode
agentv eval validate my-eval.yaml
Checks:
- Relative paths — all
tests:, workspace:, command: references resolve relative to the eval file (no absolute paths)
- Files exist — workspace template directory, code-grader scripts, hook scripts all exist at referenced paths
- Local targets — targets used by the eval are defined locally (not inherited from parent
.agentv/targets.yaml)
- No dangling env vars — warn on
${{ VAR }} references that aren't documented or have no default
- Target resolution —
use_target chains resolve without circular references
- Provenance — warn if eval is not in a git repo (no commit hash for reproducibility)
Output:
✓ my-eval.yaml
✓ All test data paths resolve (tests: ./cases.jsonl)
✓ Workspace template exists (workspace: ./template/)
✓ Code-grader scripts exist (./scripts/verify.sh)
✗ Target "claude" resolved from parent .agentv/targets.yaml — not self-contained
⚠ ENV_VAR referenced but not documented
✓ Git repo detected (commit: abc123)
1 error, 1 warning. Run with --fix to make this eval self-contained.
Fix mode
agentv eval validate my-eval.yaml --fix
What --fix does:
- Inlines targets — resolves
use_target chains from parent .agentv/targets.yaml files and writes a local targets.yaml (or inlines into EVAL.yaml) with the fully resolved definitions
- Resolves relative paths — rewrites any paths that depend on parent directory structure to be relative to the eval file
- Reports what it fixed — prints each change made
Properties:
- Modifies the eval in place — no separate output directory
- Idempotent — running twice changes nothing
- Composes with git — commit the fixed eval and it's portable
- Does NOT copy workspace templates or scripts — those must already be alongside the eval
Exit codes
0 — all checks pass (or all fixable issues were fixed with --fix)
1 — validation errors (non-portable eval, not auto-fixable)
- Warnings don't fail validation
Non-goals
- Not a schema validator (that already exists via eval schema validation)
- Not a bundle compiler — there is no separate output artifact. If validation passes, the eval directory IS the bundle.
- Not enforcing conventions on all evals — this is opt-in for benchmark-grade portability
- Not auto-copying workspace templates or scripts — those must be local already.
--fix only handles target resolution and path rewriting.
Related
Acceptance signals
Objective
Add
agentv eval validatethat checks whether an eval follows self-contained conventions, and--fixthat automatically inlines missing dependencies to make the eval portable in place.Rationale
If evals follow conventions (relative paths, committed workspace templates, local scripts, local targets), the eval directory is already a portable artifact. Rather than building a separate bundle compiler, validate conventions and autofix violations in place.
This is analogous to how agent-skills validates skill structure and conventions.
Current problem: AgentV's
examples/use a shared parenttargets.yaml(e.g.,examples/features/.agentv/targets.yaml). This means no single example directory is self-contained — they all depend on a parent directory for target resolution. The--fixoption resolves this by inlining the resolved targets locally.Design
Validate mode
agentv eval validate my-eval.yamlChecks:
tests:,workspace:,command:references resolve relative to the eval file (no absolute paths).agentv/targets.yaml)${{ VAR }}references that aren't documented or have no defaultuse_targetchains resolve without circular referencesOutput:
Fix mode
agentv eval validate my-eval.yaml --fixWhat
--fixdoes:use_targetchains from parent.agentv/targets.yamlfiles and writes a localtargets.yaml(or inlines into EVAL.yaml) with the fully resolved definitionsProperties:
Exit codes
0— all checks pass (or all fixable issues were fixed with--fix)1— validation errors (non-portable eval, not auto-fixable)Non-goals
--fixonly handles target resolution and path rewriting.Related
Acceptance signals
agentv eval validatechecks all listed conventionsagentv eval validate --fixinlines resolved targets and rewrites paths in place--fixis idempotent