feat(targets): remove workspace_template, add target-level hooks by christso · Pull Request #1095 · EntityProcess/agentv

christso · 2026-04-14T03:26:47Z

Summary

Remove workspace_template from target schema — unused field (zero references in any eval file). Removed from BASE_TARGET_SCHEMA, TargetDefinition, all provider resolved configs, orchestrator, validators, and docs.
Add target-level hooks — eval files can now define per-target setup/teardown hooks in execution.targets using object form, enabling single-file harness variant comparison (e.g., baseline vs with-plugins vs with-guidelines).

Target hooks example

execution:
  targets:
    - baseline                          # string shorthand (no hooks)
    - name: with-skills                 # object form with hooks
      use_target: default
      hooks:
        before_each:
          command: ["setup-plugins.sh", "skills"]

Hook execution order

Target hooks nest inside workspace hooks (standard setup/teardown nesting):

Workspace before_all → Target before_all
Per test: Workspace before_each → Target before_each → test → Target after_each → Workspace after_each
Target after_all → Workspace after_all

Changes

packages/core/src/evaluation/types.ts — new TargetHooksConfig, EvalTargetRef types
packages/core/src/evaluation/validation/eval-file.schema.ts — execution.targets accepts (string | EvalTargetRef)[]
packages/core/src/evaluation/loaders/config-loader.ts — new extractTargetRefsFromSuite()
packages/core/src/evaluation/orchestrator.ts — 4 lifecycle hook execution points
apps/cli/src/commands/eval/targets.ts — synthetic target injection, hooks threading
apps/cli/src/commands/eval/run-eval.ts — pass targetRefs and targetHooks through
Docs, tests, eval-schema.json updated

Test plan

bun run build — passes
bun run typecheck — passes
bun run lint — passes
bun run test — 2157 tests pass (1642 core + 67 eval + 448 cli)
bun run validate:examples — 56/56 valid
Manual UAT with real eval file using target hooks

🤖 Generated with Claude Code

Remove the unused `workspace_template` field from target schema and add per-target hooks support in eval files. Target hooks let a single eval file compare different harness configurations (e.g., baseline vs with-plugins) by running setup/teardown scripts per target variant. - Remove `workspace_template` from BASE_TARGET_SCHEMA, TargetDefinition, all provider resolved configs, orchestrator, and validators - Add `TargetHooksConfig` and `EvalTargetRef` types - Extend `execution.targets` in eval files to accept objects with hooks - Execute target hooks at 4 lifecycle points: before_all, before_each, after_each, after_all (nested inside workspace hooks) - Update docs and regenerate eval-schema.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-04-14T03:27:08Z

Deploying agentv with Cloudflare Pages

Latest commit:	`db1a83a`
Status:	✅ Deploy successful!
Preview URL:	https://df322415.agentv.pages.dev
Branch Preview URL:	https://feat-1094-target-hooks.agentv.pages.dev

View logs

- Accept string command shorthand in WorkspaceHookSchema (matches docs and parseHookConfig behavior) - Restore deprecation warning when workspace_template appears in targets.yaml (downgraded from error to warning for migration help) - Fix misleading "runs ONCE" comment on target before_all Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace the removed workspace_template field with the target-level hooks pattern from #1095. A single base 'claude' target is defined in targets.yaml, and the eval file's execution.targets uses before_each hooks to copy variant-specific plugin configs into the workspace. Also fixes: - Use 'id' instead of deprecated 'case' in test definitions - Use full commit hash with resolve: local for base_commit - Remove shallow clone (depth: 1) that prevented commit checkout Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…luation (#1091) * docs(showcase): add bug-fix-benchmark example for SWE-bench style evaluation Add a showcase example demonstrating how to evaluate coding agents on real-world bug fixes using public GitHub repositories with Docker workspace isolation and commit-pinned repos. Includes: - EVAL.yaml with example test cases (null-check, fallback, property-access bugs) - targets.yaml showing all auth options (subscription, API key, mock) - mock-agent.sh for testing without API keys - import-swebench.sh for importing SWE-bench dataset instances Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(showcase): add multi-plugin benchmark with baseline comparison Add workspace templates for comparing agent performance with and without engineering plugins: superpowers, compound-engineering, agent-skills. - Add workspaces/ with per-plugin .claude/settings.json configs - Update targets.yaml with claude-baseline, claude-superpowers, claude-compound, claude-agent-skills targets - Replace hypothetical test cases with real issue #912 bug fix task - Add scripts/setup-plugins.sh for plugin installation - Update README with comparison workflow and plugin details Closes #919 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(showcase): use bypassPermissions in all workspace settings Use defaultMode: bypassPermissions instead of listing individual Bash allow rules, matching how the agentv dev environment is configured. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(showcase): use target-level hooks instead of workspace_template Replace the removed workspace_template field with the target-level hooks pattern from #1095. A single base 'claude' target is defined in targets.yaml, and the eval file's execution.targets uses before_each hooks to copy variant-specific plugin configs into the workspace. Also fixes: - Use 'id' instead of deprecated 'case' in test definitions - Use full commit hash with resolve: local for base_commit - Remove shallow clone (depth: 1) that prevented commit checkout Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(showcase): use claude-cli provider with configurable executable Switch from provider: claude to provider: claude-cli with an executable field that reads from CLAUDE_EXECUTABLE env var (defaults to "claude"). This allows using custom CLI binaries like claude-zai. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore(showcase): remove unused import-swebench.sh script The script was speculative and non-functional (used deprecated fields, hardcoded docker config, broken template variables). Not needed for the benchmark showcase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(showcase): remove local targets.yaml, use inline rubric assertions - Remove local .agentv/targets.yaml — use repo root targets instead (targets don't merge, closest shadows; local one forced duplicating grader targets unnecessarily) - Replace llm-grader assertion with inline rubric strings (auto-unwrapped to rubrics evaluator) - Remove unused scripts: mock-agent.sh (broken with workspace repos), setup-plugins.sh (orphaned, settings.json already checked in) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(showcase): use AGENT_TARGET, reset workspace, drop fragile contains assertion - Replace hardcoded use_target: claude with ${{ AGENT_TARGET }} so the benchmark works with any provider via env var - Add workspace.hooks.before_each.reset: fast for proper isolation between pool slot reuse across plugin variants - Remove contains: effectiveCwd assertion (checks response text, not the diff); rubrics already validate the fix via file_changes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

christso marked this pull request as ready for review April 14, 2026 04:31

christso merged commit b3760ce into main Apr 14, 2026
4 checks passed

christso deleted the feat/1094-target-hooks branch April 14, 2026 04:31

christso mentioned this pull request Apr 14, 2026

docs(showcase): add bug-fix-benchmark example for SWE-bench style evaluation #1091

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(targets): remove workspace_template, add target-level hooks#1095

feat(targets): remove workspace_template, add target-level hooks#1095
christso merged 2 commits intomainfrom
feat/1094-target-hooks

christso commented Apr 14, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Apr 14, 2026

Summary

Target hooks example

Hook execution order

Changes

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages bot commented Apr 14, 2026 •

edited

Loading