fix(pipeline): grade built-in deterministic assertions in subagent mode by christso · Pull Request #1085 · EntityProcess/agentv

christso · 2026-04-13T21:01:50Z

Summary

pipeline input now writes built-in deterministic assertion configs (contains, regex, equals, etc.) to builtin_graders/ directory alongside existing code_graders/ and llm_graders/
pipeline grade evaluates these built-in assertions in-process against response.md and writes results to code_grader_results/ so pipeline bench merges them into the final score
pipeline bench reads the type field from result JSON instead of hardcoding 'code-grader', preserving the original assertion type in output

Red/Green Evidence

Red (main) — builtin assertions silently ignored:

Graded 0 code-grader(s): 0 passed
Benchmark: 1 test(s), pass_rate=0
score: 0, scores: []

Green (this branch) — builtin assertions evaluated correctly:

Graded 2 built-in assertion(s): 2/2 passed
Benchmark: 1 test(s), pass_rate=1
score: 1, scores: [{type: "contains", score: 1}, {type: "regex", score: 1}]

Test plan

Unit tests: 5 new tests covering contains pass/fail, regex, negate, and input extraction
Full test suite: 2133 tests pass, zero regressions
Manual e2e: red/green verified with real pipeline run
Pre-push hooks: build, typecheck, lint, test, validate all pass

Closes #1075

🤖 Generated with Claude Code

pipeline grade now evaluates contains, regex, equals, starts-with, ends-with, is-json, and other built-in assertion types against response.md. Previously these were silently ignored, producing score: 0 for tests with only deterministic assertions. Closes #1075 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-04-13T21:02:34Z

Deploying agentv with Cloudflare Pages

Latest commit:	`729b85c`
Status:	✅ Deploy successful!
Preview URL:	https://a33d76a4.agentv.pages.dev
Branch Preview URL:	https://fix-1075-builtin-graders.agentv.pages.dev

View logs

christso merged commit 64fdff9 into main Apr 13, 2026
4 checks passed

christso deleted the fix/1075-builtin-graders branch April 13, 2026 21:20

christso mentioned this pull request Apr 13, 2026

refactor(pipeline): unify builtin grader configs into code_graders #1086

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pipeline): grade built-in deterministic assertions in subagent mode#1085

fix(pipeline): grade built-in deterministic assertions in subagent mode#1085
christso merged 1 commit intomainfrom
fix/1075-builtin-graders

christso commented Apr 13, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Apr 13, 2026

Summary

Red/Green Evidence

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Apr 13, 2026

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant