Skip to content

fix(pipeline): grade built-in deterministic assertions in subagent mode#1085

Merged
christso merged 1 commit intomainfrom
fix/1075-builtin-graders
Apr 13, 2026
Merged

fix(pipeline): grade built-in deterministic assertions in subagent mode#1085
christso merged 1 commit intomainfrom
fix/1075-builtin-graders

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Summary

  • pipeline input now writes built-in deterministic assertion configs (contains, regex, equals, etc.) to builtin_graders/ directory alongside existing code_graders/ and llm_graders/
  • pipeline grade evaluates these built-in assertions in-process against response.md and writes results to code_grader_results/ so pipeline bench merges them into the final score
  • pipeline bench reads the type field from result JSON instead of hardcoding 'code-grader', preserving the original assertion type in output

Red/Green Evidence

Red (main) — builtin assertions silently ignored:

Graded 0 code-grader(s): 0 passed
Benchmark: 1 test(s), pass_rate=0
score: 0, scores: []

Green (this branch) — builtin assertions evaluated correctly:

Graded 2 built-in assertion(s): 2/2 passed
Benchmark: 1 test(s), pass_rate=1
score: 1, scores: [{type: "contains", score: 1}, {type: "regex", score: 1}]

Test plan

  • Unit tests: 5 new tests covering contains pass/fail, regex, negate, and input extraction
  • Full test suite: 2133 tests pass, zero regressions
  • Manual e2e: red/green verified with real pipeline run
  • Pre-push hooks: build, typecheck, lint, test, validate all pass

Closes #1075

🤖 Generated with Claude Code

pipeline grade now evaluates contains, regex, equals, starts-with,
ends-with, is-json, and other built-in assertion types against response.md.
Previously these were silently ignored, producing score: 0 for tests
with only deterministic assertions.

Closes #1075

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 729b85c
Status: ✅  Deploy successful!
Preview URL: https://a33d76a4.agentv.pages.dev
Branch Preview URL: https://fix-1075-builtin-graders.agentv.pages.dev

View logs

@christso christso merged commit 64fdff9 into main Apr 13, 2026
4 checks passed
@christso christso deleted the fix/1075-builtin-graders branch April 13, 2026 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pipeline grade: add built-in grader support for contains/regex/contains-all assertions in subagent mode

1 participant