Skip to content

fix(dry-run): return schema-valid mock responses to prevent grader crashes#1090

Merged
christso merged 2 commits intomainfrom
fix/1088-dry-run-mock-schema-valid
Apr 13, 2026
Merged

fix(dry-run): return schema-valid mock responses to prevent grader crashes#1090
christso merged 2 commits intomainfrom
fix/1088-dry-run-mock-schema-valid

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Summary

  • VSCode provider dry-run mode returned output: [], which produced an empty candidate string that caused all built-in graders to fail or report errors
  • Fixed both invoke() and invokeBatch() to return output: [{ role: 'assistant', content: '{}' }] plus zeroed tokenUsage: { input: 0, output: 0 }
  • Updated AGENTS.md to reflect that dry-run now returns schema-valid responses

Before / After

Red (before fix): Dry-run response has output: [], so extractLastAssistantContent returns '':

  • is-json grader: fails — '' is not valid JSON
  • execution-metrics with max_tokens: fails — "Token usage data not available"
  • contains/equals/regex: trivially fail on empty string

Green (after fix): Dry-run returns output: [{ role: 'assistant', content: '{}' }] with tokenUsage: { input: 0, output: 0 }:

  • is-json grader: passes — '{}' is valid JSON
  • execution-metrics: has zeroed token data, so it evaluates normally
  • contains/equals/regex: operate on a non-empty string, no crash

Test plan

  • 4 regression tests added in vscode-provider-dry-run.test.ts covering invoke() and invokeBatch() for all three properties: non-empty output, valid JSON content, zeroed tokenUsage
  • @agentv/core test suite: 1635 pass, 0 fail
  • Lint passes

Closes #1088

🤖 Generated with Claude Code

christso and others added 2 commits April 13, 2026 22:57
…ashes

VSCode provider dry-run mode returned empty `output: []`, causing
evaluators to receive an empty candidate string:
- `is-json` would fail (empty string is not valid JSON)
- `contains`/`equals`/`regex` would fail trivially
- `execution-metrics` would report "Token usage data not available"

Fix: return `output: [{ role: 'assistant', content: '{}' }]` plus
zeroed `tokenUsage: { input: 0, output: 0 }` in both `invoke()` and
`invokeBatch()` dry-run paths. The `'{}'` response is valid JSON and
a non-empty string, satisfying all built-in graders without crashing.

Closes #1088

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: cdcee62
Status: ✅  Deploy successful!
Preview URL: https://3438a616.agentv.pages.dev
Branch Preview URL: https://fix-1088-dry-run-mock-schema.agentv.pages.dev

View logs

@christso
Copy link
Copy Markdown
Collaborator Author

Red/Green UAT Evidence

Red (without fix): Stashed vscode-provider.ts fix and ran regression tests:
```
(fail) VSCodeProvider dry-run response shape > returns non-empty output so graders do not crash
(fail) VSCodeProvider dry-run response shape > returns valid JSON content so is-json grader passes
(fail) VSCodeProvider dry-run response shape > returns zeroed tokenUsage so execution-metrics grader does not report missing data
(fail) VSCodeProvider dry-run response shape > batch invoke returns the same schema-valid shape per response
1631 pass / 4 fail
```

The failures confirm the bug: output: [] → empty candidate → is-json fails, tokenUsage undefined.

Green (with fix): Applied vscode-provider.ts changes:
```
1635 pass / 0 fail
```

All 4 regression tests pass. The dry-run response now returns {} (valid JSON) with tokenUsage: { input: 0, output: 0 }.

@christso christso marked this pull request as ready for review April 13, 2026 23:03
@christso christso merged commit e044257 into main Apr 13, 2026
4 checks passed
@christso christso deleted the fix/1088-dry-run-mock-schema-valid branch April 13, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(dry-run): mock provider should return schema-valid responses to prevent execution errors

1 participant