fix(dry-run): return schema-valid mock responses to prevent grader crashes#1090
fix(dry-run): return schema-valid mock responses to prevent grader crashes#1090
Conversation
…ashes
VSCode provider dry-run mode returned empty `output: []`, causing
evaluators to receive an empty candidate string:
- `is-json` would fail (empty string is not valid JSON)
- `contains`/`equals`/`regex` would fail trivially
- `execution-metrics` would report "Token usage data not available"
Fix: return `output: [{ role: 'assistant', content: '{}' }]` plus
zeroed `tokenUsage: { input: 0, output: 0 }` in both `invoke()` and
`invokeBatch()` dry-run paths. The `'{}'` response is valid JSON and
a non-empty string, satisfying all built-in graders without crashing.
Closes #1088
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
cdcee62
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://3438a616.agentv.pages.dev |
| Branch Preview URL: | https://fix-1088-dry-run-mock-schema.agentv.pages.dev |
Red/Green UAT EvidenceRed (without fix): Stashed The failures confirm the bug: Green (with fix): Applied All 4 regression tests pass. The dry-run response now returns |
Summary
output: [], which produced an empty candidate string that caused all built-in graders to fail or report errorsinvoke()andinvokeBatch()to returnoutput: [{ role: 'assistant', content: '{}' }]plus zeroedtokenUsage: { input: 0, output: 0 }Before / After
Red (before fix): Dry-run response has
output: [], soextractLastAssistantContentreturns'':is-jsongrader: fails —''is not valid JSONexecution-metricswithmax_tokens: fails — "Token usage data not available"contains/equals/regex: trivially fail on empty stringGreen (after fix): Dry-run returns
output: [{ role: 'assistant', content: '{}' }]withtokenUsage: { input: 0, output: 0 }:is-jsongrader: passes —'{}'is valid JSONexecution-metrics: has zeroed token data, so it evaluates normallycontains/equals/regex: operate on a non-empty string, no crashTest plan
vscode-provider-dry-run.test.tscoveringinvoke()andinvokeBatch()for all three properties: non-empty output, valid JSON content, zeroed tokenUsage@agentv/coretest suite: 1635 pass, 0 failCloses #1088
🤖 Generated with Claude Code