docs(targets): add CLI Provider page + oracle-validation pattern#1146
Merged
docs(targets): add CLI Provider page + oracle-validation pattern#1146
Conversation
The cli provider was effectively undocumented — configuration.mdx had
one line in the provider table and a two-line example, with nothing on
template placeholders, the {OUTPUT_FILE} contract, batch mode, or what
errors look like. Grep for "oracle" / "type: cli" / "cli provider"
across apps/web turned up essentially zero hits, so the oracle-
validation composition pattern AGENTS.md §3 cites as an example of
"compose, don't add a feature" was invisible to users.
New page: docs/targets/cli-provider.mdx. Covers:
- minimal worked example
- the command contract (template rendered per case; command writes to
{OUTPUT_FILE}; AgentV reads that file)
- every template placeholder (PROMPT, PROMPT_FILE, OUTPUT_FILE, FILES,
EVAL_ID, ATTEMPT)
- the JSON output schema and the plain-text fallback
- full configuration-field table (including healthcheck, workers,
provider_batching, grader_target, keep_temp_files)
- a Batching section explaining when to enable it
- a dedicated "Pattern: Oracle validation" section with a worked config
that uses {EVAL_ID} to look up per-case fixtures, and a CLI workflow
(run oracle first; if it's not 100% the grader is the bug)
- a Debugging checklist
Verified against packages/core/src/evaluation/providers/cli.ts:
- {EVAL_ID} expands to request.evalCaseId at command-render time
(line 721), so it's available per case and works for the oracle
per-fixture pattern.
- parseOutputContent falls back to plain-text wrapping when JSON parse
fails or the JSON lacks output/text (lines 482, 489, 522).
Sidebar order: inserted CLI Provider at order 4 (LLM=2, Coding
Agents=3, CLI=4) and bumped Retry to 5, Custom Providers to 6. The
"Supported Providers" row in configuration.mdx now links to the new
page.
Node 18.19.1 in this dev environment can't run the Astro build
(requires 18.20.8+), but biome check passes on all 598 files and URL
pattern matches the rest of the docs site. Cloudflare Pages will build
on PR open.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
fa4cfb6
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://e2fa7fdb.agentv.pages.dev |
| Branch Preview URL: | https://docs-cli-target-oracle.agentv.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Close a discoverability gap surfaced in the #1145 retro: the
cliprovider was effectively undocumented (one line inconfiguration.mdx's provider table, a two-line example, nothing about template placeholders or the{OUTPUT_FILE}contract), and AGENTS.md §3 cites "oracle validation" as the canonical example of "compose, don't add a feature" — but the composition recipe lived nowhere users would see it.Adds a new
docs/targets/cli-provider.mdxcovering:{OUTPUT_FILE}; AgentV reads that file, not stdout).{PROMPT},{PROMPT_FILE},{OUTPUT_FILE},{FILES},{EVAL_ID},{ATTEMPT}.output/text/token_usage/cost_usd/duration_ms) and the plain-text fallback.healthcheck,workers,provider_batching,grader_target,keep_temp_files,verbose,cwd,files_format,timeout_seconds.{EVAL_ID}to look up per-case fixtures) and the workflow: run the oracle first; if it doesn't score 100%, your grader is the bug.clitargets.Also:
retry.mdx(4 → 5) andcustom-providers.mdx(5 → 6) so CLI Provider slots at 4, keeping provider-specific pages grouped (LLM=2, Coding Agents=3, CLI=4) before operational pages.clirow inconfiguration.mdx's Supported Providers table.Verification
packages/core/src/evaluation/providers/cli.ts:{EVAL_ID}expands torequest.evalCaseIdat render time — so the per-case fixture pattern in the oracle example really works.parseOutputContentfalls back to plain-text wrapping when JSON parse fails or the JSON lacksoutput/text.biome checkclean on all 598 files.Test plan
Why this is a non-draft PR
Docs-only, zero code changes, zero behavior changes. Nothing to UAT beyond visual inspection of the rendered output, which the Cloudflare preview will provide.