Problem
wordpress.bench is registered and smoke-tested, but the generic benchmarking contract is not documented as a first-class WP Codebox capability. Benchmark consumers currently need to infer behavior from command definitions, examples, and smoke tests.
A reusable primitive needs clear docs and CLI helpers around definition, execution, extraction, and comparison while keeping product-specific scoring outside WP Codebox.
Desired shape
Add docs and helper commands such as:
- benchmark contract documentation
- workload authoring examples
- metric and sample contract documentation
- artifact/provenance documentation
wp-codebox bench run or equivalent recipe wrapper
wp-codebox bench summarize
wp-codebox bench compare
wp-codebox artifacts bench-results
- human table output for scenarios/metrics
- JSON output for automation
Acceptance criteria
- Documentation explains what WP Codebox owns and what callers own.
- Documentation covers PHP, ability, WP-CLI, browser, and future workload types as they land.
- CLI helpers can summarize benchmark results from recipe-run output or artifact bundles.
- CLI helpers expose JSON and human-readable output.
- No product-specific benchmark suites, rewards, graders, or scoring policy are documented as WP Codebox responsibilities.
AI assistance
- AI assistance: Yes
- Tool(s): OpenCode (GPT-5.5)
- Used for: Drafted tracker from repository inspection and benchmark capability gap analysis; Chris remains responsible for prioritization and review.
Problem
wordpress.benchis registered and smoke-tested, but the generic benchmarking contract is not documented as a first-class WP Codebox capability. Benchmark consumers currently need to infer behavior from command definitions, examples, and smoke tests.A reusable primitive needs clear docs and CLI helpers around definition, execution, extraction, and comparison while keeping product-specific scoring outside WP Codebox.
Desired shape
Add docs and helper commands such as:
wp-codebox bench runor equivalent recipe wrapperwp-codebox bench summarizewp-codebox bench comparewp-codebox artifacts bench-resultsAcceptance criteria
AI assistance