You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new LLM-correctness benchmark numbers (first-call success, turns, tokens-to-completion, recovery) from the parity-runner extension
Methodology documentation
Reproducibility notes (link to harness, harness invocation, environment)
Why
This is the first piece of measured-evidence marketing for MCPAQL. Two numbers live or die here:
Token reduction (96% — already attested in ORG-README.md)
LLM correctness delta (new, from this case study)
Once published, the new .com homepage (Phase 1b) cites this URL as the proof page. Without the case study live, the homepage is making claims it can't substantiate.
Part of #32 (Epic: MCPAQL 2026 product launch). Phase 1a deliverable.
Depends on:
What
Publish the GitHub MCP measured-comparison case study at
mcpaql.org/research/github-mcp-case-study/. Combines:Why
This is the first piece of measured-evidence marketing for MCPAQL. Two numbers live or die here:
ORG-README.md)Once published, the new
.comhomepage (Phase 1b) cites this URL as the proof page. Without the case study live, the homepage is making claims it can't substantiate.How
/research/github-mcp-case-study/on the .org build/spec/*pages and the future homepage placeholderPhase
1a (this week). Blocking for the Phase 1b homepage launch.