Skip to content

v1.0-arxiv

Latest

Choose a tag to compare

@aray-17 aray-17 released this 15 Jun 02:46
v1.0-arxiv
a26bda6

Public artifact for the paper The Economics of Coding Agents: Calibrating the Cost-Quality Frontier (Aninda Ray, 2026). This release marks the exact code state cited in the paper.

Reproduce the paper's claims offline — no Docker, API keys, or model calls:

python3 benchmarks/verify_criteria.py   # PASS/FAIL per claim; exits 0 iff all 12 reproduce

Browse the evidence interactively:

bash benchmarks/explore.sh

See CLAIMS.md for per-claim evidence and paper/paper.pdf for full methodology.


Pre-release pending arXiv submission; the arXiv id and any final paper.pdf update will follow.