Agent Context Evals v0.11.0
Status: public fresh-clone reproducibility receipt for the v0.10 local
machine-evidence path. No public benchmark claim.
What Shipped
- Fresh-clone reproducibility receipt:
release/v0.11-fresh-clone-reproducibility/fresh-clone-receipt.json. - Manual release-operation generator:
scripts/run_fresh_clone_reproducibility.py. - CI-safe receipt gate:
scripts/check_v11_fresh_clone_receipt.py. - 5-minute local run feedback request:
https://github.com/ctxgov/agent-context-evals/issues/22.
Reproduce
python3 scripts/check_v11_fresh_clone_receipt.py
python3 -m unittest tests.test_v11_fresh_clone_reproducibility -vTo regenerate the receipt, run the explicit release-operation path. It uses
network access to clone the public repository into a temporary directory:
python3 scripts/run_fresh_clone_reproducibility.py --ref v0.10.0The checked-in receipt records a fresh clone of the v0.10.0 public local
evidence path, then runs:
python3 scripts/check_v10_saved_trace_readiness.pypython3 -m unittest discover -s tests -v
Boundary
- No public benchmark claim.
- No security claim.
- No provider/model call.
- No adoption claim.
- No human reviewer claim.
- No package publication claim.
- No stable protocol claim.
- No hosted runtime or live adapter claim.
This release publishes a reproducibility receipt and a local receipt gate only.
It does not execute provider/model calls, package publication, hosted runtime
changes, target writes, reviewer outreach, adoption measurement, or public
benchmark reporting.