Releases: dsj1984/mandrel-bench
Releases · dsj1984/mandrel-bench
mandrel-bench: v0.5.0
26 Jun 11:47
Compare
Sorry, something went wrong.
No results found
0.5.0 (2026-06-24)
Added
bench: differential-trap spike apparatus — auth-trap scenario (refs #57 ) (#63 ) (b60e2f1 )
results: 1.75.0 cohort — mandrel@1.75.0 / claude-opus-4-8 (#62 ) (e888d0c )
Fixed
bench: git-exclude the framework overlay so it never enters the deliverable diff (#58 ) (97e5e1e )
bench: security scanner measured the overlaid framework, not the deliverable (#53 ) (54802d8 )
bench: stop counting test-fixture creds as secrets; proportional secret penalty (refs #55 ) (#59 ) (e1a7c40 )
mandrel-bench: v0.4.0
19 Jun 18:14
Compare
Sorry, something went wrong.
No results found
0.4.0 (2026-06-19)
Added
agents: add durable /benchmark workflow under .agents/local (#45 ) (c43ee5f )
bench: instrument the standalone path so its value dims are measured (#48 ) (#51 ) (bd5e517 )
Epic #32 (#43 ) (955684a )
project-api as the 1.75.0 Epic rung + first complete 1.75.0 cohort (closes #50 ) (#52 ) (e152ab3 )
Fixed
bench: skip npm audit without a lockfile; allow project-api scenario (#49 ) (57e40b3 )
score: null (not a default) for ledger-derived dims when no ledger (#47 ) (c3f4a32 )
mandrel-bench: v0.3.0
18 Jun 00:47
Compare
Sorry, something went wrong.
No results found
0.3.0 (2026-06-18)
Added
bench: batch-ready run orchestrator — resumable, cost-bounded loop (refs #22 ) (#24 ) (9d4d871 )
bench: drive the mandrel arm via /plan --idea --yes (headless, fresh Epic per run) (#28 ) (81d5093 )
bench: make mandrel-arm runs clean and repeatable (#27 ) (4aaf208 )
restructure results/ into per-cohort directories and add a generated zero-dep results.html dashboard (#17 ) (#19 ) (dfe8c13 )
results: first N=8 baseline cohort — mandrel@1.72.0 / claude-opus-4-8 (refs #23 ) (#29 ) (5100d9d )
Fixed
bench: render the value-add report over the full cohort store (resume-safe) (#31 ) (e564b3d )
bench: sanitize GITHUB_TOKEN before gh in resetSandboxBaseline (#30 ) (a50cfe5 )
mandrel-bench: v0.2.0
17 Jun 01:23
Compare
Sorry, something went wrong.
No results found
0.2.0 (2026-06-17)
Added
bench: wire harness end-to-end + first benchmark result (Epic #2 ) (#15 ) (e21c42f )
bootstrap mandrel-bench — re-home self-benchmark harness from mandrel#4211 (1287546 )
Fixed
ci: exclude generated CHANGELOG from markdownlint (refs #14 ) (#18 ) (88aba4c )
ci: green up test discovery, markdown lint, and biome config (d6d3e9e )
docs: stop MD004 reading "+ noise-band" as a list bullet (f4ddd0f )