Releases · dsj1984/mandrel-bench · GitHub

26 Jun 11:47

mandrel-bench-v0.5.0

mandrel-bench: v0.5.0 Latest

Latest

0.5.0 (2026-06-24)

Added

bench: differential-trap spike apparatus — auth-trap scenario (refs #57) (#63) (b60e2f1)
results: 1.75.0 cohort — mandrel@1.75.0 / claude-opus-4-8 (#62) (e888d0c)

Fixed

bench: git-exclude the framework overlay so it never enters the deliverable diff (#58) (97e5e1e)
bench: security scanner measured the overlaid framework, not the deliverable (#53) (54802d8)
bench: stop counting test-fixture creds as secrets; proportional secret penalty (refs #55) (#59) (e1a7c40)

Assets 2

19 Jun 18:14

mandrel-bench-v0.4.0

mandrel-bench: v0.4.0

0.4.0 (2026-06-19)

Added

agents: add durable /benchmark workflow under .agents/local (#45) (c43ee5f)
bench: instrument the standalone path so its value dims are measured (#48) (#51) (bd5e517)
Epic #32 (#43) (955684a)
project-api as the 1.75.0 Epic rung + first complete 1.75.0 cohort (closes #50) (#52) (e152ab3)

Fixed

bench: skip npm audit without a lockfile; allow project-api scenario (#49) (57e40b3)
score: null (not a default) for ledger-derived dims when no ledger (#47) (c3f4a32)

Assets 2

18 Jun 00:47

mandrel-bench-v0.3.0

mandrel-bench: v0.3.0

0.3.0 (2026-06-18)

Added

bench: batch-ready run orchestrator — resumable, cost-bounded loop (refs #22) (#24) (9d4d871)
bench: drive the mandrel arm via /plan --idea --yes (headless, fresh Epic per run) (#28) (81d5093)
bench: make mandrel-arm runs clean and repeatable (#27) (4aaf208)
restructure results/ into per-cohort directories and add a generated zero-dep results.html dashboard (#17) (#19) (dfe8c13)
results: first N=8 baseline cohort — mandrel@1.72.0 / claude-opus-4-8 (refs #23) (#29) (5100d9d)

Fixed

bench: render the value-add report over the full cohort store (resume-safe) (#31) (e564b3d)
bench: sanitize GITHUB_TOKEN before gh in resetSandboxBaseline (#30) (a50cfe5)

Assets 2

17 Jun 01:23

mandrel-bench-v0.2.0

mandrel-bench: v0.2.0

0.2.0 (2026-06-17)

Added

bench: wire harness end-to-end + first benchmark result (Epic #2) (#15) (e21c42f)
bootstrap mandrel-bench — re-home self-benchmark harness from mandrel#4211 (1287546)

Fixed

ci: exclude generated CHANGELOG from markdownlint (refs #14) (#18) (88aba4c)
ci: green up test discovery, markdown lint, and biome config (d6d3e9e)
docs: stop MD004 reading "+ noise-band" as a list bullet (f4ddd0f)

Assets 2