v0.5.0 — Slice-1 calibration pack + Stage-A hardening
Slice 1 — prism now measures its locks. prism eval runs the lenses over a labeled corpus and reports per-lens precision/recall/MCC, the inter-lens diversity matrix (Krippendorff α + pairwise Cohen κ), submodular coverage-gain, verdict accuracy, and confidence calibration — each with an honest CI.
The first real run (local mistral-small:24b, public split) surfaced a genuine gap in a core lock: the runtime finding-set ρ gate reads 0.0 for every lens pair while decision-level Cohen κ is 0.73–0.81 — the ρ ≤ 0.25 gate is blind to the lens correlation κ reveals. Finding that is the point of the slice. Full results + method: eval/RESULTS.md, design/07.
Stage-A security hardening (same minor): closed the Lock-2 reasoning-strip bypass class (incl. the attribute-tolerant <thinking signature="…"> Anthropic form), provider timeout circuit-breaking, versioned receipt canonicalization (schema v5 — cross-tool Ed25519 verify), and three bounded-memory fixes on the HTTP surface.
Install: uv tool install prism-verify · npx @mcptoolshop/prism-verify · pip install "prism-verify[all]"
Docs: Handbook → Calibration & benchmark.