0.7.0 - 2026-05-26
- Updated the canonical
skills/vowline/SKILL.mdto the benchmark-selected6057fa00a6c074fe2af5f28b0a11062e69e154acf6296d4c65eb8b63f1cd637ccontract: future-client binding, exact final-surface predicates, public-label preservation, source-backed entity and metric universes, native structured tools, scoped cleanup paths, and bounded faithful proof. - Refreshed README around the 0.7.0 contract so the public summary points at the current canonical skill rather than the historical 0.6.0 operating table.
- Added a current 0.7.0 deterministic benchmark summary and chart under
docs/benchmarks/, reporting the exact same-hash6057fa00run set as 12 / 13 checked benchmark tasks passed across SkillsBench, IFEval, LiveBench, and Terminal-Bench, withnginx-request-loggingstill open. - Reclassified the prior
8e427bb4benchmark page as a historical 0.6.0 baseline instead of the current best-so-far claim.
Benchmark note: the 12 / 13 result is mostly single-run deterministic verifier evidence, not a pass@2 score or repeated-stability claim.