Skip to content

v2.9.1 — Stage A health pass + specialists design lock

Latest

Choose a tag to compare

@mcp-tool-shop mcp-tool-shop released this 10 Jun 09:23
· 25 commits to main since this release

Fixed — health pass (134 verified findings, adversarially confirmed)

  • roleos swarm and roleos audit work again. Both crashed on every invocation. The validated manifest now drives run construction: swarm runs carry stage/domain/gate metadata (swarm status groups by stage, swarm approve persists gate approvals), audit runs scale 2N+K+3 with the manifest.
  • Pack-level runs are built from real roles — catalog-valid steps, artifacts by role lookup, final review gate restored (Critic Reviewer → verdict; Judge → judge-report for brainstorm).
  • Reject verdicts route to the producing role (was: back to the reviewer).
  • Specialist quota window actually slides (route-tagged v2 state, tolerant migration) — no more permanent lockout. Citation gate counts distinct identifiers (house "arXiv ID + URL" format no longer flags). Generated capability-gate hook is self-contained and fail-closed in npx/global installs.
  • Docs tell the truth again — install step in quick start, 61-role table, opt-in egress threat model, real npm package name on the landing page, source-verified handbook counts, genericized starter-pack, pinned npm on the OIDC publish path.
  • 6 new regression suites pin the above as contracts. Suite: 1404 → 1435 tests (1432 pass, 3 deliberate skips), 0 fail.

Added

  • design/specialists-layer.md — design lock for the specialists progression layer (grade bands, the Record, cross-training, techniques, operating profiles, form). Research-grounded by a 40-finding study-swarm; citation-verified (19 identifiers checked, 0 fabricated, 0 misattributed). Implementation lands in future minors.

Architecture grounded in (selection): Deci/Koestner/Ryan 1999 (competence feedback vs rewards) · Moldon et al. 2021 (streaks induce junk work) · Ilharco et al. 2022 (task arithmetic) · Yadav et al. 2023 (TIES interference) · Boubdir et al. 2023 (Elo pitfalls) · Schemmer et al. 2023 (appropriate reliance) · full bibliography in the design doc.

🤖 Generated with Claude Code