Releases · mcp-tool-shop-org/role-os

10 Jun 09:23

mcp-tool-shop

v2.9.1

6a03494

v2.9.1 — Stage A health pass + specialists design lock Latest

Latest

Fixed — health pass (134 verified findings, adversarially confirmed)

roleos swarm and roleos audit work again. Both crashed on every invocation. The validated manifest now drives run construction: swarm runs carry stage/domain/gate metadata (swarm status groups by stage, swarm approve persists gate approvals), audit runs scale 2N+K+3 with the manifest.
Pack-level runs are built from real roles — catalog-valid steps, artifacts by role lookup, final review gate restored (Critic Reviewer → verdict; Judge → judge-report for brainstorm).
Reject verdicts route to the producing role (was: back to the reviewer).
Specialist quota window actually slides (route-tagged v2 state, tolerant migration) — no more permanent lockout. Citation gate counts distinct identifiers (house "arXiv ID + URL" format no longer flags). Generated capability-gate hook is self-contained and fail-closed in npx/global installs.
Docs tell the truth again — install step in quick start, 61-role table, opt-in egress threat model, real npm package name on the landing page, source-verified handbook counts, genericized starter-pack, pinned npm on the OIDC publish path.
6 new regression suites pin the above as contracts. Suite: 1404 → 1435 tests (1432 pass, 3 deliberate skips), 0 fail.

Added

design/specialists-layer.md — design lock for the specialists progression layer (grade bands, the Record, cross-training, techniques, operating profiles, form). Research-grounded by a 40-finding study-swarm; citation-verified (19 identifiers checked, 0 fabricated, 0 misattributed). Implementation lands in future minors.

Architecture grounded in (selection): Deci/Koestner/Ryan 1999 (competence feedback vs rewards) · Moldon et al. 2021 (streaks induce junk work) · Ilharco et al. 2022 (task arithmetic) · Yadav et al. 2023 (TIES interference) · Boubdir et al. 2023 (Elo pitfalls) · Schemmer et al. 2023 (appropriate reliance) · full bibliography in the design doc.

🤖 Generated with Claude Code

Assets 2

09 Jun 12:01

mcp-tool-shop

v2.9.0

a5aae90

v2.9.0 — Crew Dossier + Operating Posture dispatch wiring

Crew Dossier + Operating Posture

A character sheet for every role that doubles as run-time config.

Six aptitudes (rigor / pace / range / skepticism / autonomy / candor, 0–5) mapped to real dispatch knobs, an 8-archetype disposition layer carrying a behavioral instruction, a painted portrait, and a grade — for all 64 roles (the 61 roles + 3 specialty auditors).
Operating Posture dispatch wiring (opt-in, non-breaking): buildRolePrompt injects the disposition's instruction + a posture line from the role's aptitudes when a dossier exists; roles without one are byte-identical to before. Runtime data ships in src/role-dossiers.json.
A self-contained crew gallery (dossier/dossier.html) — each role's radar shows its tuned build vs its canonical ideal.
Aptitude profiles calibrated like an instrument: a cloud-model panel (per-axis median consensus) + a different-family external-verifier pass → 64 unique, knob-faithful fingerprints.

Full suite 1404 tests green. Handbook: https://mcp-tool-shop-org.github.io/role-os/handbook/crew-dossier/

Assets 2

08 Jun 15:32

mcp-tool-shop

v2.8.0

d149e4a

v2.8.0 — capability gate + conformance live-catalog rollout

Added

Capability gate — deterministic least-privilege on irreversible tool calls. A gated set of irreversible / world-touching actions (npm/PyPI publish, gh release / pr / repo edit, git push, Pages deploy), a director-authored .claude/role-os/capabilities.json grant manifest, and capabilityGate(). Opt-in (ROLEOS_CAPABILITY_GATE, default OFF → pure no-op), fail-closed for the gated set, deterministic (no model). Wired into onPreToolUse (deny path) + the generated PreToolUse hook (exit 2), alongside the advisory / fail-open conformance floor. Bounds what a wrong verdict — an honest mistake or an injected one — can DO; the preventive complement to the named-compensator rule (POLA / CaMeL).

Changed

Wedge #1 conformance — live tool-contracts catalog rollout. The deterministic schema + computable-contract floor runs at the live onPreToolUse seam against .claude/role-os/tool-contracts.json (advisory, fail-open), and generated hook scripts emit the current Claude Code wire protocol.

Full changelog: CHANGELOG.md.

Assets 2

06 Jun 07:10

mcp-tool-shop

v2.7.1

4ce9bc9

v2.7.1 — budget consult docs

Documentation release.

The README and a new handbook page now cover budget-aware dispatch — Role OS can consult a local Token Budget Analyst for each dispatch step and attach an advisory spend forecast (opt-in ROLEOS_BUDGET_CONSULT, fail-open to a deterministic baseline, never blocks a dispatch). No code changes from 2.7.0.

Assets 2

06 Jun 05:45

mcp-tool-shop

v2.7.0

ef2604b

role-os v2.7.0

Token Budget Analyst — production budget consult (opt-in, default-off).

consultBudgetForManifest / buildDispatchManifestWithBudget consult the budgeter specialist per dispatch step, attaching an advisory budget forecast + receipt to each step. Enable with ROLEOS_BUDGET_CONSULT=1; fail-open to the deterministic baseline max(ctx*1.5, 50000) (not Claude); advisory — it never blocks a dispatch. Also lands the budgeter dataset tooling under tools/token-budget-dataset/.

Full notes: CHANGELOG.md. 1334 tests green. Compensator: roleos specialist rollback.

Assets 2

03 Jun 18:56

mcp-tool-shop

v2.6.0

98d41e1

role-os v2.6.0 — local panel judges against prism's full abstract

verify-citations --local-panel now judges against prism's full abstract, not just one span.

The local entailment panel previously re-checked each supported citation against only prism's source_title + the single supporting_span the groundedness lens surfaced. A faithful claim the whole abstract entails — but no single span does — was escalated as a panel disagreement. buildEvidence now prefers prism's full source_abstract (surfaced by prism v1.0+), falling back to the span on older prism builds — so faithful claims land cleanly while genuine false-confirms are still caught. gateCitations threads source_abstract through; backward-compatible, no API change.

Pairs with prism-verify 1.0.0 (which surfaces source_abstract) and tensor-engine-knowledge wave-9. Full suite: 1199 tests green. Published via npm Trusted Publishing (provenance).

Install: npm install -g role-os · npx role-os

Assets 2

03 Jun 14:03

mcp-tool-shop

v2.5.0

0cc87e2

v2.5.0 — verify-citations --local-panel

A second, family-different verifier seat for the citation gate.

roleos verify-citations --local-panel adds a local 3-seat entailment panel (Qwen3-4B + Qwen3-14B + Mistral-Nemo-12B) that re-checks each citation an external verifier marked supported, and escalates to human review on disagreement — it can only tighten the gate, never loosen it. Runs entirely on local models, zero cost.

Why it matters: the panel's measured property is zero false-confirms — it never stamps a false claim "supported." On a real 16-case arXiv citation set, one model false-confirmed a claim that inverted a paper's finding; the panel held it at insufficient.

Opt in with --local-panel (off by default; needs a local llama-swap + offload). +16 tests, 1196 total. See CHANGELOG.md for details.

Assets 2

25 Mar 01:54

mcp-tool-shop

v1.2.0

2340666

v1.2.0 — Pack Promotion

Calibrated packs promoted to default entry. Auto-selection, mismatch detection, alternative suggestion, free-routing fallback. See CHANGELOG.md.

Assets 2

25 Mar 01:20

mcp-tool-shop

v1.1.0

e73bf06

v1.1.0 — Full Spine Complete

See CHANGELOG.md for full notes. 31 roles, 7 proven team packs, 212 tests, 35 execution trials.

Assets 2

Releases: mcp-tool-shop-org/role-os

v2.9.1 — Stage A health pass + specialists design lock

Fixed — health pass (134 verified findings, adversarially confirmed)

Added

Uh oh!

v2.9.0 — Crew Dossier + Operating Posture dispatch wiring

Crew Dossier + Operating Posture

Uh oh!

v2.8.0 — capability gate + conformance live-catalog rollout

Added

Changed

Uh oh!

v2.7.1 — budget consult docs

Uh oh!

role-os v2.7.0

Uh oh!

role-os v2.6.0 — local panel judges against prism's full abstract

Uh oh!

v2.5.0 — verify-citations --local-panel

Uh oh!

v1.2.0 — Pack Promotion

Uh oh!

v1.1.0 — Full Spine Complete

Uh oh!