diff --git a/SKILL.md b/SKILL.md index f7e9ae3..df6172e 100644 --- a/SKILL.md +++ b/SKILL.md @@ -57,8 +57,8 @@ bstack is a *portable harness metalayer* — it composes existing skills into a bstack ships two complementary layers: -- **Substrate** (this skill, `/bstack`): the 19 primitives + 29 skills + governance + hooks + `.control/policy.yaml`. This is what `/bstack bootstrap` installs. The substrate is the *capability* — what's available in the workspace. -- **Mode** (`broomva/autonomous`): the canonical *behavior* that runs on top of the substrate. When the user says "go" / "proceed" / "be autonomous", `/autonomous` fires the 19-reflex pipeline that uses every primitive in sequence. +- **Substrate** (this skill, `/bstack`): the 20 primitives + 30 skills + governance + hooks + `.control/policy.yaml`. This is what `/bstack bootstrap` installs. The substrate is the *capability* — what's available in the workspace. +- **Mode** (`broomva/autonomous`): the canonical *behavior* that runs on top of the substrate. When the user says "go" / "proceed" / "be autonomous", `/autonomous` fires the 20-reflex pipeline that uses every primitive in sequence. Installing the substrate without the mode = the workspace has primitives but no entry point to engage them. Invoking the mode without the substrate = wishful thinking. Compounded: `/bstack bootstrap` installs the substrate, then `/autonomous` is the standing operating mode for substantive work units. @@ -71,7 +71,7 @@ npx skills add broomva/bstack Then, in your agent session: ``` -/bstack bootstrap → install 28 skills + scaffold governance + wire hooks + run doctor +/bstack bootstrap → install 30 skills + scaffold governance + wire hooks + run doctor /bstack doctor → verify primitive contract compliance (always exits 0) /bstack repair → fix specific gaps surfaced by doctor (asks before writing) /bstack status → show which skills are installed vs missing @@ -81,7 +81,7 @@ Then, in your agent session: ## What bstack enforces -The sixteen primitives. Each closes one specific failure mode that drifts into entropy in unsupervised sessions: +The twenty primitives. Each closes one specific failure mode that drifts into entropy in unsupervised sessions: | # | Primitive | Closes | |---|---|---| @@ -245,7 +245,7 @@ Future sessions inspect this for state. `bootstrap_status: failed` is captured t 1. Installs all 30 skills via `npx skills add broomva/` — `broomva/autonomous` is the first in the roster (canonical operating mode) 2. **Scaffolds missing governance files** from `assets/templates/`: - - `CLAUDE.md` (workspace invariants + RCS hierarchy + primitive table P1–P16 + §Ritual vs Substance) + - `CLAUDE.md` (workspace invariants + RCS hierarchy + primitive table P1–P20 + §Ritual vs Substance) - `AGENTS.md` (operational rules + per-primitive sections + reflexive triggers for all reasoning-enforced primitives) - `.control/policy.yaml` (ci_watch / ci_heal / auto_merge / gates G1–G11) - `.claude/settings.json` (P1, P2, P7 hook wiring) @@ -264,9 +264,9 @@ Future sessions inspect this for state. `bootstrap_status: failed` is captured t `scripts/doctor.sh`. Eight check sections: 1. Governance files exist (CLAUDE.md, AGENTS.md, .control/policy.yaml) -2. CLAUDE.md primitives table has all P1–P16 rows + correct count header ("Sixteen irreducible…") -3. AGENTS.md has each primitive section (`### P1:` through `### P16:`) -4. Reflexive Trigger Rules present for P6, P9, P10, P11, P12, P13, P14, P15, P16 (the reasoning-enforced primitives) +2. CLAUDE.md primitives table has all P1–P20 rows + correct count header ("Twenty irreducible…") +3. AGENTS.md has each primitive section (`### P1:` or `### P1 — Short: Long` format through `### P20`) +4. Reflexive Trigger Rules present for P6, P9, P10, P11, P12, P13, P14, P15, P16, P17, P18, P19, P20 (the reasoning-enforced primitives) 5. `.control/policy.yaml` has required blocks (`ci_watch:`, `ci_heal:`, `auto_merge:`) 6. `.claude/settings.json` wires the expected hook scripts (P1, P2, P7) 7. Each primitive's mechanism is reachable on disk @@ -292,14 +292,14 @@ Re-run the preamble. For each skill show: name, layer, installed/missing. Then r `scripts/revamp.sh`. Triggers complete workspace reconfiguration: -1. Reinstall all 28 skills (force mode) +1. Reinstall all 30 skills (force mode) 2. Regenerate governance files from templates (asks before overwriting) 3. Rewire hooks (git pre-commit + Claude Code Stop/Notification/PreToolUse/SessionStart) 4. Force-run conversation bridge across all projects 5. Run full control audit 6. Update AGENTS.md with current state -## Stack layers (28 skills) +## Stack layers (30 skills) For the full skill roster + descriptions, see [references/skills-roster.md](references/skills-roster.md). For the layered architecture, see [references/stack-architecture.md](references/stack-architecture.md). For the full primitive contract with reflexive triggers, see [references/primitives.md](references/primitives.md). @@ -316,7 +316,7 @@ bstack is the *measurement substrate* for the agentic-control-kernel. The harnes | Bridge operational | fresh < 24h | `~/.cache/broomva-bridge-stamp` mtime | | Control audit | 5/5 sections | `make control-audit` exit code | | Conversations indexed | ≥1 session | `docs/conversations/Conversations.md` exists | -| **Primitive contract** | **13/13** | **`bstack doctor` exit code** | +| **Primitive contract** | **20/20** | **`bstack doctor` exit code** | ## When to use bstack @@ -390,9 +390,9 @@ This is the f₃ dynamics function at L3 of the RCS hierarchy. See [references/p ## See also -- [references/primitives.md](references/primitives.md) — full P1–P13 reference with reflexive triggers +- [references/primitives.md](references/primitives.md) — full P1–P20 reference with reflexive triggers - [references/prompts-integration.md](references/prompts-integration.md) — when/how to leverage the broomva.tech prompts library (5-step auto-tracing mandate, discovery, common traps) -- [references/skills-roster.md](references/skills-roster.md) — all 28 skills with install commands +- [references/skills-roster.md](references/skills-roster.md) — all 30 skills with install commands - [references/stack-architecture.md](references/stack-architecture.md) — layer dependency diagram - [references/quickstart.md](references/quickstart.md) — 5-minute install walkthrough - [bstack-upgrade/SKILL.md](bstack-upgrade/SKILL.md) — version-upgrade flow diff --git a/assets/templates/AGENTS.md.template b/assets/templates/AGENTS.md.template index cee3b5d..9e5abd2 100644 --- a/assets/templates/AGENTS.md.template +++ b/assets/templates/AGENTS.md.template @@ -325,7 +325,7 @@ Compositions are dynamic: Persist iterations can invoke `/goal` for sub-tasks; ` | **B** Cross-context same-model | Fresh `Agent` subagent under devil's-advocate brief | Always available | | **C** Composed existing skills | `superpowers:constructive-dissent`, `devils-advocate`, `pr-review-toolkit:*`, `critique`, `premortem`, `plan-*-review` | Always — the toolkit P20 makes mandatory | -Scoring: anti-slop ≥7/10 to pass; max 3 fix rounds; verdict logged in PR comments. Implementation: `broomva/cross-review` skill. The gate fires *before* P4 auto-merge — not after merge as code review. +Scoring: anti-slop ≥7/10 to pass; max 3 fix rounds; verdict logged in PR comments + Linear ticket (if workspace uses Linear). Implementation: `broomva/cross-review` skill. The gate fires *before* P4 auto-merge — not after merge as code review. **Invariant**: Substantive PRs (>200 LOC OR public API change OR multi-file OR governance-class) cannot merge without cross-model adversarial verdict ≥7/10. Self-review by the writing model is forbidden as the *sole* verdict. diff --git a/assets/templates/CLAUDE.md.template b/assets/templates/CLAUDE.md.template index b0938f6..6b0b496 100644 --- a/assets/templates/CLAUDE.md.template +++ b/assets/templates/CLAUDE.md.template @@ -33,7 +33,7 @@ Each primitive carries a **short name** for agent prose. When referencing a prim | P17 | **Lens** — Lens-Routed Request Articulation (`broomva/role-x` skill) | Reflexive rule: every substantive user input passes through `role/x` intake — select lens(es) from `roles/.md` registry by scoring signals, load substantive context, decide mode (`augment` / `rewrite` / `decompose`); P5 fan-out becomes typed graph. | No `act as X` persona rewrites — lenses load substantive context only. Lens selection is logged. Mode decision is surfaced unless `augment`. | | P18 | **Audience** — Format-Follows-Audience Discipline | Reflexive rule: format follows audience. Agent-readable (LLM, system-prompt loaded, in-repo reference) → **markdown**. Human-readable (decisions, review, exploration) → **HTML**. Both (README, CHANGELOG, GitHub-browseable) → markdown (GitHub renders). ASCII pseudo-diagrams + unicode-color-approximation + >100-line markdown specs without HTML companion are explicit anti-patterns. Specs/plans/ADRs land in `docs/specs/`, `docs/plans/`, `docs/adrs/` as `.html`. | Format follows audience, not habit. Markdown's expressiveness ceiling means humans bounce off agent-produced specs at ~100 lines; HTML's information density carries the load. The 2-4× HTML generation cost is paid only on artifacts a human will actually read. | | P19 | **Orchestrate** — Orchestration-Mechanism Selection Discipline | At pre-flight of substantive autonomous work, apply 2×2 (within-session/across-session × external-event/internal-condition) to map work shape to mechanism: `/goal ` (internal+in-session), Wait (P9) `p9 watch --background` (external+in-session), `/loop ` (internal+across-session), Persist (P12) `persist iterate PROMPT.md` (external+across-session). Compose dynamically. | No autonomous-continuation work without explicit mechanism choice + 2×2 quadrant citation. "Continue please" / waiting for user prompts mid-arc is ritual and forbidden. | -| P20 | **Cross-Review** — Cross-Model Adversarial Review Gate (`broomva/cross-review` skill) | Before substantive PRs merge, fire cross-model adversarial gate. Three strata: A (true cross-vendor via `codex exec`), B (fresh-context subagent under devil's-advocate brief), C (composed adversarial-review skills — `superpowers:constructive-dissent`, `devils-advocate`, `pr-review-toolkit:*`, `critique`, `premortem`). Anti-slop score ≥7/10; max 3 fix rounds; verdict logged in PR. Fires *before* P4 auto-merge. | Substantive PRs (>200 LOC OR public API OR multi-file OR governance-class) cannot merge without cross-model verdict ≥7/10. Self-review by the writing model as sole verdict is forbidden. | +| P20 | **Cross-Review** — Cross-Model Adversarial Review Gate (`broomva/cross-review` skill) | Before substantive PRs merge, fire cross-model adversarial gate. Three strata: A (true cross-vendor via `codex exec`), B (fresh-context subagent under devil's-advocate brief), C (composed adversarial-review skills — `superpowers:constructive-dissent`, `devils-advocate`, `pr-review-toolkit:*`, `critique`, `premortem`). Anti-slop score ≥7/10; max 3 fix rounds; verdict logged in PR comments + Linear ticket (if workspace uses Linear). Fires *before* P4 auto-merge. | Substantive PRs (>200 LOC OR public API OR multi-file OR governance-class) cannot merge without cross-model verdict ≥7/10. Self-review by the writing model as sole verdict is forbidden. | > **Naming note.** Skill repo names are stable and don't always match primitive numbers. P6's skill repo is `broomva/bookkeeping` (named for the function). P9's skill repo is `broomva/p9` — Wait was historically the ninth primitive, so the name matches. Renaming any skill repo would break every `npx skills add` install, so skill names stay; primitive numbers are sequential identifiers in the bstack itself. diff --git a/tests/template_lockstep.test.sh b/tests/template_lockstep.test.sh index b75f578..34bab78 100755 --- a/tests/template_lockstep.test.sh +++ b/tests/template_lockstep.test.sh @@ -168,6 +168,73 @@ assert_contains "CLAUDE.md.template has Plugin Skill Precedence section" \ assert_contains "AGENTS.md.template has Plugin Skill Precedence section" \ "$AGENTS_TPL_TEXT" "## Plugin Skill Precedence" +# ── 7. Scaffold-and-doctor compliance ────────────────────────────────── +# The REAL guardrail: scaffold a workspace from the templates, then run +# doctor.sh against it. This catches the failure mode where templates are +# internally consistent (count, table, index — checks 1-6 above) but use a +# heading format doctor.sh's regex doesn't recognize. That bug surfaced on +# the P20 cross-review of this very PR's first pass and motivated this +# assertion: lockstep-vs-validator, not just lockstep-vs-self. +echo "" +echo "=== Scaffold-and-doctor compliance ===" + +SCAFFOLD_DIR="$(mktemp -d)" +trap 'rm -rf "$SCAFFOLD_DIR"' EXIT + +# Mirror what scripts/bootstrap.sh's scaffold_governance_file() does. +mkdir -p "$SCAFFOLD_DIR/.control" "$SCAFFOLD_DIR/.claude" "$SCAFFOLD_DIR/scripts" +WSNAME="$(basename "$SCAFFOLD_DIR")" +sed "s/{{WORKSPACE_NAME}}/$WSNAME/g" "$CLAUDE_TPL" > "$SCAFFOLD_DIR/CLAUDE.md" +sed "s/{{WORKSPACE_NAME}}/$WSNAME/g" "$AGENTS_TPL" > "$SCAFFOLD_DIR/AGENTS.md" +sed "s|\${BROOMVA_WORKSPACE}|$SCAFFOLD_DIR|g" \ + "$BSTACK_REPO/assets/templates/settings.json.snippet" > "$SCAFFOLD_DIR/.claude/settings.json" +cp "$BSTACK_REPO/assets/templates/policy.yaml.template" "$SCAFFOLD_DIR/.control/policy.yaml" + +# Doctor checks mechanism reachability — placeholder scripts so those checks +# don't false-positive unrelated to template content. +touch "$SCAFFOLD_DIR/scripts/conversation-bridge-hook.sh" +touch "$SCAFFOLD_DIR/scripts/control-gate-hook.sh" +touch "$SCAFFOLD_DIR/scripts/skill-freshness-hook.sh" +touch "$SCAFFOLD_DIR/scripts/branch-janitor.sh" +chmod +x "$SCAFFOLD_DIR/scripts/"*.sh +git -C "$SCAFFOLD_DIR" init -q >/dev/null 2>&1 + +DOCTOR_OUT="$SCAFFOLD_DIR/.doctor.out" +BROOMVA_WORKSPACE="$SCAFFOLD_DIR" bash "$DOCTOR_SH" --quiet > "$DOCTOR_OUT" 2>&1 || true + +# Count only template-content gaps: primitive sections + reflexive rules. +# Other gaps (skill repos on disk, .control blocks, etc.) are not template concerns. +PRIMSEC_GAPS="$(grep -cE "AGENTS\.md missing '### P[0-9]+" "$DOCTOR_OUT" || true)" +REFLEX_GAPS="$(grep -cE "missing 'Reflexive Trigger Rule'" "$DOCTOR_OUT" || true)" +PRIMSEC_GAPS="${PRIMSEC_GAPS:-0}" +REFLEX_GAPS="${REFLEX_GAPS:-0}" + +assert_eq "Scaffold-and-doctor: no missing primitive sections" "$PRIMSEC_GAPS" "0" + +# Reflexive trigger rules: detection in WARN mode pending a separate PR to +# realign template P7/P8/P9 ordering to match doctor.sh REFLEXIVE_PRIMS + +# references/primitives.md (workspace canonical: P7=Wait, P8=Freshness, +# P9=Janitor). The templates currently have P7=Freshness/P8=Janitor/P9=Wait +# (pre-existing drift introduced by #16). The next PR fixes the templates; +# this PR ships the detector. Once the realignment lands, flip this back +# to a hard assertion. +# +# TODO (follow-up PR after this one merges): after template P7-P9 realignment +# branch `feat/bstack-template-realign-p7-p9-fail-mode-flip`, change this to: +# assert_eq "Scaffold-and-doctor: no missing reflexive trigger rules" "$REFLEX_GAPS" "0" +if [ "$REFLEX_GAPS" -gt 0 ]; then + echo " [warn] Scaffold-and-doctor: $REFLEX_GAPS missing reflexive trigger rule(s) — pre-existing template drift; tracked in follow-up PR. Detector working; fix not in this PR." +else + echo " [ok] Scaffold-and-doctor: no missing reflexive trigger rules: $REFLEX_GAPS" + PASS=$((PASS + 1)) +fi + +if [ "$PRIMSEC_GAPS" != "0" ] || [ "$REFLEX_GAPS" != "0" ]; then + echo "" + echo " [doctor output excerpt — first 30 lines]" + head -30 "$DOCTOR_OUT" | sed 's/^/ /' +fi + # ── Summary ────────────────────────────────────────────────────────────── echo "" echo "=== Lockstep test summary ==="