Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions SKILL.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: bstack
description: |
The Broomva Stack — twelve irreducible primitives (P1–P12) that turn any
The Broomva Stack — thirteen irreducible primitives (P1–P13) that turn any
agent-driven workspace into a self-operating system, plus 28 curated skills
that ship with the stack. The primitives are not optional features; they are
the substrate. P1 captures every session as episodic memory. P2 gates
Expand All @@ -25,7 +25,7 @@ description: |

# bstack — The Broomva Stack

**Twelve irreducible primitives. Twenty-nine curated skills. One self-operating workspace.**
**Thirteen irreducible primitives. Twenty-nine curated skills. One self-operating workspace.**

bstack is a *portable harness metalayer* — it composes existing skills into a binding primitive contract that the agent enforces by reasoning, the doctor enforces by checking, and the bootstrap enforces by scaffolding.

Expand Down Expand Up @@ -64,6 +64,7 @@ The twelve primitives. Each closes one specific failure mode that drifts into en
| **P10** | Worktree Hygiene Discipline | dirty-tree drift across the PR lifecycle |
| **P11** | Empirical Feedback Loop | shipping code that compiles but doesn't work |
| **P12** | Persistent Loop Discipline (`broomva/persist` skill) | long-horizon work decaying as the context window rots |
| **P13** | Dream Cycle Discipline | tier-crossing consolidation corrupting upper-tier rules without replay (the *shadow dream* failure mode) |

Full reference: see [references/primitives.md](references/primitives.md).

Expand Down Expand Up @@ -138,9 +139,9 @@ Report results. If any checks fail, fix them before proceeding.
`scripts/doctor.sh`. Seven check sections:

1. Governance files exist (CLAUDE.md, AGENTS.md, .control/policy.yaml)
2. CLAUDE.md primitives table has all P1–P12 rows + correct count header
2. CLAUDE.md primitives table has all P1–P13 rows + correct count header
3. AGENTS.md has each primitive section (`### P1:` through `### P11:`)
4. Reflexive Trigger Rules present for P6, P7, P10, P11, P12 (the reasoning-enforced primitives)
4. Reflexive Trigger Rules present for P6, P7, P10, P11, P12, P13 (the reasoning-enforced primitives)
5. `.control/policy.yaml` has required blocks (`ci_watch:`, `ci_heal:`, `auto_merge:`)
6. `.claude/settings.json` wires the expected hook scripts (P1, P2, P8)
7. Each primitive's mechanism is reachable on disk
Expand Down Expand Up @@ -214,7 +215,7 @@ This is the f₃ dynamics function at L3 of the RCS hierarchy. See [references/p

## See also

- [references/primitives.md](references/primitives.md) — full P1–P12 reference with reflexive triggers
- [references/primitives.md](references/primitives.md) — full P1–P13 reference with reflexive triggers
- [references/skills-roster.md](references/skills-roster.md) — all 28 skills with install commands
- [references/stack-architecture.md](references/stack-architecture.md) — layer dependency diagram
- [references/quickstart.md](references/quickstart.md) — 5-minute install walkthrough
Expand Down
38 changes: 37 additions & 1 deletion assets/templates/AGENTS.md.template
Original file line number Diff line number Diff line change
Expand Up @@ -134,19 +134,55 @@ The agent picks the right subset, runs as parallel watchers, and **captures evid
5. When CI or tests fail — capture full context first before attempting a fix.
6. At session end — produce a *dogfood receipt*.

### P12: Persistent Loop Discipline (Cross-Context Restart Loop)

**What**: Reflexive discipline binding every agent to *restart the context window when it rots*, while preserving state in the filesystem. Long-horizon work (>1h, the METR 80%-reliability ceiling) decays inside a single conversation as the context window passes ~100K tokens.

**How**: `python3 skills/persist/scripts/persist.py iterate <PROMPT.md>` substrate. Each iteration spawns a fresh agent context; state persists in PROMPT.md + git tree + state.jsonl. Validation backpressure from compilers/tests/linters, not model self-grading.

**Invariant**: state lives in the filesystem. Each iteration is a fresh subprocess. Budget: default 50 iterations / 14400s wall-clock.

**Reflexive Trigger Rule**: P12 is a reflex. Apply without being prompted:

1. Before any work that may exceed ~1h of unsupervised agent time — write PROMPT.md, call `persist iterate`. Don't try >1h work in-context.
2. When session token usage crosses ~100K — restart, don't continue in the rotted context.
3. When the same fix has been attempted ≥3 times without convergence — stop in-context; spawn fresh persist loop.
4. When orchestrating long-horizon work — default to persist + periodic checkpoints; compose with P5 worktrees.
5. When the user says "run this in the background for an hour" — that's persist territory.

### P13: Dream Cycle Discipline (Tier-Crossing Consolidation)

**What**: Reflexive discipline binding every agent to apply the **5-phase dream shape** (*gather → replay → prune → consolidate → index*) for any consolidation that crosses a cadence-tier boundary. Closes the *shadow dream* corruption mode.

**How**: Reasoning-enforced. Composes with P6 (`bookkeeping replay` is the canonical reference instance) and future Life primitives (askesis T1→T2, anamnesis T2→T3, when shipped).

**Invariant**: any consolidation that crosses a cadence-tier boundary MUST replay against a frozen substrate before committing. Without replay, dense lower-tier signal corrupts sparse upper-tier rules.

**Reflexive Trigger Rule**: P13 is a reflex. Apply without being prompted:

1. Before any consolidation that promotes lower-tier signal to upper-tier rules — verify the consolidation primitive has a replay phase.
2. For knowledge-graph promotion — use `bookkeeping replay` (not `bookkeeping run`) for substantial promotion runs.
3. For governance changes (L3 tier) — every PR is a dream cycle: gather (PR description), replay (worktree + CI + doctor), prune (CI failures), consolidate (squash merge), index (commit history).
4. When designing a NEW consolidation primitive — implement the 5-phase shape from day 1; don't ship shadow-dream form.
5. When you observe a new dream instance shipping — record it for the rule-of-three counter.

The morpheus crate (shared abstraction across implementations) is deferred per rule-of-three until ≥2 dream instances ship beyond P6.

---

These eleven primitives compose into the full autonomous development loop:
These thirteen primitives compose into the full autonomous development loop:

```
User intent → Linear ticket (P3) → Agent dispatched (P5)
→ Prior context loaded (P1) [+ P8 freshness check] [+ P10 cleanup audit]
→ Safety gates active (P2)
→ P10 worktree decision → P11 validation plan
→ IF long-horizon → P12 persist loop with PROMPT.md + budget
→ Code written + parallel watchers (P11 log-tails) → PR created (P4)
→ CI watched + heal loop (P7)
→ P11 deploy verification (preview URL, screenshots, browser session)
→ Merge → P10 post-merge cleanup via P9 janitor → Deploy
→ P13 dream cycle for any consolidation (P6 replay first; future Life dreams compose here)
→ P11 dogfood receipt → Session captured (P1) → Knowledge bookkept (P6)
→ System improved (EGRI)
```
Expand Down
4 changes: 3 additions & 1 deletion assets/templates/CLAUDE.md.template
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This workspace is governed by **bstack** — eleven irreducible primitives (P1

## Bstack Core Automation Primitives

Eleven irreducible building blocks that make this workspace self-operating. All are always active. Full specification in `AGENTS.md`.
Thirteen irreducible building blocks that make this workspace self-operating. All are always active. Full specification in `AGENTS.md`.

| # | Primitive | Mechanism | Invariant |
|---|-----------|-----------|-----------|
Expand All @@ -21,6 +21,8 @@ Eleven irreducible building blocks that make this workspace self-operating. All
| P9 | Branch + Worktree Janitor | `make janitor` → detects squash-merged branches + dead worktrees, removes safely | Default `--dry-run`; never touches protected branches |
| P10 | Worktree Hygiene Discipline | Reflexive rule: decide worktree-or-not before first file; keep `git status` clean; auto-run P9 janitor after every merge | A clean tree is the only reliable reset point |
| P11 | Empirical Feedback Loop | Reflexive rule: validate by *interacting* — log-tails, browser E2E, screenshots, deploy verification, multi-level test composition | Reasoning isn't validation; interaction is |
| P12 | Persistent Loop Discipline (`broomva/persist` skill) | Reflexive rule: cross-context restart loop — state in filesystem (PROMPT.md + git tree), each iteration spawns a fresh agent context | At long-horizon work (>1h), in-context loops decay; restart fresh, backpressure from compilers/tests |
| P13 | Dream Cycle Discipline | Reflexive rule: any consolidation that crosses a cadence-tier boundary MUST follow the 5-phase shape (gather → replay → prune → consolidate → index) | Replay against frozen substrate is the runtime form of stop-gradient; without it, dense lower-tier signal corrupts sparse upper-tier rules |

> **Naming note.** Skill names are historical and do not always match primitive numbers. P6's skill repo is `broomva/bookkeeping`. P7's skill repo is `broomva/p9` (named when it was the ninth primitive; renaming would break every `npx skills add` install). Primitive numbers are sequential identifiers in the bstack itself; skills are independent npm-style packages with stable names. `bstack doctor` checks AGENTS.md compliance with this and all other primitives.

Expand Down
92 changes: 84 additions & 8 deletions references/primitives.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ The eleven primitives that make a workspace self-operating. This is the canonica
- [P9 — Branch + Worktree Janitor](#p9--branch--worktree-janitor)
- [P10 — Worktree Hygiene Discipline](#p10--worktree-hygiene-discipline)
- [P11 — Empirical Feedback Loop](#p11--empirical-feedback-loop)
- [P12 — Persistent Loop Discipline](#p12--persistent-loop-discipline)
- [P13 — Dream Cycle Discipline](#p13--dream-cycle-discipline)
- [Cohesion narrative](#cohesion-narrative)
- [RCS L3 stability constraint](#rcs-l3-stability-constraint)

Expand Down Expand Up @@ -202,28 +204,102 @@ Mental checklist: *Did I interact with it? Did I capture evidence? Was the evide

---

## P12 — Persistent Loop Discipline

**Closes**: long-horizon work decaying as the context window rots past ~100K tokens (the *"Dumb Zone"*). METR's Time Horizon 1.1 (Jan 2026) puts the **80%-reliability deployable horizon at ~1h on Opus 4.6** — a 14× reliability gap vs the 14.5h 50%-horizon. Above 1h, in-context loops fail.

**Skill name note**: P12's skill repo is `broomva/persist` — non-anthropomorphized rename of the pattern Geoffrey Huntley popularized as the "Ralph loop" (Jan 2026).

**How**: `python3 skills/persist/scripts/persist.py iterate <PROMPT.md>` substrate. Each iteration spawns a fresh agent context. State persists in the filesystem (PROMPT.md + git tree + state.jsonl). Validation backpressure from compilers/tests/linters, not model self-grading. Five-state machine: `SPAWNED → ITERATING (self-loop) → SUCCESS | BUDGET_EXHAUSTED | ABANDONED`. Default budget: 50 iterations / 14400s wall-clock (METR's 80%-horizon ceiling).

**Invariant**: state lives in the filesystem. Each iteration starts from PROMPT.md content, not conversation history. Validation backpressure is external. Each iteration is a fresh subprocess.

### P12 Reflexive Trigger Rule (binding on every agent)

P12 is a reflex, not a request. Apply without being prompted:

1. Before any work that may exceed ~1h of unsupervised agent time — write PROMPT.md, call `persist iterate`. Don't try >1h work in-context.
2. When session token usage crosses ~100K — restart, don't continue in the rotted context.
3. When the same fix has been attempted ≥3 times without convergence — stop in-context; spawn fresh persist loop.
4. When orchestrating long-horizon work — default to persist + periodic checkpoints; compose with P5 (one persist loop per worktree) and P7 (each iteration's PR uses `p9 watch`).
5. When the user says "run this in the background for an hour" — that's persist territory.

---

## P13 — Dream Cycle Discipline

**Closes**: the *shadow dream* corruption mode — consolidation runs that gather + consolidate + index without the **replay** phase. Without replay, dense lower-tier signal corrupts sparse upper-tier rules. Pattern documented in `research/entities/concept/multi-tier-dreaming.md` (scored 9/9, promoted 2026-04-30).

**How**: Reasoning-enforced. P13 has no dedicated substrate skill — it composes with primitives that already implement the dream shape:

| Tier crossing | Implementation | Status |
|---|---|---|
| Knowledge graph (raw → promoted entities) | P6 with `bookkeeping replay` | **Reference instance — shipped 2026-05-06** |
| Agent traces → plans (T0→T1) | Life autonomic compression | shipped (eager / shadow form) |
| Trace bundles → prompt/tool diffs (T1→T2) | Life askesis | designed, not yet shipped |
| Diffs → governance amendments (T2→T3) | Life anamnesis | proposed, not yet shipped |

The 5-phase canonical shape:

| Phase | Function |
|---|---|
| **Gather** | Collect a bounded bundle of dense lower-tier signal as a frozen, addressable artifact. |
| **Replay** | Re-execute the bundle against a *frozen substrate* — sandbox, world model, retrieval cache. |
| **Prune** | Reject replayed signal that fails the gate: no improvement, schema violation, regression. |
| **Consolidate** | Commit the kept signal as a sparse, structured update to the upper-tier substrate. Atomic and versioned. |
| **Index** | Re-validate the upper-tier resource graph: reference integrity, contradiction detection, garbage-collect orphans. |

Three independent observations converge on this shape — biological REM sleep, Anthropic's `/dream` skill, Physical Intelligence's knowledge-insulation training (Driess et al. 2025). The replay phase is the runtime form of stop-gradient.

**Invariant**: any agent-driven consolidation that crosses a cadence-tier boundary MUST replay against a frozen substrate before committing. If a consolidation primitive doesn't have a replay phase, it's a *shadow dream* and is unsafe — the agent's job is to either (a) use the dream-cycle form, or (b) explicitly justify why this consolidation is single-tier and doesn't need replay.

**Deferred per rule-of-three**: the `morpheus` crate (shared abstraction across implementations). Extract only when ≥2 dream instances ship end-to-end beyond P6+replay. Currently at 1.

### P13 Reflexive Trigger Rule (binding on every agent)

P13 is a reflex, not a request. Apply without being prompted:

1. Before any consolidation that promotes lower-tier signal to upper-tier rules — verify the consolidation primitive has a replay phase. If not, request the replay-extension before consolidating, or document why this case is single-tier.
2. For knowledge-graph promotion — use `bookkeeping replay` (not `bookkeeping run`) for substantial promotion runs.
3. For governance changes (L3 tier) — every PR in this workspace is a dream cycle: gather (PR description), replay (worktree + CI + doctor), prune (CI failures, doctor gaps), consolidate (squash merge), index (commit history).
4. When designing a NEW consolidation primitive — implement the 5-phase shape from day 1; don't ship shadow-dream form.
5. When you observe a new dream instance shipping — record it in `multi-tier-dreaming.md` (the rule-of-three counter for morpheus extraction).

---

## Cohesion narrative

P11 is the **cohesion glue** for the whole stack:
P11, P12, and P13 are structural siblings at different scales:

| Primitive | Discipline | Surface | Evidence | Scale |
|---|---|---|---|---|
| **P11** Empirical Feedback | "validate by interacting" | live deployed system | screenshots, logs, browser session | in-session (≤1h) |
| **P12** Persistent Loop | "restart fresh when context rots" | filesystem (PROMPT.md + git) | state.jsonl + each iteration's evidence | cross-session (>1h) |
| **P13** Dream Cycle | "consolidate by replaying" | frozen substrate | diff against frozen snapshot | tier-crossing |

The whole stack composes:

- **P4** (PR Pipeline) and **P7** (CI Watcher) catch what CI sees; **P11** catches what CI can't.
- **P10** (Worktree Hygiene) keeps the working tree clean enough for empirical checks to be meaningful.
- **P6** (Bookkeeping) records the validation evidence as durable context.
- **P1** (Conversation Bridge) preserves the dogfood receipt across sessions.
- **P8** (Skill Freshness) ensures the validation tools the agent reaches for (gstack, agent-browser, dogfood, qa) are themselves current.
- **P9** (Janitor) ensures cleanup state is automatic so the next P10/P11 cycle starts from zero.
- **P4** (PR Pipeline) and **P7** (CI Watcher) catch what CI sees; **P11** catches what CI can't; **P13** catches what consolidation without replay can't.
- **P10** (Worktree Hygiene) keeps the working tree clean enough for empirical checks to be meaningful — same shape as P13's "frozen substrate" requirement at the knowledge layer.
- **P6** (Bookkeeping) is the first concrete implementation of P13's discipline — `bookkeeping replay` is the canonical reference dream cycle.
- **P12** (Persist) is the substrate for long-horizon work that needs P11/P13 discipline across many iterations.
- **P1** (Conversation Bridge) preserves dogfood receipts and dream-cycle audit trails across sessions.
- **P8** (Skill Freshness) ensures the validation/replay tools (gstack, bookkeeping, persist) are themselves current.
- **P9** (Janitor) ensures cleanup state is automatic so the next cycle starts from zero.

The eleven primitives compose into the full autonomous development loop:
The thirteen primitives compose into the full autonomous development loop:

```
User intent → Linear ticket (P3) → Agent dispatched (P5)
→ Prior context loaded (P1) [+ P8 freshness check] [+ P10 cleanup audit]
→ Safety gates active (P2)
→ P10 worktree decision → P11 validation plan
→ IF long-horizon → P12 persist loop with PROMPT.md + budget
→ Code written + parallel watchers (P11 log-tails) → PR created (P4)
→ CI watched + heal loop (P7)
→ P11 deploy verification (preview URL, screenshots, browser session)
→ Merge → P10 post-merge cleanup via P9 janitor → Deploy
→ P13 dream cycle for any consolidation (P6 replay first; future Life dreams compose here)
→ P11 dogfood receipt → Session captured (P1) → Knowledge bookkept (P6)
→ System improved (EGRI)
```
Expand Down
Loading