🏛️ feat: Lego templates — plugin ships zero global agents by ZaxShen · Pull Request #63 · trustmybot/plugin

ZaxShen · 2026-04-25T01:53:13Z

Summary

Implements #60 — the Lego templates restructure.

Plugin ships ZERO global agents. Bro is the only persona (in CLAUDE.md). Every other agent (swe, pr-reviewer, architect, cto, ceo, pm) is a Lego template that bro copies into <project>/.claude/agents/ on demand, with explicit Human approval. Project customization happens by extending the agent's skills: array via tmb_skill-creator — never by editing the agent body.

Lego rule

Layer	Mutability
Template body	Immutable — bro copies verbatim, never edits
`skills:` array	Additive — extended by `tmb_skill-creator`
Spawn prompt	Ephemeral — fresh per call

What's in this PR

New templates (6 agents, ≤25 lines body each)

templates/agents/ — swe, pr-reviewer, architect, cto, ceo, pm. Each is identity + role + boundary + skills: []. The 30-line Lego cap is enforced by tests/lint/agent-line-budget.sh.

New skill templates (7 default skills)

templates/skills/ — swe-checklist, review-protocol, review-findings, code-quality, docs-conventions, git-conventions, naming-conventions. Bootstrap copies these into the project alongside swe + pr-reviewer.

`tmb_` prefix on all 14 plugin protocol skills

Plus 2 NEW protocol skills:

tmb_bootstrap — first-run template copy (with Human approval)
tmb_skill-creator — generates project skills, extends agent skills: arrays (NEVER edits agent body)

tmb_agent-creator rewritten with template-copy mode (PRIMARY) + from-scratch fallback.

CLAUDE.md updates

Bootstrap-check added to bro's first-action chain
Code-touching chain references tmb_* skill names
Routing table now describes the template-copy-on-demand flow
New 'Templates shipped with the plugin' section enumerates all 6

MCP role enforcement (still in place from #59)

task_create_batch, issue_create, issue_close → bro only
task_update_status → bro + swe
validation_record → pr-reviewer only
discussion_append → bro + swe + pr-reviewer + any consultant

Layer 1 + 2 status: GREEN

Onboarding contract lint: PASS
Lego cap (≤30 lines on templates/agents/): PASS — all ≤25 lines
MCP server unit tests: 0 failures
MCP integration suite: 0 failures
Hook tests: PASS

Test plan

Layer 1 lint
Layer 2 MCP unit + integration + hook tests
Layer 3 dogfood — see tests/manual/scenarios.md. Specifically:
- 1.1 (onboarding) — should NOW also trigger tmb_bootstrap after onboarding completes (project gets swe + pr-reviewer + default skills copied)
- 2.1.bro (canonical simple-task smoke test) — should work end-to-end with the project-local swe + pr-reviewer
- C.1 (consultant invocation) — should now trigger tmb_agent-creator template-copy mode for architect/cto/ceo/pm

Followup

Open question: should agents have a shared skill bag (inheritance) or pure per-agent listing? #61 — open question on agent-shared skill bags
Open question: should plugin templates be split into a separate marketplace channel? #62 — open question on splitting templates into a separate marketplace channel

🤖 Generated with Claude Code

Major restructure to the agent-factory model. The plugin now ships ONLY bro (a persona in CLAUDE.md) and a set of templates + skills. Every other agent — swe, pr-reviewer, architect, cto, ceo, pm — is a Lego template that bro copies into <project>/.claude/agents/ on demand. ### Lego philosophy > Agents are studs. Skills are bricks. Spawn-prompts are assembly > instructions. You don't edit a Lego block — you snap blocks together. Three-layer composition rule, never confused: | Layer | Mutability | |---|---| | Template body (`templates/agents/<name>.md`) | **Immutable** — bro copies verbatim, never edits | | `skills:` frontmatter array on the project copy | **Additive** — extended by `tmb_skill-creator`, never replaced | | Spawn prompt | **Ephemeral** — fresh per Task-tool invocation | ### Templates shipped `templates/agents/` — 6 minimal Lego blocks (each ≤25 lines body, enforced by `tests/lint/agent-line-budget.sh` 30-line cap): - swe.md — executor, atomic close - pr-reviewer.md — pre-commit gate - architect.md — system-design consultant (analysis-only) - cto.md — technical strategy consultant - ceo.md — product scope consultant - pm.md — product strategy consultant `templates/skills/` — 7 default skill bag copied alongside swe + pr-reviewer during bootstrap (swe-checklist, review-protocol, review-findings, code-quality, docs-conventions, git-conventions, naming-conventions). ### Plugin protocol skills — `tmb_` prefix on all 16 To prevent name collisions with project-local skills, every plugin-shipped skill now has the `tmb_` prefix. Project-local skills stay unprefixed and live under `<project>/.claude/skills/`. Renamed: - agent-creator → tmb_agent-creator (template-copy mode + scratch fallback) - architect-workflow → tmb_architect-workflow - branch-id-proposal → tmb_branch-id-proposal - create-hook → tmb_create-hook - feedback-loop → tmb_feedback-loop - first-run-onboarding → tmb_first-run-onboarding - lazy-regen-check → tmb_lazy-regen-check - project-prescan → tmb_project-prescan - refresh-architecture → tmb_refresh-architecture - roundtable → tmb_roundtable - roundtable-cleanup → tmb_roundtable-cleanup - swe-spawn-workflow → tmb_swe-spawn-workflow - tmb-reonboard → tmb_reonboard (hyphen → underscore for consistency) - validate-swe-output → tmb_validate-swe-output Plus two NEW protocol skills: - **tmb_bootstrap** — first-run flow. Bro detects `.claude/agents/swe.md` missing → AskUserQuestion approval → copies templates verbatim into the project. Logs to ledger. Idempotent (skips files that already exist). - **tmb_skill-creator** — generates a new project-local skill and extends the relevant agent's `skills:` array. NEVER edits the agent body — the only allowed mutation to a project agent is appending to its skills list. Hard rule. ### tmb_agent-creator: two modes Rewritten with clear branching: - **Template-copy mode (PRIMARY)** — when the requested name matches a shipped template (`templates/agents/<name>.md` exists), copy verbatim with Human approval. The 4 consultant templates (architect/cto/ceo/pm) always trigger this mode. - **From-scratch mode (FALLBACK)** — when no template matches (e.g. `legal-reviewer`, `security-reviewer`), draft a Lego-shaped prompt with Human approval and write it. Reserved names tightened to `bro`, `swe`, `pr-reviewer` (architect is no longer reserved — it's a common consultant template). ### CLAUDE.md updates - Added bootstrap-check to bro's first-action chain (after onboarding, before resume). - Code-touching chain now references `tmb_*` skill names. - Routing table explicitly says "check .claude/agents/<name>.md, if absent invoke tmb_agent-creator (template-copy if available, scratch otherwise)". - New "Templates shipped with the plugin" section enumerates the 6 templates and their copy triggers. ### Layout ``` plugin/ ├── agents/ ← EMPTY (.gitkeep only; bro is in CLAUDE.md) ├── templates/ │ ├── agents/{swe,pr-reviewer,architect,cto,ceo,pm}.md │ └── skills/{swe-checklist,review-*,code-quality,*-conventions,naming-conventions}/ ├── skills/ ← all `tmb_*` plugin protocol skills └── CLAUDE.md ``` ### Tests - `tests/lint/agent-line-budget.sh` rewritten — checks plugin/agents/ is empty and templates/agents/*.md within the 30-line Lego cap. - `tests/lint/onboarding-skill-contract.sh` paths updated to `tmb_*`. - All MCP integration + unit tests still pass after the renames. ### Layer 1 + 2 status: GREEN ``` Onboarding contract lint: PASS Lego cap on templates/agents: PASS (all ≤25 lines) MCP server unit tests: 0 failures MCP integration suite: 0 failures Hook tests: PASS ``` ### Followup - #61 — open question on agent-shared skill bags (per-agent listing wins for now) - #62 — open question on splitting templates into a separate marketplace channel Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZaxShen · 2026-04-25T03:11:03Z

Layer 3 dogfood verification — end-to-end PASS

Walked the canonical scenario in fresh Mode B (`/tmp/tmb-smoke`) against this branch. First time the bro→swe→pr-reviewer chain has completed end-to-end with a clean lifecycle.

Trigger

Wiped `/tmp/tmb-smoke`, fresh `git init`, fresh Mode B session, fresh DB. Single ask: `@bro write a cli todo by Python`.

Trajectory log (from MCP DB monitor)

```
19:29:53 identity ZaxShen written
19:29:56 config branching_model="github-flow", pr_target="main"
19:29:56 config protected_branches=["main"] ← raw JSON array, NOT "[\"main\"]" — config_set defense fix verified
19:54:43 issue 1 created (objective: "Build a Python CLI todo app")
19:55:13 discussion 2 (note): "Beginning planning on branch_id feat/cli-todo, triage: simple"
← updated tmb_branch-id-proposal wording (was "Routed to architect" pre-cleanup)
19:55:30 ledger 1 (scope_gate_waived): "simple-triage personal CLI: defaulted to argparse + unittest..."
← simple-fast-lane waiver fired correctly
19:55:31 task 1 (feat/cli-todo, pending) created via task_create_batch
19:55:34 ledger 2 (planning_complete)
──────── 51s planning phase total ────────
19:59:10 ledger 3 (tmb_bootstrap_complete): "Copied 2 agent templates (swe, pr-reviewer) + 7 skill templates"
← bro detected missing .claude/agents/swe.md, invoked tmb_bootstrap, AskUserQuestion approved, copied verbatim
20:03:48 task 1 → completed (SWE atomic close with commit_sha)
← worktree at .claude/worktrees/agent-acb891639c9f96486, committed:
── caeb93d ✨ feat(cli-todo): minimal argparse + JSON-store todo CLI
── files: todo.py, test_todo.py, README.md
20:06:44 validation 1 (task_id=1, attempt_n=1, verdict=pass) ← pr-reviewer signed off
20:06:44 task 1 → closed (bro flipped status atomically)
20:08:11 issue 1 → closed (bro called issue_close after all tasks closed)
```

Design checkpoints — all green

Checkpoint	Status
Onboarding on fresh DB (identity + 3 config writes)	✓
`protected_branches` stored as raw `["main"]` array (not double-encoded `"[\"main\"]"` — config_set defense)	✓
Updated routing-note wording ("Beginning planning on..." not "Routed to architect on...")	✓
Simple-fast-lane: no env probe, no ceremonial Q+A, defaults table picked	✓
Scope-gate waiver with valid reason logged to ledger	✓
`tmb_bootstrap` auto-invoked on missing `.claude/agents/swe.md`	✓
Template copy verbatim (2 agents + 7 skills, no body edits)	✓
Bro→swe spawn finds project-local agent	✓
Worktree isolation + atomic commit + status='completed'	✓
Bro→pr-reviewer spawn + `validation_record(verdict='pass')`	✓
Bro flips task → `closed` after pass verdict	✓
Bro flips issue → `closed` after all tasks closed	✓
Zero `forbidden` events (MCP `requireRoles` enforcement clean)	✓
Every state transition logged to ledger (replayable)	✓

Lifecycle final state

```
issue.status: open → closed
task.status: pending → completed → closed
verdict: pass (pr-reviewer)
ledger events: scope_gate_waived → planning_complete → tmb_bootstrap_complete
```

Every state machine ended in a terminal position. No orphan `running` tasks. No missing verdicts. No abandoned ledger entries.

Comparison to baselines

Run	Total time	Outcome
Original `bro→architect→swe` chain (pre-#58)	7+ minutes	Hung at SWE spawn, never completed, Ctrl-C didn't interrupt
Pure Claude Code on the same ask (no plugin)	~32s	Code generated, no audit, no governance
This run (this PR's Lego architecture)	~12min total (one-time bootstrap + chain), `~8.5min` projected for subsequent tasks in same project	Closed cleanly with full audit + role enforcement

The runtime is meaningfully heavier than pure Claude — multi-agent + audit trail + gates aren't free — but the chain now actually completes, every transition is replayable from SQLite, and consultants can never write workflow state by construction. Subsequent tasks in the same project skip bootstrap entirely.

What this verifies for #60

✅ Plugin ships zero global agents (`agents/` is empty)
✅ `tmb_bootstrap` skill copies templates verbatim, no body edits
✅ All `tmb_*` skill renames work — no name collisions, all references resolve
✅ Bro's first-action chain correctly invokes `tmb_bootstrap` when `.claude/agents/swe.md` absent
✅ Lego rule enforced in practice: agent files copied verbatim from templates, no edits
✅ Layer 3 dogfood confirms what Layers 1 + 2 asserted statically

Open follow-ups (not blockers for this PR)

The 3.5min interactive bootstrap pause is wall-clock from the AskUserQuestion approval round-trip; that's user-bound, not chain-bound.
pr-reviewer ran in ~3min including bro's wrapper time; could potentially be tightened in a future iteration but doesn't block.
No `forbidden` events surfaced because the chain stayed in role; the negative-path tests in role-matrix.test.mjs cover the rejection case at Layer 2.

Recommend merging. Lego templates work end-to-end in dogfood.

…vial edits Implements the Round 1 latency-reduction plan from #64's DISCUSSION.md. Per Zax's review of the proposals (DISCUSSION.md in TMB workspace): ### Doctrine changes **Bro is the task gate.** Bro auto-flips a task to `status='closed'` as soon as SWE returns with `status='completed'` and a `commit_sha`. No pr-reviewer at task close. **pr-reviewer is the push gate.** It fires only at `git push` time over the batch of unsigned commits about to ship. Triggered by the `git-push-guard.sh` PreToolUse hook + the Human invoking the magic phrase `@bro review before push`. Cost amortized across multiple tasks per push instead of paid per task. **Direct Mode for trivial single-file changes.** Bro is now allowed to Edit a file directly when ALL of: single file, ≤3 lines diff, no public API change, no test required, no docs/trustmybot/architecture/ touched. Logged as `direct_mode_used` in the ledger. Anything bigger falls back to the default chain. ### File changes - **CLAUDE.md** — rewrote `## Role`, `## First-action chain`, `## Code-touching asks`, `## Routing`. Dropped the bootstrap-check step (onboarding now handles template copy inline). Added `## Direct Mode` and `## Push gate` sections. Catchphrase now bound to the push gate's pass verdict. - **skills/tmb_architect-workflow/SKILL.md** — rewrote the main sequence: bro flips task → closed immediately after SWE returns. No pr-reviewer step. Added explicit instruction: emit `task_create_batch` + `Task(swe)` + `ledger_log(planning_complete)` as multiple tool_use blocks in one assistant response so CC executes them concurrently. Same change to the simple-fast-lane handoff step. - **skills/tmb_first-run-onboarding/SKILL.md** — added Step 5: silently copy templates/agents/{swe,pr-reviewer}.md + templates/skills/* into the project as part of onboarding's mandatory write sequence. No extra AskUserQuestion. Logs the same `tmb_bootstrap_complete` event for audit consistency. Frontmatter `allowed-tools` extended with `Read, Write, ledger_log`. - **skills/tmb_bootstrap/SKILL.md** — relegated to RECOVERY skill. Triggers only on the rare edge case where identity is set but `.claude/agents/swe.md` is missing (e.g. user hand-deleted between sessions). Onboarding handles the normal path now. - **templates/agents/swe.md** — dropped the chain-of-thought block. Removed `swe-checklist` from eager `skills:` (now lazy-load via Skill tool only when verification needs interpretation). Instructed to batch `task_get` + `git worktree add` in the first response so they run concurrently. Net 22 → 21 lines. - **templates/agents/pr-reviewer.md** — reframed as the push gate. Body explicitly notes it fires over a batch of unsigned tasks, not per individual task close. Skills array empty (project attaches review skills via tmb_skill-creator). - **scripts/hooks/git-push-guard.sh (NEW)** — replaces `require-review-sign.sh`. PreToolUse on Bash matching `git push`. Uses `git log @{u}..HEAD` to determine commits being pushed; queries the trajectory DB for tasks with matching commit_sha that lack a `validation_attempts.verdict='pass'` row. Blocks with a structured message naming the specific unsigned tasks and instructing `@bro review before push`. - **scripts/hooks/lib/query-task.sh** — `tmb_unsigned_tasks` updated: now queries `status='closed' AND commit_sha IS NOT NULL` (was `status='completed'`). Reflects the new doctrine where bro auto- closes immediately and pr-reviewer fires later. - **scripts/hooks/require-review-sign.sh + test** — DELETED, replaced by git-push-guard.sh. - **hooks/hooks.json** — swapped `require-review-sign.sh` → `git-push-guard.sh` in the PreToolUse Bash matcher. - **docs/PERFORMANCE.md (NEW)** — records target, baseline (#63 12-min run), Tier 1/2/3 doctrine of what's safe to trim, and the changes shipped here. Re-evaluation triggers spelled out. - **docs/architecture/FLOWS.md** — Flow 2 (simple task) sequence diagram redrawn for the new chain (parallel batched writes, bro closes inline, no pr-reviewer step). Flow 6 (PR review) reframed as the push gate flow with parallel pr-reviewer spawns over multiple unsigned tasks. ### Layer 1 + 2 status ``` Onboarding contract lint: PASS Lego cap (≤30 lines templates/): PASS (max 26) MCP server unit + integration: PASS Hook tests (git-guards, task-spec): PASS ``` The deleted require-review-sign.sh test file went away cleanly; git- push-guard.sh has no dedicated test yet (Layer 3 dogfood will exercise it via real `git push`). ### Expected impact (estimate, to be verified in Layer 3) - Per-task time: ~12 min → ~3–5 min on simple-triage path - Direct Mode trivial edits: ~10–20s, near pure-Claude - Per-push pr-reviewer pass: ~2–3 min flat regardless of unsigned-task count (parallel spawns) Layer 3 verification next. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mode A retest in /tmp/tmb-scratch landed the file copies correctly but left no ledger row for tmb_bootstrap_complete. The chain still worked end-to-end (~2:16 wall-clock, 5× faster than the #63 baseline) but the audit trail lost the "onboarding ran here" anchor that downstream skills + tests rely on. Root cause: in the prior version of Step 5, the ledger_log call was described after the user-visible status message, framed as "Then log the copy event:" — bro's LLM treated that as a courtesy step instead of mandatory and skipped it once the human-facing line was emitted. ### Changes - **skills/tmb_first-run-onboarding** — Step 5 split into 5a (file copy), 5b (REQUIRED ledger_log + ledger_list verify, framed as "not optional" with a dedicated header), 5c (status message). Mandatory MCP write sequence at the top of the skill renumbered to include all 6 items (was "ALL FOUR" — stale after Step 5 was added). Closing message now explicitly waits for ledger_list to confirm the audit row landed. Frontmatter allowed-tools extended with ledger_list. - **tests/lint/onboarding-skill-contract.sh** — added 4 new assertions to catch this regression class: - skill must reference `tmb_bootstrap_complete` - skill must reference `ledger_log` (the write tool) - skill must reference `ledger_list` (the verify tool) - skill must contain the literal "not optional" so the LLM can't read the ledger step as courtesy ### Layer 1 + 2 status ``` Onboarding contract lint: PASS (4 new assertions added) Lego cap (≤30 lines templates/): PASS MCP server unit + integration: PASS Hook tests: PASS ``` Next Mode A run should produce a `tmb_bootstrap_complete` ledger row within onboarding, before the closing message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PERFORMANCE.md was 2/3 PR-cycle ephemera (PR #63 baseline timeline, "changes shipped in #64" table, "verification pending" TODO) and 1/3 evergreen doctrine (target latency band, Tier 1/2/3 trim guidance, re-eval triggers). The ephemera was rotting on every perf cycle. Moved the evergreen 1/3 into CONTRIBUTING.md § Performance. Deleted the doc. Filed #75 as a standing re-baseline issue — opens when a re-eval trigger fires, closes when the timing posts back to green. Updated references: - README.md → links to CONTRIBUTING.md#performance - CLAUDE.md → same - FILES.md → drops the doc from the file map - CHANGELOG.md → notes the relocation in v0.1.2 and updates the v0.1.0 historical references so they're not broken links Doctrine durable in CONTRIBUTING.md. Measurements ephemeral in issues + git. No more standing doc that grows stale. Closes the perf-doc-staleness side of the v0.1.2 cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZaxShen self-assigned this Apr 25, 2026

ZaxShen merged commit 6715f89 into dev Apr 25, 2026
1 check passed

ZaxShen deleted the feat/lego-templates branch April 25, 2026 03:14

ZaxShen mentioned this pull request Apr 25, 2026

🚧 chore: latency reduction — proposals on review #64

Merged

This was referenced Apr 25, 2026

Move swe + pr-reviewer to templates/ — plugin ships nothing under agents/ #60

Closed

Keep agent prompts ≤ 200 lines: extract rarely-triggered flows into skills #31

Closed

Re-baseline Layer 3 dogfood after major chain changes (perf re-eval) #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🏛️ feat: Lego templates — plugin ships zero global agents#63

🏛️ feat: Lego templates — plugin ships zero global agents#63
ZaxShen merged 1 commit into
devfrom
feat/lego-templates

ZaxShen commented Apr 25, 2026

Uh oh!

ZaxShen commented Apr 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZaxShen commented Apr 25, 2026

Summary

Lego rule

What's in this PR

New templates (6 agents, ≤25 lines body each)

New skill templates (7 default skills)

tmb_ prefix on all 14 plugin protocol skills

CLAUDE.md updates

MCP role enforcement (still in place from #59)

Layer 1 + 2 status: GREEN

Test plan

Followup

Uh oh!

ZaxShen commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Layer 3 dogfood verification — end-to-end PASS

Trigger

Trajectory log (from MCP DB monitor)

Design checkpoints — all green

Lifecycle final state

Comparison to baselines

What this verifies for #60

Open follow-ups (not blockers for this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`tmb_` prefix on all 14 plugin protocol skills

ZaxShen commented Apr 25, 2026 •

edited

Loading