Skip to content

🚧 chore: latency reduction β€” proposals on review#64

Merged
ZaxShen merged 8 commits into
devfrom
chore/latency-reduction
Apr 25, 2026
Merged

🚧 chore: latency reduction β€” proposals on review#64
ZaxShen merged 8 commits into
devfrom
chore/latency-reduction

Conversation

@ZaxShen
Copy link
Copy Markdown
Contributor

@ZaxShen ZaxShen commented Apr 25, 2026

STATUS: DRAFT β€” proposals on Human review, no code shipped yet.

Follow-up to #63 (verified Layer 3 dogfood at ~12min wall-clock for a CLI todo, vs ~32s for pure Claude). This PR is the placeholder for the latency-reduction work; actual changes land after Zax approves the proposals in TMB/bro/DISCUSSION.md (workspace-local β€” visible here only as the embedded summary below).

Latency breakdown (from #63 Layer 3 verified run)

19:54:43  issue created
19:55:34  planning_complete           Ξ” 51s          (simple-fast-lane)
19:59:10  tmb_bootstrap_complete      Ξ” 3min 36s     (interactive β€” one-time per project)
20:03:48  task β†’ completed            Ξ” 4min 38s     (sonnet SWE)
20:06:44  validation pass + closed    Ξ” 2min 56s     (opus pr-reviewer)
20:08:11  issue β†’ closed              Ξ” 1min 27s

Five candidate optimizations (ranked by impact-to-risk ratio)

# Change Estimated impact Risk class
1 pr-reviewer: opus β†’ sonnet -1 to -2 min safe
2 Skip pr-reviewer on simple triage -3 min doctrine change
3 Inline bootstrap into onboarding -30s to -1min safe (UX trade-off)
4 SWE CoT trim + lazy skill load + worktree parallelize -1 to -2 min safe (quality-impact uncertain β€” needs trial)
5 Direct-mode bypass for trivial single-file changes -5+ min major doctrine change

Proposed ship plan

What's on Zax's plate before this PR ships code

  1. Approve / reject each of ✨ v0.2.0 Phase 2 β€” SQLite MCP trajectory serverΒ #1β€“βœ¨ v0.2.0 Phase 4 β€” native hook wiring + trajectory event monitorΒ #4 for Round 1.
  2. Decide on πŸ› Phase 2.5 β€” fix 3 MCP correctness bugs from PR #1 reviewΒ #2 trust model (skip pr-reviewer on simple β€” yes/no).
  3. Decide on ✨ v0.2.0 Phase 4 β€” native hook wiring + trajectory event monitorΒ #4 implementation (commit to CoT trim, or ship behind a frontmatter feature flag for per-project A/B).

Full reasoning + arguments for/against per proposal: TMB/bro/DISCUSSION.md at workspace level.

πŸ€– Generated with Claude Code

ZaxShen and others added 2 commits April 24, 2026 20:19
Branch awaits Human approval of the latency-reduction proposals.
See TMB/bro/DISCUSSION.md (workspace-local) for the full breakdown:

  - Latency analysis from the verified Layer 3 Mode B run that closed in #63
  - 5 candidate optimizations ranked by impact-to-risk ratio
  - Recommended ship plan (Rounds 1–3)
  - Three explicit decisions requested from Zax before code lands

This commit creates the branch + PR shell. Real changes ship after
review approval β€” no template/skill/CLAUDE.md edits in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vial edits

Implements the Round 1 latency-reduction plan from #64's DISCUSSION.md.
Per Zax's review of the proposals (DISCUSSION.md in TMB workspace):

### Doctrine changes

**Bro is the task gate.** Bro auto-flips a task to `status='closed'` as
soon as SWE returns with `status='completed'` and a `commit_sha`. No
pr-reviewer at task close.

**pr-reviewer is the push gate.** It fires only at `git push` time over
the batch of unsigned commits about to ship. Triggered by the
`git-push-guard.sh` PreToolUse hook + the Human invoking the magic
phrase `@bro review before push`. Cost amortized across multiple tasks
per push instead of paid per task.

**Direct Mode for trivial single-file changes.** Bro is now allowed to
Edit a file directly when ALL of: single file, ≀3 lines diff, no public
API change, no test required, no docs/trustmybot/architecture/ touched.
Logged as `direct_mode_used` in the ledger. Anything bigger falls back
to the default chain.

### File changes

- **CLAUDE.md** β€” rewrote `## Role`, `## First-action chain`,
  `## Code-touching asks`, `## Routing`. Dropped the bootstrap-check
  step (onboarding now handles template copy inline). Added `## Direct
  Mode` and `## Push gate` sections. Catchphrase now bound to the push
  gate's pass verdict.

- **skills/tmb_architect-workflow/SKILL.md** β€” rewrote the main
  sequence: bro flips task β†’ closed immediately after SWE returns.
  No pr-reviewer step. Added explicit instruction: emit
  `task_create_batch` + `Task(swe)` + `ledger_log(planning_complete)`
  as multiple tool_use blocks in one assistant response so CC executes
  them concurrently. Same change to the simple-fast-lane handoff step.

- **skills/tmb_first-run-onboarding/SKILL.md** β€” added Step 5: silently
  copy templates/agents/{swe,pr-reviewer}.md + templates/skills/* into
  the project as part of onboarding's mandatory write sequence. No
  extra AskUserQuestion. Logs the same `tmb_bootstrap_complete` event
  for audit consistency. Frontmatter `allowed-tools` extended with
  `Read, Write, ledger_log`.

- **skills/tmb_bootstrap/SKILL.md** β€” relegated to RECOVERY skill.
  Triggers only on the rare edge case where identity is set but
  `.claude/agents/swe.md` is missing (e.g. user hand-deleted between
  sessions). Onboarding handles the normal path now.

- **templates/agents/swe.md** β€” dropped the chain-of-thought block.
  Removed `swe-checklist` from eager `skills:` (now lazy-load via Skill
  tool only when verification needs interpretation). Instructed to
  batch `task_get` + `git worktree add` in the first response so they
  run concurrently. Net 22 β†’ 21 lines.

- **templates/agents/pr-reviewer.md** β€” reframed as the push gate.
  Body explicitly notes it fires over a batch of unsigned tasks, not
  per individual task close. Skills array empty (project attaches
  review skills via tmb_skill-creator).

- **scripts/hooks/git-push-guard.sh (NEW)** β€” replaces
  `require-review-sign.sh`. PreToolUse on Bash matching `git push`.
  Uses `git log @{u}..HEAD` to determine commits being pushed; queries
  the trajectory DB for tasks with matching commit_sha that lack a
  `validation_attempts.verdict='pass'` row. Blocks with a structured
  message naming the specific unsigned tasks and instructing
  `@bro review before push`.

- **scripts/hooks/lib/query-task.sh** β€” `tmb_unsigned_tasks` updated:
  now queries `status='closed' AND commit_sha IS NOT NULL` (was
  `status='completed'`). Reflects the new doctrine where bro auto-
  closes immediately and pr-reviewer fires later.

- **scripts/hooks/require-review-sign.sh + test** β€” DELETED, replaced
  by git-push-guard.sh.

- **hooks/hooks.json** β€” swapped `require-review-sign.sh` β†’
  `git-push-guard.sh` in the PreToolUse Bash matcher.

- **docs/PERFORMANCE.md (NEW)** β€” records target, baseline (#63 12-min
  run), Tier 1/2/3 doctrine of what's safe to trim, and the changes
  shipped here. Re-evaluation triggers spelled out.

- **docs/architecture/FLOWS.md** β€” Flow 2 (simple task) sequence
  diagram redrawn for the new chain (parallel batched writes, bro
  closes inline, no pr-reviewer step). Flow 6 (PR review) reframed as
  the push gate flow with parallel pr-reviewer spawns over multiple
  unsigned tasks.

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS
Lego cap (≀30 lines templates/):    PASS (max 26)
MCP server unit + integration:      PASS
Hook tests (git-guards, task-spec): PASS
```

The deleted require-review-sign.sh test file went away cleanly; git-
push-guard.sh has no dedicated test yet (Layer 3 dogfood will exercise
it via real `git push`).

### Expected impact (estimate, to be verified in Layer 3)

- Per-task time: ~12 min β†’ ~3–5 min on simple-triage path
- Direct Mode trivial edits: ~10–20s, near pure-Claude
- Per-push pr-reviewer pass: ~2–3 min flat regardless of unsigned-task count (parallel spawns)

Layer 3 verification next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ZaxShen ZaxShen marked this pull request as ready for review April 25, 2026 03:44
@ZaxShen
Copy link
Copy Markdown
Contributor Author

ZaxShen commented Apr 25, 2026

Round 1 implementation pushed (`b2bc2ac`)

Per Zax's review of `TMB/bro/DISCUSSION.md` (workspace-local), the Round 1 changes are in. PR is now ready for review.

What changed (vs the original 5 proposals)

# Original Final design (per review)
1 pr-reviewer opus β†’ sonnet Reframed: kept opus, but pr-reviewer is now the push gate (fires only at `git push` time over batch of unsigned commits), not the per-task gate
2 Skip pr-reviewer on simple triage Subsumed by #1's reframe: pr-reviewer skipped for ALL per-task closures regardless of triage; bro auto-closes
3 Inline bootstrap (one extra question) Tightened: zero extra questions. Onboarding silently copies templates as part of its existing approval scope
4 SWE CoT trim + lazy skills + worktree parallelize All three shipped as agreed
5 Direct-mode bypass for trivial edits Shipped with narrow scope: single file, ≀3 lines, no API/test/architecture touch
Q2 Lighter parallelization Shipped: planner emits `task_create_batch` + `Task(swe)` + `ledger_log` as multiple tool_use blocks in one response

File changes

  • CLAUDE.md β€” full rewrite of Role / First-action / Code-touching / Routing sections. Added `## Direct Mode` and `## Push gate` sections.
  • skills/tmb_architect-workflow β€” main sequence drops pr-reviewer step; bro closes immediately. Added explicit batched-handoff instruction.
  • skills/tmb_first-run-onboarding β€” Step 5 (NEW): silent template copy as part of onboarding's mandatory write sequence.
  • skills/tmb_bootstrap β€” relegated to recovery skill (project missing agents AFTER onboarding succeeded).
  • templates/agents/swe.md β€” dropped CoT block, removed swe-checklist from eager skills, batched task_get + worktree.
  • templates/agents/pr-reviewer.md β€” reframed as push gate; processes batch of unsigned tasks.
  • scripts/hooks/git-push-guard.sh (NEW) β€” replaces require-review-sign.sh. Scopes to actual push range via `git log @{u}..HEAD`.
  • scripts/hooks/lib/query-task.sh β€” `tmb_unsigned_tasks` now matches `status='closed' AND commit_sha IS NOT NULL`.
  • scripts/hooks/require-review-sign.sh + test β€” DELETED.
  • hooks/hooks.json β€” swapped to git-push-guard.sh.
  • docs/PERFORMANCE.md (NEW) β€” target, baseline, doctrine of what's safe to trim, change log.
  • docs/architecture/FLOWS.md β€” Flow 2 (simple task) and Flow 6 (PR review = push gate) sequence diagrams redrawn.

Layer 1 + 2 status: GREEN

Onboarding contract lint:           PASS
Lego cap (≀30 lines templates/):    PASS (max 26)
MCP server unit + integration:      PASS
Hook tests (git-guards, task-spec): PASS

`require-review-sign.sh` test file was deleted alongside the hook. `git-push-guard.sh` has no dedicated unit test yet β€” its behavior is exercised by Layer 3 dogfood (next step).

Expected impact

Path Before (#63 baseline) After (estimate) Notes
Simple-triage chain (e.g. CLI todo) ~12 min total ~3–5 min Bootstrap folded into onboarding (-3.5 min), pr-reviewer not at task close (-3 min), SWE CoT trim (-1 min), batched MCP writes (-10s)
Trivial single-file edit (typo) ~12 min (full chain) ~10–20s (Direct Mode) Approaches pure Claude
Push gate after N tasks n/a (per-task) ~2–3 min flat Amortized across all unsigned tasks in the push, parallel pr-reviewer spawns
Recovery from rm-rf .claude/agents/ n/a tmb_bootstrap covers it Was the normal case in #63; now reserved for true recovery

Layer 3 verification plan

Re-run the same `@bro write a cli todo by python` scenario from the #63 baseline in a fresh Mode B session. Compare line-by-line to the #63 timeline (`docs/PERFORMANCE.md` records the methodology).

Specific scenarios to walk:

  1. Default chain β€” fresh project, simple-triage CLI todo. Target ≀5 min total (≀3 min on the planner+SWE phase).
  2. Direct Mode β€” separately, fire `@bro fix the typo "recieve" β†’ "receive" in README.md`. Target ≀30s, no SWE spawn, ledger event `direct_mode_used` recorded.
  3. Push gate β€” after a few tasks complete in the same project, run `git push`. Verify the hook blocks. Run `@bro review before push`. Verify pr-reviewer fires for all unsigned tasks (in parallel where possible). Verify push unblocks after all-pass.

πŸ€– Generated with Claude Code

…cascade (closes #66)

Observed in PR #64 Layer 3 dogfood on a fresh empty repo: bro fired 4–5
parallel Bash exploratory calls during tmb_project-prescan, one of them
(`git log` on an empty repo, exit 128) failed, and CC's parallel runtime
cancelled the entire batch. Bro retried serially; the second batch
contained another fragile call which also got cancelled. Net cost:
~9 wasted tool-uses + meaningful subagent context burn.

The bug is prompt design, not platform β€” CC's cancel-on-sibling-failure
behavior is a fact of life. The fix is to never batch fragile-by-design
calls with healthy ones in the same response.

### tmb_project-prescan rewrite

- Hard-rule section added at the top of the procedure: explains the
  cancel-cascade pathology and lists the two safe patterns (probe-first
  vs `|| true` defang).
- Procedure split into three explicit phases:
  - Phase 1 β€” probe state serially (git rev-parse HEAD, dir existence
    checks). Cheap, ≀2s, always single-call.
  - Phase 2 β€” batched read fan-out, gated on probe results. Fragile
    calls (git log, ls of missing dirs) only fire when the probe says
    they'll succeed.
  - Phase 3 β€” MCP queries (always safe to batch β€” they return null on
    miss instead of exiting non-zero).
- Glob calls remain in their own block (Glob never errors).

### CLAUDE.md global rule

Added a "Parallel-batching safety" section under Code-touching asks. Lists
the four most common fragile commands (`git log`, `ls <dir>`,
`find <path>`, `git diff @{u}..`) and the two safe patterns. Notes that
Glob, Grep, and MCP tool calls are safe to batch (they don't fail with
non-zero exits). This guidance applies to bro and any subagent firing
exploratory Bash batches.

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS
Lego cap (≀30 lines templates/):    PASS
MCP server unit + integration:      PASS
Hook tests:                         PASS
```

### Expected impact

Fresh-project prescan: 0 batch cancellations (down from 2). Bro completes
prescan in ~1 batch instead of 2–3 retry rounds. Bonus: less context burn
in subagent budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ZaxShen ZaxShen mentioned this pull request Apr 25, 2026
ZaxShen and others added 4 commits April 24, 2026 21:11
Observed in PR #64 Layer 3 dogfood: onboarding's Step 5 copied BOTH swe.md
AND pr-reviewer.md eagerly into .claude/agents/. That contradicts the
"pr-reviewer is the push gate, not the task gate" doctrine β€” pr-reviewer
shouldn't appear in the project until the first time it's actually needed
(at push time), not at onboarding.

User reported this as "still automatically creates pr-reviewer" after
quitting Mode B mid-test.

### Mental model alignment

| Agent | First needed at | Copy timing |
|---|---|---|
| swe.md | First task | Onboarding (eager) β€” first task could fire any moment |
| pr-reviewer.md | First `git push` | Lazy β€” first push-gate trigger only |
| consultants (architect, cto, ceo, pm) | First "@bro get the X opinion" | Lazy β€” first request |

### Changes

- **skills/tmb_first-run-onboarding/SKILL.md** β€” Step 5 rewritten:
  - Removed pr-reviewer.md from the eager copy list
  - Removed review-protocol/SKILL.md and review-findings/SKILL.md (those
    are pr-reviewer's skills, copied alongside pr-reviewer when it's
    finally copied)
  - Kept the 5 swe-side skills (swe-checklist, code-quality,
    docs-conventions, git-conventions, naming-conventions)
  - Status message now mentions "pr-reviewer will be set up automatically
    the first time you push" so the user knows what to expect

- **CLAUDE.md** β€” Push gate flow gains an explicit Step 1: lazy-copy
  pr-reviewer.md + review-protocol + review-findings from templates if
  they're missing. No AskUserQuestion (the Human invoked the push-review
  flow β€” that's the consent boundary). Logs as
  `tmb_pr_reviewer_lazy_copied` ledger event.

- **CLAUDE.md** β€” "Templates shipped" table updated: pr-reviewer.md row
  now reads "First time the push gate fires (`@bro review before push`)
  β€” lazy, not at onboarding."

- **CLAUDE.md** β€” "templates/skills/" paragraph clarifies which skills
  are swe-side vs pr-reviewer-side and when each gets copied.

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS
Lego cap (≀30 lines templates/):    PASS
MCP server unit + integration:      PASS
Hook tests:                         PASS
```

### Expected behavior after this fix

Fresh project after onboarding completes:

```
.claude/
β”œβ”€β”€ agents/
β”‚   └── swe.md                          # only swe β€” pr-reviewer NOT copied
└── skills/
    β”œβ”€β”€ code-quality/
    β”œβ”€β”€ docs-conventions/
    β”œβ”€β”€ git-conventions/
    β”œβ”€β”€ naming-conventions/
    └── swe-checklist/
```

pr-reviewer.md + review-protocol + review-findings appear later, only
after the first `@bro review before push` fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mode A retest in /tmp/tmb-scratch landed the file copies correctly but
left no ledger row for tmb_bootstrap_complete. The chain still worked
end-to-end (~2:16 wall-clock, 5Γ— faster than the #63 baseline) but the
audit trail lost the "onboarding ran here" anchor that downstream skills
+ tests rely on.

Root cause: in the prior version of Step 5, the ledger_log call was
described after the user-visible status message, framed as "Then log
the copy event:" β€” bro's LLM treated that as a courtesy step instead of
mandatory and skipped it once the human-facing line was emitted.

### Changes

- **skills/tmb_first-run-onboarding** β€” Step 5 split into 5a (file copy),
  5b (REQUIRED ledger_log + ledger_list verify, framed as "not optional"
  with a dedicated header), 5c (status message). Mandatory MCP write
  sequence at the top of the skill renumbered to include all 6 items
  (was "ALL FOUR" β€” stale after Step 5 was added). Closing message now
  explicitly waits for ledger_list to confirm the audit row landed.
  Frontmatter allowed-tools extended with ledger_list.

- **tests/lint/onboarding-skill-contract.sh** β€” added 4 new assertions
  to catch this regression class:
  - skill must reference `tmb_bootstrap_complete`
  - skill must reference `ledger_log` (the write tool)
  - skill must reference `ledger_list` (the verify tool)
  - skill must contain the literal "not optional" so the LLM can't read
    the ledger step as courtesy

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS (4 new assertions added)
Lego cap (≀30 lines templates/):    PASS
MCP server unit + integration:      PASS
Hook tests:                         PASS
```

Next Mode A run should produce a `tmb_bootstrap_complete` ledger row
within onboarding, before the closing message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…icult; bro verification non-negotiable

Per Zax review of latency proposals (DISCUSSION.md):

### #1 β€” Split tmb_architect-workflow (443 lines) into two narrower skills

Bro on the simple-fast-lane was loading 443 lines of skill prose to use ~80
lines of it (defaults table + waiver guidance + handoff). Most of the file
is difficult-path content (env probe, ambiguous-choices list, RED FLAG
examples, blueprint format) that simple-path bro never uses. Split:

- **skills/tmb_planning-simple** (114 lines): defaults table, scope-gate
  waiver, batched-handoff hard rule, bro verification protocol, escalate
  triggers, trivial template. The whole simple-triage protocol.
- **skills/tmb_planning-difficult** (210 lines): triage confirmation, env
  probe, grounded Q+A rounds, scope-ambiguity HARD RULE + worked examples,
  ADR + decision capture, standard template, bro verification protocol.
- **skills/tmb_architect-workflow** (deleted) β€” the combined file.

Bro loads exactly one of the two based on the triage decision (which it
already makes per `tmb_branch-id-proposal`). Simple-path planner reads
~25% of what it used to.

### #2 β€” Hard-rule prose for batched handoff

Both new planning skills mark the post-triage tool-call batch as a HARD
RULE: emit `task_create_batch` + `Task(swe)` + `discussion_append(routing
note)` + `ledger_log(planning_complete)` as multiple tool_use blocks in
ONE assistant response. Prior runs split these across 3+ messages,
costing ~30s of round-trip latency per task.

### #3 β€” Bro verification protocol (lean but mandatory)

Per Zax's hard constraint ("bro should never skip verification"), both
planning skills include a "Bro verification protocol β€” never skip this"
section that bro runs after SWE returns and BEFORE flipping to closed:

1. **V1 β€” Pull spec + diff**: `task_get` + `git diff <commit_sha>~1..<sha>`
2. **V2 β€” Three checks (all required)**:
   - Files match `## Files`
   - `## Verification` commands re-run and pass
   - Each `## Success Criteria` bullet visibly met in the diff
3. **V3 β€” Decide**: all-pass β†’ `task_update_status(status='closed')` (+ `issue_close` batched in same response if last task). Any-fail β†’ log to discussion, re-spawn SWE with feedback (max 3) or escalate.

This is the **task gate** β€” distinct from pr-reviewer which is the **push
gate** (deeper style/security checks at git push time).

### Other refs swept

- CLAUDE.md routing + first-action chain rewritten to load the right
  planning skill per triage; new explicit "bro verification is
  non-negotiable" section.
- README.md workflow line updated.
- skills/tmb_branch-id-proposal cross-references updated.
- docs/architecture/FILES.md tree refreshed.
- docs/architecture/FLOWS.md tables + Flow 2 sequence diagram updated.
- tests/lint/onboarding-skill-contract.sh β€” old architect-workflow
  assertions split across the two new files; added new assertions for
  the verification step ("Bro verification protocol", "never skip",
  "task gate") on both files.
- tests/mcp-integration/scope-gate.test.mjs comment refresh.

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS (24 assertions across 4 skills)
Lego cap (≀30 lines templates/):    PASS
MCP server unit + integration:      PASS
Hook tests:                         PASS
```

### Expected impact

The 1m 34s "planning gap" measured in the prior Mode A run came from bro
loading + processing 443 lines of architect-workflow before the batched
write. Cutting that to 114 lines for the simple path should drop the gap
by ~30-50s. Combined with the hard-rule batching prose forcing bro to
emit fewer total messages (target: 2 instead of 3), simple-triage
planning should land in ~30-45s of bro work plus SWE/Human time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ll templates

Per Zax's review of d5b762e: the template skills shipped from `templates/skills/`
still carried "Architect plans / Architect spawns pr-reviewer / Architect must
revert" wording from the pre-bro-as-planner era. These templates get copied into
every TMB project during onboarding (per `tmb_first-run-onboarding` Step 5), so
the stale doctrine was leaking into every install.

### Changes per template

- **git-conventions** β€” full rewrite of the gates section. Two real gates now:
  (1) Bro task gate (auto-runs after SWE returns, per planning skills'
  verification protocol), (2) PR-reviewer push gate (fires on `git push` to
  protected branches via `git-push-guard.sh`), (3) Human merge gate (final
  eyeball on GitHub diff). Old "Architect spawns pr-reviewer at gate 1/2"
  wording removed. Source-code authorship check now references `tasks` row +
  `commit_sha` correlation, with a Direct Mode exception that cross-checks the
  `direct_mode_used` ledger event.

- **docs-conventions** β€” frontmatter `agent:` updated (architect β†’ bro).
  Enforcement section: "PR Reviewer flags... The Architect escalates" β†’
  "PR-reviewer flags at push time. Bro escalates during planning."
  Editing-prompts section: "any agent touching prompt files β€” usually
  architect" β†’ "bro (planning + Direct Mode docs) or SWE (when task spec
  names a markdown file)". "route via architect β†’ SWE" β†’ "route via bro β†’
  SWE (the standard task chain)".

- **naming-conventions** β€” frontmatter `agent: swe, architect` β†’ `swe,
  pr-reviewer`. Architect doesn't load this; pr-reviewer does at push gate.

- **code-quality** β€” Description "Used by Architect (design-time)" β†’ "Used
  by bro (design-time)". Per-section "Design-time questions (Architect)"
  β†’ "(bro)" across all 4 categories.

- **review-findings** β€” "Architect β€” consult before writing task files" β†’
  "Bro β€” consult before authoring task_create_batch spec_body".

### Why this matters

Template skills are inert plugin content until onboarding copies them into
`<project>/.claude/skills/`. Every project that runs onboarding inherits
these files verbatim. The Lego rule says we never edit them post-copy, so
ANY stale doctrine in the templates becomes permanent project doctrine
unless the user notices and edits manually.

Catching this before the next Mode B run avoids re-baking the rot.

### Layer 1 + 2 status

```
Onboarding contract lint:           PASS
Lego cap (≀30 lines templates/agents/): PASS
MCP server unit + integration:      PASS
Hook tests:                         PASS
```

No new lint coverage added β€” these are template skills (no automated
contract assertions), but they're verified by Layer 3 (a fresh
onboarding now copies clean skills into the project).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Supersedes the v0.2.0 placeholder tag (to be revoked on main-merge).

### Versions aligned

- `.claude-plugin/plugin.json`: 0.3.2 β†’ **0.1.0**
- `mcp/trajectory-server/package.json`: 0.2.0 β†’ **0.1.0**

### CHANGELOG.md

Replaced the "Unreleased β€” first public release pending" placeholder with
the comprehensive v0.1.0 entry summarizing everything that shipped across
PRs #41 β†’ #64:

- bro-as-persona + planner + task gate doctrine
- pr-reviewer-as-push-gate + git-push-guard.sh hook
- Lego templates (6 agents in templates/agents/, plugin's agents/ empty)
- planning-simple / planning-difficult split (was tmb_architect-workflow)
- bundled SQLite trajectory MCP server (~30 tools)
- requireRoles middleware on every workflow-state mutation
- Direct Mode for trivial single-file edits
- Three-layer test infrastructure (lint / MCP-integration / dogfood)
- Documented latency budget in docs/PERFORMANCE.md
- Documented "known not-done" linking the 4 main backlog items

### Post-merge release plan

After this PR merges to dev + dev β†’ main:

1. `git tag -d v0.2.0`
2. `git push origin :refs/tags/v0.2.0`
3. `gh release delete v0.2.0`
4. `git tag -a v0.1.0 main -m "..."`
5. `git push origin v0.1.0`
6. `gh release create v0.1.0 --notes-file CHANGELOG.md`

Layer 1 + 2 tests still PASS. No code/skill/agent changes β€” only version
bumps + release notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ZaxShen ZaxShen merged commit 287a773 into dev Apr 25, 2026
1 check passed
ZaxShen added a commit that referenced this pull request Apr 25, 2026
PERFORMANCE.md was 2/3 PR-cycle ephemera (PR #63 baseline timeline,
"changes shipped in #64" table, "verification pending" TODO) and
1/3 evergreen doctrine (target latency band, Tier 1/2/3 trim guidance,
re-eval triggers). The ephemera was rotting on every perf cycle.

Moved the evergreen 1/3 into CONTRIBUTING.md Β§ Performance. Deleted
the doc. Filed #75 as a standing re-baseline issue β€” opens when a
re-eval trigger fires, closes when the timing posts back to green.

Updated references:
- README.md β†’ links to CONTRIBUTING.md#performance
- CLAUDE.md β†’ same
- FILES.md β†’ drops the doc from the file map
- CHANGELOG.md β†’ notes the relocation in v0.1.2 and updates the
  v0.1.0 historical references so they're not broken links

Doctrine durable in CONTRIBUTING.md. Measurements ephemeral in issues +
git. No more standing doc that grows stale.

Closes the perf-doc-staleness side of the v0.1.2 cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant