🎉 v0.2.0 Phase 5 — legacy tree removed, README + CLAUDE.md finalized, install.sh deprecated by ZaxShen · Pull Request #5 · trustmybot/plugin

ZaxShen · 2026-04-22T00:54:08Z

Summary

Phase 5 — the v0.2.0 close-out. Removes the legacy .claude/ tree, demotes install.sh to a deprecation stub, and finalizes the docs to the corrected two-tier roster model.

Commits

c96e6d7 rewrite README + stub install.sh (per task 1540 spec — but README content was rewritten with stale roster, see next commit)
d16bef8 restore README to corrected 2+5+on-demand roster — task 1540's spec was written before #D1 roster correction landed; SWE faithfully followed stale spec; this commit reverts the README portion only, keeps the install.sh stub
f0567c3 delete legacy plugin/.claude/ tree (31 files: 10 agents + 7 rules + 11 skills + 2 skills-gallery + settings.json) and rewrite CLAUDE.md (per task 1535 spec — same stale-roster issue)
dbab7d3 correct CLAUDE.md to two-tier roster: gatekeeper + prompt-engineer global, ceo/cto/architect/swe/pr-reviewer as project placeholders, on-demand via agent-creator. Adds Persistence section.

What this PR ships

✅ Plugin in 100% native Claude Code 2026 layout (.claude-plugin/plugin.json, agents/, templates/agents/, skills/<name>/SKILL.md, hooks/hooks.json, monitors/, mcp/trajectory-server/)
✅ Legacy .claude/ tree fully removed
✅ README and CLAUDE.md aligned to the two-tier roster model
✅ install.sh deprecation stub points users to /plugin marketplace add + /plugin install

Important — symlink consequence

Deleting plugin/.claude/ breaks the TMB/.claude → plugin/.claude symlink that the TMB workspace was using to dogfood. Post-merge action required:

# In a fresh terminal, from TMB workspace:
rm /Users/Zax/Git/GitHub/TMB/.claude
cd /Users/Zax/Git/GitHub/TMB/plugin
# (start a fresh claude session)
# Then in Claude Code:
/plugin marketplace add ./plugin       # local-path install
/plugin install tmb@trustmybot
/reload-plugins

This is the live dogfood install (originally task 1545, deferred to a human-driven step since it requires interactive /plugin install commands).

Deferred

Task 1545 (formal dogfood smoke test) — deferred to manual user action above. Spawning SWE to do it would have required interactive /plugin install, which subagents can't drive.

Test plan

ls plugin/.claude/ returns "No such file" (legacy tree gone)
cat plugin/CLAUDE.md shows two-tier roster (gatekeeper + prompt-engineer global; ceo/cto/architect/swe/pr-reviewer placeholders; on-demand)
bash plugin/install.sh exits 1 with deprecation notice
cat plugin/README.md shows corrected install commands and roster model
Manual: /plugin install tmb@trustmybot from a fresh project + spawn /gatekeeper + complete one trivial workflow loop — please do this in a fresh session post-merge

🤖 Generated with Claude Code

README documents /plugin marketplace add + /plugin install as the install path. install.sh reduced to a 5-line deprecation stub that prints the new commands and exits 1. Ready for v0.3 removal.

Task 1540's spec was written before the gatekeeper-roster correction (see bro/PLUGIN_BUGS.md #D1). SWE faithfully followed the stale spec which advertised a 5-fixed-global roster (secretary/architect/swe/ pr-reviewer/prompt-engineer). That regresses the corrected README shipped in commit eb27be4. Restoring README to the corrected 2-global (gatekeeper + prompt-engineer) + 5-project-placeholder (ceo/cto/architect/swe/ pr-reviewer) + on-demand model. Keep 1540's install.sh deprecation stub change as-is (that part was correct). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…v0.2 The 10-agent roster + old hook settings are fully superseded by the native plugin layout (plugin/agents/, plugin/skills/, plugin/hooks/). CLAUDE.md now describes the 5 core agents with the agent-creator capability for domain agents.

…gineer global, 5 placeholders seeded per project) Task 1535's CLAUDE.md rewrite used a stale 5-fixed-global roster (secretary/architect/swe/pr-reviewer/prompt-engineer + secretary hybrid-named "gatekeeper"). Per the corrected model in bro/PLUGIN_BUGS.md #D1: - Global tier (plugin ships): gatekeeper, prompt-engineer. - Project tier (seeded per project): ceo, cto, architect, swe, pr-reviewer. - On-demand: gatekeeper drives agent-creator flow with explicit user approval per new agent. - pm/gtm/designer NOT in plugin (TMB team internal only). Adds a Persistence section describing the bundled SQLite trajectory MCP. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…s: [X]' Node `I` had `skills: [X]` inside an unquoted Mermaid label, which the parser read as nested square brackets and bailed on. The whole flow #5 (Skill Creation) failed to render on GitHub. - Wrapped every label with literal/risky characters in double quotes (Mermaid spec for safe label content). - Escaped the inner brackets in node I to HTML entities ([ / ]) so the YAML frontmatter syntax still reads correctly. - Also escaped < as < in node B's "fires < 20% of sessions?" since unquoted < can be parsed as opening an HTML tag. No semantic change — same flow, now actually renderable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nts (#5) l5_run_claude (flow-helpers.sh) and l5_run_arm (ab-helpers.sh) now invoke `claude --output-format stream-json --include-hook-events --include-partial-messages --verbose -p <prompt>` and pipe the JSONL output to <project>/trajectory.jsonl. The previous text-grep capture path is replaced by structured per-message events: assistant.message.usage gives token counts including main-thread (no more #135 lower-bound). assistant.message.content[].name (where type == 'tool_use') gives tool calls directly — no more debug_trajectory dependency. Slim stderr summary line replaces the per-line text echo (assistant_msgs + duration_ms shown for log triage).

l5_score_cost, l5_score_trajectory_required, l5_score_trajectory_forbidden now read from <project>/trajectory.jsonl via jq instead of querying agent_runs / debug_trajectory. - cost: sums assistant.message.usage.{input,output}_tokens across the JSONL. Captures BOTH main-thread and subagent tokens — the #135 caveat from TRU-76's stopgap is resolved. - trajectory_required + trajectory_forbidden: extract tool_use names from assistant content blocks. No env-coupling on TMB_DEBUG_TRAJECTORY. - outcome: unchanged (SQL on the DB per spec). Closes #5 (TRU-82). Supersedes the agent_runs cost source from #4 and the main-thread capture follow-up from #135.

🛠️ feat(test): switch L5/A/B capture to stream-json + jq scorers (#5) See merge request trustmybot/plugin!54

…+ retry-on-failed doctrine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

📝 fix(153 #5-7): planning-difficult Step 0 + prescan greenfield arch + retry doctrine See merge request trustmybot/plugin!104

…ade) Two new optional scorers any L5 flow can opt into. Pure shell + sqlite3 + git plumbing — no LLM, no new deps. Catches the failure modes today's L5 misses (the empty-table + base-branch-contamination cases Daisy hit on 2026-05). ## outcome-coherence.json Table-shape assertions. Catches "empty discussions after a planning flow" without each flow author having to spell out the SQL. ```json { "expected_writes": { "issues": ">=1", "tasks": ">=1", "discussions": ">=1", "audit": ">=2", "tasks WHERE branch_id != 'dev'": ">=1" } } ``` Operators: `>=N`, `<=N`, `=N`, `!=N`, or bare `N` (= exact). Optional `WHERE <clause>` suffix on the key targets specific row shape. ## outcome-git.json Git-state assertions. Catches "bro committed to dev directly" / "worktree on detached HEAD" / "uncommitted slop in worktree". ```json { "base_branch_unchanged": true, "uncommitted_in_worktree": false, "worktrees": [ { "path": ".claude/worktrees/<slug>", "head_branch": "<task.branch_id>", "head_not_branch": ["dev", "main"] } ] } ``` `base_branch_unchanged` works by snapshot — `l5_run_claude` writes `.claude/tmb/_l5_pre_run_git.json` with the base SHA before bro fires; the scorer compares post-run. This isolates "bro committed during the run" from setup-time commits the flow's run.sh made beforehand. `<slug>` and `<task.branch_id>` placeholders auto-resolve from the most-recent tasks row. ## Integration `l5_score_flow` now calls 7 scorers — added `l5_score_coherence` and `l5_score_git` to the existing 5. Both opt-in; missing config files = silently skipped. All 19 existing flows continue to pass without changes. ## Verification - `tests/dogfood/lib/scorers-test.sh` — 15 unit tests covering pass/fail paths (operator parsing, WHERE clauses, snapshot comparison, worktree HEAD checks). Wired into `tests/run-all.sh` as an L3 check. - L5 flow 13-bulk-cleanup opted in to both as proof-of-concept; passes end-to-end with all 7 scorers green. ## What's next (per tests/EVALUATION.md) - MR #2: backfill coherence/git across the other 18 L5 flows - MR #3: Phase 2 multi-turn driver - MR #4: L6 layer - MR #5: retire the headless fast-path (#2867) - MR #6: Phase 3 LLM-as-judge Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZaxShen and others added 4 commits April 21, 2026 17:51

docs(plugin): rewrite README for native install flow; stub install.sh

c96e6d7

README documents /plugin marketplace add + /plugin install as the install path. install.sh reduced to a 5-line deprecation stub that prints the new commands and exits 1. Ready for v0.3 removal.

ZaxShen merged commit b883288 into dev Apr 22, 2026

ZaxShen deleted the feature/v0.2-phase5 branch April 22, 2026 06:52

ZaxShen mentioned this pull request Apr 25, 2026

🚧 chore: latency reduction — proposals on review #64

Merged

ZaxShen added a commit that referenced this pull request May 20, 2026

Merge branch 'feat/5-stream-json-switch' into 'dev'

e9df469

🛠️ feat(test): switch L5/A/B capture to stream-json + jq scorers (#5) See merge request trustmybot/plugin!54

ZaxShen added a commit that referenced this pull request May 20, 2026

📝 fix(153 #5-7): planning-difficult Step 0 + prescan greenfield arch …

840903c

…+ retry-on-failed doctrine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ZaxShen added a commit that referenced this pull request May 20, 2026

Merge branch 'fix/153-batch-5-6-7' into 'dev'

be2ad6c

📝 fix(153 #5-7): planning-difficult Step 0 + prescan greenfield arch + retry doctrine See merge request trustmybot/plugin!104

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎉 v0.2.0 Phase 5 — legacy tree removed, README + CLAUDE.md finalized, install.sh deprecated#5

🎉 v0.2.0 Phase 5 — legacy tree removed, README + CLAUDE.md finalized, install.sh deprecated#5
ZaxShen merged 4 commits into
devfrom
feature/v0.2-phase5

ZaxShen commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZaxShen commented Apr 22, 2026

Summary

Commits

What this PR ships

Important — symlink consequence

Deferred

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant