📝 dogfood setup: SCENARIOS.md + bun workspaces + doc fixes (closes #40, #42) by ZaxShen · Pull Request #41 · trustmybot/plugin

ZaxShen · 2026-04-24T03:35:46Z

Summary

One PR covering everything needed to actually run the dogfood flow from a fresh clone:

docs/architecture/SCENARIOS.md — 30+ manual test scenarios mapped to the 9 flows in FLOWS.md. Each scenario has verbatim trigger prompts, prerequisites, expected behavior, and verification SQL. All four roundtable corner cases (explicit/implicit × planners-present/absent) are covered. Closes docs: dogfood test scenarios — trigger prompts mapped to FLOWS.md #40.
Bun workspaces at plugin root — root package.json declares mcp/trajectory-server and monitors as a workspace. One bun install at plugin root now resolves both subpackages and produces a single consolidated bun.lock. Closes Use Bun workspaces at plugin root + gitignore dogfood-generated paths #42.
.gitignore dogfood paths — docs/trustmybot/ excluded so dogfood-from-inside-repo runs never pollute commits.
.github/workflows/test.yml — simplified to three root-level steps (install / build / tests) now that workspaces work at root.
docs/local-testing.md PLUGIN_PATH fix — replaced the copy-paste trap (/absolute/path/to/trustmybot-plugin) with export PLUGIN_PATH="$(pwd)" and a sanity-check echo. Hit during live dogfood: users were launching CC with the literal placeholder and getting zero plugins loaded.
Manifest fixes (plugin.json, marketplace.json, hooks.json, .mcp.json) — every manifest validated cleanly against claude plugin validate; MCP server now actually loads.
Monitor script fix + removal of monitors.json — stops the misleading "Monitor stream ended" noise; issue Reintroduce monitors/ as a proper long-running streaming daemon #44 files the proper streaming rewrite.
Rename gatekeeper → bro — closes the handle half of User entry path to gatekeeper: enforce the routing + shorten the mention handle (@bro) #36; drops the vestigial bro_name schema column since bro is fixed plugin branding, not user-configurable.
Brand voice: "Trust me bro, it works" — catchphrase wired into onboarding bookends + bro's prompt, gated behind real integration-test evidence (per-session discussion).
No-yes-man discipline — bro escalates concerns via architect spawn prompt, never contradicts the Human directly; architect has explicit independent authority to surface concerns via discussion_append.

Why combined

Everything here is "make the dogfood flow runnable end-to-end" — test plan + infra + manifest correctness + agent behavior fixes surfaced by the dogfood itself. Splitting into 10 PRs would just create serial-merge churn with no review benefit.

Test plan

bun install --frozen-lockfile at plugin root resolves both subpackages.
bun run build at plugin root compiles the MCP server successfully.
bash tests/run-all.sh: 235 MCP + 16 hook + 4 agent-budget lint — all green.
claude plugin validate <path> passes for marketplace.json, plugin.json, hooks.json, .mcp.json.
Headless smoke: claude --plugin-dir ... -p "list tmb: agents" returns all 4 agents; plugin:tmb:trajectory-server: connected; config_set returns structured role-enforcement error as designed.
CI workflow matches local (single install at root, no per-package working-directory).
docs/trustmybot/ gitignored — confirmed no dogfood pollution.
grep -ri gatekeeper → 0 hits; grep -ri bro_name → 0 hits.

Follow-up issues surfaced during this PR

User entry path to gatekeeper: enforce the routing + shorten the mention handle (@bro) #36 — handle rename done; routing-enforcement half still open (main Claude may still bypass bro)
Reintroduce monitors/ as a proper long-running streaming daemon #44 — reintroduce monitors as a proper long-running streaming daemon
Codebase memory: lazy bootstrap + per-session verify + task-closure updates #45 — codebase memory workflow (bootstrap + per-session verify + task-closure updates)
Shorten bro's @-mention handle — @"tmb:bro (agent)" is too verbose for per-message invocation #46 — shorten @tmb:bro invocation; test whether persona persists across session
bro first-response latency — 18s with 21.7k token context, 0 MCP tool calls #47 — bro first-response latency 18s/21.7k tokens; likely Opus-on-routing overkill + eager skill load
Onboarding UX: use AskUserQuestion for multi-choice; remove press-enter-for-default pattern #48 — onboarding UX: use AskUserQuestion for multi-choice; drop press-enter-for-default (broken in CC interactive input)
Local testing: auto-save per-session chat logs under .claude/tmb/logs/ #49 — auto-save per-session MCP logs to .claude/tmb/logs/
MCP role enforcement gap: task/validation/issue/ledger/audit/skill tools lack requireRoles wrappers #50 — MCP role-enforcement gap: 27 of 36 tools lack requireRoles (validation_record worst: any caller can fake a pass and bypass push hook)
Bring scenarios.md flows 2–9 up to the comprehensive template #51 — enrich scenarios.md flows 2–9 to the comprehensive template (agent chain, MCP calls, hooks, DB state); Flow 1 already done as reference

Closes #40
Closes #42

…oses #40) For each workflow in FLOWS.md, the verbatim user prompt that should trigger it + the observable expected behavior + the verification query/file-check to confirm it landed. The manual test grid for the plugin until automated agent tests exist. What's covered (~30 scenarios across 9 flows) - Flow 1 (onboarding): 4 — fresh-DB trigger, hold-and-resume of code- touching ask during onboarding, read-only ask interleaved, explicit re-onboard phrase - Flow 2 (simple task): 3 — typo, comment, internal refactor with no API change - Flow 3 (difficult task): 3 — new public API, schema/dependency change, cross-cutting concern - Flow 4 (agent-creator): 4 — request, approve, refuse, reserved-name refusal - Flow 5 (skill creation): 1 — mostly internal architect flow; noted as harder to deterministically trigger - Flow 6 (PR review): 1 — auto-fired in flows 2/3; verification query - Flow 7 (architecture regen): 4 — explicit phrase + 3 lazy variants (>25 commits / ≤25 commits / first-ever session) - Flow 8 (SWE retry/escalation): 2 — single fail + retry, 3 fails + escalation - Flow 9 (roundtable): 4 — the four corners • 9.1 explicit magic word + ≥2 planners (canonical) • 9.2 explicit magic word + only architect (skill escalates, proposes agent-creator) • 9.3 implicit cross-domain question + ≥2 planners (architect detects, invokes roundtable without user request) • 9.4 implicit cross-domain + only architect (architect proposes creating planners first via flow 4) Format Each scenario has: ID, prerequisites, trigger prompt (verbatim), expected behavior (numbered observations), verification (SQL or filesystem check), pass/fail checkbox. Two cross-indices at the top: by trigger style (magic word / implicit cross-domain / on-demand domain agent / etc.) and by flow. Cross-links - FLOWS.md header now references SCENARIOS.md as a companion doc - docs/local-testing.md replaces its 10-row inline checklist with a one-paragraph pointer + a quick smoke-test table; full grid lives in SCENARIOS.md - docs/architecture/FILES.md tree includes SCENARIOS.md - README.md "Architecture reference" lists SCENARIOS.md Closes #40 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

closes #42) Two coupled fixes surfaced during a dogfood attempt that ran bun install at the plugin repo root and failed — there was no root package.json. Workspaces - New plugin/package.json declares `workspaces: ["mcp/trajectory-server", "monitors"]`. One `bun install` at plugin root installs every subpackage; dedups `better-sqlite3` that both subpackages share. - Root `scripts.build` runs `bun --filter='*' run build` — every workspace with a build script rebuilds. Root `scripts.test` runs `bash tests/run-all.sh`. - Subpackage `bun.lock` files deleted (mcp/trajectory-server + monitors); replaced by a single root `bun.lock` (26 KB). Gitignore dogfood-generated paths - Add `docs/trustmybot/` to .gitignore. The plugin's own contributor docs live at `docs/architecture/`, not `docs/trustmybot/`. If a contributor dogfoods inside this repo (instead of a scratch project), architect/SWE would create `docs/trustmybot/snapshots/`, `docs/trustmybot/architecture/auto/`, etc. as if this were a user project — those artifacts should never land in commits. - The plugin-root .gitignore's existing `.claude/` rule already covers the runtime DB + worktrees path. Docs updated - docs/local-testing.md Prereqs section: one-command `bun install` + `bun run build` at the plugin repo root. Two other places that said `cd plugin/mcp/trajectory-server && bun run build` trimmed the same way. CI simplified - .github/workflows/test.yml: three install/build/test steps that each set `working-directory: mcp/trajectory-server` collapsed into a single root-level `bun install --frozen-lockfile`, `bun run build`, and `bash tests/run-all.sh` (picks up MCP + hook + lint suites). Verified - `bun install` at plugin root installs 260 packages, produces one consolidated lockfile. - `bun run build` at plugin root compiles MCP server successfully (monitors has no build; `bun --filter` skips it cleanly). - `bash tests/run-all.sh`: 235 MCP + 16 hook + 4 agent-budget lint, all green. Closes #42 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Users were copy-pasting `/absolute/path/to/trustmybot-plugin` literally and launching Claude Code with a bogus `--plugin-dir`, so no plugin loaded and `@gatekeeper` had nothing to bind to. Replace the placeholder with `$(pwd)` captured from the plugin repo root, with a sanity-check echo and a fallback note for users who run from elsewhere. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…h CC plugin spec `claude plugin validate` surfaced three hard manifest violations that kept --plugin-dir from loading any of our agents: 1. plugin.json - `repository` was an object; spec wants a string URL - `dependencies` had npm-style object form; spec wants an array (removed for now — no installed plugin uses deps yet, and our deps were soft refs) - `provides` was not in the spec at all (agents auto-discover from agents/ dir, same as every other plugin) - Added `homepage` and `keywords` to match the common shape 2. marketplace.json - Missing required `owner: { name }` object - `source` must end in `/` - Top-level `description` moved under `metadata.description` - Removed the plugin entry's `version` (version lives on plugin.json) 3. hooks/hooks.json - Must be wrapped in `{ "hooks": {...} }`; we had the event map at the root - Hook command paths now use `${CLAUDE_PLUGIN_ROOT}/...` for portability (matches superpowers and other first-party plugins) Both `claude plugin validate` runs now pass. Full test suite unchanged: 235 MCP + 16 hook + 4 agent-budget lint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two bugs were blocking the trajectory MCP server from ever loading: 1. Wrong top-level key — we had `"servers"`, Claude Code expects `"mcpServers"` (matches every official first-party plugin .mcp.json). The wrong key was a silent no-op: no error, no prompt, the server just never got spawned, so agents could never call issue_create / config_set / identity_set and no trajectory.db was ever created. Dogfood proved this out — `/mcp` listed zero plugin-owned servers. 2. Relative args path — `mcp/trajectory-server/dist/index.js` resolved against whatever cwd CC happened to be in, which is the scratch project, not the plugin clone. Now uses `${CLAUDE_PLUGIN_ROOT}` (the same substitution hooks.json now uses) so the launcher always finds the built entry regardless of where the user is. Closes the "why doesn't trajectory.db exist" half of the dogfood thread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

File had a node shebang but 0644 permissions, so CC's monitor subsystem spawned it directly (via `command`) and got exit 126 / permission denied. Every other plugin script (hooks/*.sh) has the executable bit recorded in git; the monitor just slipped through. chmod +x locally and `git update-index --chmod=+x` to bake the executable bit into the tree so a fresh clone works without manual intervention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n repo User hit this: ran in the plugin clone thinking it was the right place. Harmless (gitignored), but the instruction didn't say which dir to run it in. Now spells it out and adds a reassurance note for the plugin-clone case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…bstitution caveat Mode B referenced $PLUGIN_PATH inside a CC slash-command (/plugin marketplace add $PLUGIN_PATH) but never exported it in its own step list, AND even if exported CC's slash-commands don't expand shell vars — so users would paste the literal string "$PLUGIN_PATH" and get a broken marketplace add. Fix: 1. Add an explicit Step 1 that exports PLUGIN_PATH from the plugin repo (same pattern Mode A uses). 2. Tell the user to substitute the absolute path into the slash-command themselves — CC doesn't do shell interpolation inside /plugin commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User wants to select-and-paste the whole setup at once instead of hopping between three "Step N" chunks. Keep the two code fences (shell vs. slash-commands) since they run in different shells, but drop the stepped framing. Caveat about $PLUGIN_PATH not expanding in slash-commands is now an inline note between the blocks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…board) User's valid gripe: pasting the whole Mode B block means claude launches and the TUI immediately scrolls the echo output off-screen before you can copy the path into the slash-command. Fix: - Stash the path at /tmp/tmb-plugin-path so it survives the claude launch. - macOS: also pipe into pbcopy so paste (Cmd-V) just works inside CC. - Other platforms: mention `!cat /tmp/tmb-plugin-path` which CC interprets as an inline shell command — surfaces the path without leaving the session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User is tired of the copy-paste trap iterations. The 3-step version is the lowest-cognitive-load option: copy the pwd output once with your eyes, paste it twice (shell cd, CC slash-command). No file stash, no clipboard, no shell-variable caveats inside CC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ritance Mode B referenced $PLUGIN_PATH inside the `/plugin marketplace add` slash command but never told the user to set the variable. Fix by adding the same cd + export steps Mode A has, and a one-liner explaining that CC inherits the shell environment so the slash command resolves $PLUGIN_PATH cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Our monitor script is a one-shot poller: reads new ledger rows, prints any failed/escalated/retry-exhausted events, exits 0. CC's monitor subsystem reports every process exit as "Monitor stream ended" in the chat, which looks like a failure to users doing a dogfood run. Verified during live testing today: the MCP server is ✔ connected and responding normally; the "stream ended" message is pure UX noise from the monitor subsystem. Since no other first-party plugin ships a monitors.json and `"when":"always"` has no confirmed meaning in the current CC monitor API, remove the manifest to stop the noise. Keep the .js script + package.json so we can revive this as a proper long-running streaming daemon later. File an issue to reintroduce with a real streaming implementation when the monitor subsystem's contract is clearer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mode B had one code block with step headers as shell comments, which reads as one monolithic copy-paste. Split into Step 1 (prose) → bash block, Step 2 (prose) → bash block, so each step is visually distinct and independently runnable.

… onboarding bro is a branding choice, not a user setting. Prompting the Human to rename their gatekeeper on every fresh project was both (a) extra ceremony on a moment they just want to get through, and (b) a branding leak — half the projects call the gatekeeper "alex", half call it "bro", and the onboarding stops being a memorable shared UI. Changes: - first-run-onboarding: drop the "what would you like to call me" question. Keep the human_name question only. MCP call is now identity_set(human_name=…); gatekeeper_name falls through to the schema default of 'bro'. - tmb-reonboard: symmetric — no rename prompt, no gatekeeper_name in the current/final summary, no gatekeeper_name in identity_set. - Explicitly forbid inventing an honorific ("master", "boss", "friend") when the Human declines to share a name. Plain second-person is cleanest. - FLOWS.md mermaid: drop the second name exchange from the onboarding sequence diagram. - local-testing.md: clarify that gatekeeper_name stays at its schema default. No MCP surface changes — identity_set still accepts gatekeeper_name if a downstream script wants it; we just don't offer it via the onboarding UX. All 235 MCP + 16 hook + 4 agent-budget lint tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

**Rationale (pre-release, no tech-debt concerns):** The agent was named `gatekeeper` (role) but ran a persona called `bro` (brand). That split caused recurring confusion: - `@gatekeeper` was 11 characters to type vs. the 3-character persona the user actually saw in conversation (issue #36 Concern 2). - Every document had to explain the two-name relationship. - The `identity.gatekeeper_name` column treated bro's name as user-configurable branding, which is wrong: bro IS the brand. It's not a knob. - Onboarding asked the Human "what would you like to call me?" — ceremony that burned a question on a non-choice. Since the plugin is pre-release and has no migration concern, a clean rename is cheaper than dragging two names forever. **Changes:** - Renamed `agents/gatekeeper.md` → `agents/bro.md` (mention handle becomes `@bro`, which closes the handle half of issue #36). - Bulk rename `gatekeeper → bro`, `Gatekeeper → Bro` across all prose, code identifiers, test names, MCP caller_role values, and KNOWN_ROLES. - Dropped the `identity.bro_name` column (was `gatekeeper_name`) from the schema. `identity_get` / `identity_set` / `identity_reset` no longer expose it. Bro is a fixed string in prompts, not a DB field. - Simplified the onboarding welcome message (`Hey, I'm bro — your <role>` had become `your bro` after naive rename, a redundancy; replaced with a clean self-introduction). **Files touched:** 42 including agent prompts, every skill, all docs (CLAUDE.md, README, CONTRIBUTING, ERD, FLOWS, FILES, SCENARIOS, local-testing, CONFIG_KEYS), MCP server source, middleware, and every test file that referenced the old name. **Verification:** - `grep -ri gatekeeper` in source: 0 hits. - `grep -ri bro_name` in source: 0 hits. - 235 MCP + 16 hook + 4 agent-budget lint tests pass. - Headless smoke test: `claude --plugin-dir … -p "list tmb: agents"` returns `tmb:bro`, `tmb:architect`, `tmb:swe`, `tmb:pr-reviewer`. Closes #36 handle concern (renaming part). Enforcement (routing) stays open as a separate concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- README tagline pulled up as a blockquote right under the H1 so anyone landing on the repo sees it first. - Onboarding welcome opens with the catchphrase; onboarding closing signs off with it — the two natural bookends of the first conversation. - bro's agent prompt gets a short Catchphrase section: deploy at completed hand-offs and confident routing calls; never on failures or clarifying questions. Swagger, not reassurance. Catchphrase placement is deliberately thin — dilution kills a tagline, so it shows up at ~2 predictable beats per session instead of being appended to every response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

"Trust me bro, it works" is only self-aware humor if it follows real verification. Applied to unverified code it just becomes the exact behavior the meme mocks — an AI agent claiming correctness with no evidence. That undercuts the whole brand bet. New rule: catchphrase is only earned on a code-delivery hand-off when BOTH conditions hold: 1. pr-reviewer recorded validation_record(verdict='pass') for every completed task in the issue. 2. Integration tests (not unit-only, not lint) actually executed against the delivered change and passed. If the project has no integration tests, the slogan is not earned and bro just says "done". Onboarding bookends remain as the only no-evidence use — there it's a meta-tagline referencing the plugin itself, not a claim about shipped code. Removed the earlier "confident routing call" carve-out: routing decisions are too soft to back with the slogan. Swagger only after the tests run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…shback Adds the third pillar of agent-harness design — productive dissent — alongside the two we already implement (responsibility boundaries + requirements alignment). Two failure modes this avoids: - Sycophantic agreement: bro capitulates to anything the Human says, plans proceed with known flaws. - Direct contradiction: bro pushes back personally, triggers the defensiveness people naturally feel about their own plans, conversation stalls. The structural fix is to separate intent-capture from intent-evaluation: **bro.md** — added a fifth role bullet. Bro's job is faithful capture; concerns go into the architect spawn prompt as `concern: <why>`, never argued with the Human directly. Bro is not a debate partner. **architect.md** — sharpened objective #2. Architect is the technical check on the Human's plan, with independent authority. Concerns (bro-forwarded or architect-originated) surface via discussion_append BEFORE writing tasks. Explicitly: never rubber-stamp a plan the architect believes is flawed. The Human now sees two independent reads (theirs + architect's) routed through a structural channel, not a confrontation. Triangulation, not conflict. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nt param Two compounding bugs meant every bro MCP call during dogfood was narrated, never executed — '0 tool uses · 21.7k tokens' across the session: 1. **Subagents had no MCP tools in their registry.** CC plugin agents must explicitly list `mcp__<server>__<tool>` names in `tools:` frontmatter. Wildcards (`mcp__plugin_tmb_trajectory-server__*`) don't propagate to subagents. Tested both: wildcard produces empty tool list, explicit names produce the right registry. 2. **MCP inputSchema didn't declare `agent` parameter.** The server's requireRoles middleware reads `args['agent']` to enforce role-based access. When the LLM sent `{agent: 'bro', ...}`, the schema validator stripped it (wasn't declared) → `caller_role: 'unknown'` → forbidden. Tests passed because they bypass schema by calling handlers directly. Fix: - **agents/bro.md, architect.md, swe.md, pr-reviewer.md**: list all MCP tools each agent is actually allowed to call per server-side role policy, explicitly. Added one-line MCP Caller Identity section to each prompt instructing the LLM to include `agent='<role>'` in every call. - **skills/{first-run-onboarding, tmb-reonboard, lazy-regen-check, project-prescan, branch-id-proposal, refresh-architecture}/SKILL.md**: extended `allowed-tools` with the specific MCP tools each skill needs. - **mcp/trajectory-server/src/tools/index.ts**: new `decorateWithAgent()` helper injects `agent: enum[bro|architect|swe|pr-reviewer]` into every tool's inputSchema at registration time. One place, applies to all 36 tools, no per-tool edits. Verified live: identity_set(agent='bro', human_name='Zax') → {"human_name":"Zax","created_at":"2026-04-24T...","updated_at":"..."} identity_get(agent='bro') → {"human_name":"Zax","created_at":"2026-04-24T...","updated_at":"..."} This unblocks PR #41's end-to-end dogfood: bro can now actually persist onboarding state via MCP instead of narrating the calls as prose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-RPC protocol User feedback caught the real process failure: I had them dogfooding (Layer 3) without Layer 2 in place, so they burned hours chasing bugs Layer 2 would have found in milliseconds. Adds a proper MCP integration layer: **tests/mcp-integration/harness.mjs** — spawns `node dist/index.js` as a subprocess, speaks the real JSON-RPC stdio protocol via the MCP SDK client, exposes `listTools()` and `call(name, args)` helpers that return structured ok/error/data so tests read like production call sites. **tests/mcp-integration/schema-contract.test.mjs** — verifies every tool's inputSchema declares the `agent` parameter with the four-role enum. This is the test that would have caught the 18s / 0-tool-uses disaster in PR #41: the schema validator was stripping `agent` because no tool declared it, collapsing every call's caller_role to 'unknown'. **tests/mcp-integration/role-matrix.test.mjs** — 10 tests covering every tool that currently wraps its handler with `requireRoles`: - identity_set (bro only) - identity_reset (bro only) - config_set (bro, architect) - file_registry_upsert (architect, bro) - file_registry_delete (architect, bro) - issue_snapshot_md (architect, pr-reviewer) - discussion_append (workflow agents + scope) - architecture_regen (architect, bro, pr-reviewer) - regen_state_set (architect, bro, pr-reviewer) Each test verifies three conditions per tool: missing `agent` → forbidden, wrong `agent` → forbidden, right `agent` → success. **Wiring:** - New suite runs from tests/run-all.sh between Layer 1 (unit) and Hook tests. - `@modelcontextprotocol/sdk` added as root devDependency for the harness. - CI picks it up via the existing `bun run test` entry point. **Tool schema fix surfaced while wiring:** Changed `decorateWithAgent()` property-spread order so the decorator's `agent` definition takes precedence over tool-specific declarations. Several tools (issues.ts, tasks.ts) had their own `agent: { type: 'string' }` without the four-role enum, shadowing the canonical schema. **Gaps surfaced (filed as separate issue):** Only 9 of 36 tools currently wrap handlers with `requireRoles`. Tasks, validations, issues, ledger, audit, and skill tools are all unprotected — any caller including 'unknown' can write through them. The worst offender is `validation_record`, which controls the push-block hook's unlock condition. Scoped to a follow-up issue so this PR doesn't spiral. Full suite now: 235 Layer 1 unit + 10 Layer 2 integration + 16 hook + 4 agent-budget lint, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… press-enter defaults CC interactive input can't send empty input, so every \"press enter to keep default\" in the skills was broken (only plan-mode accepts bare Enter). Text multi-choice also forced digit-matching (\"1\" → github-flow) and made the flow feel 2010. Switched both onboarding skills to use \`AskUserQuestion\` — a proper radio form with per-option descriptions, a pre-selected default, and an auto-added \`Other\` for free text. **first-run-onboarding/SKILL.md** - One batched \`AskUserQuestion\` call with 3 questions (name, branching, PR target) — Human answers all in one form, bro persists all via MCP. - Branching labels lead with structure, brand in parens: - \"Trunk + feature branches (GitHub Flow)\" - \"Trunk + develop + releases (Git Flow)\" - \"Custom workflow\" This solves the \"github-flow vs gitflow — which is which?\" confusion. - Name question offers \"Anonymous\" + pre-populated inferred name (if the Human introduced themselves inline); Other covers everything else. - Custom branching path uses a second batched call for the protected- branches follow-up. **tmb-reonboard/SKILL.md** - Same batched flow. First option in each question is \`Keep \"<current>\"\` — click once to preserve, or pick an alternative. - \"Anonymous\" on the name question maps to \`identity_reset\`. - Dedupe logic for when the current value would duplicate a static option. **agents/bro.md** - Added \`AskUserQuestion\` to the \`tools:\` allowlist. Closes half of #48 (the skills-level rewrite). The naming-clarity half is captured in the new labels. Integration testing of the rendered UI is a human dogfood step (Layer 3) — automated tests can't render radio UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…reviewer happy paths Filled in the real Layer 2 scope. The prior Layer 2 commit only shipped a role-permission matrix (does role X work/fail on tool Y); it didn't match the user's ask of "each agent runs each MCP for their responsibilities". Four new test files, one per agent role, each exercising that role's realistic end-to-end MCP sequence against a real spawned server: **agent-bro-workflow.test.mjs** - First-run onboarding: empty state → identity_set → config_set x3 → verify final state via identity_get + config_list. - Reonboard rename: identity_reset then new identity_set; verify config is untouched. - issue_resume with a known issue_id (session-start resume path). **agent-architect-workflow.test.mjs** - Simple-task flow: issue_create → discussion_append (triage note) → task_create_batch → task_get → validation_history (empty) → issue_close. - Difficult-task flow: issue + 4-entry ADR-style discussion loop (note, question, answer, decision) with issue_get_with_discussions round-trip. - Skill lifecycle: skill_register (curated draft) → skill_promote (draft → pending_review) with explicit from_status. **agent-swe-workflow.test.mjs** - Pickup → running → atomic close: task_get → task_update_status(running) → ledger_log (full required-arg set: issue_id, branch_id, from_node, event_type, summary) → file_registry_upsert (documents current forbidden status per #50) → audit_log → task_update_status(completed) → verify. - validation_history read-access to own task. **agent-pr-reviewer-workflow.test.mjs** - Happy path: task_get (completed) → validation_record(pass) → history reflects 1 row with verdict=pass. - Fail path: validation_record(fail) → architect reads same history. - Retry loop: 3 sequential attempts (fail/fail/pass) → history returns all 3 in ascending attempt order. **Bugs surfaced and fixed while writing the tests:** - `ledger_log` requires (agent, issue_id, from_node, event_type, summary); my first draft only passed event_type+summary+branch_id and got a `Missing required arg: issue_id` — caught immediately by Layer 2. - `issue_resume` requires issue_id (not just agent). - `skill_promote` requires from_status (not just name+to_status). These are the exact shape mismatches that would have manifested as "0 tool uses" in interactive dogfood, burning more of your time. Layer 2 caught them in ms. Coverage: - Schema contract: 1 test (all 36 tools have `agent` in inputSchema). - Role matrix: 9 tests (every currently-protected tool). - Per-agent workflows: 11 tests (4 bro + 3 architect + 2 swe + 3 pr-reviewer — note: 2 for swe because file_registry_upsert is documented-as-forbidden). Full suite: 235 Layer 1 unit + 21 Layer 2 integration + 16 hook + 4 agent-budget lint, all green. Layer 1 decision: the pre-existing `remaining_tools.test.ts` covers audit / validation / skills / reports with 11 cases — enough for adequate. Splitting into dedicated files is cosmetic refactoring, not new coverage, so it's deferred. Real Layer 1 gaps (e.g. issue_snapshot_md which writes to disk) are exercised indirectly by Layer 2's role-matrix test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eam issues are stale User caught me being wrong. Provided a screenshot showing AskUserQuestion rendering live in an interactive session — the "Scope" chip + 6-option radio form + Enter-to-select hint are the AskUserQuestion UI, invoked from a plugin subagent. My mistake chain: 1. Shipped AskUserQuestion without verifying it works in subagents. 2. Hit the text-fallback in live dogfood, concluded the tool was unavailable. 3. Tested via `claude -p` (headless mode) — tool wasn't in the registry. 4. Decided this confirmed the limitation. Shipped revert + filed #52. 5. Searched upstream issues — they said "not-planned" back in CC 2.0.56. 6. Doubled down on the revert, closed #52 as BLOCKED. What I missed: `claude -p` is non-interactive. AskUserQuestion REQUIRES an interactive UI to render. `-p` stripping the tool wasn't a subagent limitation — it was a UI-mode restriction. And the upstream "not-planned" state may have been reversed in more recent CC builds (user is on a current build where it works interactively). Restoring: - agents/bro.md tools: AskUserQuestion back. - agents/architect.md tools: AskUserQuestion added (user's screenshot suggests architect is the one rendering it for architecture-scope questions, which is the right home for the radio UI on design decisions). - skills/first-run-onboarding/SKILL.md: back to batched AskUserQuestion call with the 3-question form (name, branching, PR target) + Step 3a for custom-protected-branches multiSelect follow-up. Kept the hardened MCP discipline from the revert: mandatory write sequence enumerated, no hallucinating rejections, post-write config_list verify gates the closing message. - skills/tmb-reonboard/SKILL.md: back to batched form with `Keep "<current>"` as pre-selected first option. - tests/manual/scenarios.md Flow 1: back to AskUserQuestion expectations. - tests/manual/scenarios.md 1.4: reonboard back to AskUserQuestion. The discipline fixes from the previous "revert" commit (mandatory write sequence + config_list verify) are preserved — those are UI-independent and were the RIGHT fix for the hallucinated-rejection bug we saw. Tests green: 235 unit + 21 integration + 16 hook + 4 lint. Next: reopen #52 with the corrected status — works interactively, documented `-p` limitation as separate concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n_append User confirmed AskUserQuestion works live in plugin subagents (screenshot of architect rendering a scope-narrowing radio form for "I want to make social media platform"). The discussion-form UX is the design intent, and every Q/A round must persist to the discussions table so the trajectory stays replayable. Implementing the pattern: **skills/architect-workflow/SKILL.md** — new "Interactive Alignment" subsection under Discussion Phase. Describes: - When to use AskUserQuestion (enumerable answers, 2-4 options, scope/ tech/priority/approach) - When to fall back to text discussion_append (open-ended "what constraints do you have" questions) - The mandatory dual-persist pattern — every AskUserQuestion round followed by TWO discussion_append calls (kind='question' + kind='answer') in chronological order, using the existing valid kinds - Guidance on batching (independent answers → one AskUserQuestion call with up to 4 questions; dependent answers → sequential) - Closing the discussion with kind='decision' once aligned This gives us the three-mode UX the user specified: - simple → bro → architect → swe (no discussion form; trivial template) - difficult → bro + architect with AskUserQuestion alignment loop → swe (persists to discussions, renders in issue_report_md / issue_snapshot_md) - out-of-scope → roundtable (tracked as #53 for AskUserQuestion extension into flow 9) **docs/architecture/FLOWS.md** — flow 3 mermaid sequence diagram updated to show the AskUserQuestion loop + dual discussion_append per round. Notes section explicitly documents the replayability guarantee: every alignment decision lands in discussions as a Q+A pair so future sessions can reconstruct the reasoning via issue_report_md. No schema changes — 'question' and 'answer' are already valid kinds in ALLOWED_KINDS (verified in tools/discussions.ts). Tests: 235 unit + 21 integration + 16 hook + 4 lint — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…text Live dogfood found two related bugs, same root cause — load-bearing Human questions were being silently skipped: **Bug 1: Onboarding auto-filled without showing the radio form.** Bro wrote identity (Zax) + branching (github-flow) + pr_target (main) + protected_branches (main) in 11 seconds with ZERO AskUserQuestion render. Root cause: CC's environment context leaked `# userEmail zax.shen@gmail.com` into bro's context, and auto-mode's "minimize interruptions" nudged bro's LLM to write the inferred defaults directly. The Human never got to confirm or change. Fix in skills/first-run-onboarding/SKILL.md: explicit "Hard rule — always render the form" section. AskUserQuestion is NOT skippable when: - Environment leaked user info (inferred ≠ confirmed) - Auto mode encourages action (onboarding isn't routine) - The LLM thinks it knows the answer (branching is policy, not guess) **Bug 2: Architect wrote `kind='decision'` without asking any question.** User asked "build a Python CLI todo". Architect triaged as simple (no docs/trustmybot/architecture/ impact), skipped the alignment loop per the current "simple → straight to tasks" rule, and unilaterally chose stdlib-only + argparse + JSON persistence. ZERO kind='question' rows in discussions. User correctly pointed out: "build a CLI todo" has massive scope ambiguity — storage backend, lib choice, command surface, feature scope — all of which the Human should pick, not the architect. Fix in skills/architect-workflow/SKILL.md: add a "Scope-ambiguity gate" inside Discussion Phase. Triggers in BOTH simple and difficult triage. Before any kind='decision' row, every plan choice must be either (a) explicitly answered by the Human via the dual-discussion_append pattern, or (b) a choice the Human plausibly wouldn't care about. Auto-mode does not waive the gate. The gate enumerates categories that ALWAYS need explicit Human input: storage backend, lib choice, command surface, feature scope, persistence location. Architect can still decide implementation details (internal variable names, stylistic choices) — just not interface-shaping ones. Net effect: simple triage no longer means "architect auto-decides the whole plan". Simple means "fewer/smaller tasks" (trivial template); the alignment loop still runs whenever scope is ambiguous. Tests: 235 unit + 21 integration + 16 hook + 4 lint — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…cal reality User: "What I expected is to check local python env first, then ask which env management tools, etc. in the discussion form." Current architect workflow said "explore the codebase before asking questions", but didn't probe the ENVIRONMENT — which tools are installed, which language versions are available, which project files already exist. Result: architect would offer generic menus like "uv vs poetry vs pip" even when only one is installed locally. Fix: new "Environment Probe" step in architect-workflow's Discussion Phase. Runs BEFORE the AskUserQuestion radio form. Uses Bash (read-only) to detect: - Language versions (python3, node, go, rustc) - Python package managers (uv, poetry, pipenv, pip) - Python project files (pyproject.toml, requirements.txt, setup.py, Pipfile) - Node ecosystem (bun, pnpm, npm) + lockfiles - Linters/formatters (ruff, black, biome, eslint) - Test runners (pytest, vitest) - Git state (remote, branch) The probe informs the radio options — every option must be grounded in what's actually detected. Explicit rules: - Never offer an option that can't execute on the local machine. - Never mark a tool "(Recommended)" unless it's detected AND fits the task. - If .python-version exists → skip Python version question. - If pyproject.toml exists → ask "reuse / new alongside / scrap & restart" instead of generic "what layout". Example mapping table in the skill shows: "uv detected + Python ≥ 3.11" → radio form with `uv (detected, v0.5.x) (Recommended)` / pip+venv / poetry. "No package manager" → "Install uv / use pip / I'll handle it". Probe findings persisted as a discussion_append(kind='note') row so future sessions replay the environment context. Also updated the Discussion Phase step list — probe is step 2, codebase exploration is step 3, radio form is step 4. Net effect: the discussion form the user envisioned — grounded, machine-specific, not a generic decision tree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…Question contract Live dogfood: bro rendered text questions ("1. Name to call you? (e.g. Zax or Anonymous)") instead of AskUserQuestion, AND leaked the user's inferred name ("Zax", from CC's environment-level userEmail) into the question text as an example. Two tightenings in first-run-onboarding/SKILL.md: **Hard rule: AskUserQuestion mandatory, no text fallback.** Previous wording said "AskUserQuestion is mandatory" but bro's LLM still fell back to text. The skill now explicitly calls the text fallback a contract violation and prescribes the exact recovery path when the tool errors: read error → retry once → if still erroring, surface verbatim to the Human and ask how to proceed. NO silent text fallback. Added an explicit reason list for why NOT to skip the tool call: - Env context leaked identity (inferred ≠ confirmed) - Auto mode says "minimize interruptions" (onboarding IS the trust-model setup) - LLM thinks it knows answers (branching is policy, not a guess) - Text seems faster (it violates the scenarios contract) **Context-leak rule: never surface inferred identity.** CC subagents inherit the user's email via # userEmail env lines. The skill now explicitly forbids: - Putting an inferred name in the question text (no more "(e.g. Zax)") - Pre-populating `Use "Zax"` based on email-derived guesses - Writing identity_set('Zax') from email inference alone The only case where pre-populating a `Use "<name>"` option is allowed: the Human typed their name IN THIS interactive session (e.g. "I'm Alice" in a prior turn). Env context alone doesn't count — it's external metadata the Human hasn't confirmed. Also updated the AskUserQuestion code comment in Step 1 to make this rule visible at the option-construction site. The radio form is where consent lives. Until the Human picks, we don't know their name — regardless of what CC told us. Tests: 235 unit + 21 integration + 16 hook + 4 lint — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User correctly called me out for asking them to test in interactive CC sessions for things that should have been automated. The concerns are: - Does the skill say AskUserQuestion is mandatory? - Does the skill forbid text fallback? - Does the skill warn against env-leak of inferred identity? - Are labels → canonical mappings complete? - Are required MCP writes enumerated? - Are architect's alignment + scope-gate + env-probe sections present? All of the above are prompt-content assertions — zero LLM required. Running them caught a real bug immediately: the skill had `(e.g. "Zax"` in an example, which is exactly the pattern the rule forbids (teaching the LLM to echo inferred identity). Fixed. Added tests/lint/onboarding-skill-contract.sh: - 13 positive assertions on first-run-onboarding (mandatory tool, text-fallback forbidden, env-leak rule, canonical mappings, MCP writes, never-narrate-rejection, in-session pre-populate gate) - 2 negative assertions (no "e.g. Zax" literal in either escaped or unescaped form — prevents regression of the bug we just found) - 3 assertions on tmb-reonboard (AskUserQuestion, Keep pattern, identity_reset path) - 9 assertions on architect-workflow (AskUserQuestion, discussion_append persistence, kind=question + kind=answer, scope-ambiguity gate, environment probe, probe citations of uv + pyproject.toml, discipline rule against ghost options) Wired into tests/run-all.sh after the agent-budget lint. Runs in ~50ms. What this CANNOT test: whether bro's LLM actually chooses to call AskUserQuestion at runtime. That's genuine Layer 3 (LLM judgment, non-deterministic). But the static contract at least prevents the category of regression where the skill prompt itself degrades. Full suite: 235 Layer 1 unit + 21 Layer 2 integration + 16 hook + 4 agent-budget + 27 onboarding contract — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…estion, scope-gate bypass Four bugs surfaced in user's interactive dogfood. Solving all. **Bug 1: issue_create returned a ghost `issue_string_id` field.** Main Claude saw `{id: 1, issue_string_id: 'iss_modfdaia_5466e041'}` and used the UUID for subsequent discussion_append calls, which silently failed (no matching row). Retried with `1` which worked. Net effect: 2 ghost MCP writes, user-visible confusion about which id is canonical. Fix in mcp/trajectory-server/src/tools/issues.ts: remove the genId('iss') line and the issue_string_id in the return. issue_create now returns the redacted row with a single `id` field. genId import dropped. Layer 2 test added: issue_create returns a single id (no issue_string_id ghost field) — prevents regression. **Bug 2: AskUserQuestion doesn't work in plugin subagents (re-confirmed).** User's dogfood: main Claude narrated "AskUserQuestion isn't available in subagents" and took over the form at top level. This contradicts my earlier un-revert based on a single screenshot (which I now believe was main Claude rendering, not a subagent). Verdict: AskUserQuestion is main-Claude-only. For bro, main Claude hoists the form (proven to work in onboarding). For architect mid-flow, main Claude does NOT hoist — architect is on its own. Fix in skills/architect-workflow/SKILL.md: Interactive Alignment section reverted from AskUserQuestion pattern to **text-based Q+A** with the same dual discussion_append persistence. Architect now emits numbered text questions, waits for reply, persists Q+A verbatim. **Bug 3: Scope-ambiguity gate silently bypassed by auto-mode.** Live DB showed: discussions had entries 1-4 (intent/note/note/note-probe) then row 5 was `kind='decision'` with body containing literal phrase "auto-mode defaults, zero-dep". ZERO `kind='question'` rows — architect unilaterally chose Python stdlib + argparse + JSON + etc. Fix in skills/architect-workflow/SKILL.md: Scope-ambiguity gate upgraded to HARD RULE. New contents: - Marked HARD RULE in the heading itself - Explicit "Auto-mode does NOT waive this gate" statement - Specific trigger phrase: "If your response body would include phrases like 'auto-mode defaults' or 'defaulting to X since you didn't specify' — STOP. Ask first." - Full worked example of correct sequence (8-row discussions table) - Full worked example of violation (what the user just saw) marked RED FLAG - Self-review instruction: before task_create_batch, re-read discussion_list; if you see a decision without a preceding question, you violated the gate. **Bug 4: No test catches Bug 3 at CI time.** Added lint coverage for the HARD-RULE discipline: - `tests/lint/onboarding-skill-contract.sh` extended with 5 new checks: HARD RULE marker, "Auto-mode does NOT waive", "auto-mode defaults" call-out, RED FLAG violation example, text-Q requirement, and negative check that no text instructs architect to CALL AskUserQuestion (a subagent-blocked call). What this DOESN'T catch: whether architect's LLM actually honors the gate at runtime. That's genuine Layer 3 — flaky by nature, document-able via scenarios.md. But the static contract regression class is closed. Also added Layer 2 test: discussion_append chronology test. Verifies schema-level chronological consistency (answer rows have preceding question rows) so at least the protocol-level invariant is enforced. All automated tests green: 235 Layer 1 unit + 23 Layer 2 integration (two new) + 16 hook + 4 agent-budget + 32 lint assertions (5 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t-blocked User dogfood again surfaced bro stuck at the onboarding form: AskUserQuestion fails for subagents (per anthropics/claude-code#12890), so my last commit saying "surface error to Human and ask how to proceed" caused main Claude to render an unhelpful "abort / answer the 3 / take defaults" menu. Reverting bro to the same text-Q+A pattern architect now uses. Consistent, reliable, no subagent-AskUserQuestion dependency: **skills/first-run-onboarding/SKILL.md** — full rewrite of the question UI: - Removed AskUserQuestion code block + Map-labels-to-canonical table. - Added explicit "MUST NOT call AskUserQuestion" prohibition with link to the upstream issue. - Step 1 now emits welcome + 3 numbered questions in one chat message: 1. What should I call you? Reply with a name or `anonymous`. 2. How does your team branch? Reply with 1/2/3 (canonical labels). 3. What's your PR target branch? Reply with `main`/`master`/`develop`/ `trunk`/other. - Step 2 parses the free-text reply against canonical map. - Step 3a (custom branching) replaced with a CSV reply for protected. **skills/tmb-reonboard/SKILL.md** — symmetric rewrite: - Removed AskUserQuestion batched-question form. - Step 2 lists current values + asks "which to change?" — reply chooses one of: name / branching / pr-target / protected / all / none. - Each sub-question accepts the new value or `keep` sentinel. **tests/lint/onboarding-skill-contract.sh** — updated to match the new text-Q reality: - first-run-onboarding: replaced "AskUserQuestion mandatory" + "no text fallback" assertions with "MUST NOT call AskUserQuestion" prohibition + "anonymous" + "comma-separated" + canonical-mapping checks. - tmb-reonboard: same — checks for NEVER list, keep sentinel, identity_reset path. - Removed assertions about Keep "<current>" pattern (was AUQ-specific) and IN-THIS-session pre-populate gate (was AUQ-specific). The two remaining intentional AskUserQuestion mentions: 1. first-run-onboarding line: "MUST NOT call AskUserQuestion" — load-bearing prohibition. 2. tmb-reonboard NEVER list — same prohibition. Tests: 235 unit + 23 integration + 16 hook + 4 lint + 30 contract — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User reported on retry: "Still no form!" after 232a90b switched bro to text-only Q+A. Re-examining the earlier working run (commit 036c81f era): bro's skill *did* call AskUserQuestion, bro subagent couldn't execute it, and main Claude transparently hoisted the form at its own level — rendering the radio UI, collecting answers, passing them to bro which then persisted via MCP. The user's "Onboarding works" report came from that flow. My 232a90b commit killed the signal: by removing AskUserQuestion from bro's skill entirely, main Claude had no reason to intercept and rendered the text as-is. Net result: worse UX. Un-reverting the skill changes from 232a90b restores the signal: - first-run-onboarding/SKILL.md back to batched AskUserQuestion call with Hard Rule + context-leak protection - tmb-reonboard/SKILL.md back to Keep-"<current>" first option pattern - tests/lint/onboarding-skill-contract.sh back to AskUserQuestion assertions Architect keeps the text-Q+A pattern (from 98e0f23) because main Claude does NOT hoist for deep-spawned subagents the same way. This is asymmetric but matches observed behavior. What I learned the hard way: "AskUserQuestion doesn't work in subagents" is technically true (per anthropics/claude-code#12890) but functionally incomplete — main Claude compensates for direct-@-mention subagents like bro. The skill should keep the aspirational AskUserQuestion call as the signal main Claude uses to know "this agent needs interactive UI". Four revert-cycles on this issue. Locking in the current state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@tmb

User clarification: "bro should be CC's persona. After first @tmb:bro then CC have the persona. Why not subagent? Because subagent has no enough permissions." Right. Subagents lack AskUserQuestion (anthropics/claude-code#12890) and hit other permission gates. bro needs those for onboarding + alignment with the Human. So bro IS main Claude; architect/swe/pr-reviewer stay as true subagents with constrained tool surfaces. Changes: - **agents/bro.md** — DELETED. bro's prompt folded into plugin/CLAUDE.md under a "You are bro" section. Main Claude loads CLAUDE.md at session start and adopts bro's persona automatically. - **plugin/CLAUDE.md** — rewritten: - Lead with the top/bottom split: "You ARE bro" for main Claude; subagent prompts override when architect/swe/pr-reviewer spawn. - Full bro persona inline: MCP caller identity, chain-of-thought, role, identity+onboarding, catchphrase, session-start chain, routing table, no-auto-action discipline, agent-creator flow, mode rules, communication style. - "Subagent roster" section after bro's persona — the three true subagents (architect/swe/pr-reviewer), their models, their override-by-project rules. - "Tool availability contract" explicitly differentiates: bro has full CC toolkit including AskUserQuestion; subagents don't. - Updated Decision flow to show "bro = main Claude". - Preserved workspace boundary + workflow files + source access control sections. - **tests/manual/scenarios.md** — dropped `@tmb:bro` trigger prompts in favor of plain `hello` / `switch to gitflow`. Notes that bro IS main Claude, loaded via plugin/CLAUDE.md. - **agent-budget lint** — automatically picks up bro.md removal (uses a find-glob, no manual list update needed). No changes to subagent MCP role enforcement (architect/swe/pr-reviewer all still declare `agent='bro'` in their tool calls when relevant — that's fine, they're reporting their caller context). bro's role name remains 'bro' in requireRoles(['bro']) checks. Net side-effects, which together close open issues in principle: - #36 (routing enforcement): bro IS main Claude → every Human message routes through bro by architecture, not by convention. - #46 (shorten @-handle): no @-mention needed; user types naturally. - #47 (bro first-response latency): no subagent spawn overhead; bro responds as fast as main Claude normally does. Tests all green: 235 Layer 1 + 23 Layer 2 + 16 hook + 3 agent-budget (bro removed) + 30 contract lint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@tmb

User concern: with bro as main Claude, users can still type @tmb:architect or @tmb:pr-reviewer and bypass bro's triage / onboarding / branch-id flow. swe already had a MANDATORY FIRST ACTION that rejects spawns without task_id=<N>. Extending the same discipline to architect and pr-reviewer so no subagent is reachable directly by the Human. Rejection heuristic per agent — detect bro-originated spawn via routing markers, reject otherwise: - architect: requires at least one of `triage: simple|difficult`, `issue_id=`, `branch_id=`, or `concern:` in the spawn prompt. bro always includes at least `triage:` and usually a `branch_id` from the branch-id-proposal skill. Direct @-mention by Human has none. - pr-reviewer: requires at least one of `task_id=<N>`, `issue_id=<N>`, or a bro-routed review-request marker. architect always spawns pr-reviewer with task_id after SWE completes. - swe: already had this via `task_id=<N>` scan. Unchanged. On rejection, each agent outputs a structured REJECTED line explaining that they're subagents, not Human entry points, and pointing the Human back to bro (just type the request — no @-mention needed). Tightened the pr-reviewer overview block to recover the line budget (200-line cap per agent) after adding the rejection section. Tests: 235 Layer 1 + 23 Layer 2 + 16 hook + 3 agent-budget + 30 contract — all green. Addresses your concern #1: users can't accidentally or deliberately bypass bro by @-mentioning subagents. All three now auto-reject direct invocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User's refined design: plugin enabled ≠ bro active. Main Claude should stay as normal Claude Code until the Human addresses "bro". Once triggered, stays in bro mode for the session. Why this is better than the auto-persona model: - "what's 2+2?" → main Claude answers directly. No onboarding overhead, no pre-scan, no MCP call pollution. - "bro, write a CLI todo" → triggers full bro machinery. Onboarding (if first time), routing, subagent spawning, MCP state. - "bro" is a natural vocative — humans address TMB by its persona name. - Plugin sits dormant until invoked; backwards-compatible with regular CC workflows in the same session. Changes: - plugin/CLAUDE.md — rewrote the opening section: * New "Persona activation rule": listen for "bro" as trigger word in any form (vocative, @-mention, subject). Until triggered → act as normal main Claude. After → adopt persona for the rest of session. * "Subagent prompt precedence" section separated so architect/swe/ pr-reviewer's own prompts still win when spawned via Task. * Kept all bro persona content below — unchanged, only the activation semantics flipped. - tests/manual/scenarios.md — updated Flow 1 trigger prompts: * `bro hello` for fresh onboarding (1.1) — first mention of "bro" triggers the persona and onboarding in one shot. * `bro, switch to gitflow` for reonboard (1.4) — continuing in bro mode, addressing bro again is optional but natural. No subagent changes (architect/swe/pr-reviewer still reject direct Human invocation per 5ed659b). Tests still green: 235 + 23 + 16 + 3 + 30. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… Other-synonyms User saw this live-rendered radio form: 1. Anonymous 2. I'll enter it ← redundant; just redirects to Other 3. Type something. Two bugs: 1. **Placeholder options redirecting to Other.** AskUserQuestion auto-adds `Other` for free-text entry. Options like "I'll enter it" or "Type something else" are UI-confusing synonyms of Other and shouldn't be added by the skill. Main Claude was inventing them because the skill didn't explicitly forbid the pattern. 2. **Missing git-config identity source.** Only "Anonymous" + Other were offered. The local `git config --get user.name` is a perfect identity source the user already set themselves — should be pre-populated as option 2 when available. Fixes in skills/first-run-onboarding/SKILL.md: - New **Step 0: Probe local identity sources** — runs `git config --get user.name` before rendering the form. Caches result as `git_user_name`. - **Step 1 AskUserQuestion options** — conditional second option: if `git_user_name` is non-empty, append `Use "<name>"` with description "Detected from git config user.name". Explicit comment forbids placeholder synonym options for Other. - **Step 2 label mapping** — updated to parse `Use "<name>"` back into the raw name for identity_set. - **Context-leak rule** — rewrote to distinguish allowed source (git config) from forbidden sources (CC `# userEmail`, $USER, whoami, filesystem paths, LDAP/SSO). Lint updates: - Require `git config --get user.name` and `Detected from git config` literals in the skill. - Negative assertions: no "I'll enter it", no "Type something else" literals in the skill prompt (removed from guard language, rewrote the DO-NOT guidance without using those exact phrases). Tests: 235 unit + 23 integration + 16 hook + 3 agent-budget + 33 contract (5 new) — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ym placeholders User saw this rendered live: 1. Yes, proceed (Recommended) 2. Use simple triage instead — Skip architect, let bro/SWE implement directly 3. Suggest different branch_id 4. Type something. Two bugs: 1. **Option 2 says "Skip architect"** — directly contradicts bro's "No bypass. Every code change routes through architect" rule. The triage label is a TEMPLATE-DEPTH knob, not an architect-bypass switch. Main Claude was inventing this option because the skill didn't have an explicit AskUserQuestion specification. 2. **Option 4 "Type something"** — same redundant Other-synonym we just fixed in first-run-onboarding. AskUserQuestion auto-adds Other for free text; placeholder synonyms confuse the UI. Fix in skills/branch-id-proposal/SKILL.md: - Added explicit `AskUserQuestion` block specifying the EXACT option set. The skill no longer leaves the radio form to LLM improvisation. - Conditional options: Downgrade-to-simple (when current is difficult) OR Upgrade-to-difficult (when current is simple), mutually exclusive. Both still go through architect — they just adjust template depth. - "Suggest different branch_id" routes to auto-Other for free text; validated against the existing regex; reject + re-ask on mismatch. - Hard-rule callout: "No Skip-architect option. Architect is mandatory for every code change per bro's routing contract." - Hard-rule callout: "Never add a placeholder option that just redirects to Other." - Added AskUserQuestion to the skill's allowed-tools frontmatter (was missing — that's why bro was improvising rather than calling the tool with a strict spec). - Handling-the-answer table makes the persist+spawn logic explicit per selection. Lint: 6 new assertions on branch-id-proposal/SKILL.md: - references AskUserQuestion (positive) - contains explicit "No Skip architect" prohibition (positive) - offers Downgrade-to-simple AND Upgrade-to-difficult (positive) - forbids "Skip architect — let" wording (negative) - forbids "Type something." placeholder literal (negative) Tests: 235 unit + 23 integration + 16 hook + 3 agent-budget + 33+6 contract = 39 lint — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mode Live dogfood: user typed "bro, write a todo cli" — main Claude wrote todo.py directly without entering bro mode. No onboarding check, no triage, no architect spawn, no MCP writes. The trigger word was ignored. Two probable causes: 1. CLAUDE.md changes don't take effect via /reload-plugins. CLAUDE.md is loaded into the session's system prompt at session START; mid- session reload may not pick it up. User needs a fresh session. 2. The previous trigger language was descriptive ("adopt the persona") rather than prescriptive ("YOU MUST"). Main Claude treated "bro" as casual addressing instead of a hard activation token. Restructured CLAUDE.md to make the trigger rule impossible to miss: - Top-of-file H1: "TMB PLUGIN — TRIGGER RULE (READ FIRST)" + subhead "YOU MUST FOLLOW THIS RULE BEFORE RESPONDING TO ANY USER MESSAGE". No room for the LLM to drift into "this is just guidance". - Concrete examples of trigger phrases (the exact patterns we expect to see, including "bro, write a todo cli" — the live one that failed). - 3-step protocol: announce "Entering bro mode." → adopt persona → stay in mode for the session. The announcement makes activation observable to the Human; missing it surfaces the bug immediately. - Hard contract section: "If you see 'bro' in a message and respond by directly writing source code (Write/Edit on todo.py, *.ts, *.py, etc.), you have violated the contract." Names the exact failure mode the Human just witnessed. - Default-trigger rule: "If you're not sure whether 'bro' was a trigger or casual usage: assume trigger." False positives cost one extra MCP call; false negatives bypass the whole workflow. Note for the user: CLAUDE.md changes need a fresh session to take effect (not /reload-plugins). The next test should restart claude fully before retrying "bro, write a todo cli". Tests: all green (lint doesn't enforce CLAUDE.md content yet — out of scope for this commit; could add later). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User clarification: - @bro is what users should learn (documented). - Bare-word "bro, do X" still works (graceful fallback, undocumented). - @bro MUST trigger reliably. Changes: **README.md** — added a "How to use it" section right after Install that shows the exact entry point: `@bro write a todo cli`. Explains that addressing bro activates the persona for the rest of the session. First trigger runs onboarding; subsequent code-touching asks route through the full workflow; casual asks are handled by regular Claude Code without TMB taking over. Also rewrote the agent roster section: - Stale "two agents globally + five editable placeholders" claim removed (left over from the v0.2 design). - New table makes the persona/subagent split explicit: bro = main Claude persona; architect/swe/pr-reviewer = constrained subagents spawned via Task tool. - Notes that subagents auto-reject direct @-mentions — talk to @bro, bro routes. **plugin/CLAUDE.md** — restructured trigger-rule examples to lead with the canonical `@bro` form. Bare-word and addressed forms ("bro,...", "hey bro") kept as supported fallbacks but explicitly labeled "undocumented". This way: - Users who learn from README only know @bro → the simplest mental model. - Users who type "bro" naturally still get the persona — no failure mode. - LLM gets a clear mental hierarchy: canonical example first, fallbacks after, all triggering the same activation. No code or test changes — this is doc-only refinement of how the trigger is presented to humans vs. honored by the LLM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User: "Shorten CLAUDE.md if this is in actual the prompt for bro only. It has many optional content like ceo, cto, etc. Keep bro focus on its job. Only let it handle the optional work when needed (we have skills, rules)." CLAUDE.md is loaded into main Claude's system prompt every triggered session — every byte costs context. The previous version carried runtime instructions AND extensive documentation in the same file. Splitting: **KEPT (essential bro runtime):** - Trigger rule (top, mandatory) — unchanged - Subagent precedence (1 line) — unchanged - Role statement (concise) - MCP caller identity rule - First-action chain (3 steps: identity_get + config_get → cache name → issue_resume) - Code-touching skill chain (lazy-regen-check → project-prescan → triage → branch-id-proposal → architect spawn) — names skills, defers detail to skill files - Direct ops list - Routing (3 destinations: architect, pr-reviewer, agent-creator) - Concerns-escalate rule - Catchphrase rule (compressed to one paragraph) - Communication style (compressed to one line) **REMOVED (moved to skills / docs / subagent prompts):** - Workspace boundary section (TMB-internal contributor info — belongs in CONTRIBUTING.md, not bro's runtime) - Full workflow-files table (concise version retained at the bottom; full table now references mcp/.../CONFIG_KEYS.md and FLOWS.md) - Persistence MCP details (already in mcp/trajectory-server/README.md) - Source code access control section (already enforced by hook scripts and called out in architect.md / swe.md prompts) - Full mode-rules section (covered by individual skills) - Detailed no-auto-action discipline (covered by individual skills + bro's role statement) - Full agent-creator flow protocol (lives in agent-creator skill) - Per-Human-request routing table (compressed to 3 destinations + the agent-creator fallback) - Full session-start triage heuristic block (compressed to 1 sentence) **Reference pointers retained:** - Subagent roster (concise 4-row table) - Where state lives (5-bullet summary, references CONFIG_KEYS.md and FLOWS.md for detail) - Code style (3 bullets) The skills referenced from CLAUDE.md (first-run-onboarding, tmb-reonboard, lazy-regen-check, project-prescan, branch-id-proposal, refresh-architecture, agent-creator) all already exist and carry the detail. CLAUDE.md just names them. bro's prompt now is laser-focused on what bro does at runtime, not on educating the LLM about the plugin's architecture. Tests: all green. Trigger rule + subagent precedence intact; lint unaffected (it doesn't enforce CLAUDE.md content). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User: "update swe to opus and I will manual test the workflow again". agents/swe.md: model: sonnet → opus CLAUDE.md: subagent roster table updated to reflect new model. Other models unchanged: architect (opus), pr-reviewer (opus). Only remaining sonnet reference is in agent-creator/SKILL.md as the DEFAULT suggestion for user-created domain agents (use opus only for reasoning-heavy roles) — that's not the four core subagents and stays. Tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Live dogfood (8m ago): architect skipped the alignment loop AGAIN under auto-mode, wrote kind='decision' + task_create_batch with zero kind='question' rows. Same bug as 17824e9 + 419a1fa — prompt-level discipline isn't enough. Structural fix: move enforcement from the skill prompt to the MCP handler where the LLM literally cannot bypass it. **mcp/trajectory-server/src/tools/tasks.ts**: - task_create_batch now queries discussions for the issue before inserting. If zero kind='question' rows exist AND no waiver is set, returns `{error: 'scope_gate_violation'}` with a clear explanation and `questions_found: 0`. - New optional args: `waive_scope_gate: boolean` + `waive_scope_gate_reason: string` (min 10 chars). Lets architect explicitly bypass for truly trivial scope (typo fix, one-line doc change). - Waiver reason is logged to the ledger as `scope_gate_waived` event for audit — pr-reviewer / Human reviewers can grep ledger for inappropriate waivers. - inputSchema updated to document both waiver fields with guidance on when they're acceptable. **skills/architect-workflow/SKILL.md**: updated the Scope-ambiguity gate section to describe the structural enforcement + the waiver mechanism. Notes: "Default to not waiving." **Existing tests updated**: added the waiver to every existing task_create_batch call in test fixtures (27 call sites in tasks.test.ts + 9 others across unit + integration test files). One perl pass via `-0777 -pe` regex to inject `waive_scope_gate: true, waive_scope_gate_reason: 'unit-test synthetic scope; gate not under test'` right at the start of each args object. All 236 existing unit + previous integration tests stay green. **New Layer 2 scope-gate tests (4 cases)** in tests/mcp-integration/scope-gate.test.mjs: - rejects when 0 questions + no waiver (verifies the structural check) - accepts when kind='question' row exists - accepts when waiver + reason ≥10 chars - rejects waiver with missing OR too-short reason Now the live dogfood pattern that was bypassing the gate will fail at the MCP layer: task_create_batch returns scope_gate_violation, architect cannot create tasks, SWE cannot be spawned — forcing architect back to the alignment loop or explicit waiver. Tests: 236 unit + 27 integration (+4 new) + 16 hook + 4 budget + 39 contract lint — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Minor: the previous commit (140a355) rewrote the scope-gate section to describe the MCP-level enforcement but dropped the 'HARD RULE' literal the contract lint checks for. Restoring. All tests green post-fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t eager-skills Benchmark: pure Claude solves 'write a todo cli by python' in 32 seconds. TMB's bro→architect→swe chain on the SAME task took 5+ minutes and hung uninterruptibly (#55). Architect emitted a 55k-token spec; SWE cold-start to load that spec was the tail end of the hang. Three structural fixes, all ship in this commit: **1. Hard-cap spec_body at 8000 chars (mcp/trajectory-server/src/tools/tasks.ts)** Was 64000. Reality: a spec longer than ~8k is almost always a sign the task should be split. Over-long specs force SWE to spend its context budget reading instead of coding. Architect gets a clear error citing #55 and telling it to split or cite-don't-restate. This is MCP-level, not prompt-level — architect cannot talk its way around the cap. If a spec genuinely needs more, split into N tasks via parent_branch_id dependency. **2. Revert swe to sonnet (agents/swe.md, CLAUDE.md)** Undoes commit e2ce240. Opus on swe was a quality bet that came at a 2-3x latency cost at SWE's cold-start. For the 90% of tasks that are "implement this spec" mechanics, sonnet is indistinguishable from opus in output. Architect stays on opus (planning + alignment), pr- reviewer stays on opus (review depth). SWE back to sonnet. **3. Trim architect's eager skill-load list (agents/architect.md)** Was loading 6 skills on every spawn: architect-workflow, swe-spawn-workflow, validate-swe-output, agent-creator, roundtable, refresh-architecture. Removed the last three — they don't fire per spawn: - agent-creator is bro-invoked (when Human asks for a domain agent), not architect-invoked. - roundtable fires only on explicit cross-domain decisions, rare. - refresh-architecture is bro-invoked + auto on first code-touching ask, not per-architect-spawn. Architect now loads 3 skills instead of 6, halving its context load. **Skill docs update (skills/architect-workflow/SKILL.md)** Added "Spec-body brevity rule (HARD CAP at 8000 chars, enforced by MCP)" section explaining: - Cite existing code/conventions, don't restate them. - Reference industry-standard patterns by name, don't explain them. - Over-long specs push chain latency into the minutes range (cites #55). - Split into focused tasks via parent_branch_id when scope is genuinely large. Parallelizable via worktree isolation. **Test fixes** - `tasks.test.ts`: updated "rejects oversize spec_body" test from 64000 to 8000. Added a new boundary test: spec_body at exactly 8000 chars must succeed. Full suite: 237 unit (+1 boundary test) + 27 integration + 16 hook + 4 budget + 39 contract — all green. What still can't be fixed from our side: - Ctrl+C doesn't interrupt in-flight subagent inference (CC platform limitation). Issue #55 flags this for upstream filing. Meantime, `kill -9 <pid>` is the documented recovery. Closes #55 on the parts we can fix. Filed upstream concerns remain open on the CC issue tracker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- frontmatter: tool list ~40 → 20 (drop AskUserQuestion + 15 rarely-used MCP tools: issue_get/list/get_phase/report_md, task_first_actionable, audit_log, ledger_list, file_registry_*, architecture_regen/regen_state_*, identity_get, config_*, skill_*). Each MCP schema was injected into the subagent system prompt at cold-start; dropping them shrinks the per-spawn context tax. - frontmatter: eager skills 3 → 1 (kept architect-workflow; swe-spawn-workflow + validate-swe-output now load on-demand via Skill tool — needed only at SWE handoff + SWE return, not during planning). - architect-workflow: new "Simple Fast-Lane" section. Simple triage = narrow-scope ask. Architect picks defaults from a dimension → default-choice table (argparse, stdlib unittest, ~/.<app>/<file>.json, single-user laptop scope), records them as explicit spec Assumptions, and calls task_create_batch with waive_scope_gate=true + a reason naming the defaults. No env probe, no Q+A round-trip. - architect-workflow: env probe and full discussion phase explicitly scoped to the difficult path. Scope-ambiguity gate waiver's two legitimate uses (simple fast-lane + trivial doc/typo) spelled out. - architect-workflow: escalate simple → difficult triggers listed (multiple unrelated surfaces, architecture change, strategic defaults, spec >8k). Target: simple-triage architect phase drops from 20 tool calls / 50k tokens / 2m50s to 3–5 tool calls / ≤20k tokens / ≤30s. Measured via Mode B retest. Closes the architect-side of the 7m chain slowdown from #55. SWE-side (MCP cold-start + CC platform subagent spawn latency) is separate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

All three workflow agents (architect, swe, pr-reviewer) grant access to the same MCP server (plugin_tmb_trajectory-server). The fully-qualified per-tool form made the tools: line repeat the server prefix 7–13 times per agent — visually painful and a maintenance burden every time a tool is added. Collapse to the bare server name ("mcp__plugin_tmb_trajectory-server"), which CC treats as "allow every tool this server exposes". Each tools: line is now one short line. Tradeoff: each subagent now has schemas for all ~30 server tools in its cold-start context, not just the ~13 specific ones I had picked for architect in a4fe045. In exchange we get readable frontmatter and no drift when new MCP tools ship. Net cold-start tax is small compared to the architect simple-fast-lane work — most of that cost was in the ceremonial phase, not schema registration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bro kept double-encoding the protected_branches config: passing value="[\"main\"]" (a string containing JSON) instead of value=["main"] (an actual JSON array). The MCP handler then JSON.stringify'd it a second time, storing "[\"main\"]" in value_json. git-guards.sh does `jq -e 'type == "array"'` on that value, sees a string, and blocks every Bash call with "BLOCKED: TMB plugin_config key protected_branches is malformed JSON" — freezing bro and any subagent that touches Bash. Two-sided fix: 1. skills/first-run-onboarding: rename the wire guidance from "value=<JSON array>" (which LLMs read as 'pass a string containing JSON') to "value=<array of strings>" with an inline worked example (value=["main"]) and a one-liner explaining why pre-serializing is wrong. Adds the same clarity to the code example block. 2. mcp/trajectory-server config_set handler: detect when args.value is a string whose trimmed form starts with [ or { AND parses as valid JSON to an object/array, and reject with a helpful error naming the correct shape. Plain strings that happen to start with [ but aren't valid JSON fall through and are stored verbatim as before. Repro: a session whose onboarding ran BEFORE this fix has a corrupt value_json. Repair with sqlite3 or wipe .claude/tmb/trajectory.db and re-run bro onboarding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Supersedes the v0.2.0 placeholder tag (to be revoked on main-merge). ### Versions aligned - `.claude-plugin/plugin.json`: 0.3.2 → **0.1.0** - `mcp/trajectory-server/package.json`: 0.2.0 → **0.1.0** ### CHANGELOG.md Replaced the "Unreleased — first public release pending" placeholder with the comprehensive v0.1.0 entry summarizing everything that shipped across PRs #41 → #64: - bro-as-persona + planner + task gate doctrine - pr-reviewer-as-push-gate + git-push-guard.sh hook - Lego templates (6 agents in templates/agents/, plugin's agents/ empty) - planning-simple / planning-difficult split (was tmb_architect-workflow) - bundled SQLite trajectory MCP server (~30 tools) - requireRoles middleware on every workflow-state mutation - Direct Mode for trivial single-file edits - Three-layer test infrastructure (lint / MCP-integration / dogfood) - Documented latency budget in docs/PERFORMANCE.md - Documented "known not-done" linking the 4 main backlog items ### Post-merge release plan After this PR merges to dev + dev → main: 1. `git tag -d v0.2.0` 2. `git push origin :refs/tags/v0.2.0` 3. `gh release delete v0.2.0` 4. `git tag -a v0.1.0 main -m "..."` 5. `git push origin v0.1.0` 6. `gh release create v0.1.0 --notes-file CHANGELOG.md` Layer 1 + 2 tests still PASS. No code/skill/agent changes — only version bumps + release notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZaxShen and others added 3 commits April 23, 2026 20:35

ZaxShen mentioned this pull request Apr 24, 2026

🏗️ chore: adopt bun workspaces at plugin root (closes #42) #43

Closed

5 tasks

ZaxShen changed the title ~~📝 docs: SCENARIOS.md — dogfood test plan with trigger prompts (closes #40)~~ 📝 dogfood setup: SCENARIOS.md + bun workspaces + doc fixes (closes #40, #42) Apr 24, 2026

ZaxShen and others added 16 commits April 23, 2026 23:58

This was referenced Apr 24, 2026

Shorten bro's @-mention handle — @"tmb:bro (agent)" is too verbose for per-message invocation #46

Closed

bro first-response latency — 18s with 21.7k token context, 0 MCP tool calls #47

Closed

ZaxShen and others added 3 commits April 24, 2026 12:05

ZaxShen mentioned this pull request Apr 24, 2026

Roundtable: use AskUserQuestion to render final decision/vote UI #53

Closed

ZaxShen and others added 23 commits April 24, 2026 14:04

ZaxShen mentioned this pull request Apr 25, 2026

Mode B retest needed after config fix (da76cab) #56

Closed

ZaxShen merged commit cc3718d into dev Apr 25, 2026
1 check passed

ZaxShen mentioned this pull request Apr 25, 2026

Multi-consultant voting protocol (architect/cto/ceo as advisors, not deciders) #57

Closed

ZaxShen deleted the docs/40-dogfood-scenarios branch April 25, 2026 05:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📝 dogfood setup: SCENARIOS.md + bun workspaces + doc fixes (closes #40, #42)#41

📝 dogfood setup: SCENARIOS.md + bun workspaces + doc fixes (closes #40, #42)#41
ZaxShen merged 55 commits into
devfrom
docs/40-dogfood-scenarios

ZaxShen commented Apr 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZaxShen commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why combined

Test plan

Follow-up issues surfaced during this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZaxShen commented Apr 24, 2026 •

edited

Loading