🐛 fix(bro): tighten activation routine + Direct Mode protocol; fix Task→Agent scorer mismatch by ZaxShen · Pull Request #139 · trustmybot/plugin

ZaxShen · 2026-04-26T23:53:30Z

Three bugs surfaced by L6 dogfood on rc.3 (run #24969960425):

Bug 1 — Bro skips first-action chain on greetings

@bro hi produced no identity_get / issue_resume MCP calls in the trajectory, contradicting the documented routine. CLAUDE.md said 'every triggered message, no shortcuts' but the imperative was in the section header — bro skimmed the body.

Fix: rewrote the section as MANDATORY on every triggered message, moved the no-shortcuts rule into the body, called out greetings explicitly (@bro hi, @bro yo, @bro thanks, @bro cool), quantified the cost (~50ms total), and made the consequence concrete ('silently breaks the audit trail and the welcome-banner contract').

Bug 2 — Direct Mode never wrote direct_mode_used ledger event

D-direct-mode trajectory showed Edit + Bash (correct) but no ledger_log call. The tmb_direct-mode skill listed all three steps but bro stopped after step 2.

Fix: rewrote 'Protocol' as ALL THREE STEPS ARE MANDATORY with the audit-trail rationale up-front and explicit 'NEVER SKIP THIS' on step 3. Added a closing reminder that the ledger_log is non-negotiable.

Bug 3 — Task vs Agent naming mismatch

CC's tool call Task(subagent_type=...) is captured by the PostToolUse hook as 'Agent', not 'Task'. Two scorer configs had the wrong expectation:

02-simple-task/tools-required.json: 'Task' → 'Agent' (was false positive)
D-direct-mode/tools-forbidden.json: 'Task' → 'Agent' (was false negative — would have let bro spawn SWE inside Direct Mode without flagging it)

Why these are doctrine-compliance bugs

Bro is the LLM. CLAUDE.md and skill prompts are the only enforcement. Bugs 1 + 2 are stronger imperative rewrites; bug 3 is fixing the test to match reality. The L6 trajectory_required scorer is the regression guard if any of these slip back.

…sk→Agent scorer mismatch Three bugs surfaced by L6 dogfood on rc.3 (run #24969960425): ## Bug 1 — Bro skips first-action chain on greetings `@bro hi` produced no `identity_get` / `issue_resume` MCP calls in the trajectory, contradicting the documented routine. CLAUDE.md said 'every triggered message, no shortcuts' but the imperative was in the section header — bro skimmed the body. Fix: rewrote the section as **MANDATORY on every triggered message**, moved the no-shortcuts rule into the body, called out greetings explicitly (`@bro hi`, `@bro yo`, `@bro thanks`, `@bro cool`), quantified the cost (~50ms total), and made the consequence concrete ('silently breaks the audit trail and the welcome-banner contract'). ## Bug 2 — Direct Mode never wrote direct_mode_used ledger event D-direct-mode trajectory showed Edit + Bash (correct) but no ledger_log call. The `tmb_direct-mode` skill listed all three steps but bro stopped after step 2. Fix: rewrote 'Protocol' as 'ALL THREE STEPS ARE MANDATORY' with the audit-trail rationale up-front and explicit 'NEVER SKIP THIS' on step 3. Added a closing reminder that the ledger_log is non-negotiable. ## Bug 3 — Task vs Agent naming mismatch in trajectory scorers The CC tool call `Task(subagent_type=...)` is captured by the PostToolUse hook as 'Agent', not 'Task'. Two scorer configs had the wrong expectation: - `02-simple-task/tools-required.json`: 'Task' → 'Agent' (false positive) - `D-direct-mode/tools-forbidden.json`: 'Task' → 'Agent' (false negative — would let bro spawn SWE inside Direct Mode without flagging it) ## Why these are doctrine-compliance bugs, not framework bugs Bro is the LLM. CLAUDE.md and skill prompts are the only enforcement. Bug 1 + 2 are stronger imperative rewrites; bug 3 is the test catching real reality. The L6 `trajectory_required` scorer is the regression guard if any of these slip back.

All three ratified items from #139's roundtable on potential users: - 'Who is TMB for?' section near top of README. Solo / small-team senior engineers running CC on real production code. No anti-persona block (per Human resolution: gate-friction filters wrong-fit users). - GitHub Sponsors plumbing: .github/FUNDING.yml + 'Support TMB' block near bottom of README linking to github.com/sponsors/trustmybot. - Defer-enterprise policy line: enterprise features (SSO, RBAC, SOC2, audit export) deferred until ≥3 unsolicited paying inquiries. Stops feature-creep toward enterprise polish before maintainer is funded for it. Closes #140.

ZaxShen merged commit 9f4f49e into dev Apr 26, 2026
5 checks passed

ZaxShen deleted the fix/three-l6-bugs branch April 26, 2026 23:55

This was referenced Apr 27, 2026

📊 Backfill A/B measurement on shipped doctrine variants (depends on #131) #153

Closed

🧪 feat(#153): 4 backfill A/B hypothesis scenarios (configs only — runs opt-in) #157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 fix(bro): tighten activation routine + Direct Mode protocol; fix Task→Agent scorer mismatch#139

🐛 fix(bro): tighten activation routine + Direct Mode protocol; fix Task→Agent scorer mismatch#139
ZaxShen merged 1 commit into
devfrom
fix/three-l6-bugs

ZaxShen commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZaxShen commented Apr 26, 2026

Bug 1 — Bro skips first-action chain on greetings

Bug 2 — Direct Mode never wrote direct_mode_used ledger event

Bug 3 — Task vs Agent naming mismatch

Why these are doctrine-compliance bugs

Next

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant