Skip to content

πŸ› fix(bro): tighten activation routine + Direct Mode protocol; fix Taskβ†’Agent scorer mismatch#139

Merged
ZaxShen merged 1 commit into
devfrom
fix/three-l6-bugs
Apr 26, 2026
Merged

πŸ› fix(bro): tighten activation routine + Direct Mode protocol; fix Taskβ†’Agent scorer mismatch#139
ZaxShen merged 1 commit into
devfrom
fix/three-l6-bugs

Conversation

@ZaxShen
Copy link
Copy Markdown
Contributor

@ZaxShen ZaxShen commented Apr 26, 2026

Three bugs surfaced by L6 dogfood on rc.3 (run #24969960425):

Bug 1 β€” Bro skips first-action chain on greetings

@bro hi produced no identity_get / issue_resume MCP calls in the trajectory, contradicting the documented routine. CLAUDE.md said 'every triggered message, no shortcuts' but the imperative was in the section header β€” bro skimmed the body.

Fix: rewrote the section as MANDATORY on every triggered message, moved the no-shortcuts rule into the body, called out greetings explicitly (@bro hi, @bro yo, @bro thanks, @bro cool), quantified the cost (~50ms total), and made the consequence concrete ('silently breaks the audit trail and the welcome-banner contract').

Bug 2 β€” Direct Mode never wrote direct_mode_used ledger event

D-direct-mode trajectory showed Edit + Bash (correct) but no ledger_log call. The tmb_direct-mode skill listed all three steps but bro stopped after step 2.

Fix: rewrote 'Protocol' as ALL THREE STEPS ARE MANDATORY with the audit-trail rationale up-front and explicit 'NEVER SKIP THIS' on step 3. Added a closing reminder that the ledger_log is non-negotiable.

Bug 3 β€” Task vs Agent naming mismatch

CC's tool call Task(subagent_type=...) is captured by the PostToolUse hook as 'Agent', not 'Task'. Two scorer configs had the wrong expectation:

  • 02-simple-task/tools-required.json: 'Task' β†’ 'Agent' (was false positive)
  • D-direct-mode/tools-forbidden.json: 'Task' β†’ 'Agent' (was false negative β€” would have let bro spawn SWE inside Direct Mode without flagging it)

Why these are doctrine-compliance bugs

Bro is the LLM. CLAUDE.md and skill prompts are the only enforcement. Bugs 1 + 2 are stronger imperative rewrites; bug 3 is fixing the test to match reality. The L6 trajectory_required scorer is the regression guard if any of these slip back.

Next

After merge β†’ PR B (rename L6 β†’ L5) β†’ cut rc.4 β†’ re-run Release canary + L5 dogfood.

…skβ†’Agent scorer mismatch

Three bugs surfaced by L6 dogfood on rc.3 (run #24969960425):

## Bug 1 β€” Bro skips first-action chain on greetings

`@bro hi` produced no `identity_get` / `issue_resume` MCP calls in
the trajectory, contradicting the documented routine. CLAUDE.md said
'every triggered message, no shortcuts' but the imperative was in the
section header β€” bro skimmed the body.

Fix: rewrote the section as **MANDATORY on every triggered message**,
moved the no-shortcuts rule into the body, called out greetings
explicitly (`@bro hi`, `@bro yo`, `@bro thanks`, `@bro cool`),
quantified the cost (~50ms total), and made the consequence concrete
('silently breaks the audit trail and the welcome-banner contract').

## Bug 2 β€” Direct Mode never wrote direct_mode_used ledger event

D-direct-mode trajectory showed Edit + Bash (correct) but no
ledger_log call. The `tmb_direct-mode` skill listed all three steps
but bro stopped after step 2.

Fix: rewrote 'Protocol' as 'ALL THREE STEPS ARE MANDATORY' with the
audit-trail rationale up-front and explicit 'NEVER SKIP THIS' on
step 3. Added a closing reminder that the ledger_log is non-negotiable.

## Bug 3 β€” Task vs Agent naming mismatch in trajectory scorers

The CC tool call `Task(subagent_type=...)` is captured by the
PostToolUse hook as 'Agent', not 'Task'. Two scorer configs had
the wrong expectation:

- `02-simple-task/tools-required.json`: 'Task' β†’ 'Agent' (false positive)
- `D-direct-mode/tools-forbidden.json`: 'Task' β†’ 'Agent' (false negative β€” would let bro spawn SWE inside Direct Mode without flagging it)

## Why these are doctrine-compliance bugs, not framework bugs

Bro is the LLM. CLAUDE.md and skill prompts are the only enforcement.
Bug 1 + 2 are stronger imperative rewrites; bug 3 is the test catching
real reality. The L6 `trajectory_required` scorer is the regression
guard if any of these slip back.
@ZaxShen ZaxShen merged commit 9f4f49e into dev Apr 26, 2026
5 checks passed
@ZaxShen ZaxShen deleted the fix/three-l6-bugs branch April 26, 2026 23:55
ZaxShen added a commit that referenced this pull request May 20, 2026
All three ratified items from #139's roundtable on potential users:

- 'Who is TMB for?' section near top of README. Solo / small-team
  senior engineers running CC on real production code. No anti-persona
  block (per Human resolution: gate-friction filters wrong-fit users).

- GitHub Sponsors plumbing: .github/FUNDING.yml + 'Support TMB' block
  near bottom of README linking to github.com/sponsors/trustmybot.

- Defer-enterprise policy line: enterprise features (SSO, RBAC, SOC2,
  audit export) deferred until β‰₯3 unsolicited paying inquiries.
  Stops feature-creep toward enterprise polish before maintainer is
  funded for it.

Closes #140.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant