Skip to content

chore(agents): trim 6 largest agent prompts — 2925 → 2525 lines (−13.7%)#178

Merged
dackclup merged 1 commit into
mainfrom
claude/trim-agent-prompts
May 21, 2026
Merged

chore(agents): trim 6 largest agent prompts — 2925 → 2525 lines (−13.7%)#178
dackclup merged 1 commit into
mainfrom
claude/trim-agent-prompts

Conversation

@dackclup
Copy link
Copy Markdown
Owner

Why

Second-order token-economy optimization stacking on #175 (which reduced spawn FREQUENCY via gate-moment routing). This PR reduces per-spawn SIZE by trimming bloat from the 6 largest agent prompts.

The two PRs are complementary:

Lever PR Effect
Spawn frequency #175 (merged) Most cues fire at gate moments instead of every edit (~80% fewer spawns per PR cycle)
Spawn size this PR Each spawn loads 25-90 fewer prompt lines (~500-1500 fewer tokens)

What got cut

File Before After Cut %
incident-commander.md 218 126 −92 −42%
test-engineer.md 189 113 −76 −40%
stock-detail-auditor.md 177 120 −57 −32%
docs-reviewer.md 180 125 −55 −31%
release-captain.md 211 145 −66 −31%
security-reviewer.md 185 131 −54 −29%
Total 1160 760 −400 −34%

Whole .claude/agents/ directory: 2925 → 2525 lines (−13.7%).

What was cut (genuine bloat, NOT substance)

  • "Read these first" prose lists enumerating obvious files — compressed to one sentence pointing at the relevant CLAUDE.md / SKILL.md / AGENTS.md anchor
  • incident-commander's Step 6 post-mortem template (full schema duplicated from 9arm-post-mortem skill) — keep only the skeleton-emit instruction; defer the full template to the skill
  • test-engineer's "Project test conventions (memorize)" section duplicating AGENTS.md §Testing — replaced with "see AGENTS.md §Testing"
  • release-captain's "Recent releases" subsection (stale; lives in PHASE_STATUS.md) — removed
  • release-captain's Step 6 user-checklist (long prose) — one-line summary pointing at release-tag skill
  • security-reviewer's Section A-H verbose intros — compressed to per-Section header + grep command + criteria bullets
  • docs-reviewer's Step 2 6-prose-block "Substance check (per doc)" — collapsed to a single matrix table
  • stock-detail-auditor's verbose Step 1 recon prose + escalation-path narrative — kept the matrix, removed wrapper text

What's PRESERVED in every trimmed agent

  • YAML frontmatter verbatim (name, description, tools, model) — the description is the auto-routing trigger; shortening it would degrade matcher quality
  • Workflow steps with concrete commands
  • Hard constraints / "What you do NOT do" section (verified present in all 6 — see test plan)
  • Output format template (compressed but complete)
  • Escalation tables to other agents
  • Tools list (no tool removed from any agent)

Companion issues filed (from the auditor's dry-run)

While building the stock-detail-auditor in PR #175, its deterministic prefilter surfaced real data bugs on the 2026-05-14 cron. Filed as separate issues so they can be picked up independently:

This PR intentionally does NOT include the compute / valuation fixes — focuses purely on the agent-prompt-size lever.

Doc lockstep (§Conventions)

  • CLAUDE.md §Phase status: "Trim agent prompts in flight (this PR)" entry with per-file deltas + cut-rationale
  • AGENTS.md §Claude-Code-specific tooling: paragraph on the keep-it-tight principle for future agent additions (trim target when prompt grows past ~150 lines)
  • No §Layout count change (still 15 agents; this PR doesn't add/remove any)
  • No §Auto-routing policy change (firing cues unchanged — feat(agents): lean auto-routing policy + 15th agent stock-detail-auditor #175 owns that surface)

Test plan

  • YAML frontmatter intact on all 6 trimmed agents (name / description / tools / model lines present)
  • ## What you do NOT do section preserved in all 6 (grep returns 1 occurrence each)
  • No description: line shortened (those are auto-routing triggers — must stay rich for matcher quality)
  • Workflow steps still have concrete commands (not just headings)
  • Output format templates still emit verdict + structured fields
  • Hard constraints / escalation tables intact
  • Post-merge: spawn stock-detail-auditor on the next cron — verify it still produces the same SCHEMA/CONSISTENCY/RULE_16/KNOWN_ISSUE structured output as the pre-trim version did during feat(agents): lean auto-routing policy + 15th agent stock-detail-auditor #175 dry-run (no behavior regression)
  • Post-merge: at the next "ready to push" gate, observe whether the trimmed quantrank-reviewer still walks the same Sections A-H (it wasn't trimmed in this PR; the gate behavior should be identical to feat(agents): lean auto-routing policy + 15th agent stock-detail-auditor #175 baseline)

Estimated token savings

At a typical English line of ~14 tokens, 400 line cut ≈ 5600 tokens of system-prompt context saved per spawn of any of the 6 trimmed agents.

For a typical PR cycle (post #175):

Real savings depend on which agents fire, but the trimmed 6 are exactly the ones that fire most often (incident-commander on incidents, test-engineer on new code, security-reviewer pre-push, etc.).

Out of scope

Reverse-action plan

If a trimmed agent shows behavior regression in real use, the full original prompt is available via git show <pre-trim-sha>:.claude/agents/<agent>.md and can be restored selectively (just the part that mattered, not the whole pre-trim file).


Generated by Claude Code

Second-order token-economy optimization stacking on PR #175 (which
reduced spawn FREQUENCY via gate-moment routing). This PR reduces
per-spawn SIZE by trimming the genuine bloat from the 6 largest
agent prompts.

Per-file deltas:
- incident-commander 218 → 126 (−92, −42%)
- test-engineer     189 → 113 (−76, −40%)
- stock-detail-auditor 177 → 120 (−57, −32%)
- docs-reviewer     180 → 125 (−55, −31%)
- release-captain   211 → 145 (−66, −31%)
- security-reviewer 185 → 131 (−54, −29%)

Total across 15-agent set: 2925 → 2525 lines (−400, −13.7%).

What got cut (genuine bloat, NOT substance):

- "Read these first" prose lists enumerating obvious files →
  compressed to one sentence pointing at CLAUDE.md / SKILL.md /
  AGENTS.md anchors
- incident-commander's Step 6 post-mortem template (full schema)
  → defer to `9arm-post-mortem` skill, keep only the skeleton-
  emit instruction
- test-engineer's "Project test conventions (memorize)" section
  duplicating AGENTS.md §Testing → "see AGENTS.md §Testing"
- release-captain's "Recent releases" subsection (stale, lives
  in PHASE_STATUS.md) → removed
- release-captain's Step 6 user-checklist (long prose) →
  one-line summary pointing at `release-tag` skill
- security-reviewer's Section A-H verbose intros → compressed
  per-Section header + grep command + criteria
- docs-reviewer's Step 2 "Substance check (per doc)" 6 prose
  blocks → single matrix table
- stock-detail-auditor's verbose Step 1 recon prose + escalation-
  path narrative → kept the matrix, removed wrapper text

What's PRESERVED in every trimmed agent:

- YAML frontmatter (name, description, tools, model) — verbatim;
  no description shortening (the description is the auto-routing
  trigger and must stay rich for matcher quality)
- Workflow steps with concrete commands
- Hard constraints / "What you do NOT do" section
- Output format template (compressed but complete)
- Escalation tables to other agents
- Tools list (no tool removed from any agent)

Companion artifacts in same session:

- Issue #176: STZ market_cap = null on 2026-05-14 cron (XBRL fact
  extraction missing `shares_outstanding`, related to issue #10)
- Issue #177: 15 tickers |mos_pct| > 500% on 2026-05-14 cron (fair-
  price ensemble producing extreme estimates on growth/goodwill-
  heavy stocks: APP, AXON, CASY, CIEN, DD, DDOG, GE, HWM, ...)

Both surfaced by the deterministic prefilter in `stock-detail-
auditor` during PR #175 dry-run. Filed as separate issues; this PR
intentionally does NOT include compute / valuation fixes — focuses
purely on the agent-prompt-size lever.

Doc lockstep: CLAUDE.md §Phase status gains "Trim agent prompts in
flight (this PR)" entry. AGENTS.md §Claude-Code-specific tooling
gains a paragraph on the keep-it-tight principle for future agent
additions (trim target when an agent grows past ~150 lines).

No compute / schema / scoring / valuation / frontend code change.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 21, 2026 1:05pm

@dackclup dackclup marked this pull request as ready for review May 21, 2026 13:10
@dackclup dackclup merged commit 7ac4ac3 into main May 21, 2026
4 checks passed
@dackclup dackclup deleted the claude/trim-agent-prompts branch May 21, 2026 13:10
dackclup added a commit that referenced this pull request May 23, 2026
…caps to drain Sonnet-only pool (#219)

* chore(agents): reset sonnet sub-agent thoroughness — lift artificial caps to drain Sonnet-only pool

User observation: "Weekly · Sonnet only" pool on the Max plan sits
near 2% utilization while "Weekly · all models" moves normally —
meaning sonnet sub-agents are under-used. Root cause: artificial
work-bounding caps I added incidentally during PR #178 trim pass
("≤ 20 tickers", "terse", "do not exceed N file Reads") were
treating sub-agent effort as a cost to minimize, when it's actually
the intended use of the separately-billed Sonnet-only pool.

Caps lifted:

- `.claude/agents/stock-detail-auditor.md` Step 3 — removed
  20-ticker hard cap. Agent now walks every prefilter-flagged
  ticker, dedups multi-rule hits, and may fetch 1-2 adjacent peers
  when a multi-ticker pattern is suspected. Added "DO NOT skip
  flagged tickers to keep the report short" hard constraint to
  codify the new principle. Frontmatter `description:` rewritten
  to remove the "≤ 20 flagged tickers" wording (kept the project's
  auto-routing cue intact otherwise).
- `.claude/agents/quantrank-reviewer.md` Output format — removed
  "Reply terse" instruction. Agent now lists every PASS / FAIL /
  WARN finding while walking Sections A-H.
- `.claude/agents/README.md` Flow 2 release ladder — release-
  captain's stock-detail-auditor lane no longer says "≤ 20 LLM
  verdicts"; says "thorough LLM verdicts for every flagged
  ticker". Roster row description updated to match.

Policy update:

- `CLAUDE.md` §Auto-routing policy §Spawn discipline — two new
  bullets:
  (a) "Don't gatekeep sub-agent effort" explaining the Max-plan
      dual-pool topology (sonnet sub-agents drain Weekly · Sonnet
      only; opus + main session drain Weekly · all models) and
      why bounding sub-agent output wastes paid budget
  (b) "Prefer delegation to sub-agents over inline main-session
      work" — when both options exist, route work through sonnet
      sub-agents so main-session tokens don't get spent doing what
      a thorough sonnet agent can do for free against a separate
      pool
- `AGENTS.md` §Claude-Code-specific tooling — rewrote the "prompts
  are kept tight" paragraph to make explicit that the trim target
  is BOILERPLATE (read-these-first lists, duplicated material from
  the canonical docs), NOT investigation depth or output length.
  Hard prompt-size constraint ≠ hard work-size cap.

Model assignments unchanged: 4 opus by design (incident-commander
+ release-captain + methodology-scientist + quantrank-reviewer) +
11 sonnet. Tested-and-reverted in same PR a temporary swap of
quantrank-reviewer + methodology-scientist to sonnet — wrong
interpretation of user intent. The fix is to stop capping work,
not to demote models.

Companion follow-up (not in this PR): per-session usage report
post-merge that confirms Sonnet-only pool actually moves more
after this lands. If not, re-investigate whether spawn frequency
needs to increase (separate PR).

No compute / schema / scoring / valuation / frontend code change.

* chore(agents): also lift spawn frequency for sonnet sub-agents on non-trivial edits

Stacks on the depth-only fix already in this PR. User followup:
loosening per-spawn caps alone isn't enough — if sonnet agents
don't spawn often enough, the Sonnet-only pool stays idle anyway.

Adds six new edit-trigger rows to CLAUDE.md §Auto-routing policy
that fire sonnet agents immediately on non-trivial edit to their
domain (replacing the lean "Edits alone do NOT auto-spawn" rule
from PR #175):

| Edit | Spawns (sonnet) |
|---|---|
| Schema triple file | schema-sentinel |
| compute/scoring or compute/valuation | defense-layer-auditor |
| frontend/components or frontend/app | frontend-design-reviewer |
| .github/workflows or new dep or new env-var | security-reviewer |
| Prod code without same-PR test | test-engineer |
| Any of 7 top-level docs | docs-reviewer |

"Non-trivial" definition spelled out: > 5 added lines OR touches
non-comment code OR adds/removes a public symbol. Comment /
whitespace / single-line fixes do not trigger.

Opus agents (incident-commander · release-captain ·
methodology-scientist · quantrank-reviewer) keep the rare-fire
policy — they bill against the "Weekly · all models" pool, so
firing them more often does not help drain the underutilized
Sonnet-only pool.

The "ready to push" gate stays as a safety-net re-batch: opus
reviewer + phase-coordinator fire fresh; sonnet agents skip via
the existing 10-min dedup window if they already ran on the same
diff during the edit-trigger pass. So worst-case spawn count per
PR rises ~2-3× vs PR #175 baseline, but every extra spawn drains
the paid-for-but-currently-idle Sonnet-only pool.

AGENTS.md §Claude-Code-specific tooling — added paragraph
describing the spawn-frequency discipline so cross-tool readers
(Copilot / Cursor / Devin) understand the dual-pool topology.

§Phase status entry in CLAUDE.md updated to describe both lifts
(per-spawn cap AND spawn frequency) as one two-part change.

No compute / schema / scoring / valuation / frontend code change.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants