feat(mcp): size-adaptive output budget for codegraph_explore (#185) by colbymchenry · Pull Request #187 · colbymchenry/codegraph

colbymchenry · 2026-05-19T21:21:49Z

Summary

Replaces the fixed 35KB cap on codegraph_explore output with a size-adaptive budget keyed off indexed file count (closes I'm missing something? #185).
Adds a per-file char cap with density-scored cluster selection, fixing the pathological case where a file like Alamofire's Session.swift collapsed into one whole-file dump.
Caps the per-file header symbol list and the per-kind relationship list (both were leaking multi-KB lists on small projects with adjacent symbols).
Gates off "Additional relevant files" / completeness signal / explore-budget reminder on small projects, where they're pure overhead.

Why

#185 reported codegraph_explore using ~2x the tokens of native grep+Read on a ~100-file Next.js project. The README's own Alamofire benchmark (102 files) shows the same pattern. Root cause: the fixed 35KB cap is sized for thousands-of-files codebases where the agent's discovery cost (grep + find + many Reads) earns the rich output; on small projects it's a tax.

Measured impact

Against the same repos used in the README benchmark, average over 3 representative queries each:

Repo	Files	Before	After	Δ
Alamofire	104	11,011 tok	4,183 tok	-62%
Excalidraw	628	10,256 tok	6,680 tok	-35%
VS Code	10,427	9,250 tok	7,916 tok	-14%

Agent-trust floor preserved: an Explore subagent on Alamofire still answered the full Session.request() → URLSession trace from the new lean output, using 1 explore call + 3 Reads for line-level detail (normal pattern for cross-file tracing, not a fallback to native discovery).

Tier breakpoints

Matches the existing getExploreBudget so a project sits in the same tier across both knobs:

Files	maxOutputChars	defaultMaxFiles	maxCharsPerFile	gapThreshold	Meta-text
<500	18,000	5	3,800	8	Relationships only
<5,000	28,000	9	5,000	12	All
<15,000	35,000	12	7,000	15	All
15,000+	38,000	14	7,000	15	All

Test plan

13 new tests in __tests__/explore-output-budget.test.ts covering tier breakpoints, off-by-one boundaries, and end-to-end budget enforcement against a synthetic small project
Full suite: 612/612 tests pass
Agent-trust validation: spawned Explore subagent against Alamofire (104 files) — full Session→URLSession trace returned correctly
Reviewer: confirm CHANGELOG entry text reads well before cutting the next release

Output is now scaled to indexed file count. Small projects (<500 files) cap at ~18KB and skip the "Additional relevant files" / completeness / explore-budget reminders that earn their keep on larger codebases; medium (<5,000) caps at ~28KB; large (<15,000) keeps the historical ~35KB; very large goes up to ~38KB. A per-file char cap also prevents a single file with many adjacent symbols from collapsing into one whole-file dump (the pathological Alamofire `Session.swift` case reported in #185), and a per-file symbol- list cap stops the `#### path — sym(kind), ...` header from leaking multi-KB lists when many adjacent symbols cluster together. Measured against the README's benchmark repos: Alamofire (~100 files) ~62% smaller per call, Excalidraw (~600 files) ~35%, VS Code (~10k files) ~14%. Agent-trust floor preserved — Relationships, scored cluster selection, and structured-source output are all retained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Folds all changes since 0.7.10 into 0.7.12 (0.7.11 was unpublished from npm): size-adaptive codegraph_explore output budget (#185/#187), line numbers in explore source sections (#188), explore-first tool guidance (#191), language-neutral source-omission markers, and Kotlin/Swift test-file detection (#191). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

colbymchenry mentioned this pull request May 19, 2026

I'm missing something? #185

Closed

colbymchenry merged commit 93e53e7 into main May 19, 2026

colbymchenry deleted the feat/adaptive-explore-budget branch May 19, 2026 21:23

colbymchenry mentioned this pull request May 19, 2026

feat(mcp): line numbers in explore output + per-file cluster fixes #188

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): size-adaptive output budget for codegraph_explore (#185)#187

feat(mcp): size-adaptive output budget for codegraph_explore (#185)#187
colbymchenry merged 1 commit into
mainfrom
feat/adaptive-explore-budget

colbymchenry commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

colbymchenry commented May 19, 2026

Summary

Why

Measured impact

Tier breakpoints

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant