Skip to content

perf(mcp): trim default max_results to match real task needs#476

Merged
justrach merged 1 commit into
mainfrom
perf-lean-defaults
May 20, 2026
Merged

perf(mcp): trim default max_results to match real task needs#476
justrach merged 1 commit into
mainfrom
perf-lean-defaults

Conversation

@justrach
Copy link
Copy Markdown
Owner

Summary

The MCP defaults for codedb_search and codedb_callers were 50 — generous, but bench data from the code-search-shootout eval (16 tasks × 4 corpora) showed the median answer needed fewer than 10 results. The extra 40 were paid in tokens on every call without contributing to answer quality.

  • codedb_search: default max_results 50 → 20 (description: "default: 20, raise to 50 for broad surveys")
  • codedb_callers: default max_results 50 → 30 (description: "default: 30, raise for hot symbols")

Agents that genuinely need a broad survey can still pass max_results explicitly. The lower default just stops paying the survey cost on every query.

Measured impact (react corpus, via MCP envelope)

query before after Δ
search 'useEffect' 932 tok 645 tok −31%
search 'scheduleCallback' (≈1100) 584 tok similar trim
search 'CompleteWork' (≈350) 287 tok small (already <20 hits)
callers 'useState' (≈1500) 980 tok −35%

At the bench average of 6.5 codedb tool calls per task, that's ≈1.8k tokens shaved per task — closing the per-task gap with codegraph (6.9k tokens avg vs codedb's 16.7k) without touching codedb's quality (4.70/5) or wall-time (16.6s) leadership.

Test plan

  • zig build (ReleaseFast) passes
  • zig build test — 486/487 pass (1 failing test is pre-existing on main: issue-44)
  • Searches that previously returned exactly 50 now return 20 by default; explicit max_results=50 still returns 50
  • No regressions in codedb_find, codedb_word, codedb_outline, codedb_status

…earch, 50→30 callers)

The defaults were set when the MCP envelope was newer and we didn't know
what agents actually consumed. Bench eval data (16 tasks × 4 corpora
across react/regex/flask/gin) showed the median answer needed under 10
results — the extra 40 were paid in tokens on every call without
contributing to quality.

Measured impact on react via the MCP envelope:
- search 'useEffect':       932 → 645 tokens (-31%)
- search 'scheduleCallback':       → 584 tokens
- search 'CompleteWork':           → 287 tokens
- callers 'useState':              → 980 tokens

Agents that want broader surveys can still pass max_results explicitly.
Description text updated to surface the lever ("default: 20, raise to 50
for broad surveys").

This closes part of the gap with codegraph on per-call tokens while
keeping codedb's quality + wall-time wins intact.
@github-actions
Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 313384 318175 +1.53% +4791 OK
codedb_changes 38768 30802 -20.55% -7966 OK
codedb_deps 5690 4939 -13.20% -751 OK
codedb_edit 15590 6220 -60.10% -9370 OK
codedb_find 42082 40675 -3.34% -1407 OK
codedb_hot 54852 57331 +4.52% +2479 OK
codedb_outline 192480 201155 +4.51% +8675 OK
codedb_read 82786 65129 -21.33% -17657 OK
codedb_search 110681 119033 +7.55% +8352 OK
codedb_snapshot 219173 226136 +3.18% +6963 OK
codedb_status 20272 10709 -47.17% -9563 OK
codedb_symbol 37823 42231 +11.65% +4408 NOISE
codedb_tree 43739 46417 +6.12% +2678 OK
codedb_word 48182 50985 +5.82% +2803 OK

@justrach justrach merged commit 45e48fd into main May 20, 2026
1 check passed
@justrach justrach deleted the perf-lean-defaults branch May 20, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant