feat(#82): cut MCP core-profile schema overhead by 30% by mohanagy · Pull Request #105 · mohanagy/graphify-ts

mohanagy · 2026-05-10T19:24:39Z

Closes #82.

Summary

Tightens the 6-tool core profile that ships by default with the graphify-ts MCP server, dropping the JSON payload emitted on `tools/list` from 4,271 bytes / ~1,068 tokens to 2,982 bytes / ~746 tokens — exactly the 30% reduction target in the issue.

Tool	Bytes pre-#82	Bytes post-#82	Δ
`retrieve`	1,558	1,080	−31%
`impact`	887	539	−39%
`pr_impact`	828	519	−37%
`call_chain`	599	511	−15%
`community_overview`	263	197	−25%
`graph_stats`	119	119	0%
TOTAL (incl. wrapper)	4,271	2,982	−30%

Where the savings came from

Tightened every tool description. Dropped trailing "Use this..." marketing sentences. Replaced "Analyze the blast radius of changing a node" with "Blast-radius for a node:", etc. Discoverability is preserved — each tool still says what it does and what it returns; only the editorializing is gone.
De-advertised the deprecated `compact` flag on `impact`, `pr_impact`, and `retrieve`. The handler still validates and accepts `compact` on receipt for backward compat (existing clients keep working), but it no longer counts against the cold-start `cache_creation_input_tokens` budget. Each occurrence was ~150 bytes of pure deprecated noise.
Dropped the redundant `"Optional:"` prefix on parameter descriptions — parameters not in `required` are optional by JSON Schema definition.
Tightened verbose-flag descriptions from `"Optional: return the legacy verbose X payload instead of the default compact response."` to `"Return verbose payload (default: compact)."`
Surfaced the `retrieval_level` argument that `retrieve` already accepted (Implement selective retrieval gate before building context packs #75) so the override is now discoverable in `tools/list`. Cost: ~50 bytes; benefit: closes the discoverability gap in Implement selective retrieval gate before building context packs #75-iv.

CI hardening (the issue's third deliverable)

`tests/unit/mcp-schema-budget.test.ts` pins the JSON payload size of both profiles:

Core ceiling: 3,100 bytes (current: 2,982 → 118-byte growth budget)
Full ceiling: 12,000 bytes (current: 11,540 → 460-byte growth budget)

A future tool addition or description regression that pushes either profile over the ceiling fails CI with a clear message identifying which profile and by how much. Test also asserts the core profile contains exactly the 6 documented tools and that every tool has a non-empty description.

`tests/unit/stdio-pr-impact.test.ts` updated to assert `compact` is NOT advertised in the schema, locking in the intentional removal.

Doc-honesty pins (the issue's fourth deliverable)

`tests/unit/why-graphify-doc.test.ts` gains a regex pin requiring the README's "Honest disclosure" section to contain:

`~750 tokens`
`~3,000 bytes`
`30%`

A future PR that changes any of these numbers must update both the README and the regex in the same commit.

README updates

"Honest disclosure" section (line 242): rewrites the cold-start premium paragraph with the post-Reduce cold-start MCP tool-schema overhead #82 measured numbers and an explicit pointer to the new regression test.
"Honest summary" callout (line 125): replaces the stale ~13% premium phrase with the measured token-overhead figure and a reference to this PR.
Roadmap (line 269): flips `🔜 Tighter cold-start MCP overhead` to `✅` with the measured 30% drop.

Test plan

`npm run typecheck` — clean
`npm run build` — clean
`npm run test:run` — 90 files / 1581 tests pass (6 new for the byte budget + 1575 pre-existing). Two pre-existing tests updated: stdio-pr-impact.test.ts (assert `compact` is no longer advertised) and why-graphify-doc.test.ts (pin the new measured numbers).
Manual measurement via `node -e "JSON.stringify({ tools: activeMcpTools('core') })"` confirms 2,982 bytes.
CI must pass on Ubuntu/macOS/Windows matrix before merge.

What's deferred

The shared `$ref` schema-fragment idea from the issue body is intentionally NOT implemented here — Claude's MCP client doesn't expand `$ref` in tool input schemas at this time, so the trade-off (less readable definitions, no token savings) didn't pencil. If MCP transport-level dedup ever lands client-side, that's a follow-up.

Refs the issue's "Honest disclosure" link in the README. The README's old "~13%" cold-start premium claim was tied to the pre-#82 5K overhead; the next benchmark refresh will re-measure the new percentage against the no-graph baseline.

Summary by CodeRabbit

Release Notes

Documentation
- Updated performance metrics: core profile now ~3,000 bytes and ~750 tokens overhead.
Improvements
- Refined tool descriptions and JSON-schema parameters for graph-related tools.
- Removed deprecated compact flag from impact and pr_impact tools.
Tests
- Added regression tests for tool-schema byte ceilings and documentation validation.

Tightens the 6-tool core profile that ships by default with the graphify-ts MCP server, dropping the JSON payload emitted on \`tools/list\` from 4,271 bytes (~1,068 tokens) to 2,982 bytes (~746 tokens) — a 30% reduction, exactly the issue's acceptance target. Per-tool measurements after the change: graph_stats 119 bytes (unchanged — already minimal) community_overview 197 bytes (was 263) call_chain 511 bytes (was 599) pr_impact 519 bytes (was 828) impact 539 bytes (was 887) retrieve 1,080 bytes (was 1,558) Where the savings came from: * Tightened every tool description: dropped trailing "Use this..." marketing sentences, replaced "Analyze the blast radius of changing a node" with "Blast-radius for a node:", etc. Preserves discoverability — every tool still describes what it does and what it returns; only the editorializing is gone. * De-advertised the deprecated \`compact\` flag on impact, pr_impact, and retrieve. The handler still validates and accepts \`compact\` on receipt for backward compat with existing clients (see the runtime tolerance check in tools.ts), but it no longer counts against the cold-start cache_creation_input_tokens budget. * Dropped the redundant "Optional:" prefix on parameter descriptions — parameters not in \`required\` are optional by JSON Schema definition. * Tightened the verbose-flag descriptions: "Optional: return the legacy verbose X payload instead of the default compact response." becomes "Return verbose payload (default: compact)." * Surfaced the \`retrieval_level\` argument the retrieve tool already accepted (#75) so override is now discoverable in tools/list. Cost: 50 bytes; benefit: closes the discoverability gap in #75-iv. CI/test hardening: * tests/unit/mcp-schema-budget.test.ts pins the JSON payload size of both the core profile (ceiling 3,100 bytes) and the full profile (ceiling 12,000 bytes). A future tool addition or description regression that pushes either profile over the ceiling fails CI with a clear message. * tests/unit/stdio-pr-impact.test.ts updated to assert that \`compact\` is NOT advertised in the schema, locking in the intentional removal. * tests/unit/why-graphify-doc.test.ts gains a doc-honesty pin on the new measured numbers in README.md (~750 tokens, ~3,000 bytes, 30% reduction). Future PRs that change the README's claim must update this regex in the same PR. README updates: * "Honest disclosure" section: rewrites the cold-start premium paragraph with the actual measured post-#82 numbers and an explicit pointer to the new regression test. * "Honest summary" callout: replaces the stale ~13% premium phrase with the measured token-overhead figure and a reference to the pull request. * Roadmap: flips the "🔜 Tighter cold-start MCP overhead" item to "✅" with the measured 30% drop.

coderabbitai · 2026-05-10T19:24:49Z

📝 Walkthrough

Walkthrough

This PR reduces cold-start MCP tool-schema overhead from ~5K tokens to ~3,000 bytes (~750 tokens) by optimizing tool descriptions, removing deprecated compact flags, and adding a new retrieval_level parameter. It introduces regression tests to enforce byte ceilings and updates README documentation to reflect the measured improvements.

Changes

Tool-Schema Overhead Reduction

Layer / File(s)	Summary
Tool Schema Optimization `src/runtime/stdio/definitions.ts`	Tightens `community_overview`, `impact`, `call_chain`, `pr_impact`, and `retrieve` tool descriptions and parameter schemas; removes deprecated `compact` alias flag from `impact`, `pr_impact`, and `retrieve`; adds new `retrieval_level` parameter to `retrieve`.
Schema Validation & Regression Tests `tests/unit/mcp-schema-budget.test.ts`, `tests/unit/stdio-pr-impact.test.ts`	New test suite enforces byte-size ceilings for core (~3,000 bytes) and full tool profiles, validates core profile tool count, ensures all tools have descriptions; pr_impact test updated to assert `compact` is absent.
Documentation Updates `README.md`	Honest summary and disclosure sections updated to report ~750 tokens / ~3,000 bytes overhead (30% reduction); references regression test ceiling and prior PR `#82`; Roadmap section adjusted with new measurements.
Documentation Verification `tests/unit/why-graphify-doc.test.ts`	New test validates README contains specific measured patterns (~750 tokens, ~3,000 bytes, 30% drop) to prevent documentation drift.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

mohanagy/graphify-ts#30: Overlapping schema edits to pr_impact and retrieve tools; this PR removes the compact flag that #30 introduced.
mohanagy/graphify-ts#28: Also modifies MCP tool definitions in src/runtime/stdio/definitions.ts and updates tool-schema related tests and documentation.
mohanagy/graphify-ts#64: Overlaps on README documentation tests and MCP tool-profile content in why-graphify-doc.test.ts.

Poem

🐰 A rabbit hops through schema trees,
Trimming words with nimble keys,
Three thousand bytes—the cold-start cost—
Tomorrow's sessions won't be lost,
Regression guards the ceiling tight,
As token overheads shrink light! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(`#82`): cut MCP core-profile schema overhead by 30%' clearly and concisely summarizes the main change: reducing MCP schema overhead by 30% to address issue `#82`.
Description check	✅ Passed	The PR description comprehensively covers all required template sections: Summary with detailed change rationale and measurements, Testing checklist with all items completed, and a comprehensive Checklist confirming docs/tests updated and no secrets committed.
Linked Issues check	✅ Passed	The PR fully addresses all four deliverables from issue `#82`: (1) audited and tightened all core tool descriptions and schemas [README.md, definitions.ts]; (2) preserved discoverable behavior while removing verbosity [definitions.ts]; (3) reduced core overhead by 30% (2,982 vs 4,271 bytes) exceeding 30% target [definitions.ts]; (4) implemented regression test enforcing schema cost ceiling [mcp-schema-budget.test.ts] with README honesty pins [why-graphify-doc.test.ts].
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issue `#82`: tool definition tightening, schema audits, regression tests for byte budgets, README updates with measured numbers, and test updates confirming deprecated compact flag removal—no out-of-scope modifications detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/issue-82-mcp-schema-overhead

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/unit/stdio-pr-impact.test.ts (1)

335-339: ⚡ Quick win

Broaden de-advertised compact assertions to all affected tools.

Line 339 currently locks this only for pr_impact. Since #82 de-advertised compact on multiple schemas, add matching assertions for impact and retrieve to prevent partial regressions.

Suggested test expansion

-    const tool = (tools?.result as { tools: Array<{ name: string; inputSchema: { properties: Record<string, unknown> } }> }).tools.find(
-      (entry) => entry.name === 'pr_impact',
-    )
+    const toolList = (tools?.result as { tools: Array<{ name: string; inputSchema: { properties: Record<string, unknown> } }> }).tools
+    const byName = new Map(toolList.map((entry) => [entry.name, entry]))

-    expect(tool?.inputSchema.properties.budget).toEqual(expect.objectContaining({ type: 'number' }))
-    expect(tool?.inputSchema.properties.verbose).toEqual(expect.objectContaining({ type: 'boolean' }))
+    const tool = byName.get('pr_impact')
+    expect(tool?.inputSchema.properties.budget).toEqual(expect.objectContaining({ type: 'number' }))
+    expect(tool?.inputSchema.properties.verbose).toEqual(expect.objectContaining({ type: 'boolean' }))
     expect(tool?.inputSchema.properties.compact).toBeUndefined()
+    expect(byName.get('impact')?.inputSchema.properties.compact).toBeUndefined()
+    expect(byName.get('retrieve')?.inputSchema.properties.compact).toBeUndefined()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/stdio-pr-impact.test.ts` around lines 335 - 339, The test
currently asserts that the de-advertised `compact` property is undefined only
for the `pr_impact` tool; expand this to assert the same for the other affected
tools by adding the same expectation for `impact` and `retrieve`. Locate the
existing assertion
(expect(tool?.inputSchema.properties.compact).toBeUndefined()) and replicate it
for the tool entries identified by the symbols/names `impact` and `retrieve` in
the same test setup so all three tools verify `compact` is not surfaced.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/mcp-schema-budget.test.ts`:
- Around line 22-24: The payloadBytes function currently returns
JSON.stringify({ tools }).length which measures UTF-16 code units, not UTF-8
byte length; change it to compute UTF-8 bytes using
Buffer.byteLength(JSON.stringify({ tools }), 'utf8') so the returned value
reflects actual UTF-8 payload size (update the payloadBytes function to use
Buffer.byteLength and keep the same input/output behavior).

---

Nitpick comments:
In `@tests/unit/stdio-pr-impact.test.ts`:
- Around line 335-339: The test currently asserts that the de-advertised
`compact` property is undefined only for the `pr_impact` tool; expand this to
assert the same for the other affected tools by adding the same expectation for
`impact` and `retrieve`. Locate the existing assertion
(expect(tool?.inputSchema.properties.compact).toBeUndefined()) and replicate it
for the tool entries identified by the symbols/names `impact` and `retrieve` in
the same test setup so all three tools verify `compact` is not surfaced.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d7251282-0afe-41c2-9677-ca9347f36891

📥 Commits

Reviewing files that changed from the base of the PR and between b7ba1bb and ea35ef1.

📒 Files selected for processing (5)

README.md
src/runtime/stdio/definitions.ts
tests/unit/mcp-schema-budget.test.ts
tests/unit/stdio-pr-impact.test.ts
tests/unit/why-graphify-doc.test.ts

coderabbitai · 2026-05-10T19:28:12Z

+function payloadBytes(tools: ReadonlyArray<unknown>): number {
+  return JSON.stringify({ tools }).length
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Confirm current implementation pattern in repo rg -n "JSON.stringify\$\\{ tools \\}\$\\.length" # 2) Demonstrate why string length is not UTF-8 bytes python - <<'PY' s = '{"tools":[{"description":"~750 tokens, ≈ 3,000 bytes"}]}' print("string length:", len(s)) print("utf8 bytes :", len(s.encode('utf-8'))) PY

Repository: mohanagy/graphify-ts

Length of output: 182

🏁 Script executed:

cat -n tests/unit/mcp-schema-budget.test.ts

Repository: mohanagy/graphify-ts

Length of output: 2999

🏁 Script executed:

rg -n "activeMcpTools|MCP_TOOLS" --type ts -A 2 tests/unit/mcp-schema-budget.test.ts | head -20

Repository: mohanagy/graphify-ts

Length of output: 948

🏁 Script executed:

find . -path ./node_modules -prune -o -name "definitions.js" -type f -print

Repository: mohanagy/graphify-ts

Length of output: 46

🏁 Script executed:

find . -name "definitions.js" -o -name "definitions.ts" 2>/dev/null | head -20

Repository: mohanagy/graphify-ts

Length of output: 99

🏁 Script executed:

wc -l src/runtime/stdio/definitions.ts

Repository: mohanagy/graphify-ts

Length of output: 101

🏁 Script executed:

cat -n src/runtime/stdio/definitions.ts | head -200

Repository: mohanagy/graphify-ts

Length of output: 7128

🏁 Script executed:

cat -n src/runtime/stdio/definitions.ts | sed -n '200,300p'

Repository: mohanagy/graphify-ts

Length of output: 6034

🏁 Script executed:

# Check for non-ASCII characters in the entire definitions file python3 <<'PYEOF' with open('src/runtime/stdio/definitions.ts', 'r', encoding='utf-8') as f: content = f.read() # Check for non-ASCII characters non_ascii = [c for c in content if ord(c) > 127] if non_ascii: print(f"Found {len(non_ascii)} non-ASCII characters") # Show a sample unique_chars = set(non_ascii) for char in sorted(unique_chars)[:10]: print(f" {repr(char)} (U+{ord(char):04X})") else: print("No non-ASCII characters found in definitions.ts") # Now test the actual measurement difference import json import sys # Simulate what JSON.stringify would produce for a sample sample_str = '{"tools":[{"description":"Blast-radius for a node"}]}' str_len = len(sample_str) utf8_bytes = len(sample_str.encode('utf-8')) print(f"\nSample measurement difference:") print(f" String .length: {str_len}") print(f" UTF-8 bytes: {utf8_bytes}") print(f" Difference: {utf8_bytes - str_len}") PYEOF

Repository: mohanagy/graphify-ts

Length of output: 206

🏁 Script executed:

# Find and examine the activeMcpTools function rg -n "export.*activeMcpTools|function activeMcpTools" src/runtime/stdio/definitions.ts -A 5

Repository: mohanagy/graphify-ts

Length of output: 334

🏁 Script executed:

# Test the actual measurement with real tool data node -e " const tools = [ { name: 'query_graph', description: 'Traverse the graph to answer a question from graph evidence.', inputSchema: { type: 'object', required: ['question'], properties: { question: { type: 'string' }, mode: { type: 'string', enum: ['bfs', 'dfs'] }, depth: { type: 'number' }, token_budget: { type: 'number' }, rank_by: { type: 'string', enum: ['relevance', 'degree'] }, community_id: { type: 'number' }, file_type: { type: 'string' }, }, }, } ]; const str = JSON.stringify({ tools }); console.log('String .length:', str.length); console.log('Buffer.byteLength:', Buffer.byteLength(str, 'utf8')); console.log('Difference:', Buffer.byteLength(str, 'utf8') - str.length); "

Repository: mohanagy/graphify-ts

Length of output: 121

Use Buffer.byteLength() to correctly measure UTF-8 bytes instead of string length.

Line 23 uses .length, which counts UTF-16 code units, not UTF-8 bytes. While current tool descriptions are ASCII-only (so the difference is zero today), the byte ceilings have tight margins (~118 bytes for core, ~460 for full). Adding non-ASCII characters to tool descriptions in the future would silently undercount bytes and allow a regression to pass. Use Buffer.byteLength() for correct UTF-8 byte measurement:

Proposed fix

function payloadBytes(tools: ReadonlyArray<unknown>): number { - return JSON.stringify({ tools }).length + return Buffer.byteLength(JSON.stringify({ tools }), 'utf8') }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

function payloadBytes(tools: ReadonlyArray<unknown>): number {

return JSON.stringify({ tools }).length

}

function payloadBytes(tools: ReadonlyArray<unknown>): number {

return Buffer.byteLength(JSON.stringify({ tools }), 'utf8')

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/mcp-schema-budget.test.ts` around lines 22 - 24, The payloadBytes function currently returns JSON.stringify({ tools }).length which measures UTF-16 code units, not UTF-8 byte length; change it to compute UTF-8 bytes using Buffer.byteLength(JSON.stringify({ tools }), 'utf8') so the returned value reflects actual UTF-8 payload size (update the payloadBytes function to use Buffer.byteLength and keep the same input/output behavior).

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

mohanagy merged commit 25cbfa1 into main May 10, 2026
7 checks passed

mohanagy mentioned this pull request May 10, 2026

feat: bundle — v0.16 finishers + 1c-ii.a framework propagation (#79, #80, #81, #72) #106

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#82): cut MCP core-profile schema overhead by 30%#105

feat(#82): cut MCP core-profile schema overhead by 30%#105
mohanagy merged 1 commit into
mainfrom
feat/issue-82-mcp-schema-overhead

mohanagy commented May 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mohanagy commented May 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Where the savings came from

CI hardening (the issue's third deliverable)

Doc-honesty pins (the issue's fourth deliverable)

README updates

Test plan

What's deferred

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mohanagy commented May 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 10, 2026 •

edited

Loading