Stop wasting 72% of your context window on MCP tool schemas.
MCP tools are powerful, but their schemas are expensive. Every tool your agent loads burns tokens before the user says a word:
- Perplexity dropped MCP citing 72% context consumed by tool definitions alone
- 55,000+ tokens eaten before the first user message is common with moderate server configs
- GitHub SEP-1576 documented rampant schema redundancy across MCP servers
- Each MCP server adds 200-600+ tokens per tool -- most of it filler
You're paying for "The absolute or relative path to the file you want to read. This should be a valid file path on the filesystem. Please ensure the file exists before attempting to read it" when "File path" would do.
Three tools in one package, each useful alone, devastating together:
| Stage | What it does | Typical reduction |
|---|---|---|
| Audit | Find bloated schemas -- verbose descriptions, oversized enums, deep nesting | Identifies waste |
| Optimize | Compress schemas automatically -- strip filler, consolidate enums, flatten nesting | ~33% avg |
| Select | BM25-based dynamic tool selection -- only send relevant tools per query | 84-94% |
| Pipeline | Select + Optimize combined in one step | 94%+ |
pip install mcp-optimizerfrom mcp_optimizer import run_pipeline, load_tools
tools = load_tools("my_tools.json")
optimized, stats = run_pipeline(tools, "read a file from disk", top_k=5)
print(f"Reduced {stats['tokens_before']} -> {stats['tokens_after_optimize']} tokens "
f"({stats['total_reduction_pct']}% reduction)")mcp-optimizer audit tools.jsonMCP Tool Audit Report
=====================
Tools analyzed: 3
Total tokens: 1,292
ISSUES FOUND:
read_file 487 tokens [verbose-description] [large-enum]
search_database 612 tokens [verbose-description] [large-enum] [deep-nesting]
send_notification 193 tokens OK
mcp-optimizer optimize tools.json -o optimized.jsonOptimization Report
===================
read_file 487 -> 312 tokens (35.9% reduction)
search_database 612 -> 398 tokens (35.0% reduction)
send_notification 193 -> 164 tokens (15.0% reduction)
Total: 1,292 -> 874 tokens (32.4% reduction)
mcp-optimizer select tools.json "read a config file" --top-k 5Query: "read a config file"
Selected 1 of 3 tools:
#1 read_file score=2.847
mcp-optimizer pipeline tools.json "send an alert to the ops channel"Pipeline Report
===============
Original: 3 tools, 1,292 tokens
Selected: 1 tool, 193 tokens (85.1% reduction)
Optimized: 1 tool, 164 tokens (87.3% total reduction)
Selected tools:
send_notification score=3.412
mcp-optimizer scan ~/.claude/mcp_config.jsonfrom mcp_optimizer import (
load_tools,
audit_tools,
optimize_tool,
ToolSelector,
run_pipeline,
)
tools = load_tools("tools.json")
# Audit for bloat
report = audit_tools(tools)
# Optimize a single tool
compressed = optimize_tool(tools[0], max_desc_tokens=40)
# Select relevant tools with BM25
selector = ToolSelector(tools)
results = selector.select("search the database for users", top_k=3)
for r in results:
print(f"{r['rank']}. {r['tool']['name']} score={r['score']:.3f}")
# Full pipeline
optimized, stats = run_pipeline(tools, "send a push notification", top_k=5)All commands also support --json for machine-readable output:
mcp-optimizer audit tools.json --json
mcp-optimizer pipeline tools.json "query" --jsonCounts tokens per tool using tiktoken (with a built-in fallback estimator). Flags schemas exceeding configurable thresholds: description length, enum size, nesting depth, property count, and total token budget.
Strips filler words and redundant phrasing from descriptions. Consolidates oversized enums into type hints. Flattens unnecessary nesting. Removes non-functional fields like examples and default descriptions. Preserves all required fields and structural semantics.
Pure Python BM25Okapi implementation. Builds an inverted index over tool names, descriptions, and parameter names. Scores every tool against natural language queries and returns the top-k matches. Sub-millisecond on typical tool sets.
Chains selection and optimization: select the top-k relevant tools via BM25, then compress their schemas. Reports token savings at each stage so you see exactly where the reduction comes from.
| Dataset | Tools | Before | After | Reduction |
|---|---|---|---|---|
| Real MCP servers (56 tools) | 56 | 10,456 tokens | ~630 tokens | 94.0% |
| Example tools (3 tools) | 3 | 1,292 tokens | 290 tokens | 77.6% |
The 94% figure comes from the full pipeline (select + optimize) on a realistic 56-tool corpus spanning filesystem, database, git, Docker, Kubernetes, and notification servers.
mcp-optimizer has zero required dependencies. It runs on the Python standard library alone.
- tiktoken is an optional extra for precise GPT-tokenizer counts. Without it, a built-in byte-level estimator is used (accurate within ~5%).
- BM25 is implemented from scratch in pure Python -- no numpy, no scipy, no sklearn.
# Minimal install
pip install mcp-optimizer
# With precise token counting
pip install mcp-optimizer[tiktoken]MIT