agent-lean

Cut input + output tokens across Claude Code, Cursor, Windsurf, and Codex.

Every turn, AI coding tools ship tool schemas — the JSON descriptions of every tool (Read, Edit, MCP tools, etc.) — to the model. With a handful of MCP servers enabled, this can be 20–40K tokens per turn, before you type anything. Across 20 turns that's 400–800K tokens paid just to describe tools you may not use.

agent-lean auto-detects which AI tools you have installed and manages MCP profiles, skill installation, and token measurement across all of them uniformly.

Supported tools

Tool	Config path	MCP profiles	Memory measure	Skill install
Claude Code	`~/.claude.json`	✅	✅	✅
Cursor	`~/.cursor/mcp.json`	✅	✅	✅
Windsurf	`~/.codeium/windsurf/mcp_config.json`	✅	✅	✅
Codex	`~/.codex/config.toml`	✅	✅	✅

Roadmap: Cline, GitHub Copilot, Aider, Antigravity.

agent-lean gives you five concrete levers:

MCP profiles — swap which MCP servers are active by task. The big input-side win.
Scoped agents — tool-limited subagents so tool-heavy work stays out of your main context.
MCP measurement — real per-turn tool-schema cost (--exact spawns MCPs and measures real bytes).
Memory measurement — CLAUDE.md + user memory per-turn cost (--memory), with special support for codebase-memory.
Output-side skills — one-command install of curated MIT-licensed skills like caveman (~65% fewer output tokens).

Pairs naturally with codebase-memory: one writes your per-turn context (rules files so the AI skips re-scanning); the other measures what that context is costing you.

Install

Two ways to use it:

As a Claude Code plugin (recommended)

# From inside Claude Code:
/plugin install RagavRida/agent-lean

Gives you slash commands inside the session:

/agent-lean:measure — combined MCP + memory token measurement
/agent-lean:profile — list/switch MCP profiles
/agent-lean:install — install curated output-savings skills
/agent-lean:optimize — full advisory flow (measure → recommend → apply with consent)

Plus the scoped agents (explorer, editor, researcher, git-worker) ship with the plugin.

As a standalone CLI

npm install -g agent-lean
# or run without install:
npx agent-lean measure

Quick start

# 1. See how much MCP schemas cost across ALL detected tools
agent-lean measure

# 2. See how much your CLAUDE.md + user memory cost
agent-lean measure --memory

# 3. See available lean profiles
agent-lean profile list

# 4. Apply a profile to every detected tool (auto-backs-up each)
agent-lean profile use minimal

# 5. Or apply to one specific tool
agent-lean profile use git-only --tool cursor

# 6. Add output-side savings (installs to every detected tool)
agent-lean install caveman

# 7. Restart the affected tool(s). Done.

What each profile includes

Profile	MCP servers	Est. MCP tokens
`minimal`	none	0
`git-only`	github	~12K
`research`	fetch, brave-search	~6K
`full`	github, fetch, filesystem, slack, linear	~36K

Scoped agents

The agents/ directory contains drop-in Claude Code agent definitions. Copy them into your project's .claude/agents/ directory (or your user-level ~/.claude/agents/) to use them.

Agent	Tools scoped to	Use for
`explorer`	Read, Grep, Glob	Read-only codebase exploration
`editor`	Read, Edit, Write, Grep	Targeted edits (no exploration)
`researcher`	WebFetch, WebSearch, Read	External docs / research
`git-worker`	Bash, Read	Git-only workflows

Why this helps: when Claude invokes an agent, only that agent's tool schemas load into its own context. Tool-heavy work doesn't bloat your main conversation.

How token estimates work

Two modes:

Default (agent-lean measure) — parses your ~/.claude.json, counts MCP servers, and looks up empirical schema sizes in lib/mcp-sizes.js. Unknown MCPs default to 8K tokens. Fast (no subprocess spawn), approximate.

Exact (agent-lean measure --exact) — spawns each configured MCP server, completes the MCP handshake, calls tools/list, and measures real schema JSON bytes. Token count uses ~3.5 chars/token heuristic (not Anthropic's tokenizer). Slower (10–30s), accurate for your actual server versions.

Example real measurement of the reference @modelcontextprotocol/server-everything: 13 tools, 6,556 bytes, ~1,874 tokens — meaningfully less than the 8K default. The static estimates are conservative for unknown servers; use --exact when you want the real number.

See a more accurate number for your setup? Run --exact and PR the result into lib/mcp-sizes.js.

Output-side savings via curated skills

Input tokens are only half the bill — Claude's output tokens are 5× more expensive (Opus: $75/M vs $15/M). agent-lean install fetches vetted third-party skills that compress output:

agent-lean install --list
agent-lean install caveman

Currently curated:

Skill	Source	Effect
`caveman`	JuliusBrussee/caveman (MIT)	Terse caveman-style output (~65% fewer output tokens)
`caveman-compress`	JuliusBrussee/caveman (MIT)	Compression skill with benchmark scripts

Skills land in ~/.claude/skills/<name>/ and activate after a Claude Code restart. All attribution points to the original maintainers.

Proof it works

Run the proof script yourself:

npm run proof

It spawns three real MCP servers (everything, memory, sequential-thinking — no auth required), completes the MCP handshake, and measures their actual tools/list byte sizes. Sample output from one run:

MCP                        Tools       Bytes      Tokens
everything                    13       6,556       1,874
memory                         9       9,808       2,803
sequential-thinking            1       4,503       1,287
TOTAL                         23      20,867       5,964

Per-turn savings if disabled: ~5,964 input tokens
Over 20 turns, Opus ($15/M):  ~$1.79 per session (no cache)
With prompt cache (90% hit):  ~$0.18 per Opus session

The script writes proof.json with the raw measurements so you can share reproducible evidence. These are real bytes from real MCP servers — not estimates.

End-to-end verification with mitmproxy

Want to see the savings on the wire? docs/verify-with-mitmproxy.md walks you through capturing actual Claude Code → api.anthropic.com requests, inspecting the tools array, and confirming that swapping profiles changes what you're billed for.

Honest caveats

Core Claude Code tools (Read, Edit, Bash, etc.) always load. You can't remove them via settings. agent-lean reduces MCP tokens, not core tool tokens.
Prompt caching reduces the cost in practice. If your session stays active under 5 minutes between turns, schemas are cached and re-billed at ~10% rate. The savings compound when sessions span longer or cache is missed.
Deferred tools (built into Claude Code) already help. If you see tools listed by name only with a ToolSearch hint, they're already lazy-loaded — this tool complements that, doesn't replace it.

Roadmap

Hook-based auto-profile switcher (detect task intent on first prompt)
More accurate MCP size detection (actually spawn the server and count)
Per-project profile overrides
Integration with .claude/settings.json permissions

Contributing

PRs welcome. Particularly useful contributions:

Refined MCP schema sizes in lib/mcp-sizes.js
Additional profiles for common workflows
Additional scoped agent templates

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.claude-plugin		.claude-plugin
.github		.github
agents		agents
cli		cli
commands		commands
docs		docs
lib		lib
mcp-profiles		mcp-profiles
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-lean

Supported tools

Install

As a Claude Code plugin (recommended)

As a standalone CLI

Quick start

What each profile includes

Scoped agents

How token estimates work

Output-side savings via curated skills

Proof it works

End-to-end verification with mitmproxy

Honest caveats

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-lean

Supported tools

Install

As a Claude Code plugin (recommended)

As a standalone CLI

Quick start

What each profile includes

Scoped agents

How token estimates work

Output-side savings via curated skills

Proof it works

End-to-end verification with mitmproxy

Honest caveats

Roadmap

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages