Audit, clean up, and route your Claude Code agents.
The control plane that decides which agent runs — and which never should. Locally, via CLI + MCP.
Your .claude/agents/ folder gets messy fast. People copy agents from GitHub, install giant packs, grant broad permissions, and hope Claude routes correctly. The problem is no longer "where do I find agents?" — it's:
Which agents should I trust, install, fix, merge, or delete?
AgentMD is the control plane that answers that. Not a new UI. Not a new LLM app. Not an API-key product. Not another marketplace — a cleanup and routing layer for the agents you already have. It runs locally through Claude Code (MCP) and as a CLI.
Scan every agent and get risk, routing, repo-fit, overlap & utility scores — with concrete actions.
Point it at a GitHub pack and it filters dozens of agents down to the handful that fit your repo, blocking the risky ones.
This is the part no directory does. Mid-session, Claude (or you) can ask for an agent by need — "I need a React performance reviewer" — and AgentMD searches across multiple open-source agent collections at once, deduplicates the near-identical results (every pack ships a code-reviewer), scores each for your repo, strips dangerous tools, and returns a ready-to-use, safety-improved .md you can drop straight into .claude/agents/.
agentmd find react code reviewer # search, dedupe, score, improve
agentmd find react code reviewer --show # print the ready-to-use .md
agentmd find react code reviewer --write # write top matches to .claude/agentsIt searches a configurable set of registries (VoltAgent, wshobson, 0xfurai, …), downloads only the top filename matches per source (so a search is a handful of API calls even against 100+ agent repos), and honors GITHUB_TOKEN for higher rate limits.
AgentMD is opinionated: it returns a conservative, read-only-first sequence, and never starts an unclear bug with the implementer.
git clone https://github.com/doramirdor/agentmd
cd agentmd
npm install
npm run buildclaude mcp add agentmd -- node /absolute/path/to/agentmd/dist/mcp/server.jsThen just ask Claude:
- "Audit my current agents"
- "Find me a React performance reviewer agent"
- "Recommend the minimal agent pack for this repo"
- "Import github:affaan-m/ecc and tell me what I actually need"
- "Make backend-agent safer"
- "Route this task: fix the checkout bug and add tests"
agentmd audit # audit all agents in the repo
agentmd find react performance reviewer # search collections, dedupe + improve a fitting agent
agentmd recommend # minimal safe pack for this repo
agentmd import github:affaan-m/ecc --max 6 # import a pack + recommend
agentmd route fix the checkout bug and add tests
agentmd fix backend-agent # safer rewrite (preview diff)
agentmd fix backend-agent --apply # write it
agentmd install # install the recommended presets
agentmd policy --risk low # generate .agentmd.yml
agentmd check # enforce .agentmd.yml (exits 1 on violations — CI gate)Add --json for machine-readable output and --repo <path> to target another directory. (Before npm link, run via node dist/cli/index.js ….)
| Score | Meaning (0–100) |
|---|---|
| risk | how dangerous — Bash/Write, permission-bypass language, hooks |
| routing | how reliably Claude will pick it (description quality) |
| repo-fit | how well it matches the current repo's stack |
| overlap | how duplicated it is by other agents |
| utility | overall worth → keep / fix / merge / delete / block |
Two things that make the scoring trustworthy:
- Risk is negation-aware. An agent that says "never deploy" or "never access secrets" is not punished for its own safety boundaries.
- Repo-fit uses word boundaries.
rustnever matches insidetrust/robust, so general agents aren't mislabeled.
Both behaviors are locked in by regression tests.
scan_agents · audit_all_agents · audit_agent · recommend_agents_for_repo · find_agent · import_agent · fix_agent · route_task · check_policy · install_agent_pack · generate_policy
Claude Code / CLI
→ scanner .claude/agents (+ global ~/.claude)
→ parser frontmatter · tools · body · mcp/hooks
→ repo detector language · framework · test · db · deploy
→ scoring risk · routing · repo-fit · overlap · utility → verdict
→ router conservative, read-only-first agent chains
→ recommender minimal safe pack per repo
→ fixer + writer safe rewrites with diff preview (never writes without --apply)
Five conservative agents ship built in: planner, code-reviewer, test-writer, debugger, implementer. The router never defaults to implementer for ambiguous tasks.
agentmd policy --risk low generates an .agentmd.yml you can commit to standardize what's allowed across a team — permissions, max agents per repo, routing rules, and import gates.
agentmd check then enforces it: it audits every agent against the policy and exits non-zero on violations, so you can drop it into CI as a governance gate (--warn-only to report without failing). The route_task output also tags each step with a model tier (cheap / standard / premium) so a cost-aware model router can run read-only steps on a small model and reserve premium models for risky edits.
Every flow is exercised end-to-end against the committed demo/ fixture. Full captured output (typecheck, build, 21/21 tests, audit, recommend, find, route, check, fix, a live 63-agent ECC import, and an MCP stdio round-trip) lives in VALIDATION.md.
npm run typecheck && npm test # 21/21 passing
vhs assets/tapes/audit.tape # regenerate the GIFsMIT


