A distilled and optimized coding agent in ~5,800 lines of Python — less code, same performance.
English | 中文
- What is this?
- Quick start
- Roadmap
- Key results
- Contributions
- Distillation pipeline
- Repository structure
- Setup
- API providers
- Usage
- SWE-bench evaluation
- License
Nano-Claw-Code is a lightweight Python coding agent distilled from the full Claude Code framework. The distillation follows a two-stage pipeline:
- TypeScript pruning — We analyzed tool usage on SWE-bench and removed 29 unused tools and 4 service groups from the original Claude Code (~405,500 → ~378,100 core lines).
- Python re-implementation — We then rewrote the core agent loop, tools, and CLI in pure Python, compressing ~378,100 lines of TypeScript into ~5,800 lines of Python while preserving the same tool-use interface and agentic capabilities.
We provide code for result evaluation on SWE-bench Lite.
git clone https://github.com/OpenLAIR/nano-claw-code.git # or your fork
cd nano-claw-code
pip install -e . # or: uv sync && source .venv/bin/activate
cp .env.example .env # optional; then edit keys (or use exports below)
./start.sh # same as: nano-claw-code (after install)- Distill Claude Code (42 → 13 tools, TypeScript pruning)
- Python re-implementation — nano-claw-code (~5,800 lines, 12 tools)
- SWE-bench evaluation harness with full trace logging (included in repo)
- Comparative evaluation on SWE-bench Lite (50/300 instances)
- Full SWE-bench Lite run (300 instances)
- SWE-bench Verified run (500 instances)
- Third-party model evaluation via OpenRouter (Kimi, MiniMax)
- Distillation dataset from agent traces
Evaluated on the first 50 instances of SWE-bench Lite using claude-sonnet-4-20250514:
| Variant | Language | Tools | Core Lines | Submitted | Resolved | Resolve Rate |
|---|---|---|---|---|---|---|
| Claude Code (full) | TypeScript | 42 | ~405,500 | 50 | 33 | 66.0% |
| Nano-Claw-Code (this repo) | Python | 12 | ~5,800 | 50 | 31 | 62.0% |
~70x less code, comparable resolve rate. Full benchmark runs (300 instances) are in progress.
The full Claude Code agent defines ~56 tools spanning shell execution, file I/O, web access, multi-agent orchestration, plan modes, cron scheduling, MCP integrations, and more. We analyzed which tools the agent actually invokes during SWE-bench tasks and removed everything non-essential:
29 tools removed (click to expand full list)
| Removed Tool | Lines | Why Removed |
|---|---|---|
PowerShellTool |
8,959 | Windows-only; BashTool covers Unix |
LSPTool |
2,005 | Experimental language server integration |
SendMessageTool |
997 | Inter-agent messaging (team/swarm) |
EnterPlanModeTool / ExitPlanModeTool |
934 | Plan mode UI (not used in SWE-bench) |
ConfigTool |
809 | Anthropic-internal settings |
BriefTool |
610 | Output formatting mode |
ToolSearchTool |
593 | Dynamic tool discovery |
EnterWorktreeTool / ExitWorktreeTool |
563 | Git worktree isolation |
ScheduleCronTool / CronDelete / CronList |
543 | Cron job scheduling |
TeamCreateTool / TeamDeleteTool |
534 | Multi-agent swarm orchestration |
TaskCreate / TaskGet / TaskUpdate / TaskList / TaskStop / TaskOutput |
1,761 | V2 task management system |
ListMcpResourcesTool / ReadMcpResourceTool |
381 | MCP resource access |
AskUserQuestionTool |
309 | Structured question UI |
McpAuthTool |
215 | MCP authentication |
RemoteTriggerTool |
192 | Remote agent triggers |
SyntheticOutputTool |
163 | Structured JSON output |
REPLTool |
85 | REPL mode wrapper |
SleepTool |
17 | Sleep utility |
TungstenTool |
5 | Anthropic-internal |
WorkflowTool |
2 | Workflow placeholders |
- 4 service groups removed (~7,400 lines) — team memory sync, voice STT, LSP server management, and plugin lifecycle
- ~27,400 lines cut (6.8% of core framework) with no performance degradation
We rewrote the pruned agent in pure Python — ~5,800 lines across 15 modules with 12 tools:
12 tools retained (click to expand tool mapping)
| Tool | What It Does | Original Claude Code Equivalent |
|---|---|---|
Read |
File reading with image/directory support | FileReadTool |
Write |
File creation/overwrite | FileWriteTool |
Edit |
String-replace editing with diff preview | FileEditTool |
Bash |
Persistent-cwd shell with sandbox patterns | BashTool |
Glob |
Pattern matching with **/ auto-prepend |
GlobTool |
Grep |
Regex search via ripgrep or Python fallback | GrepTool |
WebFetch |
URL fetch with HTML→text conversion | WebFetchTool |
WebSearch |
DuckDuckGo HTML search | WebSearchTool |
NotebookEdit |
Jupyter cell create/edit | NotebookEditTool |
TodoWrite |
In-memory task tracking with merge | TodoWriteTool |
Agent |
Sub-agent spawning with tool filtering | AgentTool |
Skill |
Skill loading from .claude/skills/ |
SkillTool |
Beyond tools, the agent preserves key infrastructure:
9 infrastructure capabilities preserved (click to expand)
| Capability | Module | What It Does |
|---|---|---|
| Sub-agent system | agents.py |
3 built-in profiles (general, explore, plan) + custom agents from .claude/agents/*.md |
| Skill system | skills.py |
Discovers skills from ~/.claude/skills/ with frontmatter metadata (inline/forked execution) |
| Memory hierarchy | memory.py |
Loads layered CLAUDE.md context from global → per-directory with @include support |
| Context compaction | agent.py |
Monitors token budget (~200K), summarizes old messages when 75% threshold exceeded |
| Prompt caching | agent.py |
Anthropic cache_control: ephemeral breakpoints to reduce token costs |
| Permission system | permissions.py |
3 modes (accept-all / manual / auto) with safe-command classification |
| Session persistence | session.py |
Save/load/resume conversations with auto-save and search |
| API retry | agent.py |
Exponential backoff with jitter on 429/5xx, respects Retry-After headers |
| OpenAI compat | openai_compat.py |
Alternative backend for non-Anthropic providers (Kimi, MiniMax, etc.) |
The original Claude Code is locked to Anthropic's API. Nano-Claw-Code adds first-class support for any OpenAI-compatible endpoint, enabling evaluation and deployment with third-party models:
4 provider tiers supported (click to expand)
| Provider | Env Vars | Examples |
|---|---|---|
| Anthropic (direct) | ANTHROPIC_API_KEY |
Claude Sonnet, Claude Opus |
| OpenRouter | OPENROUTER_API_KEY + OPENROUTER_MODEL |
Any model on OpenRouter's catalog |
| OpenAI-compatible | OPENAI_COMPAT_BASE_URL + OPENAI_COMPAT_API_KEY |
Azure AI, Kimi (Moonshot), MiniMax, DeepSeek, local vLLM/Ollama |
| LiteLLM Proxy | ANTHROPIC_BASE_URL + ANTHROPIC_API_KEY |
Unified gateway to 100+ providers |
The openai_compat.py module (~600 lines) translates the agent's Anthropic-native tool-use protocol into standard OpenAI Chat Completions format — handling tool schemas, streaming deltas, and multi-turn tool call/result pairs. Provider detection is automatic based on environment variables, requiring zero code changes to switch models.
Both variants are evaluated under identical conditions with full trace logging — every tool call, model response, and thinking block is captured for analysis.
┌─────────────────────┐ prune 29 tools ┌─────────────────────┐ rewrite in ┌─────────────────────┐
│ Claude Code │ ──────────────────────▶ │ (intermediate) │ ──────────────────────▶ │ Nano-Claw-Code │
│ TypeScript │ 4 service groups │ TypeScript │ Python │ Python │
│ ~405,500 lines │ -27,400 lines │ ~378,100 lines │ │ ~5,800 lines │
│ 42 tools │ │ 13 tools │ │ 12 tools │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
Line counts below are approximate snapshots and may drift as the code evolves.
nano-claw-code/
├── nano_claw_code/ # Agent source code
│ ├── cli.py # Interactive REPL, CLI, startup banner (1,639 lines)
│ ├── tools_impl.py # 12 core tool implementations (1,066 lines)
│ ├── agent.py # Agent loop, compaction, prompt caching, retry (659 lines)
│ ├── openai_compat.py # OpenAI-compatible API adapter (599 lines)
│ ├── agents.py # Sub-agent profiles & custom agent loading (302 lines)
│ ├── skills.py # Skill discovery & execution (294 lines)
│ ├── config.py # Configuration management (279 lines)
│ ├── session.py # Session persistence (233 lines)
│ ├── prompts.py # System prompts (189 lines)
│ ├── stream_json.py # Stream-JSON output protocol (185 lines)
│ ├── frontmatter.py # CLAUDE.md frontmatter parsing (137 lines)
│ ├── permissions.py # Permission handling (133 lines)
│ └── memory.py # Memory management (111 lines)
├── swebench_harness/ # SWE-bench evaluation harness
│ ├── run_swebench_claude_code.py # Main evaluation script (inference + evaluation)
│ ├── run.sh # One-command launcher (install, predict, evaluate)
│ ├── compare_results.py # Cross-variant result comparison
│ ├── requirements.txt # Harness dependencies (datasets, swebench)
│ ├── instance_ids_pilot_8.txt # 8-instance pilot subset
│ ├── instance_ids_full_50.txt # 50-instance subset
│ └── results/ # Predictions & evaluation reports
├── start.sh # Launch script (wraps the CLI)
├── pyproject.toml # Python package config
├── uv.lock # Locked deps (for uv)
├── .env.example # Example API / model env vars
├── nano-claw.config.toml.example # Example TOML options ([nano_claw])
└── assets/ # Screenshots & images
| Requirement | Version | Purpose |
|---|---|---|
| Python | >= 3.10 | Agent runtime |
| Docker | latest | SWE-bench test execution (optional) |
pip install -e .uv installs from pyproject.toml and the committed uv.lock for reproducible environments:
uv sync # runtime dependencies only
uv sync --extra dev # + pytest and ruff (for development)This creates .venv/ at the repo root (gitignored). Use uv run nano-claw-code …, uv run pytest, or activate .venv and run ./start.sh as usual.
Optional Rich-based terminal styling:
uv sync --extra dev --extra richIf you change dependencies in pyproject.toml, run uv lock and commit the updated uv.lock.
Either copy .env.example to .env and edit (loaded automatically from the project tree), or export variables in your shell:
# Option A: Direct Anthropic API
export ANTHROPIC_API_KEY="sk-ant-xxx"
# Option B: OpenRouter (for Kimi, MiniMax, etc.)
export OPENROUTER_API_KEY="sk-or-xxx"
export OPENROUTER_MODEL="moonshotai/kimi-k2"
# Option C: LiteLLM Proxy
export ANTHROPIC_BASE_URL="http://127.0.0.1:4000"
export ANTHROPIC_API_KEY="sk-anything"
export MODEL="moonshotai/kimi-k2"Non-secret options (model, max_tokens, permission_mode, verbose, thinking, …) can live in TOML:
- User:
~/.nano_claw/config.toml - Project:
.nano_claw/config.toml(from git root toward cwd; inner directories win)
See nano-claw.config.toml.example. Keys go under [nano_claw]. API keys stay in .env only. Precedence: env model vars → config.json → TOML → defaults.
start.sh forwards to the same entry point as the nano-claw-code console script after install:
./start.sh
# equivalent:
nano-claw-codepytest # unit tests only (default; no API calls)
# Clear addopts first — otherwise the default -m filter still applies:
pytest --override-ini addopts= -m e2e
pytest --override-ini addopts= -m integration
pytest --override-ini addopts= -m "integration or e2e"
# Full suite in one go:
pytest --override-ini addopts=E2e / integration need ANTHROPIC_API_KEY (sk-ant-*) except test_e2e_cli_version (--version only). Use uv run pytest if you use uv.
Put secrets in .env (or export in your shell). Files are discovered from the current working directory up to the git root (closer directories override parents). See .env.example.
The client picks a backend using the first match below (highest priority first). If you enable multiple routes at once, the winner is deterministic—avoid leaving stray OPENAI_COMPAT_* variables set when you mean to use Anthropic or OpenRouter only.
| Priority | When it applies | Typical use |
|---|---|---|
| 1 | OPENAI_COMPAT_BASE_URL and OPENAI_COMPAT_API_KEY are both non-empty |
Azure OpenAI / AI Foundry, OpenAI-compatible gateways, some Kimi/MiniMax HTTP APIs |
| 2 | ANTHROPIC_API_KEY starts with sk-ant- |
Official Anthropic API |
| 3 | OPENROUTER_API_KEY is set (often sk-or-v1-…) |
OpenRouter (Claude, GPT, Kimi, MiniMax, …) |
| 4 | ANTHROPIC_API_KEY and ANTHROPIC_BASE_URL are both set |
Self-hosted or vendor Anthropic-compatible HTTP proxies |
| 5 | ANTHROPIC_API_KEY or ANTHROPIC_AUTH_TOKEN without a custom base URL |
Treated as Anthropic API key on the default host |
Model names (set one that matches your provider—see table):
| Variable | Used when |
|---|---|
OPENAI_COMPAT_MODEL |
OpenAI-compat route (fallback: MODEL) |
ANTHROPIC_MODEL |
Direct Anthropic |
OPENROUTER_MODEL |
OpenRouter (fallback: MODEL) |
MODEL |
Generic fallback for several paths |
TOML / ~/.nano_claw/config.json can also set model if you do not set any of MODEL, ANTHROPIC_MODEL, OPENROUTER_MODEL, or OPENAI_COMPAT_MODEL in .env/shell (see Setup → Optional — TOML above).
ANTHROPIC_API_KEY=sk-ant-api03-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514 # optionalIf your key is sk-ant-* and ANTHROPIC_BASE_URL does not appear in any merged .env file, the CLI clears ANTHROPIC_BASE_URL from the process environment so a shell-wide OpenRouter URL does not send official keys to the wrong host.
OPENROUTER_API_KEY=sk-or-v1-...
OPENROUTER_MODEL=anthropic/claude-sonnet-4-20250514
# Optional custom API root:
# OPENROUTER_BASE_URL=https://openrouter.ai/apiPick any OpenRouter model id (e.g. moonshotai/kimi-k2, anthropic/claude-3-5-sonnet-20241022). The app uses the Anthropic SDK against OpenRouter’s Anthropic-compatible surface.
Both URL and key are required; this path wins over Anthropic/OpenRouter if both compat vars are set.
OPENAI_COMPAT_BASE_URL=https://YOUR_RESOURCE.openai.azure.com/openai/v1/
OPENAI_COMPAT_API_KEY=...
OPENAI_COMPAT_MODEL=your-deployment-or-model-nameUse the Chat Completions–compatible base URL your vendor documents (often ends with /v1/). For local servers (e.g. vLLM), point OPENAI_COMPAT_BASE_URL at http://127.0.0.1:8000/v1 (exact path depends on the server).
Configure your proxy to expose an Anthropic Messages–compatible API, then:
ANTHROPIC_BASE_URL=http://127.0.0.1:4000
ANTHROPIC_API_KEY=anything-or-litellm-master-key
MODEL=claude-3-5-sonnet-20241022 # or the model id your proxy expectsSee LiteLLM Proxy for routing many providers through one endpoint. Do not use a sk-ant-* key here unless your proxy is meant to receive real Anthropic keys.
The same variables apply when running nano-claw-code -p "..." or ./start.sh from the project directory (or after exporting globally).
./start.sh -p "Explain this codebase"
# or: nano-claw-code -p "Explain this codebase"Third-party models and proxies are configured with the environment variables above; see API providers.
The repository includes a self-contained evaluation harness in swebench_harness/ that handles both inference (generating patches) and evaluation (running SWE-bench grading).
pip install -e . # Install nano-claw-code
pip install -r swebench_harness/requirements.txt # Harness deps (datasets, swebench)
# or, with uv:
# uv pip install -e . && uv pip install -r swebench_harness/requirements.txtDocker must be running — SWE-bench uses Docker containers to execute and grade patches.
cd swebench_harness
./run.sh --max-instances 10This will:
- Auto-install
nano-claw-codeif not already installed - Generate predictions on SWE-bench Lite instances
- Run the SWE-bench evaluation harness and produce a JSON report
Step 1 — Generate predictions:
cd swebench_harness
# Run on first N instances
python run_swebench_claude_code.py --max-instances 10
# Run on a specific subset
python run_swebench_claude_code.py --instance-ids instance_ids_pilot_8.txt
# Resume from a specific instance
python run_swebench_claude_code.py --resume-from django__django-11099Predictions are saved to results/nano-claw-code/predictions.jsonl along with full traces (tool calls, model responses, thinking) in results/nano-claw-code/traces/.
Step 2 — Evaluate predictions:
python run_swebench_claude_code.py --evaluateThis runs the official SWE-bench Docker evaluation and produces a JSON report (e.g., claude-sonnet-4-20250514.nano-claw-code-swebench.json).
Step 3 — View results:
# Summary is printed to stdout; detailed report in the JSON file
cat claude-sonnet-4-20250514.nano-claw-code-swebench.json | python -m json.tool| Flag | Description | Default |
|---|---|---|
--max-instances N |
Limit number of instances to evaluate | all |
--instance-ids FILE |
Path to a file listing specific instance IDs | — |
--model MODEL |
Model to use | claude-sonnet-4-20250514 |
--dataset DATASET |
SWE-bench dataset | princeton-nlp/SWE-bench_Lite |
--split SPLIT |
Dataset split | test |
--max-turns N |
Max agentic turns per instance | 30 |
--resume-from ID |
Resume from a specific instance | — |
--evaluate |
Run evaluation only (skip inference) | — |
--predictions FILE |
Custom predictions file for evaluation | auto-detected |
--bare |
Skip hooks/LSP for faster inference | — |
-v, --verbose |
Enable debug logging | — |
export OPENROUTER_API_KEY="sk-or-xxx"
export OPENROUTER_MODEL="moonshotai/kimi-k2"
cd swebench_harness && ./run.sh --max-instances 5This project is licensed under the MIT License.
This repository is original Python code and does not include Anthropic’s Claude Code source. We reference Claude Code as the baseline product we benchmarked and compared against (e.g., on SWE-bench); Anthropic’s software remains under its own license, which does not apply to this codebase.

