English | δΈζ |νκ΅μ΄ | ζ₯ζ¬θͺ | Deutsch | PortuguΓͺs
- 00:41 AM, Apr 04, 2026: v3.05 β Voice input (
voice/package):sounddeviceβarecordβ SoX recording backends,faster-whisperβopenai-whisperβ OpenAI API STT backends. Smart keyterm extraction from git branch + project name + recent files passed as Whisperinitial_promptfor coding-domain accuracy./voice,/voice status,/voice lang <code>REPL commands. Works fully offline with no API key. 29 new tests (~11.6K lines of Python). - 10:29 PM, Apr 03, 2026: v3.04 β Expanded tool coverage:
NotebookEdit(edit Jupyter.ipynbcells β replace/insert/delete with full JSON round-trip) andGetDiagnostics(LSP-style diagnostics via pyright/mypy/flake8/tsc/shellcheck). Also fixed a pre-existing schema-index bug in_register_builtinsby switching to name-based lookup (~10.5K lines of Python). - 06:00 PM, Apr 03, 2026: v3.03 β Task management system (
task/package):TaskCreate/TaskUpdate/TaskGet/TaskListtools with sequential IDs, dependency edges (blocks/blocked_by), metadata, persistence to.nano_claude/tasks.json, thread-safe store,/tasksREPL command, 37 new tests (~9500 lines of Python). - 02:50 PM, Apr 03, 2026: v3.02 β Plugin system (
plugin/package): install/uninstall/enable/disable/update via/pluginCLI, recommendation engine (keyword+tag matching), multi-scope (user/project), git-based marketplace.AskUserQuestiontool: interactive mid-task user prompts with numbered options and free-text input (~8500 lines of Python). - 10:00 AM, Apr 03, 2026: v3.01 β MCP (Model Context Protocol) support:
mcp/package, stdio + SSE + HTTP transports, auto tool discovery,/mcpcommand, 34 new tests (~7000 lines of Python). - 12:20 PM, Apr 02, 2026: v3.0 β Multi-agent packages (
multi_agent/), memory package (memory/), skill package (skill/) with built-in skills, argument substitution, fork/inline execution, AI memory search, git worktree isolation, agent type definitions (~5000 lines of Python), see update. - 10:00 AM, Apr 02, 2026: v2.0 β Context compression, memory, sub-agents, skills, diff view, tool plugin system (~3400 lines of Python Code).
- 01:47 PM, Apr 01, 2026: Support VLLM inference (~2000 lines of Python Code).
- 11:30 AM, Apr 01, 2026: Support more closed-source models and open-source models: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (~1700 lines of Python Code).
- 09:50 AM, Apr 01, 2026: Support more closed-source models: Claude, GPT, Gemini. (~1300 lines of Python Code).
- 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (~900 lines of Python Code).
Nano Claude Code: A Lightweight and Easy-to-Use Python Reimplementation of Claude Code Supporting Any Model, such as Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint.
- Why Nano Claude Code
- Features
- Supported Models
- Installation
- Usage: Closed-Source API Models
- Usage: Open-Source Models (Local)
- Model Name Format
- CLI Reference
- Slash Commands (REPL)
- Configuring API Keys
- Permission System
- Built-in Tools
- Memory
- Skills
- Sub-Agents
- MCP (Model Context Protocol)
- Plugin System
- AskUserQuestion Tool
- Task Management
- Voice Input
- Context Compression
- Diff View
- CLAUDE.md Support
- Session Management
- Project Structure
- FAQ
Claude Code is a powerful, production-grade AI coding assistant β but its source code is a compiled, 12 MB TypeScript/Node.js bundle (~1,300 files, ~283K lines). It is tightly coupled to the Anthropic API, hard to modify, and impossible to run against a local or alternative model.
Nano Claude Code reimplements the same core loop in ~10K lines of readable Python, keeping everything you need and dropping what you don't. See here for more detailed analysis (Nano Claude code v3.03), English version and Chinese version
| Dimension | Claude Code (TypeScript) | Nano Claude Code (Python) |
|---|---|---|
| Language | TypeScript + React/Ink | Python 3.8+ |
| Source files | ~1,332 TS/TSX files | 51 Python files |
| Lines of code | ~283K | ~11.6K |
| Built-in tools | 44+ | 23 |
| Slash commands | 88 | 18 |
| Voice input | Proprietary Anthropic WebSocket (OAuth required) | Local Whisper / OpenAI API β works offline, no subscription |
| Model providers | Anthropic only | 7+ (Anthropic Β· OpenAI Β· Gemini Β· Kimi Β· Qwen Β· DeepSeek Β· Ollama Β· β¦) |
| Local models | No | Yes β Ollama, LM Studio, vLLM, any OpenAI-compatible endpoint |
| Build step required | Yes (Bun + esbuild) | No β run directly with python nano_claude.py |
| Runtime extensibility | Closed (compile-time) | Open β register_tool() at runtime, Markdown skills, git plugins |
| Task dependency graph | No | Yes β blocks / blocked_by edges in task/ package |
- UI quality β React/Ink component tree with streaming rendering, fine-grained diff visualization, and dialog systems.
- Tool breadth β 44 tools including
RemoteTrigger,EnterWorktree, and more UI-integrated tools. - Enterprise features β MDM-managed config, team permission sync, OAuth, keychain storage, GrowthBook feature flags.
- AI-driven memory extraction β
extractMemoriesservice proactively extracts knowledge from conversations without explicit tool calls. - Production reliability β single distributable
cli.js, comprehensive test coverage, version-locked releases.
- Multi-provider β switch between Claude, GPT-4o, Gemini 2.5 Pro, DeepSeek, Qwen, or a local Llama model with
--modelor/modelβ no recompile needed. - Local model support β run entirely offline with Ollama, LM Studio, or any vLLM-hosted model.
- Readable source β the full agent loop is 174 lines (
agent.py). Any Python developer can read, fork, and extend it in minutes. - Zero build β
pip install -r requirements.txtand you're running. Changes take effect immediately. - Dynamic extensibility β register new tools at runtime with
register_tool(ToolDef(...)), install skill packs from git URLs, or wire in any MCP server. - Task dependency graph β
TaskCreate/TaskUpdatesupportblocks/blocked_byedges for structured multi-step planning (not available in Claude Code). - Two-layer context compression β rule-based snip + AI summarization, configurable via
preserve_last_n_turns. - Notebook editing β
NotebookEditdirectly manipulates.ipynbJSON (replace/insert/delete cells) with no kernel required. - Diagnostics without LSP server β
GetDiagnosticschains pyright β mypy β flake8 β py_compile for Python and tsc/shellcheck for other languages, with zero configuration. - Offline voice input β
/voicerecords viasounddevice/arecord/SoX, transcribes with localfaster-whisper(no API key, no subscription), and auto-submits. Keyterms from your git branch and project files boost coding-term accuracy.
Agent loop β Nano uses a Python generator that yields typed events (TextChunk, ToolStart, ToolEnd, TurnDone). The entire loop is visible in one file, making it easy to add hooks, custom renderers, or logging.
Tool registration β every tool is a ToolDef(name, schema, func, read_only, concurrent_safe) dataclass. Any module can call register_tool() at import time; MCP servers, plugins, and skills all use the same mechanism.
Context compression
| Claude Code | Nano Claude Code | |
|---|---|---|
| Trigger | Exact token count | len / 3.5 estimate, fires at 70 % |
| Layer 1 | β | Snip: truncate old tool outputs (no API cost) |
| Layer 2 | AI summarization | AI summarization of older turns |
| Control | System-managed | preserve_last_n_turns parameter |
Memory β Claude Code's extractMemories service has the model proactively surface facts. Nano's memory/ package is tool-driven: the model calls MemorySave explicitly, which is more predictable and auditable.
- Developers who want to use a local or non-Anthropic model as their coding assistant.
- Researchers studying how agentic coding assistants work β the entire system fits in one screen.
- Teams who need a hackable baseline to add proprietary tools, custom permission policies, or specialised agent types.
- Anyone who wants Claude Code-style productivity without a Node.js build chain.
| Feature | Details |
|---|---|
| Multi-provider | Anthropic Β· OpenAI Β· Gemini Β· Kimi Β· Qwen Β· Zhipu Β· DeepSeek Β· Ollama Β· LM Studio Β· Custom endpoint |
| Interactive REPL | readline history, Tab-complete slash commands |
| Agent loop | Streaming API + automatic tool-use loop |
| 23 built-in tools | Read Β· Write Β· Edit Β· Bash Β· Glob Β· Grep Β· WebFetch Β· WebSearch Β· NotebookEdit Β· GetDiagnostics Β· MemorySave Β· MemoryDelete Β· MemorySearch Β· MemoryList Β· Agent Β· SendMessage Β· CheckAgentResult Β· ListAgentTasks Β· ListAgentTypes Β· Skill Β· SkillList Β· AskUserQuestion Β· TaskCreate/Update/Get/List Β· (MCP + plugin tools auto-added at startup) |
| MCP integration | Connect any MCP server (stdio/SSE/HTTP), tools auto-registered and callable by Claude |
| Plugin system | Install/uninstall/enable/disable/update plugins from git URLs or local paths; multi-scope (user/project); recommendation engine |
| AskUserQuestion | Claude can pause and ask the user a clarifying question mid-task, with optional numbered choices |
| Task management | TaskCreate/Update/Get/List tools; sequential IDs; dependency edges; metadata; persisted to .nano_claude/tasks.json; /tasks REPL command |
| Diff view | Git-style red/green diff display for Edit and Write |
| Context compression | Auto-compact long conversations to stay within model limits |
| Persistent memory | Dual-scope memory (user + project) with 4 types, AI search, staleness warnings |
| Multi-agent | Spawn typed sub-agents (coder/reviewer/researcher/β¦), git worktree isolation, background mode |
| Skills | Built-in /commit Β· /review + custom markdown skills with argument substitution and fork/inline execution |
| Plugin tools | Register custom tools via tool_registry.py |
| Permission system | auto / accept-all / manual modes |
| 18 slash commands | /model Β· /config Β· /save Β· /cost Β· /memory Β· /skills Β· /agents Β· /voice Β· β¦ |
| Voice input | Record β transcribe β auto-submit. Backends: sounddevice / arecord / SoX + faster-whisper / openai-whisper / OpenAI API. Works fully offline. |
| Context injection | Auto-loads CLAUDE.md, git status, cwd, persistent memory |
| Session persistence | Save / load conversations to ~/.nano_claude/sessions/ |
| Extended Thinking | Toggle on/off (Claude models only) |
| Cost tracking | Token usage + estimated USD cost |
| Non-interactive mode | --print flag for scripting / CI |
| Provider | Model | Context | Strengths | API Key Env |
|---|---|---|---|---|
| Anthropic | claude-opus-4-6 |
200k | Most capable, best for complex reasoning | ANTHROPIC_API_KEY |
| Anthropic | claude-sonnet-4-6 |
200k | Balanced speed & quality | ANTHROPIC_API_KEY |
| Anthropic | claude-haiku-4-5-20251001 |
200k | Fast, cost-efficient | ANTHROPIC_API_KEY |
| OpenAI | gpt-4o |
128k | Strong multimodal & coding | OPENAI_API_KEY |
| OpenAI | gpt-4o-mini |
128k | Fast, cheap | OPENAI_API_KEY |
| OpenAI | o3-mini |
200k | Strong reasoning | OPENAI_API_KEY |
| OpenAI | o1 |
200k | Advanced reasoning | OPENAI_API_KEY |
gemini-2.5-pro-preview-03-25 |
1M | Long context, multimodal | GEMINI_API_KEY |
|
gemini-2.0-flash |
1M | Fast, large context | GEMINI_API_KEY |
|
gemini-1.5-pro |
2M | Largest context window | GEMINI_API_KEY |
|
| Moonshot (Kimi) | moonshot-v1-8k |
8k | Chinese & English | MOONSHOT_API_KEY |
| Moonshot (Kimi) | moonshot-v1-32k |
32k | Chinese & English | MOONSHOT_API_KEY |
| Moonshot (Kimi) | moonshot-v1-128k |
128k | Long context | MOONSHOT_API_KEY |
| Alibaba (Qwen) | qwen-max |
32k | Best Qwen quality | DASHSCOPE_API_KEY |
| Alibaba (Qwen) | qwen-plus |
128k | Balanced | DASHSCOPE_API_KEY |
| Alibaba (Qwen) | qwen-turbo |
1M | Fast, cheap | DASHSCOPE_API_KEY |
| Alibaba (Qwen) | qwq-32b |
32k | Strong reasoning | DASHSCOPE_API_KEY |
| Zhipu (GLM) | glm-4-plus |
128k | Best GLM quality | ZHIPU_API_KEY |
| Zhipu (GLM) | glm-4 |
128k | General purpose | ZHIPU_API_KEY |
| Zhipu (GLM) | glm-4-flash |
128k | Free tier available | ZHIPU_API_KEY |
| DeepSeek | deepseek-chat |
64k | Strong coding | DEEPSEEK_API_KEY |
| DeepSeek | deepseek-reasoner |
64k | Chain-of-thought reasoning | DEEPSEEK_API_KEY |
| Model | Size | Strengths | Pull Command |
|---|---|---|---|
llama3.3 |
70B | General purpose, strong reasoning | ollama pull llama3.3 |
llama3.2 |
3B / 11B | Lightweight | ollama pull llama3.2 |
qwen2.5-coder |
7B / 32B | Best for coding tasks | ollama pull qwen2.5-coder |
qwen2.5 |
7B / 72B | Chinese & English | ollama pull qwen2.5 |
deepseek-r1 |
7Bβ70B | Reasoning, math | ollama pull deepseek-r1 |
deepseek-coder-v2 |
16B | Coding | ollama pull deepseek-coder-v2 |
mistral |
7B | Fast, efficient | ollama pull mistral |
mixtral |
8x7B | Strong MoE model | ollama pull mixtral |
phi4 |
14B | Microsoft, strong reasoning | ollama pull phi4 |
gemma3 |
4B / 12B / 27B | Google open model | ollama pull gemma3 |
codellama |
7B / 34B | Code generation | ollama pull codellama |
Note: Tool calling requires a model that supports function calling. Recommended local models:
qwen2.5-coder,llama3.3,mistral,phi4.
git clone <repo-url>
cd nano_claude_code
pip install -r requirements.txt
# or manually:
pip install anthropic openai httpx rich sounddeviceGet your API key at console.anthropic.com.
export ANTHROPIC_API_KEY=sk-ant-api03-...
# Default model (claude-opus-4-6)
python nano_claude.py
# Choose a specific model
python nano_claude.py --model claude-sonnet-4-6
python nano_claude.py --model claude-haiku-4-5-20251001
# Enable Extended Thinking
python nano_claude.py --model claude-opus-4-6 --thinking --verboseGet your API key at platform.openai.com.
export OPENAI_API_KEY=sk-...
python nano_claude.py --model gpt-4o
python nano_claude.py --model gpt-4o-mini
python nano_claude.py --model gpt-4.1-mini
python nano_claude.py --model o3-miniGet your API key at aistudio.google.com.
export GEMINI_API_KEY=AIza...
python nano_claude.py --model gemini/gemini-2.0-flash
python nano_claude.py --model gemini/gemini-1.5-pro
python nano_claude.py --model gemini/gemini-2.5-pro-preview-03-25Get your API key at platform.moonshot.cn.
export MOONSHOT_API_KEY=sk-...
python nano_claude.py --model kimi/moonshot-v1-32k
python nano_claude.py --model kimi/moonshot-v1-128kGet your API key at dashscope.aliyun.com.
export DASHSCOPE_API_KEY=sk-...
python nano_claude.py --model qwen/Qwen3.5-Plus
python nano_claude.py --model qwen/Qwen3-MAX
python nano_claude.py --model qwen/Qwen3.5-FlashGet your API key at open.bigmodel.cn.
export ZHIPU_API_KEY=...
python nano_claude.py --model zhipu/glm-4-plus
python nano_claude.py --model zhipu/glm-4-flash # free tierGet your API key at platform.deepseek.com.
export DEEPSEEK_API_KEY=sk-...
python nano_claude.py --model deepseek/deepseek-chat
python nano_claude.py --model deepseek/deepseek-reasonerOllama runs models locally with zero configuration. No API key required.
Step 1: Install Ollama
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Or download from https://ollama.com/downloadStep 2: Pull a model
# Best for coding (recommended)
ollama pull qwen2.5-coder # 4.7 GB (7B)
ollama pull qwen2.5-coder:32b # 19 GB (32B)
# General purpose
ollama pull llama3.3 # 42 GB (70B)
ollama pull llama3.2 # 2.0 GB (3B)
# Reasoning
ollama pull deepseek-r1 # 4.7 GB (7B)
ollama pull deepseek-r1:32b # 19 GB (32B)
# Other
ollama pull phi4 # 9.1 GB (14B)
ollama pull mistral # 4.1 GB (7B)Step 3: Start Ollama server (runs automatically on macOS; on Linux run manually)
ollama serve # starts on http://localhost:11434Step 4: Run nano claude
python nano_claude.py --model ollama/qwen2.5-coder
python nano_claude.py --model ollama/llama3.3
python nano_claude.py --model ollama/deepseek-r1List your locally available models:
ollama listThen use any model from the list:
python nano_claude.py --model ollama/<model-name>LM Studio provides a GUI to download and run models, with a built-in OpenAI-compatible server.
Step 1: Download LM Studio and install it.
Step 2: Search and download a model inside LM Studio (GGUF format).
Step 3: Go to Local Server tab β click Start Server (default port: 1234).
Step 4:
python nano_claude.py --model lmstudio/<model-name>
# e.g.:
python nano_claude.py --model lmstudio/phi-4-GGUF
python nano_claude.py --model lmstudio/qwen2.5-coder-7bThe model name should match what LM Studio shows in the server status bar.
For self-hosted inference servers (vLLM, TGI, llama.cpp server, etc.) that expose an OpenAI-compatible API:
Quick Start for option C: Step 1: Start vllm:
CUDA_VISIBLE_DEVICES=7 python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-7B-Instruct \
--host 0.0.0.0 \
--port 8000 \
--enable-auto-tool-choice \
--tool-call-parser hermes
Step 2: Start nano claudeοΌ
export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=none
python nano_claude.py --model custom/Qwen/Qwen2.5-Coder-7B-Instruct
# Example: vLLM serving Qwen2.5-Coder-32B
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-32B-Instruct \
--port 8000
# Then run nano claude pointing to your server:
python nano_claude.pyInside the REPL:
/config custom_base_url=http://localhost:8000/v1
/config custom_api_key=token-abc123 # skip if no auth
/model custom/Qwen2.5-Coder-32B-Instruct
Or set via environment:
export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=token-abc123
python nano_claude.py --model custom/Qwen2.5-Coder-32B-InstructFor a remote GPU server:
/config custom_base_url=http://192.168.1.100:8000/v1
/model custom/your-model-nameThree equivalent formats are supported:
# 1. Auto-detect by prefix (works for well-known models)
python nano_claude.py --model gpt-4o
python nano_claude.py --model gemini-2.0-flash
python nano_claude.py --model deepseek-chat
# 2. Explicit provider prefix with slash
python nano_claude.py --model ollama/qwen2.5-coder
python nano_claude.py --model kimi/moonshot-v1-128k
# 3. Explicit provider prefix with colon (also works)
python nano_claude.py --model kimi:moonshot-v1-32k
python nano_claude.py --model qwen:qwen-maxAuto-detection rules:
| Model prefix | Detected provider |
|---|---|
claude- |
anthropic |
gpt-, o1, o3 |
openai |
gemini- |
gemini |
moonshot-, kimi- |
kimi |
qwen, qwq- |
qwen |
glm- |
zhipu |
deepseek- |
deepseek |
llama, mistral, phi, gemma, mixtral, codellama |
ollama |
python nano_claude.py [OPTIONS] [PROMPT]
Options:
-p, --print Non-interactive: run prompt and exit
-m, --model MODEL Override model (e.g. gpt-4o, ollama/llama3.3)
--accept-all Auto-approve all operations (no permission prompts)
--verbose Show thinking blocks and per-turn token counts
--thinking Enable Extended Thinking (Claude only)
--version Print version and exit
-h, --help Show help
Examples:
# Interactive REPL with default model
python nano_claude.py
# Switch model at startup
python nano_claude.py --model gpt-4o
python nano_claude.py -m ollama/deepseek-r1:32b
# Non-interactive / scripting
python nano_claude.py --print "Write a Python fibonacci function"
python nano_claude.py -p "Explain the Rust borrow checker in 3 sentences" -m gemini/gemini-2.0-flash
# CI / automation (no permission prompts)
python nano_claude.py --accept-all --print "Initialize a Python project with pyproject.toml"
# Debug mode (see tokens + thinking)
python nano_claude.py --thinking --verboseType / and press Tab to autocomplete.
| Command | Description |
|---|---|
/help |
Show all commands |
/clear |
Clear conversation history |
/model |
Show current model + list all available models |
/model <name> |
Switch model (takes effect immediately) |
/config |
Show all current config values |
/config key=value |
Set a config value (persisted to disk) |
/save |
Save session (auto-named by timestamp) |
/save <filename> |
Save session to named file |
/load |
List all saved sessions |
/load <filename> |
Load a saved session |
/history |
Print full conversation history |
/context |
Show message count and token estimate |
/cost |
Show token usage and estimated USD cost |
/verbose |
Toggle verbose mode (tokens + thinking) |
/thinking |
Toggle Extended Thinking (Claude only) |
/permissions |
Show current permission mode |
/permissions <mode> |
Set permission mode: auto / accept-all / manual |
/cwd |
Show current working directory |
/cwd <path> |
Change working directory |
/memory |
List all persistent memories |
/memory <query> |
Search memories by keyword |
/skills |
List available skills |
/agents |
Show sub-agent task status |
/mcp |
List configured MCP servers and their tools |
/mcp reload |
Reconnect all MCP servers and refresh tools |
/mcp reload <name> |
Reconnect a single MCP server |
/mcp add <name> <cmd> [args] |
Add a stdio MCP server to user config |
/mcp remove <name> |
Remove a server from user config |
/voice |
Record voice, transcribe with Whisper, auto-submit as prompt |
/voice status |
Show recording and STT backend availability |
/voice lang <code> |
Set STT language (e.g. zh, en, ja; auto to detect) |
/exit / /quit |
Exit |
Switching models inside a session:
[myproject] β― /model
Current model: claude-opus-4-6 (provider: anthropic)
Available models by provider:
anthropic claude-opus-4-6, claude-sonnet-4-6, ...
openai gpt-4o, gpt-4o-mini, o3-mini, ...
ollama llama3.3, llama3.2, phi4, mistral, ...
...
[myproject] β― /model gpt-4o
Model set to gpt-4o (provider: openai)
[myproject] β― /model ollama/qwen2.5-coder
Model set to ollama/qwen2.5-coder (provider: ollama)
# Add to ~/.bashrc or ~/.zshrc
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=AIza...
export MOONSHOT_API_KEY=sk-... # Kimi
export DASHSCOPE_API_KEY=sk-... # Qwen
export ZHIPU_API_KEY=... # Zhipu GLM
export DEEPSEEK_API_KEY=sk-... # DeepSeek/config anthropic_api_key=sk-ant-...
/config openai_api_key=sk-...
/config gemini_api_key=AIza...
/config kimi_api_key=sk-...
/config qwen_api_key=sk-...
/config zhipu_api_key=...
/config deepseek_api_key=sk-...
Keys are saved to ~/.nano_claude/config.json and loaded automatically on next launch.
// ~/.nano_claude/config.json
{
"model": "qwen/qwen-max",
"max_tokens": 8192,
"permission_mode": "auto",
"verbose": false,
"thinking": false,
"qwen_api_key": "sk-...",
"kimi_api_key": "sk-...",
"deepseek_api_key": "sk-..."
}| Mode | Behavior |
|---|---|
auto (default) |
Read-only operations always allowed. Prompts before Bash commands and file writes. |
accept-all |
Never prompts. All operations proceed automatically. |
manual |
Prompts before every single operation, including reads. |
When prompted:
Allow: Run: git commit -am "fix bug" [y/N/a(ccept-all)]
yβ approve this one actionnor Enter β denyaβ approve and switch toaccept-allfor the rest of the session
Commands always auto-approved in auto mode:
ls, cat, head, tail, wc, pwd, echo, git status, git log, git diff, git show, find, grep, rg, python, node, pip show, npm list, and other read-only shell commands.
| Tool | Description | Key Parameters |
|---|---|---|
Read |
Read file with line numbers | file_path, limit, offset |
Write |
Create or overwrite file (shows diff) | file_path, content |
Edit |
Exact string replacement (shows diff) | file_path, old_string, new_string, replace_all |
Bash |
Execute shell command | command, timeout (default 30s) |
Glob |
Find files by glob pattern | pattern (e.g. **/*.py), path |
Grep |
Regex search in files (uses ripgrep if available) | pattern, path, glob, output_mode |
WebFetch |
Fetch and extract text from URL | url, prompt |
WebSearch |
Search the web via DuckDuckGo | query |
| Tool | Description | Key Parameters |
|---|---|---|
NotebookEdit |
Edit a Jupyter notebook (.ipynb) cell |
notebook_path, new_source, cell_id, cell_type, edit_mode (replace/insert/delete) |
GetDiagnostics |
Get LSP-style diagnostics for a source file (pyright/mypy/flake8 for Python; tsc/eslint for JS/TS; shellcheck for shell) | file_path, language (optional override) |
| Tool | Description | Key Parameters |
|---|---|---|
MemorySave |
Save or update a persistent memory | name, type, description, content, scope |
MemoryDelete |
Delete a memory by name | name, scope |
MemorySearch |
Search memories by keyword (or AI ranking) | query, scope, use_ai, max_results |
MemoryList |
List all memories with age and metadata | scope |
| Tool | Description | Key Parameters |
|---|---|---|
Agent |
Spawn a sub-agent for a task | prompt, subagent_type, isolation, name, model, wait |
SendMessage |
Send a message to a named background agent | name, message |
CheckAgentResult |
Check status/result of a background agent | task_id |
ListAgentTasks |
List all active and finished agent tasks | β |
ListAgentTypes |
List available agent type definitions | β |
| Tool | Description | Key Parameters |
|---|---|---|
Skill |
Invoke a skill by name from within the conversation | name, args |
SkillList |
List all available skills with triggers and metadata | β |
MCP tools are discovered automatically from configured servers and registered under the name mcp__<server>__<tool>. Claude can use them exactly like built-in tools.
| Example tool name | Where it comes from |
|---|---|
mcp__git__git_status |
git server, git_status tool |
mcp__filesystem__read_file |
filesystem server, read_file tool |
mcp__myserver__my_action |
custom server you configured |
Adding custom tools: See Architecture Guide for how to register your own tools.
The model can remember things across conversations using the built-in memory system.
How it works: Memories are stored as markdown files. There are two scopes:
- User scope (
~/.nano_claude/memory/) β follows you across all projects - Project scope (
.nano_claude/memory/in cwd) β specific to the current repo
A MEMORY.md index (β€ 200 lines / 25 KB) is auto-rebuilt on every save or delete and injected into the system prompt so Claude always has an overview.
Memory types:
| Type | Use for |
|---|---|
user |
Your role, preferences, background |
feedback |
How you want the model to behave |
project |
Ongoing work, deadlines, decisions |
reference |
Links to external resources |
Memory file format (~/.nano_claude/memory/coding_style.md):
---
name: coding style
description: Python formatting preferences
type: feedback
created: 2026-04-02
---
Prefer 4-space indentation and full type hints in all Python code.
**Why:** user explicitly stated this preference.
**How to apply:** apply to every Python file written or edited.Example interaction:
You: Remember that I prefer 4-space indentation and type hints in all Python code.
AI: [calls MemorySave] Memory saved: coding_style [feedback/user]
You: /memory
[feedback/user] coding_style (today): Python formatting preferences
You: /memory python
[feedback/user] coding_style: Prefers 4-space indent and type hints in Python
Staleness warnings: Memories older than 1 day get a freshness note in /memory output so you know when to review or update them.
AI-ranked search: MemorySearch(query="...", use_ai=true) uses the model to rank results by relevance rather than simple keyword matching.
Skills are reusable prompt templates that give the model specialized capabilities. Two built-in skills ship out of the box β no setup required.
Built-in skills:
| Trigger | Description |
|---|---|
/commit |
Review staged changes and create a well-structured git commit |
/review [PR] |
Review code or PR diff with structured feedback |
Quick start β custom skill:
mkdir -p ~/.nano_claude/skillsCreate ~/.nano_claude/skills/deploy.md:
---
name: deploy
description: Deploy to an environment
triggers: [/deploy]
allowed-tools: [Bash, Read]
when_to_use: Use when the user wants to deploy a version to an environment.
argument-hint: [env] [version]
arguments: [env, version]
context: inline
---
Deploy $VERSION to the $ENV environment.
Full args: $ARGUMENTSNow use it:
You: /deploy staging 2.1.0
AI: [deploys version 2.1.0 to staging]
Argument substitution:
$ARGUMENTSβ the full raw argument string$ARG_NAMEβ positional substitution by named argument (first word β first name)- Missing args become empty strings
Execution modes:
context: inline(default) β runs inside current conversation historycontext: forkβ runs as an isolated sub-agent with fresh history; supportsmodeloverride
Priority (highest wins): project-level > user-level > built-in
List skills: /skills β shows triggers, argument hint, source, and when_to_use
Skill search paths:
./.nano_claude/skills/ # project-level (overrides user-level)
~/.nano_claude/skills/ # user-level
The model can spawn independent sub-agents to handle tasks in parallel.
Specialized agent types β built-in:
| Type | Optimized for |
|---|---|
general-purpose |
Research, exploration, multi-step tasks |
coder |
Writing, reading, and modifying code |
reviewer |
Security, correctness, and code quality analysis |
researcher |
Web search and documentation lookup |
tester |
Writing and running tests |
Basic usage:
You: Search this codebase for all TODO comments and summarize them.
AI: [calls Agent(prompt="...", subagent_type="researcher")]
Sub-agent reads files, greps for TODOs...
Result: Found 12 TODOs across 5 files...
Background mode β spawn without waiting, collect result later:
AI: [calls Agent(prompt="run all tests", name="test-runner", wait=false)]
AI: [continues other work...]
AI: [calls CheckAgentResult / SendMessage to follow up]
Git worktree isolation β agents work on an isolated branch with no conflicts:
Agent(prompt="refactor auth module", isolation="worktree")
The worktree is auto-cleaned up if no changes were made; otherwise the branch name is reported.
Custom agent types β create ~/.nano_claude/agents/myagent.md:
---
name: myagent
description: Specialized for X
model: claude-haiku-4-5-20251001
tools: [Read, Grep, Bash]
---
Extra system prompt for this agent type.List running agents: /agents
Sub-agents have independent conversation history, share the file system, and are limited to 3 levels of nesting.
MCP lets you connect any external tool server β local subprocess or remote HTTP β and Claude can use its tools automatically. This is the same protocol Claude Code uses to extend its capabilities.
| Transport | Config type |
Description |
|---|---|---|
| stdio | "stdio" |
Spawn a local subprocess (most common) |
| SSE | "sse" |
HTTP Server-Sent Events stream |
| HTTP | "http" |
Streamable HTTP POST (newer servers) |
Place a .mcp.json file in your project directory or edit ~/.nano_claude/mcp.json for user-wide servers.
{
"mcpServers": {
"git": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-git"]
},
"filesystem": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-filesystem", "/tmp"]
},
"my-remote": {
"type": "sse",
"url": "http://localhost:8080/sse",
"headers": {"Authorization": "Bearer my-token"}
}
}
}Config priority: .mcp.json (project) overrides ~/.nano_claude/mcp.json (user) by server name.
# Install a popular MCP server
pip install uv # uv includes uvx
uvx mcp-server-git --help # verify it works
# Add to user config via REPL
/mcp add git uvx mcp-server-git
# Or create .mcp.json in your project dir, then:
/mcp reload/mcp # list servers + their tools + connection status
/mcp reload # reconnect all servers, refresh tool list
/mcp reload git # reconnect a single server
/mcp add myserver uvx mcp-server-x # add stdio server
/mcp remove myserver # remove from user config
Once connected, Claude can call MCP tools directly:
You: What files changed in the last git commit?
AI: [calls mcp__git__git_diff_staged()]
β shows diff output from the git MCP server
Tool names follow the pattern mcp__<server_name>__<tool_name>. All characters
that are not alphanumeric or _ are automatically replaced with _.
| Server | Install | Provides |
|---|---|---|
mcp-server-git |
uvx mcp-server-git |
git operations (status, diff, log, commit) |
mcp-server-filesystem |
uvx mcp-server-filesystem <path> |
file read/write/list |
mcp-server-fetch |
uvx mcp-server-fetch |
HTTP fetch tool |
mcp-server-postgres |
uvx mcp-server-postgres <conn-str> |
PostgreSQL queries |
mcp-server-sqlite |
uvx mcp-server-sqlite --db-path x.db |
SQLite queries |
mcp-server-brave-search |
uvx mcp-server-brave-search |
Brave web search |
Browse the full registry at modelcontextprotocol.io/servers
The plugin/ package lets you extend nano-claude-code with additional tools, skills, and MCP servers from git repositories or local directories.
/plugin install my-plugin@https://github.com/user/my-plugin
/plugin install local-plugin@/path/to/local/plugin/plugin # list installed plugins
/plugin enable my-plugin # enable a disabled plugin
/plugin disable my-plugin # disable without uninstalling
/plugin disable-all # disable all plugins
/plugin update my-plugin # pull latest from git
/plugin uninstall my-plugin
/plugin info my-plugin # show manifest details/plugin recommend # auto-detect from project files
/plugin recommend "docker database" # recommend by keyword contextThe engine matches your context against a curated marketplace (git-tools, python-linter, docker-tools, sql-tools, test-runner, diagram-tools, aws-tools, web-scraper) using tag and keyword scoring.
{
"name": "my-plugin",
"version": "0.1.0",
"description": "Does something useful",
"author": "you",
"tags": ["git", "python"],
"tools": ["tools"], // Python module(s) that export TOOL_DEFS
"skills": ["skills/my.md"],
"mcp_servers": {},
"dependencies": ["httpx"] // pip packages
}Alternatively use YAML frontmatter in PLUGIN.md.
| Scope | Location | Config |
|---|---|---|
| user (default) | ~/.nano_claude/plugins/ |
~/.nano_claude/plugins.json |
| project | .nano_claude/plugins/ |
.nano_claude/plugins.json |
Use --project flag: /plugin install name@url --project
Claude can pause mid-task and interactively ask you a question before proceeding.
Example invocation by Claude:
{
"tool": "AskUserQuestion",
"question": "Which database should I use?",
"options": [
{"label": "SQLite", "description": "Simple, file-based"},
{"label": "PostgreSQL", "description": "Full-featured, requires server"}
],
"allow_freetext": true
}What you see in the terminal:
β Question from assistant:
Which database should I use?
[1] SQLite β Simple, file-based
[2] PostgreSQL β Full-featured, requires server
[0] Type a custom answer
Your choice (number or text):
- Select by number or type free text directly
- Claude receives your answer and continues the task
- 5-minute timeout (returns "(no answer β timeout)" if unanswered)
The task/ package gives Claude (and you) a structured task list for tracking multi-step work within a session.
| Tool | Parameters | What it does |
|---|---|---|
TaskCreate |
subject, description, active_form?, metadata? |
Create a task; returns #id created: subject |
TaskUpdate |
task_id, subject?, description?, status?, owner?, add_blocks?, add_blocked_by?, metadata? |
Update any field; status='deleted' removes the task |
TaskGet |
task_id |
Return full details of one task |
TaskList |
(none) | List all tasks with status icons and pending blockers |
Valid statuses: pending β in_progress β completed / cancelled / deleted
TaskUpdate(task_id="3", add_blocked_by=["1","2"])
# Task 3 is now blocked by tasks 1 and 2.
# Reverse edges are set automatically: tasks 1 and 2 get task 3 in their "blocks" list.
Completed tasks are treated as resolved β TaskList hides their blocking effect on dependents.
Tasks are saved to .nano_claude/tasks.json in the current working directory after every mutation and reloaded on first access.
/tasks list all tasks
/tasks create <subject> quick-create a task
/tasks start <id> mark in_progress
/tasks done <id> mark completed
/tasks cancel <id> mark cancelled
/tasks delete <id> remove a task
/tasks get <id> show full details
/tasks clear delete all tasks
User: implement the login feature
Claude:
TaskCreate(subject="Design auth schema", description="JWT vs session") β #1
TaskCreate(subject="Write login endpoint", description="POST /auth/login") β #2
TaskCreate(subject="Write tests", description="Unit + integration") β #3
TaskUpdate(task_id="2", add_blocked_by=["1"])
TaskUpdate(task_id="3", add_blocked_by=["2"])
TaskUpdate(task_id="1", status="in_progress", active_form="Designing schema")
... (does the work) ...
TaskUpdate(task_id="1", status="completed")
TaskList() β task 2 is now unblocked
...
Nano Claude Code v3.05 adds a fully offline voice-to-prompt pipeline. Speak your request β it is transcribed and submitted as if you had typed it.
# 1. Install a recording backend (choose one)
pip install sounddevice # recommended: cross-platform, no extra binary
# sudo apt install alsa-utils # Linux arecord fallback
# sudo apt install sox # SoX rec fallback
# 2. Install a local STT backend (recommended β works offline, no API key)
pip install faster-whisper numpy
# 3. Start Nano Claude Code and speak
python nano_claude.py
[myproject] β― /voice
π Listeningβ¦ (speak now, auto-stops on silence, Ctrl+C to cancel)
π ββββ
β Transcribed: "fix the authentication bug in user.py"
[auto-submittingβ¦]| Backend | Install | Notes |
|---|---|---|
faster-whisper |
pip install faster-whisper |
Recommended β local, offline, fastest, GPU optional |
openai-whisper |
pip install openai-whisper |
Local, offline, original OpenAI model |
| OpenAI Whisper API | set OPENAI_API_KEY |
Cloud, requires internet + API key |
Override the Whisper model size with NANO_CLAUDE_WHISPER_MODEL (default: base):
export NANO_CLAUDE_WHISPER_MODEL=small # better accuracy, slower
export NANO_CLAUDE_WHISPER_MODEL=tiny # fastest, lightest| Backend | Install | Notes |
|---|---|---|
sounddevice |
pip install sounddevice |
Recommended β cross-platform, Python-native |
arecord |
sudo apt install alsa-utils |
Linux ALSA, no pip needed |
sox rec |
sudo apt install sox / brew install sox |
Built-in silence detection |
Before each recording, Nano extracts coding vocabulary from:
- Git branch (e.g.
feat/voice-inputβ "feat", "voice", "input") - Project root name (e.g. "nano-claude-code")
- Recent source file stems (e.g.
authentication_handler.pyβ "authentication", "handler") - Global coding terms:
MCP,grep,TypeScript,OAuth,regex,gRPC, β¦
These are passed as Whisper's initial_prompt so the STT engine prefers correct spellings of coding terms.
| Command | Description |
|---|---|
/voice |
Record voice and auto-submit the transcript as your next prompt |
/voice status |
Show which recording and STT backends are available |
/voice lang <code> |
Set transcription language (en, zh, ja, de, fr, β¦ default: auto) |
| Claude Code | Nano Claude Code v3.05 | |
|---|---|---|
| STT service | Anthropic private WebSocket (voice_stream) |
faster-whisper / openai-whisper / OpenAI API |
| Requires Anthropic OAuth | Yes | No |
| Works offline | No | Yes (with local Whisper) |
| Keyterm hints | Deepgram keyterms param |
Whisper initial_prompt (git + files + vocab) |
| Language support | Server-allowlisted codes | Any language Whisper supports |
Long conversations are automatically compressed to stay within the model's context window.
Two layers:
- Snip β Old tool outputs (file reads, bash results) are truncated after a few turns. Fast, no API cost.
- Auto-compact β When token usage exceeds 70% of the context limit, older messages are summarized by the model into a concise recap.
This happens transparently. You don't need to do anything.
When the model edits or overwrites a file, you see a git-style diff:
Changes applied to config.py:
--- a/config.py
+++ b/config.py
@@ -12,7 +12,7 @@
"model": "claude-opus-4-6",
- "max_tokens": 8192,
+ "max_tokens": 16384,
"permission_mode": "auto",Green lines = added, red lines = removed. New file creations show a summary instead.
Place a CLAUDE.md file in your project to give the model persistent context about your codebase. Nano Claude automatically finds and injects it into the system prompt.
~/.claude/CLAUDE.md # Global β applies to all projects
/your/project/CLAUDE.md # Project-level β found by walking up from cwd
Example CLAUDE.md:
# Project: FastAPI Backend
## Stack
- Python 3.12, FastAPI, PostgreSQL, SQLAlchemy 2.0, Alembic
- Tests: pytest, coverage target 90%
## Conventions
- Format with black, lint with ruff
- Full type annotations required
- New endpoints must have corresponding tests
## Important Notes
- Never hard-code credentials β use environment variables
- Do not modify existing Alembic migration files
- The `staging` branch deploys automatically to staging on push# Inside REPL:
/save # auto-name: session_20260401_143022.json
/save debug_auth_bug # named save
/load # list all saved sessions
/load debug_auth_bug # resume a session
/load session_20260401_143022.jsonSessions are stored as JSON in ~/.nano_claude/sessions/.
nano_claude_code/
βββ nano_claude.py # Entry point: REPL + slash commands + diff rendering
βββ agent.py # Agent loop: streaming, tool dispatch, compaction
βββ providers.py # Multi-provider: Anthropic, OpenAI-compat streaming
βββ tools.py # Core tools (Read/Write/Edit/Bash/Glob/Grep/Web/NotebookEdit/GetDiagnostics) + registry wiring
βββ tool_registry.py # Tool plugin registry: register, lookup, execute
βββ compaction.py # Context compression: snip + auto-summarize
βββ context.py # System prompt builder: CLAUDE.md + git + memory
βββ config.py # Config load/save/defaults
β
βββ multi_agent/ # Multi-agent package
β βββ __init__.py # Re-exports
β βββ subagent.py # AgentDefinition, SubAgentManager, worktree helpers
β βββ tools.py # Agent, SendMessage, CheckAgentResult, ListAgentTasks, ListAgentTypes
βββ subagent.py # Backward-compat shim β multi_agent/
β
βββ memory/ # Memory package
β βββ __init__.py # Re-exports
β βββ types.py # MEMORY_TYPES and format guidance
β βββ store.py # save/load/delete/search, MEMORY.md index rebuilding
β βββ scan.py # MemoryHeader, age/freshness helpers
β βββ context.py # get_memory_context(), truncation, AI search
β βββ tools.py # MemorySave, MemoryDelete, MemorySearch, MemoryList
βββ memory.py # Backward-compat shim β memory/
β
βββ skill/ # Skill package
β βββ __init__.py # Re-exports; imports builtin to register built-ins
β βββ loader.py # SkillDef, parse, load_skills, find_skill, substitute_arguments
β βββ builtin.py # Built-in skills: /commit, /review
β βββ executor.py # execute_skill(): inline or forked sub-agent
β βββ tools.py # Skill, SkillList
βββ skills.py # Backward-compat shim β skill/
β
βββ mcp/ # MCP (Model Context Protocol) package
β βββ __init__.py # Re-exports
β βββ types.py # MCPServerConfig, MCPTool, MCPServerState, JSON-RPC helpers
β βββ client.py # StdioTransport, HttpTransport, MCPClient, MCPManager
β βββ config.py # Load .mcp.json (project) + ~/.nano_claude/mcp.json (user)
β βββ tools.py # Auto-discover + register MCP tools into tool_registry
β
βββ voice/ # Voice input package (v3.05)
β βββ __init__.py # Public API: check_voice_deps, voice_input
β βββ recorder.py # Audio capture: sounddevice β arecord β sox rec
β βββ stt.py # STT: faster-whisper β openai-whisper β OpenAI API
β βββ keyterms.py # Coding-domain vocab from git branch + project files
β
βββ tests/ # 239+ unit tests
βββ test_mcp.py
βββ test_memory.py
βββ test_skills.py
βββ test_subagent.py
βββ test_tool_registry.py
βββ test_compaction.py
βββ test_diff_view.py
βββ test_voice.py # 29 voice tests (no hardware required)
For developers: Each feature package (
multi_agent/,memory/,skill/,mcp/,voice/) is self-contained. Add custom tools by callingregister_tool(ToolDef(...))from any module imported bytools.py.
Q: How do I add an MCP server?
Option 1 β via REPL (stdio server):
/mcp add git uvx mcp-server-git
Option 2 β create .mcp.json in your project:
{
"mcpServers": {
"git": {"type": "stdio", "command": "uvx", "args": ["mcp-server-git"]}
}
}Then run /mcp reload or restart. Use /mcp to check connection status.
Q: An MCP server is showing an error. How do I debug it?
/mcp # shows error message per server
/mcp reload git # try reconnecting
If the server uses stdio, make sure the command is in your $PATH:
which uvx # should print a path
uvx mcp-server-git # run manually to see errorsQ: Can I use MCP servers that require authentication?
For HTTP/SSE servers with a Bearer token:
{
"mcpServers": {
"my-api": {
"type": "sse",
"url": "https://myserver.example.com/sse",
"headers": {"Authorization": "Bearer sk-my-token"}
}
}
}For stdio servers with env-based auth:
{
"mcpServers": {
"brave": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-brave-search"],
"env": {"BRAVE_API_KEY": "your-key"}
}
}
}Q: Tool calls don't work with my local Ollama model.
Not all models support function calling. Use one of the recommended tool-calling models: qwen2.5-coder, llama3.3, mistral, or phi4.
ollama pull qwen2.5-coder
python nano_claude.py --model ollama/qwen2.5-coderQ: How do I connect to a remote GPU server running vLLM?
/config custom_base_url=http://your-server-ip:8000/v1
/config custom_api_key=your-token
/model custom/your-model-name
Q: How do I check my API cost?
/cost
Input tokens: 3,421
Output tokens: 892
Est. cost: $0.0648 USD
Q: Can I use multiple API keys in the same session?
Yes. Set all the keys you need upfront (via env vars or /config). Then switch models freely β each call uses the key for the active provider.
Q: How do I make a model available across all projects?
Add keys to ~/.bashrc or ~/.zshrc. Set the default model in ~/.nano_claude/config.json:
{ "model": "claude-sonnet-4-6" }Q: Qwen / Zhipu returns garbled text.
Ensure your DASHSCOPE_API_KEY / ZHIPU_API_KEY is correct and the account has sufficient quota. Both providers use UTF-8 and handle Chinese well.
Q: Can I pipe input to nano claude?
echo "Explain this file" | python nano_claude.py --print --accept-all
cat error.log | python nano_claude.py -p "What is causing this error?"Q: How do I run it as a CLI tool from anywhere?
# Add an alias to ~/.bashrc or ~/.zshrc
alias nc='python /path/to/nano_claude_code/nano_claude.py'
# Or install as a script
pip install -e . # if setup.py existsQ: How do I set up voice input?
# Minimal setup (local, offline, no API key):
pip install sounddevice faster-whisper numpy
# Then in the REPL:
/voice status # verify backends are detected
/voice # speak your promptOn first use, faster-whisper downloads the base model (~150 MB) automatically.
Use a larger model for better accuracy: export NANO_CLAUDE_WHISPER_MODEL=small
Q: Voice input transcribes my words wrong (misses coding terms).
The keyterm booster already injects coding vocabulary from your git branch and project files.
For persistent domain terms, put them in a .nano_claude/voice_keyterms.txt file (one term per line) β this is checked automatically on each recording.
Q: Can I use voice input in Chinese / Japanese / other languages?
Yes. Set the language before recording:
/voice lang zh # Mandarin Chinese
/voice lang ja # Japanese
/voice lang auto # reset to auto-detect (default)
Whisper supports 99 languages. auto detection works well but explicit codes improve accuracy for short utterances.

