SmallCode

AI coding agent optimized for small LLMs (≤20B parameters)

SmallCode is a terminal-native coding agent designed from the ground up to extract useful work from local models (7B-20B) running on consumer hardware. While tools like OpenCode assume frontier models with 128k+ context and perfect tool calling, SmallCode compensates for the limitations of small models through intelligent architecture.

Why SmallCode?

	OpenCode	SmallCode
Target	Frontier models (Claude, GPT-5)	7B-20B local models
Context	Dumps everything	Budget-managed, summarized
Tool calling	Assumes reliable JSON	Forgiving multi-format parser
Planning	Single-shot	TODO-file decomposed steps
Editing	Full file write	Search-and-replace patch
Privacy	API calls to cloud	Fully local, no network needed

Quick Start

# Install globally via npm
npm install -g smallcode

# Or run directly with npx
npx smallcode

# Start in your project directory
cd my-project
smallcode

Prebuilt Binaries (no Node.js needed)

Pre-compiled tarballs for Windows, macOS, and Linux are built on every release — they bundle Node.js plus all native addons so you never need node-gyp or C++ build tools.

Platform	One‑line install
Linux / macOS	`bash <(curl -fsSL https://raw.githubusercontent.com/Doorman11991/smallcode/master/install.sh)`
Windows	`iwr -Uri https://raw.githubusercontent.com/Doorman11991/smallcode/master/install.ps1 -UseBasicParsing \| iex`

The install script downloads the correct tarball for your platform, extracts it to ~/.smallcode, and adds it to your PATH. Run smallcode --help to verify.

SmallCode includes BoneScript and budget-aware-mcp as dependencies — everything installs in one go.

Requirements

Node.js 18+ (LTS recommended — 20.x or 22.x have prebuilt binaries for SQLite)
A local LLM server (LM Studio, Ollama, or any OpenAI-compatible endpoint)

Optional (for code graph + FTS5 memory search):

better-sqlite3 needs native compilation if prebuilt binaries aren't available for your Node version
Prebuilt binaries exist for Node LTS (20.x, 22.x) on Linux/macOS/Windows. no build tools needed
If you're on a non-LTS Node (23+, 25+), you'll need:
- Linux: python3, make, gcc/g++ (sudo apt install build-essential python3 or pacman -S base-devel python)
- macOS: Xcode Command Line Tools (xcode-select --install)
- Windows: Visual Studio Build Tools with "Desktop development with C++" workload, or npm install -g windows-build-tools
If build fails, SmallCode still works — it falls back to JSON-based memory automatically

Configuration

Create a .env file in your project root:

# Required
SMALLCODE_MODEL=your-model-name
SMALLCODE_BASE_URL=http://localhost:1234/v1

# Optional: escalation (auto-fallback to cloud on hard fail)
# ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# DEEPSEEK_API_KEY=sk-...

See .env.example for all options. Also supports smallcode.toml for backwards compatibility.

Architecture

SmallCode is built with a modular architecture:

bin/
├── smallcode.js        Entry point, agent loop, TUI orchestration (1570 lines)
├── config.js           Config loading, endpoint detection, auth headers
├── executor.js         Tool execution (all 18 tools)
├── tools.js            Tool definitions + 2-stage routing
├── mcp_bridge.js       Built-in code graph MCP communication
├── model_client.js     LLM API calls, streaming, validation
├── governor.js         Tool scoring, verification, decompose
├── escalation.js       Cloud model fallback (Claude/OpenAI/DeepSeek)
├── commands.js         TUI slash commands
├── tui.js              Classic TUI renderer
└── bonescript_guide.js BoneScript syntax reference

src/
├── api/index.js        Programmatic API (require('smallcode'))
├── tui/fullscreen.js   Fullscreen alternate-buffer TUI
├── plugins/loader.js   Plugin system
├── plugins/skills.js   Skill system
├── tools/              Tool routing, MCP client, validators
├── governor/           Early-stop detection, verifier, tool scorer
├── model/              Multi-model profiles + routing
└── session/            Persistence, undo, sharing, references

Key Features

MarrowScript Cognition Layer

SmallCode's intelligence is declared in MarrowScript and compiled to a production runtime. One 50-line .marrow declaration generates 1400+ lines of TypeScript with caching, retry, validation, traces, and budget enforcement — all for free.

prompt classify_task_type(user_message: string) {
  model: TinyClassifier
  timeout: 3s
  cache: { key: hash(user_message), ttl: 10m }
  retry: { max_attempts: 2, backoff: fixed, interval: 100ms }
  constraints: [output in ["coding", "editing", "search", ...]]
}

The compiled cognition layer provides:

Prompt caching — 0ms on cache hit, content-hash keys with TTL
Structured traces — trace_id/span_id for every LLM call (enable with SMALLCODE_COGNITION_LOG=stderr)
Tier-based routing — trivial tasks → tiny model, complex tasks → medium model
Token budgets — per-cost-class enforcement, never overspend
Validation + repair — schema checks with auto-retry on malformed output

BoneScript Integration

For Node.js/TypeScript backends, SmallCode uses BoneScript — write ONE .bone file and compile it to a complete project (routes, auth, DB, events, migrations, SDK, admin panel, Docker, CI). Reduces 8-15 tool calls to 1-2, dramatically improving reliability with small models.

Model Escalation

When the local model hard fails after retry + decompose, SmallCode can optionally escalate to a stronger cloud model (Claude, OpenAI, DeepSeek). Fully opt-in — requires an API key. Session-limited to prevent runaway costs.

Escalation targets (cloud, used only on hard fail):

Claude Sonnet 4.5 / 4.6, Haiku 4.5
GPT-5.4 Mini / Nano
DeepSeek V4 / V4 Pro / V4 Flash

Context Budget Engine

Never exceeds your model's context window. Tool results capped at 4k chars, mid-turn eviction drops old results when context grows too large, and semantic compression summarizes history instead of dropping it.

2-Stage Tool Routing

Halves the schema context overhead. Model picks a category (read/write/search/run/plan) first, then gets only relevant tool schemas. Critical for models with 8-16k context.

Early-Stop Detection

Detects repetition loops, patch spirals (stuck on corrupted file → forces rewrite), and greeting regression (model lost context → re-injects task). Saves tokens and time.

Forgiving Tool Call Parser

Small models produce messy output. SmallCode parses tool calls from JSON, YAML, XML, Hermes format, or plain text. Auto-repairs common mistakes (wrong param names, type mismatches).

Patch-First Editing

Search-and-replace as the primary edit primitive. Small models can't reliably reproduce entire files — they truncate, hallucinate, or drift. patch is safer and more context-efficient.

TODO-Driven Planning

Complex tasks get decomposed into atomic steps. The model reads a TODO file each turn to know where it is. Each step is validated (lint/compile) before moving on.

Model Profiles

Per-model configuration: context length, tool format (native/hermes/json/xml/text), chat template, strengths/weaknesses. Auto-adapts prompting strategy.

Working Memory

Persistent scratchpad that survives across turns. Compensates for limited reasoning depth — the model can write notes to itself.

Commands

Command	Description
`/quit`, `/q`	Exit SmallCode
`/clear`	Reset conversation
`/stats`	Show session statistics
`/tokens`	Detailed token usage report
`/budget`	Context window budget + visual bar
`/trace`	List/show/export execution traces
`/eval`	Run prompt evaluation suites
`/memory`	Show working memory
`/plan`	Show current task plan
`/model`	Show/switch model
`/profile`	Show detected model profile + routing mode
`/cognition`	Show MarrowScript cognition layer status
`/mcp`	Show connected external MCP servers
`/skill`	Manage reusable skills
`/plugin`	Install/manage plugins
`/sessions`	List/resume saved sessions
`/help`	Show all commands

Observability

SmallCode tracks token usage and execution traces automatically:

Token Monitor — Every LLM call records prompt/completion tokens. View with /tokens.
Context Budget — Visual indicator of context window usage. View with /budget.
Execution Traces — Every agent turn is recorded to .smallcode/traces/. View with /trace list.
Trace-to-Test — Generate regression tests from traces: /trace test <id>.
Prompt Evaluations — Measure classifier accuracy and tool selection: /eval classify_accuracy.

# Run evaluations from CLI
smallcode --eval classify_accuracy
smallcode --eval tool_selection

Programmatic API

Use SmallCode as a library in your own tools, CI pipelines, or TypeScript frameworks:

const { SmallCode } = require('smallcode');

const agent = new SmallCode({
  model: 'gemma-4-e4b',
  baseUrl: 'http://localhost:1234/v1',
});

// Run a task
const result = await agent.run("create hello.py that prints hello world");
console.log(result.filesCreated);  // ['hello.py']
console.log(result.toolCalls.length);  // 1
console.log(result.success);  // true

// Subscribe to events
agent.on('tool_start', ({ name, args }) => console.log(`Using: ${name}`));
agent.on('tool_end', ({ name, ms }) => console.log(`Done: ${name} (${ms}ms)`));
agent.on('error', (err) => console.error(err));

Returns a structured RunResult with: response text, tool call records, files created/edited, token usage, duration, and success status.

Tools

Tool	Description
`bone_compile`	Compile .bone to full backend project
`bone_check`	Validate .bone file (type errors, constraints)
`list_projects`	List all indexed projects with stats
`graph_search`	Code graph symbol search
`explain_symbol`	Full symbol explanation (callers, callees)
`read_file`	Read file contents
`write_file`	Create/overwrite files
`patch`	Search-and-replace edit
`bash`	Run shell commands
`search`	Regex search (ripgrep)
`find_files`	Glob file search
`memory_load`	Load relevant project memory
`memory_remember`	Save knowledge to memory
`web_search`	Search the web via DuckDuckGo (requires `SMALLCODE_WEB_BROWSE=true`)
`web_fetch`	Fetch and extract text from a URL (requires `SMALLCODE_WEB_BROWSE=true`)

Web Browsing

SmallCode includes Playwright with stealth mode for undetected web browsing. Disabled by default — enable for medium/large models (20B+) that can synthesize web context effectively:

# In your .env
SMALLCODE_WEB_BROWSE=true

When enabled, the model can search the web and fetch documentation during tasks. Uses headless Chromium with anti-detection to avoid CAPTCHAs and bot blocks. Falls back to simple HTTP fetch if Playwright isn't available.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
.qartez		.qartez
.smallcode/plugins/hello-plugin		.smallcode/plugins/hello-plugin
bench		bench
bin		bin
extensions		extensions
marrow		marrow
profiles		profiles
src		src
tmpsmallcode-artifactslinux		tmpsmallcode-artifactslinux
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmignore		.npmignore
BONESCRIPT_INTEGRATION.md		BONESCRIPT_INTEGRATION.md
CHANGELOG.md		CHANGELOG.md
COMPARISON.md		COMPARISON.md
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
README_zh-CN.md		README_zh-CN.md
SECURITY.md		SECURITY.md
build.js		build.js
install.ps1		install.ps1
install.sh		install.sh
mcp.json		mcp.json
package-lock.json		package-lock.json
package.json		package.json
smallcode.toml		smallcode.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmallCode

Why SmallCode?

Quick Start

Prebuilt Binaries (no Node.js needed)

Requirements

Configuration

Architecture

Key Features

MarrowScript Cognition Layer

BoneScript Integration

Model Escalation

Context Budget Engine

2-Stage Tool Routing

Early-Stop Detection

Forgiving Tool Call Parser

Patch-First Editing

TODO-Driven Planning

Model Profiles

Working Memory

Commands

Observability

Programmatic API

Tools

Web Browsing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SmallCode

Why SmallCode?

Quick Start

Prebuilt Binaries (no Node.js needed)

Requirements

Configuration

Architecture

Key Features

MarrowScript Cognition Layer

BoneScript Integration

Model Escalation

Context Budget Engine

2-Stage Tool Routing

Early-Stop Detection

Forgiving Tool Call Parser

Patch-First Editing

TODO-Driven Planning

Model Profiles

Working Memory

Commands

Observability

Programmatic API

Tools

Web Browsing

License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages