Skip to content

Zireael/smallcode

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

111 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmallCode

简体中文 | English


npm

AI coding agent optimized for small LLMs (≤20B parameters)

SmallCode is a terminal-native coding agent designed from the ground up to extract useful work from local models (7B-20B) running on consumer hardware. While tools like OpenCode assume frontier models with 128k+ context and perfect tool calling, SmallCode compensates for the limitations of small models through intelligent architecture.

Why SmallCode?

OpenCode SmallCode
Target Frontier models (Claude, GPT-5) 7B-20B local models
Context Dumps everything Budget-managed, summarized
Tool calling Assumes reliable JSON Forgiving multi-format parser
Planning Single-shot TODO-file decomposed steps
Editing Full file write Search-and-replace patch
Privacy API calls to cloud Fully local, no network needed

Quick Start

# Install globally via npm
npm install -g smallcode

# Or run directly with npx
npx smallcode

# Start in your project directory
cd my-project
smallcode

Prebuilt Binaries (no Node.js needed)

Pre-compiled tarballs for Windows, macOS, and Linux are built on every release — they bundle Node.js plus all native addons so you never need node-gyp or C++ build tools.

Platform One‑line install
Linux / macOS bash <(curl -fsSL https://raw.githubusercontent.com/Doorman11991/smallcode/master/install.sh)
Windows iwr -Uri https://raw.githubusercontent.com/Doorman11991/smallcode/master/install.ps1 -UseBasicParsing | iex

The install script downloads the correct tarball for your platform, extracts it to ~/.smallcode, and adds it to your PATH. Run smallcode --help to verify.

SmallCode includes BoneScript and budget-aware-mcp as dependencies — everything installs in one go.

Requirements

  • Node.js 18+ (LTS recommended — 20.x or 22.x have prebuilt binaries for SQLite)
  • A local LLM server (LM Studio, Ollama, or any OpenAI-compatible endpoint)

Optional (for code graph + FTS5 memory search):

  • better-sqlite3 needs native compilation if prebuilt binaries aren't available for your Node version
  • Prebuilt binaries exist for Node LTS (20.x, 22.x) on Linux/macOS/Windows. no build tools needed
  • If you're on a non-LTS Node (23+, 25+), you'll need:
    • Linux: python3, make, gcc/g++ (sudo apt install build-essential python3 or pacman -S base-devel python)
    • macOS: Xcode Command Line Tools (xcode-select --install)
    • Windows: Visual Studio Build Tools with "Desktop development with C++" workload, or npm install -g windows-build-tools
  • If build fails, SmallCode still works — it falls back to JSON-based memory automatically

Configuration

Create a .env file in your project root:

# Required
SMALLCODE_MODEL=your-model-name
SMALLCODE_BASE_URL=http://localhost:1234/v1

# Optional: escalation (auto-fallback to cloud on hard fail)
# ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# DEEPSEEK_API_KEY=sk-...

See .env.example for all options. Also supports smallcode.toml for backwards compatibility.

Architecture

SmallCode is built with a modular architecture:

bin/
├── smallcode.js        Entry point, agent loop, TUI orchestration (1570 lines)
├── config.js           Config loading, endpoint detection, auth headers
├── executor.js         Tool execution (all 18 tools)
├── tools.js            Tool definitions + 2-stage routing
├── mcp_bridge.js       Built-in code graph MCP communication
├── model_client.js     LLM API calls, streaming, validation
├── governor.js         Tool scoring, verification, decompose
├── escalation.js       Cloud model fallback (Claude/OpenAI/DeepSeek)
├── commands.js         TUI slash commands
├── tui.js              Classic TUI renderer
└── bonescript_guide.js BoneScript syntax reference

src/
├── api/index.js        Programmatic API (require('smallcode'))
├── tui/fullscreen.js   Fullscreen alternate-buffer TUI
├── plugins/loader.js   Plugin system
├── plugins/skills.js   Skill system
├── tools/              Tool routing, MCP client, validators
├── governor/           Early-stop detection, verifier, tool scorer
├── model/              Multi-model profiles + routing
└── session/            Persistence, undo, sharing, references

Key Features

MarrowScript Cognition Layer

SmallCode's intelligence is declared in MarrowScript and compiled to a production runtime. One 50-line .marrow declaration generates 1400+ lines of TypeScript with caching, retry, validation, traces, and budget enforcement — all for free.

prompt classify_task_type(user_message: string) {
  model: TinyClassifier
  timeout: 3s
  cache: { key: hash(user_message), ttl: 10m }
  retry: { max_attempts: 2, backoff: fixed, interval: 100ms }
  constraints: [output in ["coding", "editing", "search", ...]]
}

The compiled cognition layer provides:

  • Prompt caching — 0ms on cache hit, content-hash keys with TTL
  • Structured traces — trace_id/span_id for every LLM call (enable with SMALLCODE_COGNITION_LOG=stderr)
  • Tier-based routing — trivial tasks → tiny model, complex tasks → medium model
  • Token budgets — per-cost-class enforcement, never overspend
  • Validation + repair — schema checks with auto-retry on malformed output

BoneScript Integration

For Node.js/TypeScript backends, SmallCode uses BoneScript — write ONE .bone file and compile it to a complete project (routes, auth, DB, events, migrations, SDK, admin panel, Docker, CI). Reduces 8-15 tool calls to 1-2, dramatically improving reliability with small models.

Model Escalation

When the local model hard fails after retry + decompose, SmallCode can optionally escalate to a stronger cloud model (Claude, OpenAI, DeepSeek). Fully opt-in — requires an API key. Session-limited to prevent runaway costs.

Escalation targets (cloud, used only on hard fail):

  • Claude Sonnet 4.5 / 4.6, Haiku 4.5
  • GPT-5.4 Mini / Nano
  • DeepSeek V4 / V4 Pro / V4 Flash

Context Budget Engine

Never exceeds your model's context window. Tool results capped at 4k chars, mid-turn eviction drops old results when context grows too large, and semantic compression summarizes history instead of dropping it.

2-Stage Tool Routing

Halves the schema context overhead. Model picks a category (read/write/search/run/plan) first, then gets only relevant tool schemas. Critical for models with 8-16k context.

Early-Stop Detection

Detects repetition loops, patch spirals (stuck on corrupted file → forces rewrite), and greeting regression (model lost context → re-injects task). Saves tokens and time.

Forgiving Tool Call Parser

Small models produce messy output. SmallCode parses tool calls from JSON, YAML, XML, Hermes format, or plain text. Auto-repairs common mistakes (wrong param names, type mismatches).

Patch-First Editing

Search-and-replace as the primary edit primitive. Small models can't reliably reproduce entire files — they truncate, hallucinate, or drift. patch is safer and more context-efficient.

TODO-Driven Planning

Complex tasks get decomposed into atomic steps. The model reads a TODO file each turn to know where it is. Each step is validated (lint/compile) before moving on.

Model Profiles

Per-model configuration: context length, tool format (native/hermes/json/xml/text), chat template, strengths/weaknesses. Auto-adapts prompting strategy.

Working Memory

Persistent scratchpad that survives across turns. Compensates for limited reasoning depth — the model can write notes to itself.

Commands

Command Description
/quit, /q Exit SmallCode
/clear Reset conversation
/stats Show session statistics
/tokens Detailed token usage report
/budget Context window budget + visual bar
/trace List/show/export execution traces
/eval Run prompt evaluation suites
/memory Show working memory
/plan Show current task plan
/model Show/switch model
/profile Show detected model profile + routing mode
/cognition Show MarrowScript cognition layer status
/mcp Show connected external MCP servers
/skill Manage reusable skills
/plugin Install/manage plugins
/sessions List/resume saved sessions
/help Show all commands

Observability

SmallCode tracks token usage and execution traces automatically:

  • Token Monitor — Every LLM call records prompt/completion tokens. View with /tokens.
  • Context Budget — Visual indicator of context window usage. View with /budget.
  • Execution Traces — Every agent turn is recorded to .smallcode/traces/. View with /trace list.
  • Trace-to-Test — Generate regression tests from traces: /trace test <id>.
  • Prompt Evaluations — Measure classifier accuracy and tool selection: /eval classify_accuracy.
# Run evaluations from CLI
smallcode --eval classify_accuracy
smallcode --eval tool_selection

Programmatic API

Use SmallCode as a library in your own tools, CI pipelines, or TypeScript frameworks:

const { SmallCode } = require('smallcode');

const agent = new SmallCode({
  model: 'gemma-4-e4b',
  baseUrl: 'http://localhost:1234/v1',
});

// Run a task
const result = await agent.run("create hello.py that prints hello world");
console.log(result.filesCreated);  // ['hello.py']
console.log(result.toolCalls.length);  // 1
console.log(result.success);  // true

// Subscribe to events
agent.on('tool_start', ({ name, args }) => console.log(`Using: ${name}`));
agent.on('tool_end', ({ name, ms }) => console.log(`Done: ${name} (${ms}ms)`));
agent.on('error', (err) => console.error(err));

Returns a structured RunResult with: response text, tool call records, files created/edited, token usage, duration, and success status.

Tools

Tool Description
bone_compile Compile .bone to full backend project
bone_check Validate .bone file (type errors, constraints)
list_projects List all indexed projects with stats
graph_search Code graph symbol search
explain_symbol Full symbol explanation (callers, callees)
read_file Read file contents
write_file Create/overwrite files
patch Search-and-replace edit
bash Run shell commands
search Regex search (ripgrep)
find_files Glob file search
memory_load Load relevant project memory
memory_remember Save knowledge to memory
web_search Search the web via DuckDuckGo (requires SMALLCODE_WEB_BROWSE=true)
web_fetch Fetch and extract text from a URL (requires SMALLCODE_WEB_BROWSE=true)

Web Browsing

SmallCode includes Playwright with stealth mode for undetected web browsing. Disabled by default — enable for medium/large models (20B+) that can synthesize web context effectively:

# In your .env
SMALLCODE_WEB_BROWSE=true

When enabled, the model can search the web and fetch documentation during tasks. Uses headless Chromium with anti-detection to avoid CAPTCHAs and bot blocks. Falls back to simple HTTP fetch if Playwright isn't available.

License

MIT

About

AI coding agent optimized for small LLMs. 87% benchmark with 4B-active model.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • JavaScript 56.0%
  • MAXScript 29.7%
  • TypeScript 13.6%
  • Other 0.7%