Skip to content

multiple tiny agent coders instead of one big one...

License

Notifications You must be signed in to change notification settings

usr-wwelsh/zweek-code

Repository files navigation

Warning

Prototype Stage: This project is currently in a pre-release/prototype stage. The actual coding functionality has not been fully implemented yet as research into optimization is ongoing.

Zweek Code

AI Coding Assistant That Runs on YOUR Machine
No cloud. No telemetry. Just you and your code.

Zweek Code Demo


What Is This?

A terminal-based AI coding assistant powered by specialized local models. Unlike cloud assistants, everything runs offline on consumer hardware using optimized small models (135M-1.1B parameters).

Philosophy: Use deterministic tools when possible, AI when necessary.

Features

100% Offline – No internet, no tracking, no data leaves your machine
Fast – Router stays resident (~150MB RAM), responses in seconds
Private – Your code is yours
Lightweight – Runs on older hardware (4GB RAM minimum)
Smart Routing – SmolLM classifies intent (CODE/CHAT/TOOL) instantly

Architecture

Optimized 3-Model System:

  1. Router (SmolLM-135M) - Stays loaded, classifies intent with GBNF grammar
  2. Code Drafter (StarCoder-Tiny) - Generates code when needed
  3. Chat (Qwen3-0.6B-Q8_0) - Answers questions about your code

Key Optimizations:

  • Resident models stay in memory (~350MB idle)
  • GBNF grammars eliminate hallucination
  • Compiler check (cl.exe) validates code instantly (no AI)
  • Peak RAM: ~500MB during inference

Quick Start

Requirements

  • Windows 10/11, Linux, or macOS
  • 4GB RAM minimum (8GB recommended)
  • CMake 3.20+
  • C++17 compiler

Build

git clone https://github.com/usr-wwelsh/zweek-code.git
cd zweek-code

# Download models (place in models/ directory)
# - smollm-135m-router.gguf
# - Qwen3-0.6B-Q8_0.gguf  
# - starcoder-tiny.gguf

cmake -S . -B build -G Ninja
cmake --build build
.\build\zweek.exe  # Windows
./build/zweek      # Linux/macOS

See QUICKSTART.md for detailed instructions.

Usage

> /help
Show commands and tips

> what is this function doing?
Chat mode answers questions

> add error handling here
Code mode generates fixes

Keyboard:

  • m - Switch between Plan and Auto mode
  • y/n - Accept/reject changes
  • Ctrl+C - Exit

Models

Place these in models/:

Model Size Purpose RAM
smollm-135m-router.gguf ~150MB Intent classification Resident
starcoder-tiny.gguf ~200MB Code generation On-demand
Qwen3-0.6B-Q8_0.gguf ~700MB Q&A On-demand

Download from HuggingFace (GGUF Q8 quantized versions).

Performance

Target: <15 seconds for most operations
Idle RAM: ~350MB (Router + Code Drafter resident)
Peak RAM: ~500MB during chat inference

Status

v1.0.0-alpha - Phase 2 Complete

✅ TUI with FTXUI
✅ Router with GBNF
✅ Chat mode with Qwen3
✅ Compiler-based validation
✅ Command system
✅ Persistent Chat History

Commands

  • /help - Show available commands
  • /history [n] - Show last n messages
  • /sessions - List saved sessions
  • /load <index> - Load a previous session
  • /clear-history - Clear current session history
  • /cd <path> - Change working directory
  • /ls [path] - List files in directory (current if no path given)

Keyboard Shortcuts

  • m - Switch between Plan and Auto mode
  • t - Toggle visibility of "Thinking" sections
  • y/n - Accept/reject changes
  • Ctrl+C - Exit

Tech Stack

License

MIT - See LICENSE


No cloud. No compromises. Your code, your machine.

About

multiple tiny agent coders instead of one big one...

Resources

License

Stars

Watchers

Forks

Releases

No releases published