Skip to content

ramigb/smolBro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

smolBro

smolBro is a local-first chat assistant and background agent with a small web UI, local llama.cpp models, SQLite-backed memory, scheduled jobs, MCP tool support, and opt-in CLI / remote escalation.

This is a very, very early-stage project. Use it at your own risk. I guarantee nothing. This is a vibecoded weekend project, not a polished product.

What It Can Do

  • Run a local-first assistant through llama.cpp
  • Stream replies and reasoning from llama.cpp when the provider exposes them
  • Render assistant replies as markdown in the web UI
  • Store chat logs, durable memories, and scheduled jobs in SQLite
  • Let the model search, create, and update memories through constrained tools
  • Load local SKILL.md skills and use them through tool-assisted prompt workflows
  • Expose external Model Context Protocol (MCP) tools from configured stdio servers
  • Run a post-reply background memory reflection pass for likely durable user facts
  • Use MCP-backed tools for workspace inspection, browsing, and other external integrations
  • Ask for user approval before escalating to configured CLI helpers
  • Use an optional OpenAI-compatible remote API provider
  • Poll Telegram and route messages through the same orchestrator

What It Cannot Do Reliably

  • Guarantee correctness
  • Guarantee stable memory behavior or perfect memory selection
  • Guarantee safe autonomous operation without supervision
  • Guarantee compatibility across machines, GPUs, or llama.cpp revisions
  • Guarantee production-grade security, observability, or failure handling
  • Guarantee that every streamed turn can also do tool-based memory writes at the same time

Current Reality

  • The backend is Bun + TypeScript
  • The frontend is React + Vite
  • State is stored in data/smolbro.sqlite
  • Local models default to Qwen3 GGUF builds from Hugging Face
  • The app manages local llama-server child processes itself
  • Memory writes now happen in two ways:
    • direct memory tools during tool-using turns
    • a separate post-reply memory reflection pass for likely durable facts

Quick Start

Prerequisites

  • bun
  • npm
  • git
  • cmake
  • curl

Install dependencies

npm install
npm --prefix frontend install

Run everything

npm run dev

This will:

  1. clone and build repo-local llama.cpp if needed
  2. download the default local GGUF models if missing
  3. start the Bun server on http://127.0.0.1:3000

Useful commands

npm run llama:build
npm run models:pull
npm run dev:server
npm run ui:dev
npm run ui:build

Built-In Local Models

  • qwen3:1.7b -> Qwen/Qwen3-1.7B-GGUF / Qwen3-1.7B-Q8_0.gguf
  • qwen3:0.6b -> Qwen/Qwen3-0.6B-GGUF / Qwen3-0.6B-Q8_0.gguf

Default local runtime port:

  • primary: 12434

Web UI

The built-in UI can:

  • send streamed chat requests
  • show live request stats
  • show reasoning when the provider emits it
  • render assistant markdown
  • inspect provider state
  • inspect chat logs, memories, jobs, and job runs
  • create manual memories

Memory Behavior

smolBro has durable memory, but it is intentionally constrained.

Current memory paths:

  • Manual API writes through POST /api/memories
  • Model tool calls:
    • search_memories
    • create_memory
    • update_memory
  • Post-reply background memory reflection for likely durable facts

The intended memory scope is narrow:

  • stable user preferences
  • identity details
  • recurring workflow instructions
  • lasting project facts

Things it should not store:

  • one-off requests
  • transient context
  • secrets or credentials
  • speculative or weakly supported facts

This still needs supervision. Do not assume memory writes are always correct.

Tooling

Memory tools

  • search_memories
  • create_memory
  • update_memory

There is no model-facing delete tool right now.

Skills tools

  • list_skills
  • read_skill

Skills are loaded from local directories that contain SKILL.md files.

MCP tools

  • list_mcp_tools
  • one generated tool per configured MCP server tool, named like mcp_<server>_<tool>

MCP support currently uses stdio servers configured through environment variables. MCP is the path for browser-like, filesystem-like, and other external tool integrations.

Escalation

smolBro is local-first, but it can escalate when needed.

Supported escalation paths:

  • configured CLI presets
  • OpenAI-compatible remote API

CLI escalation is approval-based:

  1. smolBro decides a bigger model may help
  2. it asks the user for permission
  3. if approved, it offers the configured CLI presets

Slash Commands

Currently supported:

  • /model list
  • /model <name>
  • /model install <name>

Examples:

/model list
/model qwen3:0.6b
/model install qwen3:1.7b
/model install Qwen/Qwen3-1.7B-GGUF:Qwen3-1.7B-Q8_0.gguf

API

Health and status

  • GET /health
  • GET /api/strategy
  • GET /api/providers
  • GET /api/mcp
  • GET /api/skills
  • GET /api/operations/:id

Chat

  • POST /api/chat
  • POST /api/chat/stream
  • GET /api/chat-logs

Memories

  • GET /api/memories
  • POST /api/memories

Jobs

  • GET /api/jobs
  • POST /api/jobs
  • GET /api/job-runs
  • POST /api/jobs/:id/run
  • POST /api/jobs/:id/start
  • POST /api/jobs/:id/stop
  • DELETE /api/jobs/:id

Example chat request

curl -s \
  -H 'content-type: application/json' \
  -d '{"message":"Remember that I prefer concise answers."}' \
  http://127.0.0.1:3000/api/chat

Environment

Core

  • PORT=3000
  • SMOLBRO_DATA_DIR=./data
  • SMOLBRO_DEBUG_MODE=false
  • SMOLBRO_ENABLE_MOCK=true
  • SMOLBRO_PERSONALITY=...

Local models and runtime

  • SMOLBRO_SMALL_MODEL=qwen3:1.7b
  • SMOLBRO_LOCAL_RUNTIME_HOST=127.0.0.1
  • SMOLBRO_LOCAL_RUNTIME_PRIMARY_PORT=12434
  • SMOLBRO_LOCAL_RUNTIME_TIMEOUT_MS=120000
  • SMOLBRO_LOCAL_RUNTIME_STARTUP_TIMEOUT_MS=120000
  • SMOLBRO_LOCAL_RUNTIME_CONTEXT_SIZE=8192
  • SMOLBRO_LOCAL_RUNTIME_GPU_LAYERS=0
  • SMOLBRO_LOCAL_RUNTIME_THREADS=0
  • SMOLBRO_LLAMA_CPP_DIR=./.tools/llama.cpp
  • SMOLBRO_LOCAL_MODELS_DIR=./.models
  • SMOLBRO_HUGGING_FACE_TOKEN=
  • SMOLBRO_LLAMA_CPP_REPO=https://github.com/ggml-org/llama.cpp.git
  • SMOLBRO_LLAMA_CPP_REF=
  • SMOLBRO_LLAMA_CPP_BUILD_JOBS=
  • SMOLBRO_LLAMA_CPP_CMAKE_ARGS=

Remote API

  • SMOLBRO_REMOTE_API_BASE_URL=
  • SMOLBRO_REMOTE_API_MODEL=
  • SMOLBRO_REMOTE_API_KEY=

CLI escalation

  • SMOLBRO_CLI_PRESETS=[]
  • SMOLBRO_CLI_MODEL_COMMAND=

MCP

  • SMOLBRO_MCP_SERVERS=[]
  • SMOLBRO_MCP_REQUEST_TIMEOUT_MS=30000

Example:

export SMOLBRO_MCP_SERVERS='[
  {
    "id": "filesystem",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
    "cwd": ".",
    "description": "Example filesystem MCP server"
  }
]'

Skills

  • SMOLBRO_ENABLE_SKILLS=true
  • SMOLBRO_SKILLS_DIRS=./skills
  • SMOLBRO_SKILLS_DIR=./skills
  • SMOLBRO_SKILLS_MAX_CATALOG_CHARS=2000

smolBro scans each configured skills directory recursively for SKILL.md files.

Telegram

  • SMOLBRO_TELEGRAM_BOT_TOKEN=
  • SMOLBRO_TELEGRAM_API_BASE_URL=https://api.telegram.org
  • SMOLBRO_TELEGRAM_POLL_INTERVAL_MS=3000
  • SMOLBRO_TELEGRAM_ALLOWED_CHAT_IDS=

Scheduler and context

  • SMOLBRO_SCHEDULER_POLL_MS=5000
  • SMOLBRO_CONTEXT_MAX_RECENT_TURNS=4
  • SMOLBRO_CONTEXT_MAX_MEMORIES=5
  • SMOLBRO_CONTEXT_MAX_PINNED_MEMORIES=2
  • SMOLBRO_CONTEXT_CHAR_BUDGET=2200

Telegram

Telegram support is optional and uses long polling from the main server process. It does not require a webhook.

At minimum:

export SMOLBRO_TELEGRAM_BOT_TOKEN=123456:telegram-bot-token

Optional hardening:

export SMOLBRO_TELEGRAM_ALLOWED_CHAT_IDS=123456789,-100987654321

Known Risks

  • This project is unfinished
  • Tool use can still be wrong
  • Memory reflection can still save the wrong thing
  • Local model routing is heuristic-based
  • Browser access is constrained but not battle-tested
  • API and storage formats may change without warning
  • There is no migration or upgrade story I would call stable yet

If You Use This

  • expect breakage
  • expect rough edges
  • inspect memory state yourself
  • keep backups if you care about the data
  • do not trust it with anything critical

If you still want to use it, that is on you.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages