smolBro

smolBro is a local-first chat assistant and background agent with a small web UI, local llama.cpp models, SQLite-backed memory, scheduled jobs, MCP tool support, and opt-in CLI / remote escalation.

This is a very, very early-stage project. Use it at your own risk. I guarantee nothing. This is a vibecoded weekend project, not a polished product.

What It Can Do

Run a local-first assistant through llama.cpp
Stream replies and reasoning from llama.cpp when the provider exposes them
Render assistant replies as markdown in the web UI
Store chat logs, durable memories, and scheduled jobs in SQLite
Let the model search, create, and update memories through constrained tools
Load local SKILL.md skills and use them through tool-assisted prompt workflows
Expose external Model Context Protocol (MCP) tools from configured stdio servers
Run a post-reply background memory reflection pass for likely durable user facts
Use MCP-backed tools for workspace inspection, browsing, and other external integrations
Ask for user approval before escalating to configured CLI helpers
Use an optional OpenAI-compatible remote API provider
Poll Telegram and route messages through the same orchestrator

What It Cannot Do Reliably

Guarantee correctness
Guarantee stable memory behavior or perfect memory selection
Guarantee safe autonomous operation without supervision
Guarantee compatibility across machines, GPUs, or llama.cpp revisions
Guarantee production-grade security, observability, or failure handling
Guarantee that every streamed turn can also do tool-based memory writes at the same time

Current Reality

The backend is Bun + TypeScript
The frontend is React + Vite
State is stored in data/smolbro.sqlite
Local models default to Qwen3 GGUF builds from Hugging Face
The app manages local llama-server child processes itself
Memory writes now happen in two ways:
- direct memory tools during tool-using turns
- a separate post-reply memory reflection pass for likely durable facts

Quick Start

Prerequisites

bun
npm
git
cmake
curl

Install dependencies

npm install
npm --prefix frontend install

Run everything

npm run dev

This will:

clone and build repo-local llama.cpp if needed
download the default local GGUF models if missing
start the Bun server on http://127.0.0.1:3000

Useful commands

npm run llama:build
npm run models:pull
npm run dev:server
npm run ui:dev
npm run ui:build

Built-In Local Models

qwen3:1.7b -> Qwen/Qwen3-1.7B-GGUF / Qwen3-1.7B-Q8_0.gguf
qwen3:0.6b -> Qwen/Qwen3-0.6B-GGUF / Qwen3-0.6B-Q8_0.gguf

Default local runtime port:

primary: 12434

Web UI

The built-in UI can:

send streamed chat requests
show live request stats
show reasoning when the provider emits it
render assistant markdown
inspect provider state
inspect chat logs, memories, jobs, and job runs
create manual memories

Memory Behavior

smolBro has durable memory, but it is intentionally constrained.

Current memory paths:

Manual API writes through POST /api/memories
Model tool calls:
- search_memories
- create_memory
- update_memory
Post-reply background memory reflection for likely durable facts

The intended memory scope is narrow:

stable user preferences
identity details
recurring workflow instructions
lasting project facts

Things it should not store:

one-off requests
transient context
secrets or credentials
speculative or weakly supported facts

This still needs supervision. Do not assume memory writes are always correct.

Tooling

Memory tools

search_memories
create_memory
update_memory

There is no model-facing delete tool right now.

Skills tools

list_skills
read_skill

Skills are loaded from local directories that contain SKILL.md files.

MCP tools

list_mcp_tools
one generated tool per configured MCP server tool, named like mcp_<server>_<tool>

MCP support currently uses stdio servers configured through environment variables. MCP is the path for browser-like, filesystem-like, and other external tool integrations.

Escalation

smolBro is local-first, but it can escalate when needed.

Supported escalation paths:

configured CLI presets
OpenAI-compatible remote API

CLI escalation is approval-based:

smolBro decides a bigger model may help
it asks the user for permission
if approved, it offers the configured CLI presets

Slash Commands

Currently supported:

/model list
/model <name>
/model install <name>

Examples:

/model list
/model qwen3:0.6b
/model install qwen3:1.7b
/model install Qwen/Qwen3-1.7B-GGUF:Qwen3-1.7B-Q8_0.gguf

API

Health and status

GET /health
GET /api/strategy
GET /api/providers
GET /api/mcp
GET /api/skills
GET /api/operations/:id

Chat

POST /api/chat
POST /api/chat/stream
GET /api/chat-logs

Memories

GET /api/memories
POST /api/memories

Jobs

GET /api/jobs
POST /api/jobs
GET /api/job-runs
POST /api/jobs/:id/run
POST /api/jobs/:id/start
POST /api/jobs/:id/stop
DELETE /api/jobs/:id

Example chat request

curl -s \
  -H 'content-type: application/json' \
  -d '{"message":"Remember that I prefer concise answers."}' \
  http://127.0.0.1:3000/api/chat

Environment

Core

PORT=3000
SMOLBRO_DATA_DIR=./data
SMOLBRO_DEBUG_MODE=false
SMOLBRO_ENABLE_MOCK=true
SMOLBRO_PERSONALITY=...

Local models and runtime

SMOLBRO_SMALL_MODEL=qwen3:1.7b
SMOLBRO_LOCAL_RUNTIME_HOST=127.0.0.1
SMOLBRO_LOCAL_RUNTIME_PRIMARY_PORT=12434
SMOLBRO_LOCAL_RUNTIME_TIMEOUT_MS=120000
SMOLBRO_LOCAL_RUNTIME_STARTUP_TIMEOUT_MS=120000
SMOLBRO_LOCAL_RUNTIME_CONTEXT_SIZE=8192
SMOLBRO_LOCAL_RUNTIME_GPU_LAYERS=0
SMOLBRO_LOCAL_RUNTIME_THREADS=0
SMOLBRO_LLAMA_CPP_DIR=./.tools/llama.cpp
SMOLBRO_LOCAL_MODELS_DIR=./.models
SMOLBRO_HUGGING_FACE_TOKEN=
SMOLBRO_LLAMA_CPP_REPO=https://github.com/ggml-org/llama.cpp.git
SMOLBRO_LLAMA_CPP_REF=
SMOLBRO_LLAMA_CPP_BUILD_JOBS=
SMOLBRO_LLAMA_CPP_CMAKE_ARGS=

Remote API

SMOLBRO_REMOTE_API_BASE_URL=
SMOLBRO_REMOTE_API_MODEL=
SMOLBRO_REMOTE_API_KEY=

CLI escalation

SMOLBRO_CLI_PRESETS=[]
SMOLBRO_CLI_MODEL_COMMAND=

MCP

SMOLBRO_MCP_SERVERS=[]
SMOLBRO_MCP_REQUEST_TIMEOUT_MS=30000

Example:

export SMOLBRO_MCP_SERVERS='[
  {
    "id": "filesystem",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
    "cwd": ".",
    "description": "Example filesystem MCP server"
  }
]'

Skills

SMOLBRO_ENABLE_SKILLS=true
SMOLBRO_SKILLS_DIRS=./skills
SMOLBRO_SKILLS_DIR=./skills
SMOLBRO_SKILLS_MAX_CATALOG_CHARS=2000

smolBro scans each configured skills directory recursively for SKILL.md files.

Telegram

SMOLBRO_TELEGRAM_BOT_TOKEN=
SMOLBRO_TELEGRAM_API_BASE_URL=https://api.telegram.org
SMOLBRO_TELEGRAM_POLL_INTERVAL_MS=3000
SMOLBRO_TELEGRAM_ALLOWED_CHAT_IDS=

Scheduler and context

SMOLBRO_SCHEDULER_POLL_MS=5000
SMOLBRO_CONTEXT_MAX_RECENT_TURNS=4
SMOLBRO_CONTEXT_MAX_MEMORIES=5
SMOLBRO_CONTEXT_MAX_PINNED_MEMORIES=2
SMOLBRO_CONTEXT_CHAR_BUDGET=2200

Telegram

Telegram support is optional and uses long polling from the main server process. It does not require a webhook.

At minimum:

export SMOLBRO_TELEGRAM_BOT_TOKEN=123456:telegram-bot-token

Optional hardening:

export SMOLBRO_TELEGRAM_ALLOWED_CHAT_IDS=123456789,-100987654321

Known Risks

This project is unfinished
Tool use can still be wrong
Memory reflection can still save the wrong thing
Local model routing is heuristic-based
Browser access is constrained but not battle-tested
API and storage formats may change without warning
There is no migration or upgrade story I would call stable yet

If You Use This

expect breakage
expect rough edges
inspect memory state yourself
keep backups if you care about the data
do not trust it with anything critical

If you still want to use it, that is on you.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
frontend		frontend
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
product.md		product.md
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

smolBro

What It Can Do

What It Cannot Do Reliably

Current Reality

Quick Start

Prerequisites

Install dependencies

Run everything

Useful commands

Built-In Local Models

Web UI

Memory Behavior

Tooling

Memory tools

Skills tools

MCP tools

Escalation

Slash Commands

API

Health and status

Chat

Memories

Jobs

Example chat request

Environment

Core

Local models and runtime

Remote API

CLI escalation

MCP

Skills

Telegram

Scheduler and context

Telegram

Known Risks

If You Use This

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages