an autonomous personal operator that actually knows you.
tracks goals. stores evidence. runs durable plans. reviews patterns over time. verifies outcomes. browses the web. schedules tasks. delegates to sub-agents. reads documents. texts like a real one.
a long-running personal agent that connects to telegram or a local cli, routes every message through the right model for the job, maintains structured goals and observations, creates durable multi-step plans, verifies important outcomes, remembers your preferences via semantic memory, creates its own skills, schedules reminders and reviews, browses the web, spawns sub-agents for parallel work, reads PDFs and documents, transcribes voice messages, generates images, and replies like a real person.
built on bun. runs on a $5 vps. no bloat.
you (telegram / cli)
|
v
+----------+
| router |---- fast (gpt-5.4)
| 2-tier |---- deep (gpt-5.4)
| classify |
+----------+
|
v
+----------+ +--------------------------+
| agent |---->| generateText tool loop |
| core | | prepareStep: escalate |
+----------+ | onStepFinish: track |
| | stopWhen: 30 steps max |
v +--------------------------+
+------------------------------------------+
| tools |
| memory - assessment - planning - verify |
| search - sandbox - filesystem |
| schedule - skills - soul - status |
| subagent - image - sendFile |
+------------------------------------------+
|
v
+----------+
| sqlite | messages, tasks, usage,
| wal mode | state, subagents
+----------+
the agent calls generateText or streamText with tools, the ai sdk handles the loop, and channels call runAgent() or streamAgent() directly. tool context (userId, chatId, channel) is threaded via AsyncLocalStorage so every tool knows who it's serving without passing state around.
model escalation: if the agent is still working after 5 tool steps on fast tier, it automatically upgrades to deep. chat tiers stay separate even when they point at the same primary model.
failover chains: each tier has a fallback model list via openrouter's models array. if the primary model is down, koda falls over to the next model automatically.
tool cost tracking: external API costs (Exa search, image generation) are tracked separately from LLM token costs via addToolCost(), accumulated per-request via AsyncLocalStorage, and stored in the usage table.
sub-agent streaming: spawned child agents broadcast live progress via streamUpdate and return structured results via returnResult. the dashboard displays progress lines in real time over SSE.
# install
bun install
# interactive setup (creates config + .env)
bun run src/index.ts setup
# verify everything works
bun run src/index.ts doctor
# run (telegram + proactive)
bun start
# or cli-only mode
bun run clicp config/config.example.json ~/.koda/config.jsoncreate ~/.koda/.env:
KODA_OPENROUTER_API_KEY=sk-or-...
KODA_TELEGRAM_TOKEN=123456:ABC... # required unless cli-only
KODA_EXA_API_KEY=... # optional — web search + skill shop
KODA_SUPERMEMORY_API_KEY=... # optional — semantic memory (gracefully disabled without it)
| command | what it does |
|---|---|
/help |
list all commands |
/clear |
reset conversation history |
/usage |
see token usage and costs |
/status |
system health summary — uptime, memory, models, costs, next task |
/deep |
force next message to use deep tier |
/fast |
force next message to use fast tier |
/recap |
summarize recent conversation — key topics, decisions, open items |
/model |
view or change chat/image models (/model fast openai/gpt-5.4) |
- voice messages — send voice messages or circle videos. koda transcribes them via a dedicated transcription model and responds.
- document ingestion — send PDFs, text files (.txt, .md, .csv, .json, .html, .xml) directly in Telegram. koda extracts the text and responds in context.
- reply threading — reply to any message and koda sees the original text as context.
- forwarded messages — forward messages to koda and it knows who/where they came from.
- edited messages — edit a sent message and koda processes the update.
- structured assessment state — goals, observations, interventions, and reviews are stored explicitly so koda can assess patterns over time instead of relying only on freeform memory.
- durable plans — hard tasks can be stored as multi-step plans with success criteria, verification hints, approvals, and resumable continuation.
- verification —
verifyOutcomehelps confirm files, reminders, plans, and URLs before koda claims a task is done. - image generation —
generateImagetool creates images via OpenRouter (default: google/gemini-3-pro-image-preview). - file sending —
sendFiletool sends workspace files back as Telegram documents. - tier override —
/deepand/fastcommands force the next message to a specific model tier. - model switching —
/modelchanges chat/image models on the fly without touching the dedicated transcription, summary, or memory models. - database backup — automatic daily SQLite backup to
~/.koda/backups/with 7-day retention. - webhook mode — optional Telegram webhook support instead of polling.
- startup/shutdown notifications — admin users get notified when koda comes online or goes down.
- dashboard — real-time web UI at
/with usage stats, skills, tasks, sub-agent activity. SSE-powered live updates. - tool cost tracking — external API costs (Exa, image generation) tracked separately from LLM costs.
- MCP — connect external tool servers (Notion, GitHub, etc.) via
@ai-sdk/mcp. stdio, SSE, and HTTP transports. auto-reconnect on crash. - sub-agents — spawn focused child agents for parallel work. isolated sessions, filtered tools, config-driven limits. live progress via
streamUpdate. structured results viareturnResult. addressable via@AgentName: .... - docker sandbox — run untrusted code in isolated containers with hard resource limits.
- Ollama — use local LLMs for fast tier when configured. falls back to OpenRouter when unavailable.
- soul personality — editable
soul.md+soul.d/*.mdwith filesystem watcher for hot-reload. - workspace context — place a
CONTEXT.mdin your workspace to inject project context into every system prompt. hot-reloads on change. - request tracing — every request gets an 8-char ID prefix in logs for easy tracing across agent and sub-agent calls.
- task failure tracking — recurring tasks auto-disable after 3 consecutive failures with user notification.
- reflective reviews — built-in recurring reviews can audit goal drift, follow-through, and patterns in the user's work over time.
all fields are optional except openrouter.apiKey (via env var).
| section | field | default | description |
|---|---|---|---|
mode |
"private" |
"private" (telegram) or "cli-only" |
|
owner |
id |
"owner" |
owner user ID |
openrouter |
fastModel |
openai/gpt-5.4 |
fast chat tier model |
openrouter |
deepModel |
openai/gpt-5.4 |
deep chat tier model |
openrouter |
imageModel |
google/gemini-3-pro-image-preview |
image generation model |
openrouter |
transcriptionModel |
google/gemini-3-flash-preview |
voice / video transcription model |
openrouter |
summaryModel |
google/gemini-3-flash-preview |
conversation summarization model |
openrouter |
memoryModel |
google/gemini-3-flash-preview |
memory extraction / consolidation model |
agent |
maxSteps |
30 |
max tool loop steps |
agent |
maxTokens |
8192 |
max output tokens per turn |
agent |
temperature |
0.7 |
LLM temperature |
timeouts |
llm |
120000 |
LLM request timeout (ms) |
timeouts |
memory |
10000 |
memory/embedding timeout (ms) |
timeouts |
search |
30000 |
search/external API timeout (ms) |
scheduler |
timezone |
America/Los_Angeles |
IANA timezone for scheduling |
proactive |
tickIntervalMs |
30000 |
scheduler tick interval |
features |
scheduler |
true |
enable/disable proactive scheduler |
features |
debug |
false |
enable debug logging |
features |
autoBackup |
true |
daily SQLite backup |
subagent |
timeoutMs |
90000 |
sub-agent timeout |
subagent |
maxSteps |
10 |
sub-agent max steps |
ollama |
enabled |
false |
use local Ollama for fast tier |
ollama |
baseUrl |
http://localhost:11434 |
Ollama server URL |
ollama |
model |
llama3.2 |
Ollama model name |
mcp |
servers |
[] |
MCP server configurations (stdio, sse, http) |
telegram |
useWebhook |
false |
use webhook instead of polling |
telegram |
webhookUrl |
— | webhook URL (e.g., https://koda.example.com/telegram) |
telegram |
webhookSecret |
— | secret token for webhook verification |
| tool | what it does |
|---|---|
| remember / recall | semantic memory via Supermemory — stores facts, retrieves relevant context. gracefully disabled without API key. |
| assessmentSnapshot / upsertGoal / logObservation / createIntervention / storeReview | structured personal assessment state — goals, evidence, interventions, and reviews. |
| createPlanRecord / listPlans / getPlan / updatePlanStep / approvePlan | durable execution plans with multi-step progress and approval support. |
| verifyOutcome | checks whether important outputs actually exist before claiming success. |
| webSearch / extractUrl | exa-powered web search + page content extraction. cost tracked per call. |
| readFile / writeFile / listFiles | workspace-scoped filesystem. blocked patterns for .env, secrets, node_modules. |
| runSandboxed | isolated Docker container execution with resource limits (512MB RAM, 0.5 CPU, no network). |
| createReminder / createRecurringTask / listTasks / deleteTask | timezone-aware scheduling with natural language ("every Monday at 9am") and cron format. |
| skills | list, load, create, search, preview, install SKILL.md files. unified skill management + community skill shop via Exa. |
| getSoul / updateSoul | read or rewrite personality sections. hot-reloaded without restart. |
| systemStatus | uptime, memory usage, circuit breaker state, today's cost, next scheduled task. |
| spawnAgent | delegate sub-tasks to isolated child agents with filtered toolsets. multiple spawns run concurrently. returns structured results. |
| generateImage | generate images via OpenRouter image models. |
| sendFile | send workspace files back to the user as document attachments. |
5 tables in SQLite (WAL mode):
| table | purpose |
|---|---|
messages |
conversation history per session |
tasks |
reminders + recurring scheduled tasks |
usage |
per-request cost, tool cost, and token tracking |
state |
key-value store (schema version, seeds) |
subagents |
sub-agent spawn records |
# with docker compose (recommended)
docker compose up -d
# or standalone
docker build -t koda .
docker run -d --env-file .env -p 3000:3000 kodabun install --production
bun startruns on anything that runs bun. 128mb ram is plenty. health checks at /health.
| layer | tech |
|---|---|
| runtime | bun |
| language | typescript 5.9 (strict) |
| ai | vercel ai sdk v6 |
| llm routing | openrouter — 2-tier auto-routing |
| memory | supermemory (optional) |
| search | exa |
| telegram | grammy |
| database | sqlite via bun:sqlite (wal mode) |
| validation | zod v4 |
| mcp | @ai-sdk/mcp |
| cli ui | @clack/prompts + chalk + ora |
MIT