Skip to content

ImTheMars/koda

Repository files navigation

koda

v0.12.0 Bun TypeScript MIT

an autonomous personal operator that actually knows you.
tracks goals. stores evidence. runs durable plans. reviews patterns over time. verifies outcomes. browses the web. schedules tasks. delegates to sub-agents. reads documents. texts like a real one.


a long-running personal agent that connects to telegram or a local cli, routes every message through the right model for the job, maintains structured goals and observations, creates durable multi-step plans, verifies important outcomes, remembers your preferences via semantic memory, creates its own skills, schedules reminders and reviews, browses the web, spawns sub-agents for parallel work, reads PDFs and documents, transcribes voice messages, generates images, and replies like a real person.

built on bun. runs on a $5 vps. no bloat.

architecture

you (telegram / cli)
         |
         v
   +----------+
   |  router   |---- fast (gpt-5.4)
   |  2-tier   |---- deep (gpt-5.4)
   |  classify |
   +----------+
         |
         v
   +----------+     +--------------------------+
   |  agent    |---->|  generateText tool loop   |
   |  core     |     |  prepareStep: escalate    |
   +----------+     |  onStepFinish: track      |
         |          |  stopWhen: 30 steps max    |
         v          +--------------------------+
  +------------------------------------------+
  |  tools                                   |
  |  memory - assessment - planning - verify |
  |  search - sandbox - filesystem           |
  |  schedule - skills - soul - status       |
  |  subagent - image - sendFile             |
  +------------------------------------------+
         |
         v
   +----------+
   |  sqlite   |  messages, tasks, usage,
   |  wal mode |  state, subagents
   +----------+

the agent calls generateText or streamText with tools, the ai sdk handles the loop, and channels call runAgent() or streamAgent() directly. tool context (userId, chatId, channel) is threaded via AsyncLocalStorage so every tool knows who it's serving without passing state around.

model escalation: if the agent is still working after 5 tool steps on fast tier, it automatically upgrades to deep. chat tiers stay separate even when they point at the same primary model.

failover chains: each tier has a fallback model list via openrouter's models array. if the primary model is down, koda falls over to the next model automatically.

tool cost tracking: external API costs (Exa search, image generation) are tracked separately from LLM token costs via addToolCost(), accumulated per-request via AsyncLocalStorage, and stored in the usage table.

sub-agent streaming: spawned child agents broadcast live progress via streamUpdate and return structured results via returnResult. the dashboard displays progress lines in real time over SSE.

quick start

# install
bun install

# interactive setup (creates config + .env)
bun run src/index.ts setup

# verify everything works
bun run src/index.ts doctor

# run (telegram + proactive)
bun start

# or cli-only mode
bun run cli

manual config

cp config/config.example.json ~/.koda/config.json

create ~/.koda/.env:

KODA_OPENROUTER_API_KEY=sk-or-...
KODA_TELEGRAM_TOKEN=123456:ABC...     # required unless cli-only
KODA_EXA_API_KEY=...                  # optional — web search + skill shop
KODA_SUPERMEMORY_API_KEY=...          # optional — semantic memory (gracefully disabled without it)

telegram commands

command what it does
/help list all commands
/clear reset conversation history
/usage see token usage and costs
/status system health summary — uptime, memory, models, costs, next task
/deep force next message to use deep tier
/fast force next message to use fast tier
/recap summarize recent conversation — key topics, decisions, open items
/model view or change chat/image models (/model fast openai/gpt-5.4)

features

  • voice messages — send voice messages or circle videos. koda transcribes them via a dedicated transcription model and responds.
  • document ingestion — send PDFs, text files (.txt, .md, .csv, .json, .html, .xml) directly in Telegram. koda extracts the text and responds in context.
  • reply threading — reply to any message and koda sees the original text as context.
  • forwarded messages — forward messages to koda and it knows who/where they came from.
  • edited messages — edit a sent message and koda processes the update.
  • structured assessment state — goals, observations, interventions, and reviews are stored explicitly so koda can assess patterns over time instead of relying only on freeform memory.
  • durable plans — hard tasks can be stored as multi-step plans with success criteria, verification hints, approvals, and resumable continuation.
  • verificationverifyOutcome helps confirm files, reminders, plans, and URLs before koda claims a task is done.
  • image generationgenerateImage tool creates images via OpenRouter (default: google/gemini-3-pro-image-preview).
  • file sendingsendFile tool sends workspace files back as Telegram documents.
  • tier override/deep and /fast commands force the next message to a specific model tier.
  • model switching/model changes chat/image models on the fly without touching the dedicated transcription, summary, or memory models.
  • database backup — automatic daily SQLite backup to ~/.koda/backups/ with 7-day retention.
  • webhook mode — optional Telegram webhook support instead of polling.
  • startup/shutdown notifications — admin users get notified when koda comes online or goes down.
  • dashboard — real-time web UI at / with usage stats, skills, tasks, sub-agent activity. SSE-powered live updates.
  • tool cost tracking — external API costs (Exa, image generation) tracked separately from LLM costs.
  • MCP — connect external tool servers (Notion, GitHub, etc.) via @ai-sdk/mcp. stdio, SSE, and HTTP transports. auto-reconnect on crash.
  • sub-agents — spawn focused child agents for parallel work. isolated sessions, filtered tools, config-driven limits. live progress via streamUpdate. structured results via returnResult. addressable via @AgentName: ....
  • docker sandbox — run untrusted code in isolated containers with hard resource limits.
  • Ollama — use local LLMs for fast tier when configured. falls back to OpenRouter when unavailable.
  • soul personality — editable soul.md + soul.d/*.md with filesystem watcher for hot-reload.
  • workspace context — place a CONTEXT.md in your workspace to inject project context into every system prompt. hot-reloads on change.
  • request tracing — every request gets an 8-char ID prefix in logs for easy tracing across agent and sub-agent calls.
  • task failure tracking — recurring tasks auto-disable after 3 consecutive failures with user notification.
  • reflective reviews — built-in recurring reviews can audit goal drift, follow-through, and patterns in the user's work over time.

config reference

all fields are optional except openrouter.apiKey (via env var).

section field default description
mode "private" "private" (telegram) or "cli-only"
owner id "owner" owner user ID
openrouter fastModel openai/gpt-5.4 fast chat tier model
openrouter deepModel openai/gpt-5.4 deep chat tier model
openrouter imageModel google/gemini-3-pro-image-preview image generation model
openrouter transcriptionModel google/gemini-3-flash-preview voice / video transcription model
openrouter summaryModel google/gemini-3-flash-preview conversation summarization model
openrouter memoryModel google/gemini-3-flash-preview memory extraction / consolidation model
agent maxSteps 30 max tool loop steps
agent maxTokens 8192 max output tokens per turn
agent temperature 0.7 LLM temperature
timeouts llm 120000 LLM request timeout (ms)
timeouts memory 10000 memory/embedding timeout (ms)
timeouts search 30000 search/external API timeout (ms)
scheduler timezone America/Los_Angeles IANA timezone for scheduling
proactive tickIntervalMs 30000 scheduler tick interval
features scheduler true enable/disable proactive scheduler
features debug false enable debug logging
features autoBackup true daily SQLite backup
subagent timeoutMs 90000 sub-agent timeout
subagent maxSteps 10 sub-agent max steps
ollama enabled false use local Ollama for fast tier
ollama baseUrl http://localhost:11434 Ollama server URL
ollama model llama3.2 Ollama model name
mcp servers [] MCP server configurations (stdio, sse, http)
telegram useWebhook false use webhook instead of polling
telegram webhookUrl webhook URL (e.g., https://koda.example.com/telegram)
telegram webhookSecret secret token for webhook verification

tools

tool what it does
remember / recall semantic memory via Supermemory — stores facts, retrieves relevant context. gracefully disabled without API key.
assessmentSnapshot / upsertGoal / logObservation / createIntervention / storeReview structured personal assessment state — goals, evidence, interventions, and reviews.
createPlanRecord / listPlans / getPlan / updatePlanStep / approvePlan durable execution plans with multi-step progress and approval support.
verifyOutcome checks whether important outputs actually exist before claiming success.
webSearch / extractUrl exa-powered web search + page content extraction. cost tracked per call.
readFile / writeFile / listFiles workspace-scoped filesystem. blocked patterns for .env, secrets, node_modules.
runSandboxed isolated Docker container execution with resource limits (512MB RAM, 0.5 CPU, no network).
createReminder / createRecurringTask / listTasks / deleteTask timezone-aware scheduling with natural language ("every Monday at 9am") and cron format.
skills list, load, create, search, preview, install SKILL.md files. unified skill management + community skill shop via Exa.
getSoul / updateSoul read or rewrite personality sections. hot-reloaded without restart.
systemStatus uptime, memory usage, circuit breaker state, today's cost, next scheduled task.
spawnAgent delegate sub-tasks to isolated child agents with filtered toolsets. multiple spawns run concurrently. returns structured results.
generateImage generate images via OpenRouter image models.
sendFile send workspace files back to the user as document attachments.

database

5 tables in SQLite (WAL mode):

table purpose
messages conversation history per session
tasks reminders + recurring scheduled tasks
usage per-request cost, tool cost, and token tracking
state key-value store (schema version, seeds)
subagents sub-agent spawn records

deploy

docker

# with docker compose (recommended)
docker compose up -d

# or standalone
docker build -t koda .
docker run -d --env-file .env -p 3000:3000 koda

vps

bun install --production
bun start

runs on anything that runs bun. 128mb ram is plenty. health checks at /health.

tech stack

layer tech
runtime bun
language typescript 5.9 (strict)
ai vercel ai sdk v6
llm routing openrouter — 2-tier auto-routing
memory supermemory (optional)
search exa
telegram grammy
database sqlite via bun:sqlite (wal mode)
validation zod v4
mcp @ai-sdk/mcp
cli ui @clack/prompts + chalk + ora

license

MIT

About

a personal ai that actually feels personal. remembers everything. runs code. browses the web. sends voice notes. texts like a real one.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages