-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
Everything blumi does is driven by one JSON file — ~/.blumi/settings.json. This page is the
complete reference: every section, its fields, its defaults, and copy-pasteable examples. All
sections are optional — blumi ships with sensible defaults, so an empty file is valid and a one-line
provider key is usually all you need to start.
The annotated JSON blocks below use
//comments for clarity. Strip them if your editor is strict —settings.jsonis plain JSON.
blumi merges these in order (later wins), via figment:
- Built-in defaults
-
~/.blumi/settings.json— global, your main file -
./.blumi/settings.json— per-project overrides (commit-safe project tweaks; secrets stay global) -
Environment variables — prefix
BLUMI_, nest with__(e.g.BLUMI_LLM__MODEL=claude-opus-4-5,BLUMI_GRID__SECRET=…,BLUMI_PROVIDERS__OPENAI__API_KEY=…)
settings.json is written mode 0600 and holds secrets — never commit it (the repo's .blumi/
is gitignored). Per-invocation flags (--provider, --model, --persona, --yolo) beat all of the above.
blumi loginPick a provider, paste a key (or endpoint), choose a model — it writes the right bits into
settings.json. Re-run any time to switch. That's all most people need; the rest of this page is for
tuning.
-
TUI / gateway session: ask the agent to
reload_self, or use the web/phone "reload" — it re-readssettings.jsonwithout losing the conversation. -
Process-level settings (bind host/port, the web password, the grid identity) are read once at
startup, so they need a restart:
launchctl kickstart -k gui/$(id -u)/com.blumi.serve(macOS) /systemctl --user restart blumi-serve(Linux), or just relaunchblumi tui.
blumi is provider-agnostic. Built-in presets exist for the common ones (so you usually set only a key),
keyed by name in providers:
| Preset name | kind |
Notes |
|---|---|---|
anthropic |
anthropic |
Claude — API-key auth only |
openai |
openai_compat |
OpenAI and any OpenAI-compatible endpoint (set base_url) |
gemini |
gemini |
Google Gemini (native client) |
azure |
anthropic_foundry |
Azure AI Foundry (Anthropic models) |
local |
openai_compat |
a local server (llama.cpp / Ollama-compatible) — no key |
mock |
— | deterministic, for tests/demos |
Each provider entry:
"providers": {
"anthropic": { "api_key_env": "ANTHROPIC_API_KEY" }, // read key from env (preferred)
"openai": { "api_key": "sk-…", "base_url": "https://api.openai.com/v1" },
"local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
}-
api_key_env(read the key from an env var) is preferred over a literalapi_keyso the key never sits in the file. -
base_urlpoints OpenAI-compatible clients at any endpoint (Ollama, llama.cpp, OpenRouter, …). -
kindis only needed for a fully custom provider name; the presets above set it for you.
"llm": {
"provider": "anthropic", // which entry in "providers"
"model": "claude-sonnet-4-5", // "" = let the provider pick/probe
"context_size": 131072,
"max_output_tokens": 16384,
"temperature": 0.7,
"top_p": 0.8,
"top_k": 20,
"max_iterations": 25, // tool steps allowed per turn
"max_auto_continue": 12, // self-continue rounds when a turn hits the step cap (0 = off)
"max_auto_continue_tokens": 400000, // token ceiling for one auto-continue run (0 = no cap)
"max_local_agents": 4 // max concurrent local sub-agents (overflow → grid or queue)
}Override per run without editing: blumi --provider openai --model gpt-4o run "…".
"permissions": {
"yolo": false, // true = auto-approve EVERYTHING (use only sandboxed)
"tools": {
"Bash": { "deny": ["rm -rf*", "sudo*"], "ask": ["git push*"] },
"FileWrite": { "allow": ["src/**"] }
}
}Per-tool allow / deny / ask pattern lists. Interactive by default (mutating tools surface an
approval card); toggle YOLO live with ctrl+y / /yolo / --yolo. For real guardrails, pair with
hooks.pre_tool_use.
"persona": "default", // active persona (built-in or custom)
"personas": {
"reviewer": {
"description": "Careful code reviewer",
"instructions": "Be terse. Prioritise correctness + security. Propose diffs.",
"model": "claude-opus-4-5", // optional: switch model when active
"temperature": 0.2 // optional override
}
}Built-ins include architect, pair, reviewer, team. Switch with --persona <name> or /persona.
"executor": {
"backend": "local", // "local" (host) | "docker" (sandbox) | "ssh" (remote)
"docker_image": "debian:stable-slim",
"ssh_host": "user@box", // for backend = "ssh"
"ssh_workdir": "/home/user/proj"
}Use docker (or ssh to a throwaway box) when you want YOLO/gateway autonomy without risking the host.
"brain": {
"mode": "off", // "off" | "advisory" (annotate) | "auto" (decide)
"provider": "local", // "" = reuse the main agent's provider
"model": "qwen2.5:3b" // a small/cheap/local model is ideal here
}A cheap model that vets approval prompts so the flagship isn't interrupted — escalates to you on uncertainty or danger.
"router": {
"mode": "off", // "off" | "heuristic" | "hybrid" | "judge"
"light": { "provider": "", "model": "claude-haiku-4-5" },
"heavy": { "provider": "", "model": "claude-opus-4-5" },
"judge": { "provider": "", "model": "" }, // "" = reuse brain.* then llm.*
"subagent_tier": "light", // "light" | "heavy" | "inherit"
"prefer_grid_light": false, // run the light tier on a grid peer's local model (free)
"heuristics": {
"heavy_chars": 1500, "light_chars": 280, "heavy_tool_count": 12,
"escalate_iteration": 6, "heavy_keywords": [], "light_keywords": []
}
}Per turn, picks light vs flagship to cut cost. Off by default. Empty tier provider/model reuse
the active llm.*. See Self-Management → Cost-aware routing.
"heal": {
"enabled": true,
"recovery_budget": 2, // recovery attempts per turn
"verify": false, // only mark a fix "verified" if a later step succeeds
"learn": true, // store failure→fix episodes in memory
"evolve": "auto", // "auto" | "propose" | "off" (mine recurring fixes → skills)
"redact_paths": true
}Auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. On by default. See Self-Management → Self-healing.
These three power recall, semantic memory, and code search. All on by default — the bundled local embedder downloads a ~90 MB model on first use, then works fully offline. See Memory & Knowledge.
"embeddings": {
"enabled": true,
"backend": "local", // "local" (bundled ONNX) | "openai" (endpoint) | "grid" (peer)
"provider": "", // for backend = "openai": a name from "providers"
"model": "bge-small-en-v1.5",
"dim": 384
},
"memory": {
"enabled": true,
"recall_k": 5, // memories injected per turn (RAG)
"dedup_threshold": 0.92, // admission gate: near-duplicates are merged
"max_per_namespace": 2000,
"diffuse": true, // share non-`user` learnings across the grid
"sweep_secs": 60 // governance cadence (eviction/consolidation)
},
"knowledge": {
"enabled": true,
"max_file_kb": 256, // skip files larger than this when indexing
"exclude": ["target", "node_modules", ".git", "dist"]
}-
Semantic memory accrues automatically (failure→fix episodes, the agent's memory tool); browse it
with
/memories(TUI) or the web/phone Control Center (where you can pin/edit/delete). Theusernamespace never leaves your node. Tell the agent "remember that …" to force-store; pin an entry to exempt it from eviction. -
Knowledge is empty until you index a repo:
blumi knowledge ingest .thenblumi knowledge status(or/knowledgein the TUI). Powerscode_search/code_retrieve.
"acceleration": {
"mode": "auto", // "auto" | "cpu" | "apple" (CoreML) | "cuda"
"embeddings_accel": "auto" // override just the bundled embedder
}blumi accel doctor shows what was detected. See
Memory & Knowledge → GPU acceleration.
"web": { "password_hash": "" } // set via `blumi serve pair --password <pw>` (argon2; never plaintext)A password is required when binding to a non-loopback address. The same server serves the browser UI and the blugo phone app. See Gateway.
"voice": {
"enabled": false,
"api_key": "",
"stt_base_url": "https://api.openai.com/v1",
"stt_model": "whisper-1",
"tts_provider": "openai", // "openai" | "elevenlabs"
"tts_base_url": "", // blank = provider default
"tts_model": "tts-1",
"tts_voice": "alloy", // OpenAI voice name, or an ElevenLabs voice id
"tts_api_key": "" // separate TTS key (falls back to api_key)
}See Voice.
"gateway": {
"yolo": false, // auto-approve in bot sessions (only with a sandboxed executor!)
"telegram": {
"token": "123456:AA…", // @BotFather
"allowed_chats": [], // [] = anyone who messages it; or [<your-chat-id>]
"voice": false // transcribe voice notes + speak replies (needs voice.* too)
},
"discord": { "token": "", "allowed_channels": [] },
"slack": { "bot_token": "xoxb-…", "app_token": "xapp-…" },
"whatsapp": { "token": "", "phone_number_id": "", "verify_token": "", "webhook_port": 8080 }
}Run one transport in the foreground (blumi gateway telegram) or all configured ones as a service
(blumi gateway install). One bot per token — see Gateway → Messaging-bot gateway.
"grid": {
"enabled": false,
"secret": "", // same value on every node = same grid (or BLUMI_GRID__SECRET)
"grid_id": "", // blank = derived from the secret digest
"node_name": "", // blank = hostname
"peers": ["10.0.0.150:7777"] // static peers (in addition to mDNS auto-discovery)
}Several blumi serve nodes sharing a secret form a grid that hands tasks to peers. The secret is
never advertised (only a non-sensitive digest). See Grid.
"remote": { "instances": [] }, // remote blumi instances the TUI can attach to as tabs
"workspaces": { "roots": ["~/code"] } // dirs scanned for sibling git repos in the TUI sidebar"git": { "author_name": "Blumi", "author_email": "you@example.com" }Stamped on commits the agent makes (GIT_AUTHOR_* / GIT_COMMITTER_*). Empty author_name disables
the override.
"always_on": {
"enabled": false,
"autonomy": "propose", // "off" | "propose" (add tasks) | "auto" (reserved)
"cadence_secs": 900,
"min_interval_secs": 300,
"skip_if_todos": 1, // skip while the board already has todos
"max_open_discoveries": 5,
"max_per_pass": 3
}When idle, the gateway runs a read-only discovery pass and proposes tasks + a report. Off by default. See Self-Management → Always-on discovery.
"notify": {
"enabled": false,
"on": ["loop", "discovery"], // which completions fire (also "turn"); [] = loop+discovery
"desktop": true, // OS notification on the host
"bot": { "transport": "telegram", "target": "<chat-id>" }, // proactive bot message
"web_push": false // browser Web Push (VAPID; secure-context only)
}Pings you when blumi loop / discovery finishes. Off by default. See
Self-Management → Completion notifications.
"hooks": {
"user_prompt_submit": [
{ "command": "git branch --show-current", "timeout_secs": 5 }
],
"pre_tool_use": [
{ "command": "jq -e '.input.command|test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" }
]
}user_prompt_submit injects each command's stdout as turn context; pre_tool_use can block a tool
(non-zero exit). Off by default. See Self-Management → Lifecycle hooks.
"mcp_servers": {
"github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "ghp_…" }, "enabled": true }
}External MCP servers launched over stdio; their tools appear to the agent. {workspace} in args is
substituted with the project path. See blumi mcp and CLI Usage.
"lsp_servers": {
"rust": { "command": "rust-analyzer", "args": [], "extensions": ["rs"], "language_id": "rust" }
}Power the Lsp code-intelligence tool (definitions, references, diagnostics).
A realistic everyday settings.json — Claude for the flagship, a cheap local brain, memory on,
notifications to Telegram:
{
"llm": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
"providers": {
"anthropic": { "api_key_env": "ANTHROPIC_API_KEY" },
"local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
},
"brain": { "mode": "advisory", "provider": "local", "model": "qwen2.5:3b" },
"router": { "mode": "hybrid", "light": { "model": "claude-haiku-4-5" },
"heavy": { "model": "claude-opus-4-5" } },
"memory": { "enabled": true },
"knowledge": { "enabled": true },
"git": { "author_name": "Blumi", "author_email": "you@example.com" },
"notify": { "enabled": true, "bot": { "transport": "telegram", "target": "123456789" } },
"gateway": { "telegram": { "token": "123456:AA…", "allowed_chats": [123456789] } }
}When a turn stops only because it hit the per-turn step cap, blumi auto-continues the same session and
narrates each step — bounded by both llm.max_auto_continue (default 12) and
llm.max_auto_continue_tokens (default 400k), whichever hits first. Tune live with /autocontinue <n>
(0 disables).
Beyond semantic memory, two markdown files in ~/.blumi/ are read as a frozen snapshot each session:
project MEMORY.md (agent notes) and USER.md (about you). View/edit them in the TUI
(/memory), the web/phone Control Center, or on disk.