Stop giving your AI amnesia.
Brain OS is a biologically-inspired, central cognitive engine written in pure Rust. Instead of every script, coding assistant, and chat UI keeping its own isolated, fragmented context, Brain OS acts as your single source of truth.
It routes intents through a Thalamus, scores importance via an Amygdala, and stores everything in a unified Hippocampus (FTS5 + HNSW Vector Search). Whether you connect via HTTP, WebSocket, gRPC, or MCP, your AI tools now share one localized, ever-growing memory that runs 24/7 on your machine.
Your data never leaves your hardware. Your AI never forgets.
Every input — regardless of protocol — flows through the same pipeline:
Input → Intent Classification → Importance Scoring → Memory Store/Recall → LLM Response
The memory engine combines vector search (HNSW) with full-text search (BM25 FTS5), fuses results via Reciprocal Rank Fusion, and reranks by importance and recency. A forgetting curve runs every 24 hours to prune low-value memories and promote reinforced episodes to permanent semantic facts.
Requirements: Ollama (or any OpenAI-compatible API), Docker (optional, for web search)
# Install the brain binary (requires Rust 1.82+)
cargo install brainos
# Initialize data directory (~/.brain/)
brain init
# Pull the default LLM + embedding models
ollama pull qwen2.5-coder:7b
ollama pull nomic-embed-text
# Start external services (SearXNG web search — optional)
brain deps upgit clone https://github.com/keshavashiya/brain.git
cd brain
cargo install --path crates/cli
brain initbrain init creates ~/.brain/ with config, database, vector index, and log directories.
brain deps up starts a Docker container for SearXNG (web search, port 8888). This is optional — Brain works without it but web search intents will return "backend not configured".
If the embedding provider is unavailable, Brain uses deterministic normalized fallback vectors so writes and search continue without panics. Semantic quality is lower until the embedding provider is healthy.
# Start Brain as a background daemon (all adapters enabled)
brain start
# Stop the daemon
brain stop
# Check daemon + adapter status
brain status
# Interactive chat (connects to running daemon or starts inline)
brain chat
# One-shot message
brain chat "remember that I use dark mode"Brain uses an optional Docker container for web search:
brain deps up # Start SearXNG
brain deps status # Check if running
brain deps down # Stop| Service | Port | Purpose |
|---|---|---|
| SearXNG | 8888 | Web search backend (metasearch engine) |
brain status automatically checks if SearXNG is reachable.
Install Brain as a system service so it starts automatically on login:
# Install (creates launchd / systemd / Task Scheduler entry)
brain service install
# Remove the service
brain service uninstall| Platform | Mechanism | Privileges required |
|---|---|---|
| macOS | launchd (LaunchAgents) | None |
| Linux | systemd user service | None |
| Windows | Task Scheduler (ONLOGON) | None |
After installation the daemon starts immediately and will restart after crashes.
Any MCP-compatible client can connect to Brain as a stdio MCP server. MCP (Model Context Protocol) is an open standard for connecting AI assistants to tools and data sources.
Configure your MCP client to spawn Brain as a subprocess:
{
"mcpServers": {
"brain": {
"command": "brain",
"args": ["mcp"]
}
}
}Brain also exposes MCP over HTTP (brain serve --mcp) for clients that prefer HTTP transport.
| Tool | Arguments | Description |
|---|---|---|
memory_search |
query, top_k?, namespace? |
Hybrid semantic + full-text search |
memory_store |
subject, predicate, object, category, namespace? |
Store a semantic fact |
memory_facts |
subject, namespace? |
All facts about a subject (optional namespace filter) |
memory_episodes |
limit? |
Recent conversation history |
user_profile |
— | Current user configuration |
memory_procedures |
action, trigger?, steps?, procedure_id? |
Manage learned workflows (list / store / delete) |
MCP stdio passes auth in the _meta field of every request:
{
"method": "tools/call",
"params": {
"_meta": { "x-api-key": "your-key" },
"name": "memory_search",
"arguments": { "query": "dark mode" }
}
}MCP over HTTP uses the x-api-key header.
Default port: 19789. All /v1/* routes require Authorization: Bearer <key>.
# Health check (no auth)
curl http://localhost:19789/health
# Prometheus metrics (no auth)
curl http://localhost:19789/metrics
# Web UI — diagnostic tool (no auth)
open http://localhost:19789/ui
# OpenAPI spec
curl http://localhost:19789/openapi.json
# Swagger UI
open http://localhost:19789/api
# Store a fact (only "content" is required; source/sender/namespace/agent are optional)
curl -X POST http://localhost:19789/v1/signals \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content":"I prefer dark mode"}'
# Search memory
curl -X POST http://localhost:19789/v1/memory/search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query":"UI preferences","top_k":5}'
# List all facts
curl http://localhost:19789/v1/memory/facts \
-H "Authorization: Bearer YOUR_API_KEY"
# Namespace statistics
curl http://localhost:19789/v1/memory/namespaces \
-H "Authorization: Bearer YOUR_API_KEY"
# SSE stream of proactive notifications (open loop reminders, habit nudges)
curl -N http://localhost:19789/v1/events \
-H "Authorization: Bearer YOUR_API_KEY"| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/health |
No | Liveness check |
GET |
/metrics |
No | Prometheus metrics |
GET |
/ui |
No | Browser UI (diagnostic) |
GET |
/openapi.json |
No | OpenAPI spec |
GET |
/api |
No | Swagger UI |
POST |
/v1/signals |
Yes | Submit a signal |
GET |
/v1/signals/:id |
Yes | Poll cached response |
POST |
/v1/memory/search |
Yes | Hybrid semantic search |
GET |
/v1/memory/facts |
Yes | List all facts |
GET |
/v1/memory/namespaces |
Yes | Namespace stats |
GET |
/v1/events |
Yes | SSE stream of proactive notifications |
brain start launches all adapters together. They share a single processor so memory is consistent across all protocols.
| Adapter | Default Port | Notes |
|---|---|---|
| HTTP REST | 19789 | REST API + Web UI + Swagger + OpenAPI |
| WebSocket | 19790 | Bidirectional streaming, real-time |
| MCP HTTP | 19791 | MCP over HTTP transport |
| gRPC | 19792 | Protobuf RPC + server streaming |
| MCP stdio | stdin/stdout | brain mcp for subprocess MCP clients |
| Adapter | Auth | Namespace Input | Streaming | Memory Semantics |
|---|---|---|---|---|
| HTTP | Bearer API key | namespace on /v1/signals and /v1/memory/search |
Request/response | Shared semantic+episodic stores |
| WebSocket | First frame api_key |
namespace in each message |
Bidirectional socket | Shared semantic+episodic stores |
| gRPC | Interceptor (x-api-key or Bearer metadata) |
namespace on signal/search/store requests |
Server streaming (ReceiveSignals, StreamSignals) |
Shared semantic+episodic stores |
| MCP (stdio/http) | _meta.x-api-key / x-api-key header |
Tool args (memory_store, memory_search, memory_facts) |
JSON-RPC request/response | Shared semantic+episodic stores |
For development, brain serve runs everything in the foreground with optional flags:
brain serve # all adapters (foreground)
brain serve --http # HTTP only
brain serve --http --ws # HTTP + WebSocket
brain serve --mcp # MCP HTTP onlyScope facts and episodes to a context. The default namespace is "personal".
# Store a project-specific fact
curl -X POST http://localhost:19789/v1/signals \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content":"use bun not npm","namespace":"my-project"}'
# Search only within that namespace
curl -X POST http://localhost:19789/v1/memory/search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query":"package manager","namespace":"my-project"}'Brain is a local service — it does not reach outward to external messaging platforms. Instead, a thin external bridge connects a platform-specific bot or gateway to Brain's WebSocket API and translates messages in both directions.
External Platform Bridge (your code / external repo) Brain OS
──────────────────── ────────────────────────────────── ────────────────
Slack / Telegram ────► BridgeClient (crates/bridge library) ──► ws://localhost:19790
Custom chat agent exponential-backoff reconnection SignalProcessor
Any WebSocket bot thin message translation memory + LLM
The crates/bridge/ library provides a BridgeClient for building these relays. It handles reconnection with exponential backoff, ping/pong keep-alive, and JSON message serialization automatically. No platform-specific code lives inside Brain itself.
A minimal bridge connecting an external gateway to Brain:
// In your own external relay project — not inside the Brain OS repo
use bridge::{BridgeClient, BridgeConfig, BridgeMessage};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Connect to YOUR gateway (e.g. a Slack bot WebSocket endpoint)
let client = BridgeClient::new(
"ws://your-gateway.example.com/brain-relay",
BridgeConfig::default(), // exponential backoff: 1s → 2s → 4s → … → 60s
);
// For each inbound message from the gateway, forward to Brain and relay the response
client.connect_and_relay(|msg| async move {
BridgeMessage::reply(&msg, call_brain_ws(&msg.content).await)
}).await?;
Ok(())
}Brain's WebSocket API (ws://localhost:19790) is the entry point — the bridge is external and lives in its own repository. This keeps Brain small, stable, and protocol-agnostic.
Brain provides a built-in brain bridge command that simplifies connecting external gateways:
# Connect to an external WebSocket gateway
brain bridge ws://localhost:8080/gateway
# With custom API key
brain bridge ws://localhost:8080/gateway --api-key YOUR_KEYThe bridge command:
- Connects to your external WebSocket gateway
- Connects to Brain's WebSocket synapse internally
- Relays messages bidirectionally between the gateway and Brain
- Automatically handles reconnection with exponential backoff
This is useful for quickly testing bridge connections or for simple relay setups without writing custom code.
brain serve and brain start spawn background tasks alongside the protocol adapters, sharing the same SignalProcessor:
Runs every 24 hours. Uses an Ebbinghaus forgetting curve to prune low-retention episodes and promote frequently-reinforced episodes to permanent semantic facts with an idempotency guard.
memory:
consolidation:
enabled: true # on by default
interval_hours: 24
forgetting_threshold: 0.05 # episodes with retention < 5% are prunedWhen enabled, Brain becomes bidirectional — it proactively reminds you of things instead of only responding when asked.
Habit Detection — scans episodic memory for recurring patterns (keyword × day-of-week × hour histograms) and nudges you when a pattern matches the current time slot.
Open-Loop Detection — scans for unresolved commitments ("I need to...", "remind me to...", "I should...") and generates reminders when no resolution is found within the configured window.
Delivery — proactive messages are delivered through three tiers:
- Outbox — written to SQLite, drained on next
brain chatsession - Broadcast — pushed to live WebSocket and SSE (
GET /v1/events) sessions - Webhooks — pushed to configured messaging channels (Slack, Discord, Telegram, etc.)
proactivity:
enabled: false # opt-in; set to true to activate
max_per_day: 5
min_interval_minutes: 60
quiet_hours:
start: "22:00"
end: "08:00"
delivery:
outbox: true # always write to outbox; drain on next interaction
broadcast: true # push to live WS/SSE sessions
webhook_channels: [] # channel keys from actions.messaging.channels
max_outbox_age_days: 7
open_loop:
enabled: true # detect unresolved commitments (requires proactivity.enabled)
scan_window_hours: 72
resolution_window_hours: 24
check_interval_minutes: 120Every signal can carry an agent field identifying the originating AI tool (e.g. "claude-code", "cursor"). Agent identity flows through the entire pipeline — recall, habit detection, and proactive messages reference the originating agent when known.
curl -X POST http://localhost:19789/v1/signals \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content":"deploy staging server","agent":"devops-agent"}'Action intents routed by Thalamus (web_search, schedule_task, send_message) are handled by internal ActionDispatcher backends. These are internal-only — no public HTTP or gRPC endpoints expose them directly.
Behavior contract:
- Disabled in config → explicit
disabled by configerror - Enabled but backend missing → explicit
backend not configurederror - Backend configured → real execution with structured success output
actions:
web_search:
enabled: true
provider: "searxng" # searxng | tavily | custom
endpoint: "http://localhost:8888" # SearXNG instance URL
api_key: "" # required for tavily
timeout_ms: 3000
default_top_k: 5| Provider | Auth | Self-hosted | Setup |
|---|---|---|---|
searxng |
None | Yes | brain deps up (or docker run -d -p 8888:8080 searxng/searxng) |
tavily |
API key (free, no CC) | No | Sign up at tavily.com, set api_key |
custom |
None | — | Set endpoint to any OpenAI-compatible JSON search API |
Brain sends messages via configurable webhook URLs. Any service that accepts HTTP POST works — Slack, Discord, Telegram, ntfy.sh, or a custom endpoint. No platform SDK is bundled.
actions:
messaging:
enabled: true
timeout_ms: 3000
channels:
slack:
url: "https://hooks.slack.com/services/T/B/x"
body: '{"text": "{{content}}"}'
discord:
url: "https://discord.com/api/webhooks/123/abc"
body: '{"content": "[{{channel}}] {{content}}"}'
telegram:
url: "https://api.telegram.org/bot<TOKEN>/sendMessage"
body: '{"chat_id": "<ID>", "text": "{{content}}"}'
headers:
Content-Type: "application/json"
simple: "https://example.com/hook" # shorthand: URL only, default JSON bodyTemplate placeholders: {{content}}, {{channel}}, {{recipient}}, {{namespace}}, {{timestamp}}. Values are JSON-escaped automatically. Custom headers are optional (useful for auth-requiring APIs like Telegram).
All HTTP backends (web search + messaging) share a retry and circuit breaker configuration:
actions:
resilience:
max_retries: 2 # retries on 5xx / timeout / connection refused
retry_base_ms: 500 # exponential backoff: 500 → 1000 → 2000ms
circuit_breaker_threshold: 5 # consecutive failures before circuit opens
circuit_breaker_cooldown_secs: 60 # seconds before retrying after circuit opens4xx errors (auth, bad request) fail immediately without retries. When a circuit opens, all requests to that backend return an instant error until the cooldown elapses.
scheduling:
enabled: false
mode: "persist_only" # intents stored in SQLite; background poller fires due intentsWhen scheduling.enabled: true, a background task in brain serve polls every 60 seconds for
pending intents and delivers them as proactive notifications via the NotificationRouter.
Back up and restore all memory:
# Export to stdout (pipe to file)
brain export > backup.json
# Export directly to file
brain export --output backup.json
# Preview what an import would do (dry-run)
brain import backup.json --dry-run
# Import from backup
brain import backup.jsonThe export format is a self-contained JSON file containing all facts and episodes with timestamps, importance scores, and namespace labels. Import is idempotent — re-importing the same backup is safe.
| Adapter | Method |
|---|---|
| HTTP REST | Authorization: Bearer <key> |
| WebSocket | First frame: {"api_key":"<key>"} |
| MCP HTTP | x-api-key: <key> header |
| MCP stdio | params._meta["x-api-key"] |
| gRPC | Interceptor checks x-api-key or authorization metadata |
Configure keys in ~/.brain/config.yaml:
access:
api_keys:
- key: "your-secret-key"
name: "Production Key"
permissions: [read, write]
- key: "readonly-key"
name: "Read Only"
permissions: [read]brain init generates a unique API key (prefixed brk_) and prints it to the terminal. Find your key in ~/.brain/config.yaml under access.api_keys.
Config is loaded from three sources (highest priority wins):
- Environment variables —
BRAIN_LLM__MODEL=gpt-4o brain serve - User config —
~/.brain/config.yaml - Defaults —
crates/core/default.yaml
Double-underscore (__) is the nesting separator in env var names.
llm:
provider: "ollama" # ollama | openai
model: "qwen2.5-coder:7b"
base_url: "http://localhost:11434"
api_key: "" # required for openai provider
temperature: 0.7
max_tokens: 4096
intent_llm_fallback: false # enable LLM fallback when regex intent classification is uncertainTo use OpenAI or OpenRouter:
llm:
provider: "openai"
base_url: "https://api.openai.com/v1"
api_key: "sk-..."
model: "gpt-4o"The api_key can also be set via the BRAIN_LLM__API_KEY environment variable (takes precedence over config file). Both the LLM and embedding providers use this key.
embedding:
model: "nomic-embed-text" # must be pulled: `ollama pull nomic-embed-text`
dimensions: 768 # must match the model output sizeFor OpenAI-compatible embeddings:
embedding:
model: "text-embedding-3-small"
dimensions: 1536# Generate a salt and enable encryption
brain init --encryptThen set encryption.enabled: true in ~/.brain/config.yaml and provide a passphrase:
# Via environment variable (for daemon/CI)
BRAIN_PASSPHRASE="your-passphrase" brain serve
# Or Brain will prompt interactively on startup
brain serveNote: When encryption is enabled, the FTS5 full-text search index cannot operate on encrypted content. Keyword search (BM25) returns no results — hybrid search relies entirely on vector similarity (HNSW ANN). Search still works but recall quality may be lower for keyword-heavy queries.
~/.brain/
├── config.yaml # User configuration (overrides defaults)
├── db/
│ ├── brain.db # SQLite — facts, episodes, procedures, FTS5 index
│ └── salt # Encryption salt (only if --encrypt was used)
├── ruvector/ # HNSW vector index files (ruvector-core)
├── logs/
│ └── brain.log # Daemon logs
└── exports/ # Export output directory
# Regenerate config with a new API key (data directories are preserved)
brain init --force
# Also enable encryption
brain init --force --encrypt--force overwrites ~/.brain/config.yaml with defaults and a fresh API key. Your database, vector index, and exports remain untouched.
git clone https://github.com/keshavashiya/brain.git
cd brain
# Build the workspace
cargo build
# Run tests
cargo test
# Run the CLI in development
cargo run -p brainos -- chat "hello"
# Run the server in foreground (all adapters)
cargo run -p brainos -- serve
# Run specific adapters only
cargo run -p brainos -- serve --http --mcpThe project is a Cargo workspace with 15 crates. All internal dependencies use both path (for local development) and version (for crates.io), so no Cargo.toml changes are needed to switch between local and published builds.
crates/
├── core/ # brainos-core — Config and bootstrapping
├── storage/ # brainos-storage — SQLite + HNSW vector index
├── hippocampus/ # brainos-hippocampus — Episodic + semantic memory
├── cortex/ # brainos-cortex — LLM providers + context assembly
├── thalamus/ # brainos-thalamus — Intent classification
├── amygdala/ # brainos-amygdala — Importance scoring
├── signal/ # brainos-signal — Central signal processor
├── cerebellum/ # brainos-cerebellum — Procedural memory
├── ganglia/ # brainos-ganglia — Proactivity engine
├── bridge/ # brainos-bridge — WebSocket relay client
├── adapters/
│ ├── http/ # brainos-httpadapter — Axum REST API
│ ├── ws/ # brainos-wsadapter — WebSocket adapter
│ ├── grpc/ # brainos-grpcadapter — gRPC adapter
│ └── mcp/ # brainos-mcp — MCP adapter
└── cli/ # brainos (binary: brain) — CLI entry point
Crates must be published in dependency order (leaf crates first). All crates are on crates.io under the brainos-* namespace.
MIT