Self-hosted semantic memory for AI assistants via MCP (Model Context Protocol).
Store, search, and manage persistent thoughts with vector embeddings — fully local, zero cloud dependencies.
- Semantic search across stored thoughts using cosine similarity
- Automatic embeddings via local Ollama (nomic-embed-text)
- LLM metadata extraction — auto-tags, categorizes, and summarizes thoughts
- Encryption at rest — optional AES-256-GCM for all stored content
- OAuth 2.1 + PKCE — browser-based auth for Claude Desktop and similar clients, persisted in PostgreSQL (survives restarts)
- API key auth — simple header-based auth for Claude Code and CLI tools
- Dual transport — stdio (local pipe) and Streamable HTTP (network access)
- SSE keepalive — stable long-lived connections through proxies and firewalls
- Auto re-embed — recovers embeddings after Ollama downtime (on startup + every 10 minutes)
Any MCP-compatible client works:
- Claude Code
- Claude Desktop
- Cursor
- Windsurf
- Continue
- Any app speaking MCP over stdio or HTTP
- Docker and Docker Compose
- Ollama running locally with an embedding model pulled
# 1. Clone and enter the directory
git clone https://github.com/biofers/braimory.git
cd braimory
# 2. Pull the required Ollama models
ollama pull nomic-embed-text # embeddings (vector search)
ollama pull llama3.2 # metadata extraction (auto-tags, summaries)
# 3. Create your .env
cp .env.example .env
# Edit .env — at minimum set BRAIMORY_DB_PASSWORD and MCP_ACCESS_KEY
# 4. Build and start
docker compose up -d --build
# 5. Verify
curl http://localhost:3100/health
# Should return: {"status":"ok","db":"connected","ollama":"connected"}Braimory supports two auth methods depending on the client:
| Client | Auth method | Needs public URL? | Needs HTTPS? |
|---|---|---|---|
| Claude Code / Cursor / CLI tools | API key (x-brain-key header) |
No | No |
| Claude Desktop / browser-based | OAuth 2.1 + PKCE (login page) | Yes | Yes |
The simplest setup. Braimory runs on the same machine as Claude Code.
.env requires: BRAIMORY_DB_PASSWORD, MCP_ACCESS_KEY
claude mcp add -t http -s user \
-H "x-brain-key: YOUR_MCP_ACCESS_KEY" \
braimory http://localhost:3100/mcpBraimory runs on a server (e.g. 192.168.1.50), Claude Code runs on your workstation. Both on the same network.
.env requires: same as Scenario A
claude mcp add -t http -s user \
-H "x-brain-key: YOUR_MCP_ACCESS_KEY" \
braimory http://192.168.1.50:3100/mcpNo HTTPS needed — Claude Code connects directly from your machine.
This is the most powerful setup. Once configured, Braimory appears under your claude.ai account and is available across the entire Claude ecosystem — Claude Desktop, claude.ai web, Claude mobile, and Claude Code CLI. You'll see it listed as a claude.ai MCP server, accessible from any device where you're logged into Claude.
Claude Desktop does NOT connect directly from your machine — Anthropic's servers connect to your MCP endpoint. This means it must be publicly reachable over HTTPS. Plain HTTP or localhost will not work.
Step 1: Expose your server with HTTPS using one of:
- Reverse proxy with SSL (Nginx Proxy Manager, Caddy, Traefik) — point a domain to your server and terminate TLS
- Tailscale Funnel —
tailscale funnel 3100(provides HTTPS automatically) - Cloudflare Tunnel —
cloudflared tunnel(HTTPS via Cloudflare)
Step 2: Configure .env:
OAUTH_USERNAME=your_username
OAUTH_PASSWORD=a_strong_password
OAUTH_ISSUER_URL=https://brain.example.com # your public HTTPS URL
TRUST_PROXY=true # if behind a reverse proxyStep 3: Add the MCP server in Claude Desktop:
Open Claude Desktop → Settings → MCP Servers → Add → paste your URL:
https://brain.example.com/mcp
Alternatively, you can edit
claude_desktop_config.jsonmanually, but the file location varies by OS and Claude Desktop may not pick up manual changes reliably. Using the UI is recommended.{ "mcpServers": { "braimory": { "url": "https://brain.example.com/mcp" } } }
On first use, Claude Desktop will open your browser to the Braimory login page. This page IS the OAuth authorization flow — enter the OAUTH_USERNAME and OAUTH_PASSWORD from your .env. After login, a JWT token is issued and Claude uses it automatically for all future requests.
Do I still need
MCP_ACCESS_KEY? Yes — even with OAuth-only usage. It serves as the JWT signing secret. Without it (or with one shorter than 64 hex chars), the secret is randomly generated on each startup — existing access tokens become invalid and clients must re-authenticate via refresh token. OAuth clients and refresh tokens are persisted in PostgreSQL and survive restarts, but a stableMCP_ACCESS_KEYavoids unnecessary token churn. It also protects your public endpoint from unauthorized access outside OAuth.
Can I use A/B and C together? Yes — both auth methods work simultaneously on the same server. Claude Code sends the API key header, Claude Desktop sends OAuth tokens. A common setup is both: Claude Code for CLI work + Claude Desktop for the full ecosystem.
Security trade-off: Scenario C requires exposing a port to the public internet. Braimory includes OAuth 2.1 + PKCE, brute-force protection, constant-time credential checks, and optional encryption at rest — but no publicly exposed service is ever 100% secure. You are opening a door to your network. Evaluate whether the convenience of the full Claude ecosystem is worth the added attack surface for your setup. Scenarios A/B avoid this entirely by staying local.
Runs Braimory as a child process with no HTTP server — the MCP client communicates directly via stdin/stdout. Most users won't need this — Scenarios A/B/C cover typical setups. Stdio is mainly useful for:
- Development and debugging — test code changes without rebuilding the Docker image
- Environments without Docker — run the Node.js process directly when Docker isn't available or desired
Requires building the TypeScript manually and managing environment variables yourself.
cd mcp-server && npm install && npm run build && cd ..
claude mcp add -t stdio braimory -- node /path/to/mcp-server/dist/index.js --transport stdioSet DATABASE_URL and optionally OLLAMA_URL as environment variables. No auth needed — stdio trusts the local pipe.
| Tool | Description |
|---|---|
search_thoughts |
Semantic search by meaning (cosine similarity) |
browse_recent |
Chronological browsing with time/source filters |
capture_thought |
Store a new thought with auto-embedding and metadata |
update_thought |
Update content or tags (re-embeds automatically) |
delete_thought |
Permanently delete by UUID |
stats_overview |
Database statistics: counts, tags, sources, coverage |
import_memory_graph |
Import from MCP memory plugin knowledge graph |
All configuration is via environment variables in .env:
| Variable | Required | Description |
|---|---|---|
BRAIMORY_DB_PASSWORD |
Yes | PostgreSQL password |
MCP_ACCESS_KEY |
Recommended | API key for MCP access (exactly 64 hex chars). If empty, all endpoints are open. Also used as the OAuth JWT signing secret — without it, access tokens invalidate on restart (refresh tokens persist in PostgreSQL) |
ENCRYPTION_KEY |
No | AES-256 key for encryption at rest (exactly 64 hex chars). Any other length silently disables encryption |
OAUTH_USERNAME |
No | Username for OAuth login (both username and password must be set together) |
OAUTH_PASSWORD |
No | Password for OAuth login (both username and password must be set together) |
OLLAMA_URL |
No | Ollama API URL (default: http://host.docker.internal:11434) |
EMBEDDING_MODEL |
No | Ollama embedding model (default: nomic-embed-text) |
EMBEDDING_DIMENSIONS |
No | Embedding vector size (default: 768) |
LLM_MODEL |
No | Ollama model for metadata extraction (default: llama3.2) |
MCP_HTTP_PORT |
No | HTTP server port (default: 3100) |
DB_PORT |
No | Exposed PostgreSQL port (default: 5433, commented out — uncomment in docker-compose.yml to enable) |
DB_POOL_SIZE |
No | PostgreSQL connection pool size (default: 10) |
TRUST_PROXY |
No | Trust X-Forwarded-For for IP detection (default: false) |
OAUTH_ISSUER_URL |
No | Public URL for OAuth metadata discovery |
┌─────────────────┐ MCP (HTTP/stdio) ┌──────────────────┐
│ Claude Code / │ ◄─────────────────────► │ braimory-mcp │
│ Claude Desktop │ │ (Node.js) │
│ Cursor / etc. │ └───────┬──────────┘
└─────────────────┘ │
├──► PostgreSQL + pgvector
└──► Ollama (embeddings + LLM)
- API key auth (Scenarios A/B): Claude Code sends
x-brain-keyheader with every request — checked againstMCP_ACCESS_KEY - OAuth 2.1 + PKCE (Scenario C): Claude Desktop can't send custom headers, so it uses OAuth. The login page you see IS the OAuth authorization step — after login, a JWT token is issued and used automatically. OAuth clients and refresh tokens are persisted in PostgreSQL (survive container restarts). Includes brute-force protection (per-IP rate limiting, global lockout after 50 failed attempts)
- Encryption at rest: Optional AES-256-GCM with per-field IV
- Constant-time comparisons: All credential checks use timing-safe operations
- No cloud dependencies: Everything runs locally
- Redirect URI validation: OAuth auto-registration only allows
localhostandclaude.airedirects - Input size limits: 50KB max per thought, 1MB max request body
- Session limits: Max 100 concurrent sessions, idle sessions purged after 30 minutes
- Reverse proxy: Set
TRUST_PROXY=trueonly if behind a reverse proxy — this affects IP detection for rate limiting, never bypasses authentication
The default models work well for most use cases, but you can swap them.
Every embedding model produces vectors of a fixed size. You need this number for EMBEDDING_DIMENSIONS. Three ways to find it:
1. Check the model page on ollama.com/library — search for the model, the description or tags will mention the dimension count.
2. Ask Ollama directly — after pulling the model, run:
ollama show nomic-embed-text
# Look for "embedding_length" in the parameters output3. Generate a test embedding and count — quick and definitive:
curl -s http://localhost:11434/api/embed \
-d '{"model":"nomic-embed-text","input":"test"}' | python3 -c "import sys,json; print(len(json.load(sys.stdin)['embeddings'][0]))"
# Output: 768| Model | Dimensions | Size | Notes |
|---|---|---|---|
nomic-embed-text (default) |
768 | ~274MB | Good balance of quality and speed |
all-minilm |
384 | ~45MB | Smallest and fastest, lower quality |
mxbai-embed-large |
1024 | ~670MB | Higher quality, slower |
snowflake-arctic-embed |
1024 | ~670MB | Strong multilingual support |
The LLM_MODEL is used to auto-tag and summarize thoughts. Any Ollama chat model works:
| Model | Size | Notes |
|---|---|---|
llama3.2 (default) |
~2GB | Fast, good quality |
mistral |
~4GB | Strong reasoning |
phi3 |
~2.3GB | Compact, good for low-RAM systems |
gemma2 |
~5GB | Google's model, strong general quality |
Changing the LLM is simpler than the embedding model — no database changes needed:
# 1. Pull the new model
ollama pull mistral
# 2. Update .env
LLM_MODEL=mistral
# 3. Rebuild
docker compose down && docker compose up -d --buildNew thoughts will use the new model for tagging and summarization. Existing thoughts keep their original metadata.
# 1. Pull the new model
ollama pull mxbai-embed-large
# 2. Check its dimensions
ollama show mxbai-embed-large # look for embedding_length → 1024
# 3. Update .env
EMBEDDING_MODEL=mxbai-embed-large
EMBEDDING_DIMENSIONS=1024
# 4. Update init-db.sql — change VECTOR(768) to VECTOR(1024)
# This only affects new databases. See step 6 for existing data.
# 5. Rebuild
docker compose down && docker compose up -d --build
# 6. Handle existing data (pick one):
# Option A: Start fresh (easiest)
docker compose down -v # WARNING: deletes all data
docker compose up -d --build
# Option B: Keep data, re-embed (manual SQL)
docker exec -it braimory-db psql -U braimory -d braimory -c "
ALTER TABLE thoughts DROP COLUMN embedding;
ALTER TABLE thoughts ADD COLUMN embedding vector(1024);
"
# Then restart — the re-embed job will regenerate all embeddings automaticallyImportant:
EMBEDDING_DIMENSIONSin.env,VECTOR(...)ininit-db.sql, and the actual model output must all match. A mismatch will cause insert errors.
When ENCRYPTION_KEY is set:
- All thought content is encrypted with AES-256-GCM before storage
- Metadata summaries are also encrypted
- Embeddings remain unencrypted (required for vector search)
- Existing unencrypted thoughts are auto-migrated on startup
Generate a key: openssl rand -hex 32
MIT