A personal knowledge base with built-in usage tracking that connects to Claude.ai via MCP.
Second Brain ingests your emails, calendar events, documents, messages, notes, and browser history into a unified vector database. It exposes an MCP (Model Context Protocol) server so Claude can search your personal data during conversations. Built-in query logging and an auto-curator track what you search for, how results perform, and which sources are pulling their weight.
┌─────────────────────────────────────────────────────────────────┐
│ Data Sources │
│ Gmail · Calendar · OneDrive · iMessage · Notes · Chrome · PDF │
└──────────────────────────┬──────────────────────────────────────┘
│ Ingestion (scheduled)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Ingestion Pipeline │
│ Chunking → Embedding (all-MiniLM-L6-v2) → ChromaDB (cosine) │
│ Auto-tagging · Deduplication · Checkpoint tracking │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ChromaDB Vector DB │
│ 15 collections · ~384-dim embeddings · persistent on disk │
│ emails · documents · calendar · messages · notes · manual · … │
└──────────────┬───────────────────────────────┬──────────────────┘
│ │
▼ ▼
┌──────────────────────────┐ ┌──────────────────────────────────┐
│ MCP Server (:8420) │ │ Observability & Quality Signals │
│ FastAPI + JSON-RPC │ │ │
│ OAuth 2.0 for Claude.ai │ │ Feedback DB (SQLite) │
│ 8 tools: │ │ ↳ logs every query + results │
│ · query_second_brain │ │ ↳ user ratings (thumbs up/down)│
│ · get_recent_entries │ │ │
│ · get_entries_about_ │ │ Auto-Curator (daily) │
│ person │ │ ↳ source quality rankings │
│ · get_project_status │ │ ↳ stale chunk detection │
│ · add_entry │ │ ↳ query pattern analysis │
│ · get_brain_status │ │ ↳ health reports │
│ · rate_response │ │ │
│ · brain_insights │ │ brain_insights tool │
│ │ │ ↳ recommendations for user │
│ Cloudflare Tunnel │ │ ↳ "what should I add?" │
│ ↳ remote access │ │ ↳ "what's working well?" │
└──────────────────────────┘ └──────────────────────────────────┘
│
▼
┌──────────────────────────┐
│ Claude.ai │
│ MCP Connector │
│ "Search my brain for…" │
│ "What did Sarah say…" │
│ "Rate that as helpful" │
└──────────────────────────┘
Built-in observability. Most personal knowledge bases are black boxes — you add data, you search it, and you have no idea what's working. Second Brain tracks usage and surfaces quality signals:
- Every query is logged — what you searched for, what came back, relevance scores, response time
- You rate results — thumbs up/down via the
rate_responseMCP tool, right from your Claude conversation - The auto-curator analyzes query patterns — which sources appear in useful results? Which chunks haven't matched a query in 30+ days? What topics do you search most?
- Daily health reports — the curator generates actionable recommendations: source quality rankings, stale data flags, coverage gaps
- Freshness-weighted scoring — newer entries get a relevance boost, query expansion fills gaps, and deduplication keeps the index clean
Right now it surfaces insights and recommendations. Automated re-chunking and re-weighting based on feedback is on the roadmap.
| Source | Type | Status |
|---|---|---|
| Gmail | Cloud (OAuth) | Stable |
| Google Calendar | Cloud (OAuth) | Stable |
| OneDrive | Cloud (OAuth) | Stable |
| PDF / Word / Excel / PPT | Local files | Stable |
| Claude conversations | Export | Stable |
| Manual entries | CLI / MCP | Stable |
| iMessage | Local (macOS) | Stable |
| Apple Notes | Local (macOS) | Stable |
| Chrome history | Local | Stable |
| Plaid (financial) | API | Beta |
- macOS (tested on Sequoia 26.x, Apple Silicon recommended)
- Linux support is possible but untested — the macOS-specific sources (iMessage, Apple Notes, launchd) would need alternatives
- Python 3.11+
- Homebrew (macOS package manager)
- Cloudflare account (free tier, for remote access via Claude.ai)
- ~2GB disk space (embedding model + ChromaDB)
- FileVault enabled (strongly recommended — your data includes emails and messages)
# 1. Clone the repo
git clone https://github.com/yourusername/SecondBrain.git
cd SecondBrain
# 2. Copy and edit config
cp config.yaml config.yaml # Already included, edit values
cp .env.example .env
nano .env # Add your API keys/secrets
# 3. Run setup (creates venv, installs deps, downloads model, starts services)
bash setup.sh
# 4. Test it
source ~/.zshrc
brain add --title "First entry" --text "Hello, Second Brain"
brain search "first entry"- Create a Google Cloud project and enable Gmail + Calendar APIs
- Download OAuth credentials to
~/SecondBrain/data/oauth/gmail_credentials.json - Run the OAuth flows:
# Gmail (opens browser for consent) python3 -c " from ingestion.gmail import _get_credentials _get_credentials() " # Calendar python3 -c " from ingestion.calendar_sync import _get_credentials _get_credentials() "
- Enable in
config.yaml: setgmail.enabled: trueandcalendar.enabled: true - Restart the scheduler:
launchctl unload ~/Library/LaunchAgents/com.secondbrain.scheduler.plist && launchctl load ~/Library/LaunchAgents/com.secondbrain.scheduler.plist
- Register an app at Azure Portal → App registrations
- Set "Personal Microsoft accounts only", redirect URI:
http://localhost:8424/callback - Enable "Allow public client flows" under Authentication
- Add your
client_idtoconfig.yamlundersources.onedrive.client_id - Run the OAuth flow (script provided in
scripts/)
- Set up a Cloudflare Tunnel to expose port 8420:
brew install cloudflared cloudflared tunnel --url http://127.0.0.1:8420
- In Claude.ai → Settings → MCP Servers → Add:
- URL:
https://your-tunnel-url/mcp - The server handles OAuth discovery automatically
- URL:
| Tool | Description |
|---|---|
query_second_brain |
Semantic search across all sources with date/tag/source filters |
get_recent_entries |
Entries from the last N hours |
get_entries_about_person |
Everything mentioning a specific person |
get_project_status |
Project timeline in chronological order |
add_entry |
Save a note, decision, or idea from the conversation |
get_brain_status |
System health, ingestion status, alerts |
rate_response |
Rate a query result as useful/not useful |
brain_insights |
Self-improvement analytics and recommendations |
SecondBrain/
├── brain/ # Core: DB, search, chunking, embeddings, self-improvement
│ ├── db.py # ChromaDB wrapper (15 collections)
│ ├── search.py # Semantic search with freshness weighting
│ ├── chunker.py # Text → chunks with auto-tagging
│ ├── embeddings.py # all-MiniLM-L6-v2 embedding function
│ ├── feedback.py # Query logging + ratings (SQLite)
│ ├── curator.py # Auto-curator: analysis, reports, stale detection
│ ├── self_improve.py # Freshness scoring, query expansion, dedup
│ ├── alerting.py # Proactive health alerts (email/log)
│ ├── maintenance.py # Log rotation, health snapshots
│ └── secure_fs.py # Atomic writes with restricted permissions
├── ingestion/ # Data source connectors
│ ├── scheduler.py # APScheduler orchestrator (parallel I/O)
│ ├── gmail.py # Gmail via Google API (OAuth)
│ ├── calendar_sync.py# Google Calendar (OAuth)
│ ├── onedrive.py # OneDrive via Microsoft Graph (OAuth)
│ ├── imessage.py # iMessage (local SQLite)
│ ├── apple_notes.py # Apple Notes (local SQLite)
│ ├── chrome.py # Chrome history (local SQLite)
│ ├── documents.py # PDF, Word, Excel, PPT (inbox folder)
│ ├── claude_export.py# Claude conversation exports
│ ├── plaid_finance.py# Plaid transactions, balances, holdings
│ └── manual.py # CLI / MCP manual entries
├── mcp_server/ # MCP + REST API server
│ ├── server.py # FastAPI: MCP JSON-RPC, OAuth 2.0, REST
│ ├── tools.py # 8 tool implementations + schemas
│ └── auth.py # API key validation
├── web/ # Status dashboard (Flask)
├── scripts/ # CLI, backup, utilities
├── tunnel/ # Cloudflare Tunnel setup
├── config.yaml # All configuration
├── setup.sh # One-command setup
└── requirements.txt # Python dependencies
Q: How is this different from just connecting Gmail/Google Calendar to Claude?
Native connectors give Claude live access to one service at a time. Second Brain is fundamentally different in three ways:
- Unified cross-source search — It combines Gmail, Calendar, OneDrive, and more into a single semantic index. Ask "what's happening with Project X" and it searches emails, calendar invites, and documents simultaneously. Native connectors can't cross-reference across sources.
- Persistent memory with semantic understanding — Native connectors are stateless keyword searches. Second Brain pre-processes your data into vector embeddings, so it understands meaning, not just keywords. Ask "what financial decisions have I made recently" and it pulls budget spreadsheets, investment emails, and financial advisor meetings — by meaning, across every source.
- Observability and feedback — Query patterns are tracked, source quality is ranked, and stale data is flagged automatically. The auto-curator surfaces recommendations so you know what's working and what isn't. Native connectors give you zero visibility into result quality.
Q: What data sources are supported?
Currently: Gmail, Google Calendar, and OneDrive. The architecture supports any source — iMessage, local files, Chrome history, and financial data (via Plaid) are on the roadmap. Adding a new source means writing one Python module that follows the existing pattern.
Q: Where does my data live?
Everything stays on your machine. Second Brain runs entirely self-hosted — your data is stored in a local ChromaDB vector database and SQLite. Nothing is sent to external servers except through the Cloudflare tunnel to YOUR Claude session. No third-party analytics, no telemetry, no cloud storage.
Q: Do I need a Mac Mini?
No. Any always-on machine works — a Linux server, a Raspberry Pi, an old laptop, a cloud VM. The Mac Mini is just what we used. You need Python 3.11+, about 500MB of disk space, and a network connection.
Q: Can I use this with ChatGPT or other LLMs?
The MCP server exposes a standard HTTP API. While the current setup is optimized for Claude's MCP connector protocol, the underlying REST endpoints can be adapted for any LLM that supports tool use or function calling.
Q: Is this secure?
The MCP server requires API key authentication for all data-access operations. OAuth tokens are stored with 0600 permissions. The Cloudflare tunnel provides HTTPS encryption. No credentials are stored in the codebase. That said — this is a personal project, not enterprise software. Review the security model before deploying with sensitive data.
- Phase 1: Core (manual, documents, Claude export)
- Phase 2: Cloud sources (Gmail, Calendar)
- Phase 3: More sources (OneDrive)
- Phase 4: MCP server + Claude.ai integration
- Phase 5: Observability (query logging, feedback, auto-curator)
- Phase 6: Local macOS sources (iMessage, Apple Notes, Chrome)
- Phase 7: Financial data (Plaid integration)
- Phase 8: Automated quality actions (re-chunking, re-weighting based on feedback)
- Phase 9: Multi-device sync and mobile access
- All data is stored locally on your machine — nothing leaves unless you set up a tunnel
- FileVault (full-disk encryption) is strongly recommended
- OAuth tokens are stored with
0600permissions (owner-only) - Atomic file writes prevent token corruption during concurrent access
- API key authentication on all data-access endpoints
- Rate limiting (60 req/min) on the MCP server
MIT — see LICENSE.