Multi-agent orchestration for Hermes Agent
Spawn parallel Hermes agents. Give them a shared brain. Ship in one command.
Backed by SQLite, coordinated by Python, zero tokens spent on coordination.
Install · Quick Start · How It Works · CLI · Tools · Advanced
Option 1 — npm (fastest)
npm install -g @hermes/brain-mcpThen register with Hermes:
hermes mcp add brain -s user -- brain-mcpOption 2 — from source
curl -fsSL https://raw.githubusercontent.com/DevvGwardo/brain-mcp/main/install.sh | bashThe installer:
- Builds the Node.js MCP server (
brain-mcp) - Installs the Python orchestration package (
hermes-brain) - Registers the brain as an MCP server in Hermes and Claude Code
Verify:
hermes mcp list | grep brain
hermes-brain --helpPrerequisites: Python 3.10+, Node.js 18+, Hermes Agent, Claude Code
Sharing with friends? Each person's brain is its own isolated SQLite DB — no network config needed. Same one-liner works anywhere.
Docker users: Spawn agents with layout: "headless" since tmux panes can't render in a headless container:
brain_wake({ task: "...", layout: "headless" })One command to orchestrate a fleet of Hermes agents:
hermes-brain "Build a REST API with auth, users, and posts" \
--agents api-routes auth-layer db-models testsWhat happens:
- Python conductor spawns 4 background Hermes agents (
hermes -q) - Each agent claims its files, publishes contracts, writes code, pulses heartbeats
- Conductor runs an integration gate — compiles the project, routes errors back to responsible agents via DM
- Agents self-correct. Gate retries until clean.
- Summary printed: agents, contracts, memories, metrics, done.
More ways to run it:
# Auto-named agents
hermes-brain "Add error handling to the whole codebase"
# Mix models per task
hermes-brain "Build a game" --agents engine ui store --model claude-sonnet-4-5
# Cheap model for boilerplate
hermes-brain "Generate 10 test files" --model claude-haiku-4-5
# JSON pipeline with multiple phases
hermes-brain --config pipeline.jsonOr from inside Hermes (interactive):
hermes> Use brain:brain_register, then brain:brain_wake to spawn 3 agents
that each refactor a different module.
graph TB
subgraph "Python Conductor"
CLI["hermes-brain CLI"]
ORCH["Orchestrator<br/><small>spawn · wait · gate · retry</small>"]
end
subgraph "Hermes Agents"
direction LR
H1["Agent 1<br/><small>hermes -q</small>"]
H2["Agent 2<br/><small>hermes -q</small>"]
H3["Agent 3<br/><small>hermes -q</small>"]
end
CLI --> ORCH
ORCH -->|spawn| H1
ORCH -->|spawn| H2
ORCH -->|spawn| H3
subgraph "Brain (shared SQLite)"
DB[("brain.db")]
PULSE["Heartbeats"]
MX["Mutex Locks"]
KV["Shared State"]
CON["Contracts"]
MEM["Memory"]
PLAN["Task DAG"]
end
ORCH <--> DB
H1 <--> DB
H2 <--> DB
H3 <--> DB
subgraph "Integration Gate"
GATE["tsc · mypy · cargo · go vet"]
ROUTE["DM errors → agents"]
end
ORCH --> GATE
GATE --> ROUTE
ROUTE -.->|DM| H1
ROUTE -.->|DM| H2
style CLI fill:#9333EA,stroke:#7C3AED,color:#fff
style ORCH fill:#9333EA,stroke:#7C3AED,color:#fff
style H1 fill:#3B82F6,stroke:#2563EB,color:#fff
style H2 fill:#10B981,stroke:#059669,color:#fff
style H3 fill:#F59E0B,stroke:#D97706,color:#000
style DB fill:#1E293B,stroke:#334155,color:#fff
style GATE fill:#EF4444,stroke:#DC2626,color:#fff
Zero-token coordination. The conductor is pure Python — LLM tokens are only spent on the actual work. Heartbeats, claims, contracts, gates, retries all run locally.
No server to manage. Each agent opens its own stdio connection to the brain. SQLite WAL mode handles concurrent access safely.
Same brain, any CLI. Hermes, Claude Code, MiniMax — all clients hit the same SQLite DB. A mixed fleet of Hermes + Claude agents can coordinate on the same task.
hermes-brain <task> [options]| Flag | Default | What it does |
|---|---|---|
--agents <names...> |
agent-1 agent-2 |
Agent names to spawn in parallel |
--model <id> |
claude-sonnet-4-5 |
Model passed to each agent |
--no-gate |
off | Skip integration gate |
--retries <n> |
3 |
Max gate retry attempts |
--timeout <seconds> |
600 |
Per-agent timeout |
--config <file.json> |
Load a multi-phase pipeline | |
--db-path <path> |
~/.claude/brain/brain.db |
Custom brain DB |
{
"task": "Build a todo app",
"model": "claude-sonnet-4-5",
"gate": true,
"max_gate_retries": 3,
"phases": [
{
"name": "foundation",
"parallel": true,
"agents": [
{ "name": "types", "files": ["src/types/"], "task": "Define all TS types" },
{ "name": "db", "files": ["src/db/"], "task": "Set up Prisma schema" }
]
},
{
"name": "feature",
"parallel": true,
"agents": [
{ "name": "api", "files": ["src/api/"], "task": "REST endpoints" },
{ "name": "ui", "files": ["src/ui/"], "task": "React components" }
]
},
{
"name": "quality",
"parallel": true,
"agents": [
{ "name": "tests", "task": "Write unit + integration tests" }
]
}
]
}Phases run sequentially. Agents within a phase run in parallel. The integration gate runs between phases.
30+ tools across 9 categories. All available to Hermes, Claude Code, and any MCP-compatible agent.
| Tool | What it does |
|---|---|
brain_register |
Name this session |
brain_sessions |
List active sessions |
brain_status |
Show session info + room |
brain_pulse |
Heartbeat with status + progress (returns pending DMs) |
brain_agents |
Live health of all agents (status, heartbeat age, claims) |
| Tool | What it does |
|---|---|
brain_post |
Post to a channel |
brain_read |
Read from a channel |
brain_dm |
Direct message another agent |
brain_inbox |
Read your DMs |
| Tool | What it does |
|---|---|
brain_set / brain_get |
Ephemeral key-value store |
brain_keys / brain_delete |
List / remove keys |
brain_remember |
Store persistent knowledge (survives brain_clear) |
brain_recall |
Search memories from previous sessions |
brain_forget |
Remove outdated memories |
| Tool | What it does |
|---|---|
brain_claim |
Lock a file/resource (TTL-based mutex) |
brain_release |
Unlock |
brain_claims |
List active locks |
| Tool | What it does |
|---|---|
brain_contract_set |
Publish what your module provides / expects |
brain_contract_get |
Read other agents' contracts before coding |
brain_contract_check |
Validate all contracts — catches param mismatches, missing functions |
| Tool | What it does |
|---|---|
brain_gate |
Run compile + contract check, DM errors to responsible agents |
brain_auto_gate |
Run gate in a loop, wait for fixes, retry until clean |
| Tool | What it does |
|---|---|
brain_plan |
Create a task DAG with dependencies |
brain_plan_next |
Get tasks whose dependencies are satisfied |
brain_plan_update |
Mark task done/failed (auto-promotes dependents) |
brain_plan_status |
Overall progress |
| Tool | What it does |
|---|---|
brain_wake |
Spawn a new agent (hermes, claude, or headless) |
brain_swarm |
Spawn multiple agents in one call |
brain_respawn |
Replace a failed agent with recovery context |
brain_metrics |
Success rates, duration, error counts per agent |
| Tool | What it does |
|---|---|
brain_context_push |
Log action/discovery/decision/error |
brain_context_get |
Read the ledger |
brain_context_summary |
Condensed view for context recovery |
brain_checkpoint |
Save full working state |
brain_checkpoint_restore |
Recover after context compression |
Every spawned agent follows two protocols that the orchestrator enforces:
Heartbeat — agents call brain_pulse every 2-3 tool calls with their status and a short progress note. The conductor uses this to:
- Show live status in the terminal (
● working — editing src/api/routes.ts) - Detect stalled agents (no pulse in 60s →
stale) - Deliver pending DMs as pulse return values (no extra round-trip)
Contracts — before agents write code, they call brain_contract_get to see what other agents export. After writing, they publish their own contract with brain_contract_set. Before marking done, brain_contract_check validates the whole fleet — catches:
- Function signature mismatches (expected 2 args, got 3)
- Missing exports (agent A imports
getUserbut agent B never exported it) - Type drift (expected
User, got{name, email})
This is the key to matching single-agent integration quality with a parallel fleet.
sequenceDiagram
participant O as Orchestrator
participant C as Compiler
participant DB as Brain DB
participant A as Agent
O->>C: Run tsc / mypy / cargo / go vet
C-->>O: Errors with file:line:message
O->>DB: Query: who claimed this file?
DB-->>O: Agent X owned src/api/routes.ts
O->>A: DM: "Fix these errors in your files"
Note over A: Agent reads DM on next pulse
Note over A: Fixes code, pulses done
O->>C: Re-run compiler
C-->>O: Clean
O->>DB: Record metrics
The gate auto-detects the project language and runs the appropriate checker:
| Language | Checker |
|---|---|
| TypeScript | npx tsc --noEmit |
| Python | mypy |
| Rust | cargo check |
| Go | go vet |
Errors are parsed, matched to the agent that claimed the failing file, and routed as a DM. Agents pick up their errors on the next pulse and self-correct. The loop retries up to --retries times before giving up.
The brain DB is shared across all MCP clients. A single project can have:
graph LR
subgraph "Fleet"
direction TB
HA["Hermes Agent<br/><small>fast local inference</small>"]
CC["Claude Code<br/><small>deep reasoning</small>"]
MM["MiniMax<br/><small>cheap boilerplate</small>"]
end
subgraph "Brain"
DB[("brain.db")]
end
HA <--> DB
CC <--> DB
MM <--> DB
style HA fill:#F59E0B,stroke:#D97706,color:#000
style CC fill:#9333EA,stroke:#7C3AED,color:#fff
style MM fill:#3B82F6,stroke:#2563EB,color:#fff
style DB fill:#1E293B,stroke:#334155,color:#fff
Route by task type. Use Hermes for routine work, Claude for architectural decisions, cheaper models for boilerplate — all coordinating through the same brain, sharing contracts, gates, memory.
From Claude Code:
brain_wake({ task: "...", cli: "hermes", layout: "headless" })
brain_wake({ task: "...", cli: "claude", layout: "horizontal" })
Everything below covers the full technical depth.
Run the benchmarks yourself:
node benchmark.mjs # SQLite direct layer (1000 iterations)
node benchmark-mcp.mjs # MCP tool layer (30 iterations per tool)| Operation | avg | p50 | p95 | p99 | throughput |
|---|---|---|---|---|---|
| session_register | 0.021ms | 0.011ms | 0.027ms | 0.039ms | ~47K/s |
| message_post (1 msg) | 0.014ms | 0.011ms | 0.019ms | 0.031ms | ~70K/s |
| message_read (50 msgs) | 0.042ms | 0.042ms | 0.045ms | 0.066ms | ~24K/s |
| state_get | 0.002ms | 0.002ms | 0.002ms | 0.003ms | ~570K/s |
| claim_query (all) | 0.001ms | 0.001ms | 0.002ms | 0.002ms | ~670K/s |
| heartbeat_pulse (update) | 0.002ms | 0.002ms | 0.002ms | 0.003ms | ~464K/s |
| session_query (by id) | 0.002ms | 0.002ms | 0.002ms | 0.003ms | ~455K/s |
Direct SQLite: every core coordination operation is sub-millisecond. The KV store (state_get) sustains ~570K reads/s. High-frequency coordination (heartbeats, claims, state) stays well under 1ms.
| Tool | avg | p50 | p95 | min | max |
|---|---|---|---|---|---|
| brain_status | 12.2ms | 12.0ms | 15.6ms | 8.8ms | 21.2ms |
| brain_sessions | 1.9ms | 1.7ms | 3.6ms | 0.9ms | 4.7ms |
| brain_keys | 1.6ms | 1.6ms | 2.6ms | 0.8ms | 4.5ms |
| brain_claims | 2.0ms | 1.8ms | 3.4ms | 1.2ms | 4.9ms |
| brain_metrics | 2.0ms | 1.9ms | 4.0ms | 1.1ms | 4.4ms |
MCP tool calls include JSON-RPC framing, stdio IPC, TypeScript tool dispatch, and SQLite query. Most tools respond in 1-2ms once the server is warm. brain_status is slower (12ms) because it aggregates session data from all rooms — 3000+ sessions were present during the benchmark.
- High-frequency coordination (heartbeats every 2-3 agent turns, claim/release, state get/set): always goes through Python
hermes.db.BrainDBdirectly — not the MCP layer. Sub-millisecond, no stdio overhead. - Agent-level operations (spawn, gate, contract check, swarm): use MCP tools. 1-5ms per call is fine — these happen once per agent, not per turn.
- Zero-token coordination overhead: the entire coordination layer (messaging, locking, state, heartbeats) adds no LLM token cost. Tokens are only spent on actual work.
graph TB
subgraph "MCP Clients"
HA["hermes sessions"]
CC["claude sessions"]
PY["Python orchestrator"]
end
subgraph "MCP Layer"
M1["brain-mcp<br/><small>stdio server</small>"]
end
subgraph "Python API"
PYDB["hermes.db.BrainDB<br/><small>direct SQLite access</small>"]
end
subgraph "Storage"
DB[("~/.claude/brain/brain.db<br/><small>SQLite WAL</small>")]
end
HA --> M1
CC --> M1
PY --> PYDB
M1 --> DB
PYDB --> DB
subgraph "Tables"
T1["sessions · messages · dms"]
T2["state · claims · contracts"]
T3["memory · plans · metrics"]
T4["context_ledger · checkpoints"]
end
DB --- T1
DB --- T2
DB --- T3
DB --- T4
style HA fill:#F59E0B,stroke:#D97706,color:#000
style CC fill:#9333EA,stroke:#7C3AED,color:#fff
style PY fill:#3776AB,stroke:#2C5F8D,color:#fff
style DB fill:#10B981,stroke:#059669,color:#fff
Design decisions:
- Dual access paths — Agents use MCP (stdio) via
brain-mcp. The Python orchestrator useshermes.db.BrainDBfor direct, fast access to the same SQLite file. - One process per session — No long-running daemon. Each agent opens its own stdio.
- WAL mode + 5s busy timeout — Multiple writers serialize safely.
- Heartbeat-based liveness — Agents dead in 60s = stale, dead in 5m = cleaned up.
- Room scoping — Working directory is the default room. Override with
BRAIN_ROOM.
stateDiagram-v2
[*] --> Spawned: hermes -q &
Spawned --> Initializing: MCP connected
Initializing --> Registered: brain_register
Registered --> ReadingContext: brain_get / brain_recall
ReadingContext --> CheckingContracts: brain_contract_get
state "Working Loop" as Loop {
CheckingContracts --> Claiming: brain_claim files
Claiming --> Editing: make changes
Editing --> Pulsing: brain_pulse (every 2-3 calls)
Pulsing --> ReadingDMs: DMs returned in pulse
ReadingDMs --> Editing: fix errors if any
Editing --> Publishing: brain_contract_set
}
Publishing --> FinalCheck: brain_contract_check
FinalCheck --> Publishing: mismatches found
FinalCheck --> Done: clean
Done --> Releasing: brain_release all files
Releasing --> Reporting: brain_pulse status=done
Reporting --> Exited: process ends
Exited --> [*]
If an agent crashes or goes stale, the orchestrator spawns a replacement with full context:
sequenceDiagram
participant O as Orchestrator
participant DB as Brain DB
participant R as Replacement
Note over O,DB: Agent X went stale (no pulse 60s+)
O->>DB: Get X's progress, claims, messages
DB-->O: "was editing src/api, claimed 3 files"
O->>DB: Release X's claims
O->>DB: Record failure metric
O->>R: Spawn "X-r4521" with recovery prompt:
Note over R: "You're replacing X.<br/>Last progress: 'editing routes.ts'.<br/>Pick up where they left off."
R->>DB: brain_register, brain_claim, continue
The replacement inherits the original task, knows what files the failed agent touched, and has context about their last known progress.
erDiagram
sessions ||--o{ messages : sends
sessions ||--o{ direct_messages : sends
sessions ||--o{ claims : owns
sessions ||--o{ contracts : publishes
sessions ||--o{ pulses : heartbeats
sessions ||--o{ context_ledger : logs
sessions ||--o{ checkpoints : saves
sessions ||--o{ metrics : records
sessions { text id PK text name text room text status text progress text last_heartbeat }
messages { int id PK text channel text room text sender text content text created_at }
direct_messages { int id PK text from_id text to_id text content bool read }
state { text key PK text scope text value text updated_by }
claims { text resource PK text owner_id text expires_at }
contracts { text module PK text agent_id json provides json expects }
memory { text id PK text room text topic text content text tags }
plans { text id PK text room json tasks json dependencies }
metrics { int id PK text agent_name text outcome int duration_ms }
context_ledger { int id PK text agent_id text entry_type text content text file_path }
checkpoints { text id PK text agent_id json working_state text summary }
Database location: ~/.claude/brain/brain.db
| Variable | Default | Description |
|---|---|---|
BRAIN_SESSION_NAME |
session-{pid} |
Pre-set session name |
BRAIN_SESSION_ID |
uuid | Pre-set session id (used by orchestrator) |
BRAIN_ROOM |
Working directory | Override room grouping |
BRAIN_DB_PATH |
~/.claude/brain/brain.db |
Custom database path |
BRAIN_DEFAULT_CLI |
claude |
Default CLI for brain_wake (hermes/claude) |
HERMES_MODEL |
Model passed to spawned hermes agents |
If you don't want the Python CLI, you can orchestrate directly from inside a Hermes session:
hermes> brain:brain_register with name "lead"
hermes> brain:brain_set key="task" value="refactor auth" scope="room"
hermes> brain:brain_wake name="worker-1" task="..." cli="hermes" layout="headless"
hermes> brain:brain_wake name="worker-2" task="..." cli="hermes" layout="headless"
hermes> brain:brain_agents # monitor health
hermes> brain:brain_auto_gate # run gate loop until clean
The tools work identically in interactive mode, headless mode, and across mixed fleets.
Brain also supports spawning Claude Code sessions in tmux split panes for visual orchestration:
graph TB
subgraph "Your terminal"
direction LR
L["LEAD<br/><small>purple border</small>"]
W1["worker 1<br/><small>blue</small>"]
W2["worker 2<br/><small>emerald</small>"]
W3["worker 3<br/><small>amber</small>"]
end
L -->|brain_wake| W1
L -->|brain_wake| W2
L -->|brain_wake| W3
style L fill:#0d0a1a,stroke:#9333EA,color:#fff,stroke-width:3px
style W1 fill:#0F172A,stroke:#3B82F6,color:#fff
style W2 fill:#0F172A,stroke:#10B981,color:#fff
style W3 fill:#0F172A,stroke:#F59E0B,color:#fff
From Claude Code, say "Refactor the API with 3 agents" — the lead splits the work, spawns 3 Claude sessions in tmux panes, each with a unique colored border, and coordinates through the brain.
Layouts: headless (Hermes default), horizontal, vertical, tiled, window
# Node.js MCP server
npm run dev # watch mode
npm run build # compile TypeScript
npm start # run server
# Python orchestrator
pip install -e . # install hermes-brain
python -m hermes.cli "task" --agents a b cRepo layout:
brain-mcp/
├── src/ # TypeScript MCP server (brain-mcp)
│ ├── index.ts # Tool definitions (30+ tools)
│ ├── db.ts # SQLite layer
│ ├── conductor.ts # brain_wake / brain_swarm logic
│ └── gate.ts # Integration gate
├── hermes/ # Python orchestration (hermes-brain)
│ ├── cli.py # hermes-brain CLI entry point
│ ├── orchestrator.py # Conductor — spawn, wait, gate, retry
│ ├── db.py # Direct SQLite access (shares brain.db)
│ ├── gate.py # Compiler + contract checks
│ └── prompt.py # Agent prompt templates
├── benchmark.mjs # SQLite layer benchmark (1000 iterations)
├── benchmark-mcp.mjs # MCP tool layer benchmark (30 calls per tool)
├── setup-hermes.sh # Full installer
└── pyproject.toml # Python package config