Self-evolving vector memory + agent fleet management for Claude Code
The first system where Claude Code agents write, evaluate, and reconcile their own agents.
Installation • How It Works • Skill Discovery • Hook Reference • Fleet Management • Memory Hygiene • MCP Resources • Architecture • Safety
cortex gives Claude Code persistent memory across sessions and self-managing agents that improve over time.
- Memories are stored as vector embeddings in ChromaDB and silently injected into Claude's context
- Agents are automatically created from accumulated knowledge, evaluated on usage, and retired when obsolete
- Skills are auto-discovered from your project's tech stack and generated as slash commands with best practices baked in
- 10 hook events cover the entire session lifecycle — from startup to shutdown
- Multi-project aware — memories are scoped per project, agents and skills exist at project and global levels
🧠 85 memories (39 project, 17 feedback, 18 reference, 1 user) across 4 projects
🤖 10 agents (5 project + 5 global) | 5 spawns today | health 4.2/5
📚 3 skills (2 project + 1 global)
⚡ loaded 31 memories (session start)
| Requirement | Version | Check |
|---|---|---|
| Claude Code | v2.1.9+ | claude --version |
| Python | 3.8+ | python3 --version |
| pip | any | pip --version |
git clone https://github.com/digin1/cortex.git ~/.claude/skills/cortexpip install chromadbChromaDB will also install onnxruntime (for embeddings) and numpy. No GPU required — CPU embeddings are fast enough.
python3 -c "
import chromadb
client = chromadb.PersistentClient(path='$HOME/.claude/cortex-db')
col = client.get_or_create_collection('claude_memories')
print(f'Database initialized: {col.count()} memories')
"Add to your ~/.claude/.mcp.json (create if it doesn't exist):
{
"mcpServers": {
"cortex": {
"type": "stdio",
"command": "python3",
"args": ["-W", "ignore", "/home/YOUR_USERNAME/.claude/skills/cortex/mcp_server.py"]
}
}
}Replace YOUR_USERNAME with your actual username.
Add the following to your ~/.claude/settings.json (merge with any existing settings):
Click to expand full settings.json configuration
{
"permissions": {
"allow": [
"mcp__cortex__memory_store",
"mcp__cortex__memory_search",
"mcp__cortex__memory_list",
"mcp__cortex__memory_delete",
"mcp__cortex__memory_update",
"mcp__cortex__memory_stats"
]
},
"statusLine": {
"type": "command",
"command": "bash ~/.claude/skills/cortex/statusline.sh 2>/dev/null",
"padding": 0
},
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/recall.sh 2>/dev/null",
"statusMessage": "Recalling relevant memories..."
}
]
}
],
"PreToolUse": [
{
"matcher": "mcp__cortex",
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/cortex_pretool_enrich.sh 2>/dev/null",
"statusMessage": "Enriching cortex operation..."
}
]
}
],
"PostToolUse": [
{
"matcher": "Agent",
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/agent_track.sh 2>/dev/null",
"statusMessage": "Tracking agent usage..."
}
]
}
],
"PreCompact": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/compact_save.sh 2>/dev/null",
"statusMessage": "Extracting learnings + managing agent fleet..."
}
]
}
],
"PostCompact": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/post_compact_save.sh 2>/dev/null",
"statusMessage": "Extracting knowledge from compressed context..."
}
]
}
],
"SubagentStart": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/agent_context_inject.sh 2>/dev/null",
"statusMessage": "Injecting cortex context into agent..."
}
]
}
],
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/cleanup.sh 2>/dev/null",
"statusMessage": "Cleaning stale memory snapshots...",
"async": true
}
]
},
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/agent_bootstrap.sh 2>/dev/null",
"statusMessage": "Bootstrapping agents from cortex...",
"async": true
}
]
},
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/memory_hygiene.sh 2>/dev/null",
"statusMessage": "Memory hygiene check...",
"async": true
}
]
},
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/skill_discover.sh 2>/dev/null",
"statusMessage": "Discovering project skills...",
"async": true
}
]
}
],
"SessionEnd": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/session_end_cleanup.sh 2>/dev/null",
"statusMessage": "Saving session summary..."
}
]
}
],
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/learn.sh",
"statusMessage": "Saving session learnings..."
}
]
},
{
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/skills/cortex/fleet_eval_stop.sh",
"statusMessage": "Evaluating agent fleet health..."
}
]
}
]
}
}These files teach Claude to always search cortex before saying "I don't remember" and to proactively store useful memories:
# Global CLAUDE.md — loaded into every session, every project
cp ~/.claude/skills/cortex/config/CLAUDE.md ~/.claude/CLAUDE.md
# Global rules — loaded on-demand for cortex-related behavior
mkdir -p ~/.claude/rules
cp ~/.claude/skills/cortex/config/cortex-memory.md ~/.claude/rules/cortex-memory.md
cp ~/.claude/skills/cortex/config/skill-discovery.md ~/.claude/rules/skill-discovery.mdWhy? Without these, Claude may not search cortex when you ask "do you remember X" — the recall hook injects context automatically, but if it misses something, Claude needs to know it should search manually. These files close that gap.
# Run the test suite
bash ~/.claude/skills/cortex/test.sh
# Check status line
bash ~/.claude/skills/cortex/statusline.shThen restart Claude Code. You should see the status line at the bottom and memories will start accumulating automatically.
bash ~/.claude/skills/cortex/install.shEvery session follows this automatic flow:
Session Start
├── cleanup.sh → prune stale data (async)
├── agent_bootstrap.sh → create agents from cortex knowledge (async, daily)
├── memory_hygiene.sh → dedup, validate paths, consolidate (async, daily)
└── skill_discover.sh → detect tech stack + generate skill commands (async, weekly)
Every Message
└── recall.sh → inject relevant memories into Claude's context
First message: comprehensive project load (all memories)
Subsequent: targeted semantic search
Agent Spawned
├── agent_context_inject.sh → inject domain memories into the agent
└── agent_track.sh → log spawn to usage ledger
Context Compressed
├── compact_save.sh → extract memories from transcript + create/evaluate agents
└── post_compact_save.sh → extract key insights from the compressed summary
Session Paused
├── learn.sh → prompt Claude to save learnings
└── fleet_eval_stop.sh → check agent fleet health (if 5+ spawns today)
Session Ended
└── session_end_cleanup.sh → save session summary, clean temp files
Every cortex Tool Call
└── cortex_pretool_enrich.sh → auto-tag project from cwd, log to audit trail
On the first message of every session, cortex detects your project from cwd and loads all relevant memories — not just semantic matches:
| Category | What's loaded | Why |
|---|---|---|
| User profile | Always | Who you are, how you work |
| Feedback | Always (all projects) | Cross-project rules and preferences |
| Project context | Matching project + global | Architecture, decisions, gotchas |
| References | All (always included) | File paths, commands, endpoints |
| Semantic matches | From your first prompt | Cross-project hits |
Subsequent messages use targeted semantic search with relaxed thresholds. "Remember" queries (containing keywords like "recall", "do you know", "last time") get boosted thresholds (0.75-0.85 vs 0.6-0.7) and more results (12 vs 8).
First-message content uses progressive disclosure — summaries are truncated to 250 chars to save tokens. Claude can fetch full content via mcp__cortex__memory_search when needed.
Memories are injected via Claude Code's additionalContext API — Claude sees them, you don't:
{
"suppressOutput": true,
"hookSpecificOutput": {
"hookEventName": "UserPromptSubmit",
"additionalContext": "[cortex] Session context loaded for project: my-project\n..."
}
}When any subagent spawns, the SubagentStart hook queries cortex for memories relevant to that agent type and injects them. A dask-optimizer agent automatically gets Dask-related memories; a swarm-deployer gets deployment knowledge.
cortex automatically detects your project's tech stack and generates slash-command skill files (.md) with framework-specific best practices. Two paths:
On every session start, skill_discover.sh runs asynchronously:
- Detect —
skill_detect.pyscans for project markers (package.json,pyproject.toml,go.mod,Cargo.toml, etc.) and identifies 50+ frameworks across Node.js, Python, Go, Rust, Ruby, Java, and infrastructure tools - Check — skips if skills already exist for detected frameworks, or if cooldown hasn't expired (weekly per project)
- Generate — calls
claude -p --model sonnetto produce skill definitions with framework best practices - Write —
skill_create.pywrites.mdcommand files to.claude/commands/(project) or~/.claude/commands/(global) - Track — stores discovery record in cortex to avoid regeneration
Session starts in a FastAPI + SQLAlchemy project
→ skill_detect.py: "fastapi, sqlalchemy detected"
→ cooldown check: first time this week ✓
→ existing skills check: no fastapi-* commands found ✓
→ claude -p generates: fastapi-endpoint.md, fastapi-test.md, fastapi-model.md
→ skill_create.py writes to .claude/commands/
→ /fastapi-endpoint, /fastapi-test, /fastapi-model now available
For deeper, web-researched skills:
/cortex discover
Unlike the automatic path (which uses LLM knowledge only), the manual command uses WebSearch to research current best practices online and reads your actual project code for project-specific conventions. Use this for:
- Newer or less common frameworks
- More detailed, up-to-date skills
- Bypassing the weekly cooldown
| Ecosystem | Frameworks |
|---|---|
| Node.js | Next.js, React, Vue, Angular, Express, Svelte, Nuxt, Astro, Remix, NestJS, Gatsby, Prisma, Drizzle, tRPC, Playwright, Jest, Vitest, Cypress |
| Python | FastAPI, Django, Flask, Celery, SQLAlchemy, Pydantic, Pandas, PyTorch, TensorFlow, Streamlit, Gradio, LangChain, Dask, Scrapy, Pytest |
| Go | Gin, Fiber, Echo, Gorilla Mux, GORM, pgx, Cobra, gRPC |
| Rust | Actix, Axum, Rocket, Tokio, Serde, Diesel, SQLx, Clap, Tonic |
| Ruby | Rails, Sinatra, RSpec, Sidekiq |
| Java | Spring Boot, Quarkus, Micronaut |
| Infra | Docker, Docker Compose, Kubernetes, Helm, Terraform, GitHub Actions, GitLab CI, Ansible, Serverless, Vercel, Netlify, Fly.io |
| Limit | Value |
|---|---|
| Max project skills | 10 |
| Max global skills | 10 |
| Skills per discovery run | 5 |
| Cooldown | Weekly per project (automatic), none (manual) |
| Overwrite protection | Never overwrites existing skill files |
| Hook Event | Script | Sync/Async | What It Does |
|---|---|---|---|
UserPromptSubmit |
recall.sh |
Sync | Inject memories into Claude's context |
PreToolUse |
cortex_pretool_enrich.sh |
Sync | Auto-tag project + audit log for cortex ops |
PostToolUse(Agent) |
agent_track.sh |
Sync | Track agent spawns in usage ledger |
SubagentStart |
agent_context_inject.sh |
Sync | Inject domain memories into spawned agents |
PreCompact |
compact_save.sh |
Sync | Extract memories + create/evaluate agents |
PostCompact |
post_compact_save.sh |
Sync | Extract insights from compressed summary |
SessionStart |
cleanup.sh |
Async | Prune stale data |
SessionStart |
agent_bootstrap.sh |
Async | Bootstrap agents from cortex (daily) |
SessionStart |
memory_hygiene.sh |
Async | Dedup, path validation, consolidation (daily) |
SessionStart |
skill_discover.sh |
Async | Auto-detect tech stack + generate skill commands (weekly) |
SessionEnd |
session_end_cleanup.sh |
Sync | Save session summary + cleanup |
Stop |
learn.sh |
Sync | Prompt Claude to save learnings |
Stop |
fleet_eval_stop.sh |
Sync | Lightweight fleet health check |
Agents are created from two sources:
-
Bootstrap (
SessionStart): Queries cortex for accumulated knowledge. If a project has 3+ memories but fewer than 2 agents, it usesclaude -p --model haikuto propose new agents. Runs once per project per day. -
Compact (
PreCompact): Analyzes the conversation transcript for recurring patterns. Creates 0-5 agents per compaction event with semantic dedup (cosine < 0.55).
On every PreCompact, the system:
- Scores each agent 1-5 based on relevance, quality, and usage data
- Updates agents with stale instructions (creates
.bakbackup first) - Retires low-scoring agents (extracts knowledge to cortex, moves to
.retired/)
All agents have the memory: field in their frontmatter:
---
name: dask-optimizer
memory: project # project agents use project scope
---
---
name: data-safety-ops
memory: user # global agents use user scope
---This means agents accumulate knowledge across sessions in their own MEMORY.md files.
| Metric | How it's tracked |
|---|---|
| Usage count | JSONL ledger (~/.claude/agent-usage.jsonl) |
| Eval scores | Stored in cortex as agent_eval type memories |
| Health status | Based on score + 7-day usage |
| Fleet dashboard | Run /cortex agents for full report |
| Limit | Value | Purpose |
|---|---|---|
| Max project agents | 5 | Prevent scope creep |
| Max global agents | 5 | Prevent scope creep |
| Semantic dedup threshold | 0.55 | Prevent duplicate agents |
| Bootstrap per day | 1 per project | Prevent repeated creation |
| Agents per batch | 3 (bootstrap), 5 (compact) | Conservative creation |
The hygiene system runs daily on SessionStart (async) with 4 phases:
Finds memory pairs with cosine distance < 0.35 within the same type. Keeps the longer/more detailed version, merges tags, records merged_from metadata.
Extracts file paths from memory content and checks if they still exist on the filesystem. Flags broken paths with stale_path tag. Skips NAS mounts (/remote/) and container paths (/app/). Never deletes — only flags.
Reads the recall log (~/.claude/.cortex_recall_log) and updates each memory's last_recalled timestamp and recall_count. This data informs future hygiene decisions.
For groups of 3+ related memories in the same project, uses claude -p --model haiku to merge them into 1 comprehensive memory. Only runs when 15+ total memories exist. Originals are deleted only after successful consolidation.
The MCP server exposes resources that can be referenced with @ in the Claude Code prompt:
| Resource URI | What It Returns |
|---|---|
memory://all |
All memories with type, project, and content preview |
memory://{memory_id} |
Full content and metadata for a specific memory |
memory://project/{name} |
All memories scoped to a project |
memory://type/{type} |
All memories of a specific type (feedback, project, etc.) |
| Command | Description |
|---|---|
/cortex store <content> |
Store a new memory |
/cortex search <query> |
Semantic search across all memories |
/cortex list |
List all memories |
/cortex stats |
Database statistics |
/cortex delete <id> |
Delete a memory (archived to audit log) |
/cortex update <id> |
Update a memory |
/cortex agents |
Fleet health dashboard |
/cortex discover |
Auto-detect tech stack + generate skill commands with web research |
/cortex learn |
Review session and extract learnings to memory |
| Type | What | Example |
|---|---|---|
user |
User profile and role | "Senior engineer, prefers terse responses" |
feedback |
User corrections and preferences | "Always use step=any on auto-calculated number inputs" |
project |
Technical decisions and architecture | "Docker Swarm --force reuses cached image, must use --image" |
reference |
File paths, URLs, commands | "GitLab API at git.example.com, token in config" |
~/.claude/
├── skills/cortex/
│ ├── recall.sh # UserPromptSubmit: project-aware context injection
│ ├── cortex_pretool_enrich.sh # PreToolUse: auto-tag project + audit
│ ├── agent_context_inject.sh # SubagentStart: inject domain memories into agents
│ ├── compact_save.sh # PreCompact: extract memories + fleet management
│ ├── post_compact_save.sh # PostCompact: extract from compressed summary
│ ├── agent_bootstrap.sh # SessionStart: create agents from cortex knowledge
│ ├── memory_hygiene.sh # SessionStart: dedup, validate, consolidate
│ ├── cleanup.sh # SessionStart: prune stale data
│ ├── skill_discover.sh # SessionStart: auto-detect tech stack + generate skills
│ ├── learn.sh # Stop: prompt to save learnings
│ ├── fleet_eval_stop.sh # Stop: lightweight fleet health check
│ ├── session_end_cleanup.sh # SessionEnd: save summary + cleanup
│ ├── agent_track.sh # PostToolUse(Agent): log spawns
│ ├── agent_dashboard.py # /cortex agents command
│ ├── statusline.sh # Multi-line status bar
│ ├── mcp_server.py # MCP server: 6 tools + 4 resources
│ ├── memory_db.py # ChromaDB CLI wrapper
│ ├── SKILL.md # Skill definition for /cortex
│ ├── test.sh # Test suite
│ ├── config/
│ │ ├── CLAUDE.md # Global behavioral rules → ~/.claude/CLAUDE.md
│ │ ├── cortex-memory.md # Memory rules → ~/.claude/rules/cortex-memory.md
│ │ └── skill-discovery.md # Skill discovery rules → ~/.claude/rules/skill-discovery.md
│ ├── .retired/ # Retired agents (outside agents/ to avoid Claude discovery)
│ └── lib/
│ ├── parse_transcript.py # Transcript JSONL parser
│ ├── store_memories.py # Memory storage with dedup
│ ├── memory_hygiene.py # Dedup, path validation, consolidation
│ ├── collect_memories_full.py # Full memory collector (for bootstrap)
│ ├── fleet_create.py # Agent creation with semantic dedup + hard caps
│ ├── fleet_eval.py # Agent evaluation, update, retire
│ ├── collect_agents.py # Agent inventory collector
│ ├── collect_usage.py # Usage ledger reader
│ ├── collect_memories.py # ChromaDB memory reader
│ ├── skill_detect.py # Tech stack detector (50+ frameworks)
│ └── skill_create.py # Skill .md file writer with safety caps
├── cortex-db/ # ChromaDB persistent storage
├── agent-usage.jsonl # Agent spawn ledger
├── .cortex_activity # Live activity indicator
├── .cortex_audit.jsonl # Audit trail for all memory operations
├── .cortex_recall_log # Recall tracking for hygiene
├── .cortex_ops_log.jsonl # PreToolUse operation log
├── .cortex_sessions.jsonl # Session start/end markers
└── agents/
└── *.md # Active global agents (memory: user)
Project-level agents live at <project>/.claude/agents/*.md with memory: project.
| Protection | Implementation |
|---|---|
| Content size limit | Max 5000 chars per memory |
| Database cap | Max 200 total memories |
| Audit trail | All store/update/delete operations logged to .cortex_audit.jsonl |
| Soft delete | Deleted memory content archived to audit log before removal |
| Merge tracking | Merged memories tagged with merged_from ID |
| Consolidation tracking | Consolidated memories tagged with consolidated_from IDs |
| No age-based deletion | Memories never removed for being old (designed for long-running projects) |
| Path validation | Broken file paths flagged, never auto-deleted |
| Agent path traversal | os.path.realpath() + directory whitelisting |
| Agent hard caps | Max 5 project + 5 global agents |
| Agent filename sanitization | Only [a-z0-9\-_.] allowed |
| Semantic dedup | 0.55 threshold for agents, 0.15 for memories, 0.35 for hygiene merges |
| Backup before update | Timestamped .bak files before agent overwrites |
| Soft retire | Agents moved to cortex .retired/ (outside agents/ to avoid Claude Code discovery) |
| Skill hard caps | Max 10 project + 10 global skills, 5 per discovery run |
| Skill overwrite protection | Never overwrites existing skill files |
| Daily cooldowns | Bootstrap + hygiene run max once per project per day |
| Weekly cooldowns | Skill discovery runs max once per project per week |
| Operation logging | Every cortex tool call logged via PreToolUse hook |
| Auto project tagging | PreToolUse enriches memory_store with project from cwd |
| Requirement | Version | Notes |
|---|---|---|
| Claude Code | v2.1.9+ | Needs additionalContext + SubagentStart hook support |
| Python | 3.8+ | For ChromaDB and hook scripts |
| chromadb | Latest | pip install chromadb |
| claude CLI | Latest | For claude -p in compact/bootstrap/hygiene |
| Platform | Status | Notes |
|---|---|---|
| Linux | Fully supported | Primary development platform |
| macOS | Fully supported | stat differences handled |
| Windows | Via Git Bash / WSL | Requires bash environment |
Contributions welcome! Areas of interest:
- Recall caching — daemon process to avoid ChromaDB cold-start on every prompt
- Cross-project federation — share agents between projects intelligently
- Fleet analytics — usage trends, score degradation alerts, visualizations
- Hybrid search — combine vector + keyword (BM25/FTS5) for better recall
- Plugin packaging — convert to official Claude Code plugin format
Built with care by @digin1
If this saves you context, give it a star