cortex-rs

Persistent semantic memory, task routing, cost tracking, and wikilinked notes - all as a native MCP server for Claude Code.

cortex-rs

LLM CLI Companion: persistent semantic memory, task routing, cost stats with routing savings, and wikilinked notes as a native MCP server for Claude Code.

Works in Claude Code Desktop, CLI terminal, claude.ai/code web app, and IDE extensions. One installation, one SQLite file, everywhere.

What it does

Pillar	Feature
Memory	Semantic (all-MiniLM-L6-v2 embeddings) + FTS5 hybrid recall, hot/warm/cold tiers, Weibull decay
Router	Rule-based task-to-model routing from `cortex.yaml` - typically 40-55% cost savings, tracked
Stats	Anthropic + OpenAI org API or `~/.claude/` log scraper, with routing-savings calculator
Notes	Wikilinked markdown notes with backlink graph (`[[slug]]` syntax)
Pattern Miner	N-gram tool call mining for speculative pre-fetch (PASTE paper)

Note

The semantic memory layer uses fastembed-rs (ONNX all-MiniLM-L6-v2 model). On first use, the model downloads automatically (~90 MB) into ~/.cache/fastembed/. You can disable embeddings with --no-default-features to use FTS5-only recall.

Important

Cost stats require either (1) Anthropic Org plan admin key for the provider API path, or (2) local log scraper reading ~/.claude/ session files. The log scraper is a fallback: it reads your stored session JSONL but may miss data if sessions are archived. For complete stats, configure an admin key.

Install

Build from source

git clone https://github.com/orellius/cortex-rs.git
cd cortex-rs
cargo install --path crates/relay-cli

Then copy the config template and edit your role-to-model mapping:

mkdir -p ~/.cortex
cp cortex.yaml ~/.cortex/cortex.yaml
# Edit ~/.cortex/cortex.yaml with your roles and pricing

As an MCP server in Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "cortex": {
      "command": "cortex",
      "args": ["mcp"]
    }
  }
}

Tip

The MCP wiring assumes cortex is in your PATH (from cargo install). If you prefer a local binary or custom path, use an absolute path: "/Users/you/Desktop/Projects/_OSS/cortex-rs/target/release/cortex".

Quickstart

Memory: store and recall facts

cortex remember "prefer named exports in TypeScript"
cortex remember "Redux Thunk is middleware, not reducer logic" --pin
cortex recall "exports"                    # hybrid FTS5 + semantic match
cortex recall "Redux" --limit 5

Routing: check which model to use for a task

cortex route "write readme for authentication module"
cortex route "audit security in the login flow"
cortex route "refactor this reducer for perf"

Output includes the recommended role, model, and reasoning.

Stats: track spending and routing savings

cortex stats                               # last 7 days
cortex stats --days 30                     # month view

Shows cost breakdown by model, token counts, and estimated savings from routing (vs the baseline model).

Notes: wikilinked markdown

cortex notes new auth-design
cortex notes write auth-design "# Auth System\n\nUses [[JWT]] tokens..."
cortex notes read auth-design
cortex notes search "JWT"
cortex notes graph auth-design              # ASCII backlink tree

Background jobs

cortex daemon jobs                         # decay hot memory, mine tool patterns
cortex daemon embed                        # backfill embeddings (v0.1 -> v0.2 upgrade)
cortex mcp                                 # start MCP server for Claude Code

MCP tools (11 total)

All five pillars are exposed as MCP tools that Claude Code calls natively:

memory_hot_context - returns up to 20 hot-tier memories for session context
memory_recall - search memories by keyword + semantic similarity
memory_write - store a new memory
route_task - classify a task and recommend a model/role
trace_record - log a tool call for pattern mining
pattern_stats - fetch mined N-gram statistics
stats_summary - fetch rolling 7/30-day cost and savings
note_read - fetch a note and its backlinks
note_write - create or update a note
note_search - full-text search notes
note_backlinks - fetch notes that link to a given slug

Crates

Each pillar is published independently on crates.io so you can embed just the pieces you need.

Crate	License	Description
	MIT	DB schema, migrations, shared types
	MIT	Tiered memory engine: FTS5 + semantic embeddings + Weibull decay
	MIT	Wikilinked markdown notes with backlink graph
	MIT	Task classifier and model role router
	MIT	Usage + cost dashboard (provider API or local log scraper)
	MIT	Tool call trace recorder and N-gram pattern miner
	AGPL-3.0	MCP stdio server wiring all pillars
	AGPL-3.0	Background jobs: decay, pattern mining
	AGPL-3.0	`cortex` CLI binary

MIT crates have no usage restrictions — embed them in your own projects freely. AGPL crates cover the assembled product; commercial forks must stay open.

Architecture

cortex-rs/
  crates/
    relay-core/       DB schema, migrations, shared types        (cortex-rs-core)
    relay-memory/     FTS5 + decay tiered memory engine          (cortex-rs-memory)
    relay-notes/      Wikilink graph + markdown CRUD             (cortex-rs-notes)
    relay-stats/      Provider API + local log scraper           (cortex-rs-stats)
    relay-router/     YAML role config + rule-based classifier   (cortex-rs-router)
    relay-miner/      Tool trace N-gram pattern mining           (cortex-rs-miner)
    relay-mcp/        JSON-RPC 2.0 MCP stdio server             (cortex-rs-mcp)
    relay-daemon/     Background jobs (decay, mining)            (cortex-rs-daemon)
    relay-cli/        clap CLI binary → `cortex`                (cortex-rs-cli)

Single SQLite file at ~/.cortex/memory.db. Zero external services.

See ARCHITECTURE.md for crate-by-crate details and data flow.

Configuration

cortex reads from ~/.cortex/cortex.yaml (or via --config flag). Configure:

Roles + models: Map task types to LLM models with costs
Provider API keys: Optional Anthropic/OpenAI org-plan keys for stats
Memory tiers: Hot/warm/cold cutoffs, decay rate, confidence thresholds
Notes directory: Where markdown note files live

See CONFIG.md for annotated example and all fields.

How it works

Memory: hybrid recall + decay

When you store a fact, cortex:

Hashes it for conflict detection
Generates an embedding via fastembed (all-MiniLM-L6-v2)
Stores it as a hot-tier memory with recency timestamp
Runs decay nightly: promotes/demotes between hot/warm/cold tiers based on access patterns (Weibull law) and confidence scores

When you recall, cortex queries both full-text (FTS5) and semantic similarity (cosine distance), merges results, and reranks by recency. Cold-tier memories are filtered by confidence threshold (paranoia setting).

Routing: cost-aware task classification

cortex reads your cortex.yaml role definitions and their keyword descriptions. When you ask it to route a task:

Tokenizes the task description
Scores it against each role's keywords
Picks the role with the highest match
Logs the decision (task excerpt, chosen model, baseline model) for savings analysis

The stats engine compares actual model usage against the baseline cost - showing you how much routing saved.

Stats: provider API or log scraper

If you have an Anthropic Org plan admin key:

cortex queries the provider usage API for token counts and costs

If not:

cortex scrapes ~/.claude/ session JSONL files (your local log of all Claude Code sessions)
Parses request + response token counts and infers costs from cortex.yaml pricing

Both paths emit a rolling dashboard of cost by model, cache stats, and routing savings.

Roadmap (v0.3)

BERT-ONNX model router (RouteLLM classifier) replacing rule-based keyword matching
Gemini provider stats (expand beyond Anthropic/OpenAI)
HNSW/ANN index for scaling beyond 100K memories
Speculative pre-fetch wiring (cortex-rs-miner → MCP tool integration)

Contributing

cortex-rs accepts contributions. Open an issue or PR describing your use case. By submitting a PR you agree your contribution is licensed under the affected crate's license (see LICENSING.md).

License

Dual-licensed. Pillar crates (cortex-rs-core, cortex-rs-memory, cortex-rs-notes, cortex-rs-router, cortex-rs-stats, cortex-rs-miner) are MIT so you can reuse them in your own projects. Architecture crates (cortex-rs-mcp, cortex-rs-daemon, cortex-rs-cli) are AGPL-3.0 so commercial forks of the assembled product must stay open.

See LICENSING.md, LICENSE (MIT), and LICENSE-AGPL (AGPL-3.0).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
crates		crates
docs		docs
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
LICENSE-AGPL		LICENSE-AGPL
LICENSING.md		LICENSING.md
README.md		README.md
cortex.yaml		cortex.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cortex-rs

What it does

Install

Build from source

As an MCP server in Claude Code

Quickstart

Memory: store and recall facts

Routing: check which model to use for a task

Stats: track spending and routing savings

Notes: wikilinked markdown

Background jobs

MCP tools (11 total)

Crates

Architecture

Configuration

How it works

Memory: hybrid recall + decay

Routing: cost-aware task classification

Stats: provider API or log scraper

Roadmap (v0.3)

Contributing

License

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cortex-rs

What it does

Install

Build from source

As an MCP server in Claude Code

Quickstart

Memory: store and recall facts

Routing: check which model to use for a task

Stats: track spending and routing savings

Notes: wikilinked markdown

Background jobs

MCP tools (11 total)

Crates

Architecture

Configuration

How it works

Memory: hybrid recall + decay

Routing: cost-aware task classification

Stats: provider API or log scraper

Roadmap (v0.3)

Contributing

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages