Skip to content

Orellius/cortex-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Persistent semantic memory, task routing, cost tracking, and wikilinked notes - all as a native MCP server for Claude Code.

cortex-rs

version license mcp rust

LLM CLI Companion: persistent semantic memory, task routing, cost stats with routing savings, and wikilinked notes as a native MCP server for Claude Code.

Works in Claude Code Desktop, CLI terminal, claude.ai/code web app, and IDE extensions. One installation, one SQLite file, everywhere.

What it does

Pillar Feature
Memory Semantic (all-MiniLM-L6-v2 embeddings) + FTS5 hybrid recall, hot/warm/cold tiers, Weibull decay
Router Rule-based task-to-model routing from cortex.yaml - typically 40-55% cost savings, tracked
Stats Anthropic + OpenAI org API or ~/.claude/ log scraper, with routing-savings calculator
Notes Wikilinked markdown notes with backlink graph ([[slug]] syntax)
Pattern Miner N-gram tool call mining for speculative pre-fetch (PASTE paper)

Note

The semantic memory layer uses fastembed-rs (ONNX all-MiniLM-L6-v2 model). On first use, the model downloads automatically (~90 MB) into ~/.cache/fastembed/. You can disable embeddings with --no-default-features to use FTS5-only recall.

Important

Cost stats require either (1) Anthropic Org plan admin key for the provider API path, or (2) local log scraper reading ~/.claude/ session files. The log scraper is a fallback: it reads your stored session JSONL but may miss data if sessions are archived. For complete stats, configure an admin key.

Install

Build from source

git clone https://github.com/orellius/cortex-rs.git
cd cortex-rs
cargo install --path crates/relay-cli

Then copy the config template and edit your role-to-model mapping:

mkdir -p ~/.cortex
cp cortex.yaml ~/.cortex/cortex.yaml
# Edit ~/.cortex/cortex.yaml with your roles and pricing

As an MCP server in Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "cortex": {
      "command": "cortex",
      "args": ["mcp"]
    }
  }
}

Tip

The MCP wiring assumes cortex is in your PATH (from cargo install). If you prefer a local binary or custom path, use an absolute path: "/Users/you/Desktop/Projects/_OSS/cortex-rs/target/release/cortex".

Quickstart

Memory: store and recall facts

cortex remember "prefer named exports in TypeScript"
cortex remember "Redux Thunk is middleware, not reducer logic" --pin
cortex recall "exports"                    # hybrid FTS5 + semantic match
cortex recall "Redux" --limit 5

Routing: check which model to use for a task

cortex route "write readme for authentication module"
cortex route "audit security in the login flow"
cortex route "refactor this reducer for perf"

Output includes the recommended role, model, and reasoning.

Stats: track spending and routing savings

cortex stats                               # last 7 days
cortex stats --days 30                     # month view

Shows cost breakdown by model, token counts, and estimated savings from routing (vs the baseline model).

Notes: wikilinked markdown

cortex notes new auth-design
cortex notes write auth-design "# Auth System\n\nUses [[JWT]] tokens..."
cortex notes read auth-design
cortex notes search "JWT"
cortex notes graph auth-design              # ASCII backlink tree

Background jobs

cortex daemon jobs                         # decay hot memory, mine tool patterns
cortex daemon embed                        # backfill embeddings (v0.1 -> v0.2 upgrade)
cortex mcp                                 # start MCP server for Claude Code

MCP tools (11 total)

All five pillars are exposed as MCP tools that Claude Code calls natively:

  • memory_hot_context - returns up to 20 hot-tier memories for session context
  • memory_recall - search memories by keyword + semantic similarity
  • memory_write - store a new memory
  • route_task - classify a task and recommend a model/role
  • trace_record - log a tool call for pattern mining
  • pattern_stats - fetch mined N-gram statistics
  • stats_summary - fetch rolling 7/30-day cost and savings
  • note_read - fetch a note and its backlinks
  • note_write - create or update a note
  • note_search - full-text search notes
  • note_backlinks - fetch notes that link to a given slug

Crates

Each pillar is published independently on crates.io so you can embed just the pieces you need.

Crate License Description
cortex-rs-core MIT DB schema, migrations, shared types
cortex-rs-memory MIT Tiered memory engine: FTS5 + semantic embeddings + Weibull decay
cortex-rs-notes MIT Wikilinked markdown notes with backlink graph
cortex-rs-router MIT Task classifier and model role router
cortex-rs-stats MIT Usage + cost dashboard (provider API or local log scraper)
cortex-rs-miner MIT Tool call trace recorder and N-gram pattern miner
cortex-rs-mcp AGPL-3.0 MCP stdio server wiring all pillars
cortex-rs-daemon AGPL-3.0 Background jobs: decay, pattern mining
cortex-rs-cli AGPL-3.0 cortex CLI binary

MIT crates have no usage restrictions — embed them in your own projects freely. AGPL crates cover the assembled product; commercial forks must stay open.

Architecture

cortex-rs/
  crates/
    relay-core/       DB schema, migrations, shared types        (cortex-rs-core)
    relay-memory/     FTS5 + decay tiered memory engine          (cortex-rs-memory)
    relay-notes/      Wikilink graph + markdown CRUD             (cortex-rs-notes)
    relay-stats/      Provider API + local log scraper           (cortex-rs-stats)
    relay-router/     YAML role config + rule-based classifier   (cortex-rs-router)
    relay-miner/      Tool trace N-gram pattern mining           (cortex-rs-miner)
    relay-mcp/        JSON-RPC 2.0 MCP stdio server             (cortex-rs-mcp)
    relay-daemon/     Background jobs (decay, mining)            (cortex-rs-daemon)
    relay-cli/        clap CLI binary → `cortex`                (cortex-rs-cli)

Single SQLite file at ~/.cortex/memory.db. Zero external services.

See ARCHITECTURE.md for crate-by-crate details and data flow.

Configuration

cortex reads from ~/.cortex/cortex.yaml (or via --config flag). Configure:

  • Roles + models: Map task types to LLM models with costs
  • Provider API keys: Optional Anthropic/OpenAI org-plan keys for stats
  • Memory tiers: Hot/warm/cold cutoffs, decay rate, confidence thresholds
  • Notes directory: Where markdown note files live

See CONFIG.md for annotated example and all fields.

How it works

Memory: hybrid recall + decay

When you store a fact, cortex:

  1. Hashes it for conflict detection
  2. Generates an embedding via fastembed (all-MiniLM-L6-v2)
  3. Stores it as a hot-tier memory with recency timestamp
  4. Runs decay nightly: promotes/demotes between hot/warm/cold tiers based on access patterns (Weibull law) and confidence scores

When you recall, cortex queries both full-text (FTS5) and semantic similarity (cosine distance), merges results, and reranks by recency. Cold-tier memories are filtered by confidence threshold (paranoia setting).

Routing: cost-aware task classification

cortex reads your cortex.yaml role definitions and their keyword descriptions. When you ask it to route a task:

  1. Tokenizes the task description
  2. Scores it against each role's keywords
  3. Picks the role with the highest match
  4. Logs the decision (task excerpt, chosen model, baseline model) for savings analysis

The stats engine compares actual model usage against the baseline cost - showing you how much routing saved.

Stats: provider API or log scraper

If you have an Anthropic Org plan admin key:

  • cortex queries the provider usage API for token counts and costs

If not:

  • cortex scrapes ~/.claude/ session JSONL files (your local log of all Claude Code sessions)
  • Parses request + response token counts and infers costs from cortex.yaml pricing

Both paths emit a rolling dashboard of cost by model, cache stats, and routing savings.

Roadmap (v0.3)

  • BERT-ONNX model router (RouteLLM classifier) replacing rule-based keyword matching
  • Gemini provider stats (expand beyond Anthropic/OpenAI)
  • HNSW/ANN index for scaling beyond 100K memories
  • Speculative pre-fetch wiring (cortex-rs-miner → MCP tool integration)

Contributing

cortex-rs accepts contributions. Open an issue or PR describing your use case. By submitting a PR you agree your contribution is licensed under the affected crate's license (see LICENSING.md).

License

Dual-licensed. Pillar crates (cortex-rs-core, cortex-rs-memory, cortex-rs-notes, cortex-rs-router, cortex-rs-stats, cortex-rs-miner) are MIT so you can reuse them in your own projects. Architecture crates (cortex-rs-mcp, cortex-rs-daemon, cortex-rs-cli) are AGPL-3.0 so commercial forks of the assembled product must stay open.

See LICENSING.md, LICENSE (MIT), and LICENSE-AGPL (AGPL-3.0).

About

LLM CLI Companion — persistent semantic memory, task routing, and cost dashboards as a native MCP server for Claude Code (Desktop, CLI, web, IDE).

Topics

Resources

License

MIT, AGPL-3.0 licenses found

Licenses found

MIT
LICENSE
AGPL-3.0
LICENSE-AGPL

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages