Skip to content

cmyui/recall

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

recall

Semantic memory system for Claude Code. Stores, retrieves, and consolidates personal knowledge using hybrid search (BM25 + dense embeddings) and LLM-powered fact extraction.

How it works

Files → Chunker → LLM Extraction → Embeddings → SQLite
                                                    ↓
                              Query → Hybrid Search (BM25 + Dense → RRF)
                                                    ↓
                                              Ranked Results

Dream Cycle:
All Memories → Entity Extraction → Peer Card Synthesis → Store
  1. Ingest — reads files (markdown, JSON, PDF, mbox, docx, csv), sends content to an LLM to extract self-contained facts, embeds them, and stores in SQLite.
  2. Query — embeds the query, runs BM25 keyword search + dense cosine similarity, fuses rankings via Reciprocal Rank Fusion (RRF).
  3. Dream — scans all memories to identify people, builds biographical peer cards per entity, stores as searchable memories. Excludes its own output to prevent feedback loops.

Setup

pip install -e .

Requires either:

  • An OpenAI API key (OPENAI_API_KEY) for GPT-based extraction (default)
  • The Claude CLI installed and authenticated, with --use-claude-cli flag for Opus-based extraction

Usage

Ingest files

# Basic ingestion (uses GPT-4.1-nano by default)
recall ingest ~/documents/

# With Claude Opus (higher quality, uses your CLI subscription)
recall --use-claude-cli ingest ~/documents/

# Parallel batch extraction within each file (4 concurrent LLM calls)
recall --use-claude-cli ingest ~/documents/ -j 4

# Process smallest files first (checkpoint quickly)
recall --use-claude-cli ingest ~/documents/ -j 4 --smallest-first

# Skip certain directories
recall --use-claude-cli ingest ~/documents/ --exclude standups --exclude drafts

# With context hints for the LLM
recall ingest ~/discord-exports/ --hint "Discord DMs involving Alice Johnson (ally_j)"

# Override extraction model
recall ingest ~/notes/ --model gpt-4.1-mini

Remember facts directly

# AI-authored (default, e.g. called by Claude Code)
recall remember "Alice prefers sushi over pizza"

# Human-authored
recall remember --human "My bank account number is 12345"

Auto-prepends today's date if no [YYYY-MM-DD] prefix is present.

Search

recall query "who is Bob?"
recall query "production migration practices" --top-k 10
recall query "Alice's birthday" --min-confidence 0.5

Forget

recall forget --id 42
recall forget --source "old-file.md"

Dream cycle

recall dream              # Build/rebuild peer cards
recall dream --dry-run    # Preview without writing

Other

recall stats              # Show memory count, DB size, sources
recall serve              # Start web UI on localhost:8765

Architecture

Module Purpose
cli.py CLI entry point, argument parsing, ingestion orchestration
llm.py LLM backend abstraction (OpenAI API or Claude CLI subprocess)
extractor.py Fact extraction prompts and batching
embeddings.py Sentence-transformer embedding (BAAI/bge-small-en-v1.5)
retrieval.py Hybrid BM25 + dense search with RRF fusion
store.py SQLite storage, schema, migrations, CRUD
chunker.py File discovery, format parsing, fact date extraction
dreamer.py Dream cycle — entity extraction and peer card generation
parsers.py Email body extraction helper
web/app.py FastAPI web UI for browsing/editing memories

Design details

LLM backends

Backend Flag Model Batch size Notes
OpenAI API (default) GPT-4.1-nano 100K chars Requires OPENAI_API_KEY
Claude CLI --use-claude-cli Opus 4.6 500K chars Uses your CLI subscription, --effort max

The Claude CLI backend produces significantly higher quality extractions — better date accuracy, fewer duplicates, atomic self-contained facts, and preserved perspectives from all conversation participants.

Source types

Every memory is tagged with a source_type indicating its provenance:

Type Origin Trust level
ingest File ingestion via recall ingest High — primary source data
remember:human recall remember --human "..." High — deliberate human input
remember:ai recall remember "..." (default) Medium — AI interpretation
dream Dream cycle peer card generation Low — derived/synthesized

The dream cycle excludes source_type='dream' from its input to prevent hallucination feedback loops.

Fact dates

Facts are prefixed with dates during extraction. The fact_date column parses these into structured metadata for temporal queries. Supported formats:

  • [2024-01-15] — specific day
  • [2024-01] — month
  • [2024] — year
  • [2024-Q2] — quarter
  • [2025-H2] — half-year
  • [2024-01 to 2024-03] — date range
  • [2023-11-11 to 2023-11-13] — day range

Facts without parseable dates (e.g. [Undated], category-only tags) get fact_date=NULL.

Ingestion behavior

  • Files are hashed and tracked in ingest_log — unchanged files are skipped on re-runs
  • Files that produce zero facts are also recorded, avoiding redundant LLM calls on retry
  • Large files are split into pages at line boundaries, respecting the backend's batch size limit
  • With -j N, batches within each file are extracted in parallel (N concurrent LLM calls), while files are processed sequentially for reliable checkpointing
  • --smallest-first processes files in ascending size order, maximizing early checkpoints when token budgets are limited
  • SQLite WAL mode is enabled for safe concurrent access across processes

Schema

chunks (
    id INTEGER PRIMARY KEY,
    content TEXT,           -- the extracted fact
    source_file TEXT,       -- e.g. "identity.md" or "peer-card/Bob"
    section_path TEXT,      -- hierarchical context
    chunk_hash TEXT UNIQUE, -- SHA256(source_file + content)
    embedding BLOB,         -- float32 vector (384 dims)
    created_at REAL,        -- ingestion timestamp
    source_type TEXT,       -- 'ingest', 'remember:human', 'remember:ai', 'dream'
    fact_date TEXT          -- structured date, nullable (see formats above)
)

ingest_log (
    source_file TEXT PRIMARY KEY,  -- relative path from ingest directory
    file_hash TEXT,                -- SHA256 of file content (skip if unchanged)
    ingested_at REAL,              -- timestamp
    chunk_count INTEGER            -- 0 for files with no extractable facts
)

Supported formats

  • Markdown (.md) — raw text
  • JSON (.json) — auto-detects Discord and Slack export formats
  • PDF (.pdf) — text extraction via PyMuPDF
  • Email (.mbox) — filters automated/calendar emails, sorts by substance
  • Word (.docx) — XML paragraph extraction
  • CSV (.csv) — raw text

Configuration

Settings via pydantic-settings in recall/settings.py:

Setting Default Description
db_path ~/.local/share/recall/memory.db SQLite database location
embedding_model BAAI/bge-small-en-v1.5 HuggingFace embedding model
embedding_dimensions 384 Must match embedding model
debug True Verbose output

Tests

pip install -e ".[dev]"
pytest tests/ -v

About

Semantic memory system for use with AI agents

Resources

Stars

Watchers

Forks

Contributors