Local-first, cross-session memory for Claude Code, with redaction before embedding.
A knowledge layer that turns every past Claude Code session into searchable memory, so an answer, a procedure or a decision established in one session becomes retrievable from any other, without a single secret leaving your machine.
Every Claude Code session is a sealed bubble on disk:
~/.claude/projects/<project>/*.jsonl. Knowledge accumulates but stays trapped
per session and per project. A procedure you established once (for example: "how
to SSH into the production server") is invisible from every other session. You
end up hunting through old conversations to re-find an answer you already got.
cc-history adds the retrieval layer that was missing, and changes nothing about Claude Code itself.
Session A (weeks ago): "SSH to the server is: ssh deploy@host -i ~/.ssh/prod ..."
|
v indexed, secrets masked, stored locally
Session B (today): you: "connect to the server"
Claude calls recall() -> gets the procedure from Session A
The model gains two tools in every session and calls them on its own when you reference something established elsewhere:
recall(query, project?, top_k?): semantic search across your whole historyrecall_list_projects(): list indexed projects
No copy-paste between sessions. No "which conversation was that in". The memory follows you.
Most "AI memory" tools are SaaS: your transcripts, prompts and code get uploaded to a third-party server. cc-history is the opposite by design.
- Local-first. Parsing, redaction, embeddings and the vector store all run
on your machine. Nothing is uploaded. The only text that reaches an API is what
a
recallresult re-injects into your Claude context, which is inherent to using Claude Code at all, and even that text is already redacted. - Security by design. Secrets are masked before they are embedded, so a secret never exists in the index (neither as document text nor as a vector).
- Privacy by design. Default-deny allowlist, third-party data excluded, right-to-erasure with physical deletion (GDPR Article 17).
- Zero API cost to run. Embeddings are computed locally (BGE-M3). Indexing and recall cost nothing but local compute.
~/.claude/projects/**/*.jsonl (source: main sessions only, sidechains excluded)
|
| parser.py JSONL transcript -> Exchange (user msg + assistant text + tool actions)
| redact.py mask secrets and third-party PII BEFORE embedding
| embeddings.py BGE-M3, 1024-dim, normalized, local (MPS or CPU)
v
Chroma collection `claude_code_history` (dedicated store under ~/.cc-history)
|
| store.py read-only recall core (re-scans every returned doc, fail-closed)
v
cc-history-recall (CLI) + cc-history-mcp (MCP server, exposed to every session)
Requires Python 3.10+. Install straight from GitHub:
# full tool (embeddings + MCP server); pulls in torch via sentence-transformers
pip install "cc-history[all] @ git+https://github.com/Straska7/cc-history.git"
# or just the redaction library (pure Python, no torch)
pip install "git+https://github.com/Straska7/cc-history.git"Or from a clone (for development):
git clone https://github.com/Straska7/cc-history.git && cd cc-history
uv venv && uv pip install -e ".[all,dev]"Not on PyPI yet. Once published,
pip install "cc-history[all]"will work directly.
# 1. Choose what to index (default-deny). List your Claude Code projects:
ls ~/.claude/projects
# then copy the example config and set `policy: allow` on the ones you want:
mkdir -p ~/.cc-history
cp config/cc_history.example.yaml ~/.cc-history/config.yaml
$EDITOR ~/.cc-history/config.yaml
# 2. Index (incremental). CPU device keeps the GPU free for other workloads.
CC_HISTORY_EMBEDDING_DEVICE=cpu cc-history-ingest
cc-history-ingest --dry-run # show the plan, write nothing
cc-history-ingest --rebuild # from scratch
# 3. Recall from the CLI
cc-history-recall "how do I connect to the server over ssh"
cc-history-recall --list-projects
cc-history-recall --audit # zero-leak self-checkRegister the MCP server once, at user scope:
claude mcp add cc-history -s user -e CC_HISTORY_EMBEDDING_DEVICE=cpu -- \
cc-history-mcp
claude mcp get cc-history # verify
claude mcp remove cc-history -s userFrom then on, every new session has recall and recall_list_projects. The
server is read-only, lazy-loads the model (no RAM cost until the first recall),
and wraps every result in an anti-injection envelope with provenance.
This is the part that separates cc-history from a weekend script. The threat
model is explicit: because recall re-injects stored text into the model
context, which transits to the API, anything stored is potentially
exfiltrated. Therefore nothing sensitive is ever stored.
Detection and masking happen in redact.py before a single character is
embedded. Placeholders are typed ([REDACTED:<type>]) so audits can tell what
was caught.
Secrets and credentials masked (16 classes):
| Category | Types |
|---|---|
| Private keys | PEM private keys, SSH key material |
| Cloud / provider keys | AWS access key IDs, GCP API keys, Anthropic API keys, OpenAI API keys |
| VCS tokens | GitHub tokens, GitLab tokens |
| Chat / bot tokens | Slack tokens, Telegram bot tokens |
| Auth material | JWTs, Bearer tokens, Authorization headers |
| Inline secrets | export FOO=... env secrets, passwords, passwords in URLs |
Third-party PII masked (4 classes):
| Type | Approach (tuned for a code-heavy corpus) |
|---|---|
Regex + allowlist for the operator's own addresses (CC_HISTORY_OWN_EMAILS) |
|
| Phone | Two formats (international, national) plus a digit-count validator, so CVE IDs, dates, versions and ports never qualify |
@handle |
Prose-context only, with a blocklist of Python decorators and framework names, an allowlist of your own bots, and exclusion of function calls, npm scopes and attribute access |
| Name | Only in a labeled header position (Cc: / Contact:), with an allowlist for the operator |
Defense in depth: the recall path re-scans every returned document through the redactor before it leaves the store, and fails closed on any error, so even a stale or malformed index entry cannot leak.
- Default-deny allowlist: a project is invisible until explicitly allowed.
- Third-party exclusion: projects containing other people's regulated data
are
denyand never indexed. - Right to erasure (GDPR Article 17):
--purge-sessionand--purge-projectperform physical deletion with a store VACUUM and removal of backups, not a soft delete. - Auditability:
--auditre-scans the entire index for any residual leak.
See SECURITY.md for the full threat model.
Two complementary refresh paths, both local and incremental:
- On session end: a Claude Code
SessionEndhook that runscc-history-ingest --quiet, so a finished session enters memory within seconds. - Nightly safety net: a cron job (or launchd/systemd timer) running the same command once a day.
Example SessionEnd hook in ~/.claude/settings.json:
{
"hooks": {
"SessionEnd": [
{ "type": "command",
"command": "CC_HISTORY_EMBEDDING_DEVICE=cpu cc-history-ingest --quiet >/dev/null 2>&1 || true",
"async": true }
]
}
}cc-history-eval measures retrieval quality objectively against a fixed query
set (paraphrases plus off-topic negatives): recall@K, MRR, and negative
precision. The bundled cases are placeholders; point them at your own corpus.
On the author's own history (496 chunks) the same harness scored recall@5 = 95%, MRR = 0.887, with all off-topic negatives below the noise threshold. That evaluation is also the decision gate for adding a lexical (BM25) hybrid: it is deliberately not added, because dense retrieval already clears the bar.
Redaction is covered by unit tests including adversarial and anti-false-positive cases (Python decorators, npm scopes, CVE IDs, dates, versions, ports, UUIDs).
- Single machine, single user. There is no team sync, no multi-user auth, no RBAC. This indexes one developer's local history.
- Names in free-running prose are not redacted. Masking arbitrary names in
sentences requires NER, not regex. Labeled names (
Cc:,Contact:) and prose@handlesare covered; a bare "I spoke with Marie" is not. Precision is favored over recall here on purpose. - Tuned for a code-heavy corpus. The PII heuristics trade recall for precision to avoid masking code. A prose-heavy corpus would want different tuning.
- Assumes the Claude Code on-disk transcript format
(
~/.claude/projects/**/*.jsonl).
Everything lives under ~/.cc-history, each piece overridable by an env var:
| Variable | Default | Purpose |
|---|---|---|
CC_HISTORY_HOME |
~/.cc-history |
base directory |
CC_HISTORY_CHROMA_PATH |
<home>/chroma |
vector store |
CC_HISTORY_STATE |
<home>/state/ingest_state.json |
incremental-ingest state |
CC_HISTORY_CONFIG |
<home>/config.yaml |
allowlist config |
CLAUDE_PROJECTS_DIR |
~/.claude/projects |
source transcripts |
CC_HISTORY_EMBEDDING_DEVICE |
auto |
cpu, mps, cuda, or auto |
CC_HISTORY_EMBEDDING_MODEL |
BAAI/bge-m3 |
embedding model id |
CC_HISTORY_OWN_EMAILS |
(empty) | comma-separated own addresses to keep readable |
CC_HISTORY_OWN_NAMES |
(empty) | comma-separated own names to keep readable |
CC_HISTORY_OWN_HANDLES |
(empty) | comma-separated own @handles to keep readable |
cc_history/
parser.py JSONL -> Exchange -> prepared chunks
redact.py secret and PII detection and masking (the security barrier)
store.py read-only recall core (re-scan, fail-closed)
embeddings.py local BGE-M3 wrapper
mcp_server.py MCP server exposing recall to every session
config.py allowlist loader
paths.py central paths and constants
ingest.py incremental indexer, purge, backup (cc-history-ingest)
recall.py CLI recall, list, audit (cc-history-recall)
eval.py retrieval evaluation (cc-history-eval)
config/cc_history.example.yaml the allowlist template
tests/ redaction and parser tests
Apache-2.0. See LICENSE.