Skip to content

timastras9/persistent-memory

Repository files navigation

persistent-memory

Persistent long-term memory for Claude Code. Free forever.

License: MIT Python 3.10+ MCP Cost PRs Welcome

Walks every Claude Code transcript at ~/.claude/projects/, embeds the text locally with sentence-transformers, stores vectors in Pinecone's free tier, and exposes two tools to Claude over MCP:

  • search_memory(query, top_k=5) — semantic search across all past sessions
  • add_memory(content, tag) — manually save a memory

Claude can now remember every conversation you've ever had with it.

Demo

You:    "Do you remember what I told you about being in the army?"

Claude: [calls search_memory("army military service")]
        Yeah — you served in the 101st Airborne Division. Screaming Eagle
        patch, "Currahee" motto. We talked about that weeks ago when we
        were building the CTF capture-flag feature...

The recall is pulled from a conversation 27 days earlier, across a different project, with zero context loaded into the current session.

Why it's free

Layer Choice Cost
Vector store Pinecone serverless (free tier: 2 GB / 1M reads/mo) $0
Embeddings sentence-transformers/all-MiniLM-L6-v2 (local, on your machine) $0
Raw transcripts Already on disk in ~/.claude/projects/ $0
MCP server Local Python via stdio $0
Per-search context cost ~1k–1.5k Anthropic tokens (5 chunks × ~250 tok) tiny

A typical "remember last conversation" search costs ~95% fewer tokens than loading a full memory file into context up front.

Install

1. Clone the repo

git clone https://github.com/timastras9/persistent-memory.git ~/.claude/persistent-memory
cd ~/.claude/persistent-memory

2. Get a free Pinecone API key

Sign up at app.pinecone.io (no credit card required). Create a project, copy your API key from the dashboard.

3. Configure

cp .env.example .env

Edit .env and paste in your PINECONE_API_KEY. Defaults are fine for everything else.

4. Install dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
chmod +x run_server.sh

5. Index your existing conversations

python3 ingest.py

This walks every .jsonl transcript in ~/.claude/projects/, extracts the text, embeds it locally on your CPU, and upserts it to Pinecone. The script is idempotent — safe to re-run any time you want recent sessions indexed.

For a fresh install with thousands of past conversations, expect 1–5 minutes.

6. Register the MCP server with Claude Code

claude mcp add persistent-memory ~/.claude/persistent-memory/run_server.sh --scope user

Verify:

claude mcp list
# persistent-memory: ... - ✓ Connected

7. Restart Claude Code

In an active session, run /mcp — you should see persistent-memory listed with two tools.

Usage

Once registered, Claude calls the tools automatically when relevant:

"Do you remember what we decided about the auth refactor last month?"

Claude will run search_memory and pull the relevant exchange from past sessions.

To save a memory by hand:

"Save this to memory: I prefer pytest over unittest, with descriptive test names."

Claude will call add_memory and tag it.

Refreshing the index

Re-run ingest.py whenever you want recent conversations indexed:

cd ~/.claude/persistent-memory
./.venv/bin/python ingest.py

Optional: hourly auto-refresh via cron

0 * * * * cd $HOME/.claude/persistent-memory && ./.venv/bin/python ingest.py >/dev/null 2>&1

Cost guardrails

Pinecone free tier covers:

  • 2 GB serverless storage (≈ 2M chunks at 384-dim)
  • 5M write units / 1M read units per month
  • 1 project, 5 indexes

Personal use never approaches these limits. If you ever do, the next tier is $25/month flat.

How it works

~/.claude/projects/*.jsonl  ──►  ingest.py  ──►  local embedder
                                                      │
                                                      ▼
                                              Pinecone (vectors)
                                                      ▲
Claude Code  ◄─►  MCP stdio  ◄─►  server.py  ─────────┘
                                  search_memory(query)
                                  add_memory(content)
  1. Ingest: transcripts are chunked (1500 chars each), embedded locally with a 384-dim sentence-transformer, and upserted to Pinecone with role and project metadata.
  2. Query: Claude calls search_memory("..."), the server embeds the query locally, runs cosine similarity in Pinecone, and returns the top-k chunks.
  3. Cost: zero recurring charges. Only Anthropic-side overhead is the ~1–1.5k tokens of retrieved context per search.

Layout

persistent-memory/
├── .env.example          # template — copy to .env and add your Pinecone key
├── requirements.txt
├── embedder.py           # local sentence-transformers wrapper
├── pinecone_client.py    # creates/returns the Pinecone index
├── ingest.py             # one-shot: scan transcripts → embed → upsert
├── server.py             # MCP stdio server (search_memory + add_memory)
├── run_server.sh         # wrapper that activates venv + runs server
├── LICENSE
└── README.md

Troubleshooting

Symptom Fix
ModuleNotFoundError: No module named 'pinecone' Activate the venv: source .venv/bin/activate then pip install -r requirements.txt
MCP server not appearing in /mcp Run claude mcp list — if missing, re-run step 6. Restart Claude Code.
✗ Failed to connect Most often a venv path issue if you renamed/moved the directory — recreate the venv
(no memories found) on every search Index is empty. Run python3 ingest.py
Pinecone auth error Check .env has the right PINECONE_API_KEY (no quotes, no trailing space)

Contributing

PRs welcome. Bug reports too.

License

MIT — see LICENSE.

About

Persistent long-term memory for Claude Code. $0/month. Pinecone free tier + local embeddings + MCP server.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors