persistent-memory

Persistent long-term memory for Claude Code. Free forever.

Walks every Claude Code transcript at ~/.claude/projects/, embeds the text locally with sentence-transformers, stores vectors in Pinecone's free tier, and exposes two tools to Claude over MCP:

search_memory(query, top_k=5) — semantic search across all past sessions
add_memory(content, tag) — manually save a memory

Claude can now remember every conversation you've ever had with it.

Demo

You:    "Do you remember what I told you about being in the army?"

Claude: [calls search_memory("army military service")]
        Yeah — you served in the 101st Airborne Division. Screaming Eagle
        patch, "Currahee" motto. We talked about that weeks ago when we
        were building the CTF capture-flag feature...

The recall is pulled from a conversation 27 days earlier, across a different project, with zero context loaded into the current session.

Why it's free

Layer	Choice	Cost
Vector store	Pinecone serverless (free tier: 2 GB / 1M reads/mo)	$0
Embeddings	`sentence-transformers/all-MiniLM-L6-v2` (local, on your machine)	$0
Raw transcripts	Already on disk in `~/.claude/projects/`	$0
MCP server	Local Python via stdio	$0
Per-search context cost	~1k–1.5k Anthropic tokens (5 chunks × ~250 tok)	tiny

A typical "remember last conversation" search costs ~95% fewer tokens than loading a full memory file into context up front.

Install

1. Clone the repo

git clone https://github.com/timastras9/persistent-memory.git ~/.claude/persistent-memory
cd ~/.claude/persistent-memory

2. Get a free Pinecone API key

Sign up at app.pinecone.io (no credit card required). Create a project, copy your API key from the dashboard.

3. Configure

cp .env.example .env

Edit .env and paste in your PINECONE_API_KEY. Defaults are fine for everything else.

4. Install dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
chmod +x run_server.sh

5. Index your existing conversations

python3 ingest.py

This walks every .jsonl transcript in ~/.claude/projects/, extracts the text, embeds it locally on your CPU, and upserts it to Pinecone. The script is idempotent — safe to re-run any time you want recent sessions indexed.

For a fresh install with thousands of past conversations, expect 1–5 minutes.

6. Register the MCP server with Claude Code

claude mcp add persistent-memory ~/.claude/persistent-memory/run_server.sh --scope user

Verify:

claude mcp list
# persistent-memory: ... - ✓ Connected

7. Restart Claude Code

In an active session, run /mcp — you should see persistent-memory listed with two tools.

Usage

Once registered, Claude calls the tools automatically when relevant:

"Do you remember what we decided about the auth refactor last month?"

Claude will run search_memory and pull the relevant exchange from past sessions.

To save a memory by hand:

"Save this to memory: I prefer pytest over unittest, with descriptive test names."

Claude will call add_memory and tag it.

Refreshing the index

Re-run ingest.py whenever you want recent conversations indexed:

cd ~/.claude/persistent-memory
./.venv/bin/python ingest.py

Optional: hourly auto-refresh via cron

0 * * * * cd $HOME/.claude/persistent-memory && ./.venv/bin/python ingest.py >/dev/null 2>&1

Cost guardrails

Pinecone free tier covers:

2 GB serverless storage (≈ 2M chunks at 384-dim)
5M write units / 1M read units per month
1 project, 5 indexes

Personal use never approaches these limits. If you ever do, the next tier is $25/month flat.

How it works

~/.claude/projects/*.jsonl  ──►  ingest.py  ──►  local embedder
                                                      │
                                                      ▼
                                              Pinecone (vectors)
                                                      ▲
Claude Code  ◄─►  MCP stdio  ◄─►  server.py  ─────────┘
                                  search_memory(query)
                                  add_memory(content)

Ingest: transcripts are chunked (1500 chars each), embedded locally with a 384-dim sentence-transformer, and upserted to Pinecone with role and project metadata.
Query: Claude calls search_memory("..."), the server embeds the query locally, runs cosine similarity in Pinecone, and returns the top-k chunks.
Cost: zero recurring charges. Only Anthropic-side overhead is the ~1–1.5k tokens of retrieved context per search.

Layout

persistent-memory/
├── .env.example          # template — copy to .env and add your Pinecone key
├── requirements.txt
├── embedder.py           # local sentence-transformers wrapper
├── pinecone_client.py    # creates/returns the Pinecone index
├── ingest.py             # one-shot: scan transcripts → embed → upsert
├── server.py             # MCP stdio server (search_memory + add_memory)
├── run_server.sh         # wrapper that activates venv + runs server
├── LICENSE
└── README.md

Troubleshooting

Symptom	Fix
`ModuleNotFoundError: No module named 'pinecone'`	Activate the venv: `source .venv/bin/activate` then `pip install -r requirements.txt`
MCP server not appearing in `/mcp`	Run `claude mcp list` — if missing, re-run step 6. Restart Claude Code.
`✗ Failed to connect`	Most often a venv path issue if you renamed/moved the directory — recreate the venv
`(no memories found)` on every search	Index is empty. Run `python3 ingest.py`
Pinecone auth error	Check `.env` has the right `PINECONE_API_KEY` (no quotes, no trailing space)

Contributing

PRs welcome. Bug reports too.

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

persistent-memory

Demo

Why it's free

Install

1. Clone the repo

2. Get a free Pinecone API key

3. Configure

4. Install dependencies

5. Index your existing conversations

6. Register the MCP server with Claude Code

7. Restart Claude Code

Usage

Refreshing the index

Optional: hourly auto-refresh via cron

Cost guardrails

How it works

Layout

Troubleshooting

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
embedder.py		embedder.py
ingest.py		ingest.py
pinecone_client.py		pinecone_client.py
requirements.txt		requirements.txt
run_server.sh		run_server.sh
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

persistent-memory

Demo

Why it's free

Install

1. Clone the repo

2. Get a free Pinecone API key

3. Configure

4. Install dependencies

5. Index your existing conversations

6. Register the MCP server with Claude Code

7. Restart Claude Code

Usage

Refreshing the index

Optional: hourly auto-refresh via cron

Cost guardrails

How it works

Layout

Troubleshooting

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages