```{contents}
```
## Long-Term Memory

Long-term memory (LTM) enables generative AI systems to **retain, retrieve, and utilize information across interactions, documents, and time**, beyond a single model context window. It is a core architectural component for building **stateful, personalized, and knowledge-grounded AI systems**.

---

### 1. Motivation & Intuition

Large language models are **stateless** at inference time:

| Limitation      | Explanation                              |
| --------------- | ---------------------------------------- |
| Context window  | Fixed size; older information is lost    |
| No persistence  | Model forgets user history after session |
| Hallucination   | Model lacks grounding in real data       |
| Personalization | Cannot remember user preferences         |

**Long-term memory solves this** by storing external knowledge and interaction history that the model can retrieve when generating responses.

---

### 2. Conceptual Architecture

```
User Query
   │
   ▼
Retriever ──────► Long-Term Memory Store
   │                  (Vector DB, SQL, Files, APIs)
   ▼
Prompt Constructor (Query + Retrieved Memory)
   │
   ▼
LLM → Response
```

---

### 3. Types of Long-Term Memory

| Type                  | Purpose                  | Examples               |
| --------------------- | ------------------------ | ---------------------- |
| **Episodic Memory**   | User interaction history | Chat logs, preferences |
| **Semantic Memory**   | World / domain knowledge | Documents, manuals     |
| **Procedural Memory** | Skills / workflows       | Tool usage patterns    |
| **Reflective Memory** | Model self-feedback      | Evaluations, summaries |

---

### 4. Memory Storage Techniques

| Method          | Description                                         |
| --------------- | --------------------------------------------------- |
| Vector Store    | Embedding-based retrieval (FAISS, Pinecone, Chroma) |
| Relational DB   | Structured memory (Postgres, SQLite)                |
| Document Store  | Raw text (S3, filesystem)                           |
| Knowledge Graph | Entity relationships                                |

---

### 5. Retrieval Workflow (RAG + Memory)

1. **Encode query → embedding**
2. **Search memory store**
3. **Select top-k relevant memories**
4. **Inject into prompt**
5. **Generate response**

---

### 6. Minimal Working Example

```python
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI

# 1. Create memory
docs = ["User likes concise answers", "Project uses PyTorch"]
emb = OpenAIEmbeddings()
store = FAISS.from_texts(docs, emb)

# 2. Query with memory
query = "How should I explain this model?"
memory = store.similarity_search(query, k=2)

# 3. Build prompt
context = "\n".join([m.page_content for m in memory])
prompt = f"Context:\n{context}\n\nAnswer the question: {query}"

# 4. Generate
llm = OpenAI()
response = llm(prompt)
print(response)
```

---

### 7. Memory Lifecycle

| Stage     | Description                   |
| --------- | ----------------------------- |
| Ingestion | Store interactions, documents |
| Encoding  | Convert to embeddings         |
| Indexing  | Build searchable structure    |
| Retrieval | Query-time fetch              |
| Update    | Summarize, prune, refresh     |

---

### 8. Design Challenges

| Issue         | Solution                     |
| ------------- | ---------------------------- |
| Memory bloat  | Summarization & pruning      |
| Stale info    | Time decay, versioning       |
| Privacy       | Encryption, scoped access    |
| Hallucination | Strict grounding from memory |

---

### 9. Evaluation Metrics

| Metric       | What it measures            |
| ------------ | --------------------------- |
| Recall@k     | Relevant memory retrieved   |
| Faithfulness | Response grounded in memory |
| Consistency  | Stable personalization      |
| Latency      | Retrieval overhead          |

---

### 10. Applications

* Personalized assistants
* Autonomous agents
* Enterprise knowledge bots
* Long-term planning systems
* Scientific research copilots

---

### Key Takeaway

**Long-term memory transforms LLMs from stateless text generators into persistent, personalized, knowledge-grounded cognitive systems.**
