
# Hierarchical Memory Agent Demo

This notebook illustrates a **three-layer memory** design for agents:

- **Short-term memory** — current conversation  
- **Episodic memory** — per-task summaries  
- **Long-term memory** — vector store across many sessions  

We focus on the *conceptual wiring* and simulate behavior with simple Python structures.



## 1. Setup


In [None]:

%pip install -q langgraph langchain-openai langchain chromadb

import os
os.environ.setdefault("OPENAI_API_KEY", "sk-REPLACE_ME")



## 2. Simple Memory Manager

We simulate:

- `short_term`: list of recent messages  
- `episodic`: list of per-session summaries  
- `long_term`: vector store (Chroma) for semantic recall  


In [None]:

from dataclasses import dataclass, field
from typing import List, Dict, Any

from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

@dataclass
class MemoryManager:
    short_term: List[Dict[str, Any]] = field(default_factory=list)
    episodic: List[str] = field(default_factory=list)
    long_term_store: Any = None

    def setup_long_term(self, persist_dir="memory_store"):
        embeddings = OpenAIEmbeddings()
        self.long_term_store = Chroma(collection_name="agent_memory", embedding_function=embeddings, persist_directory=persist_dir)

    def add_short(self, msg: Dict[str, Any]):
        self.short_term.append(msg)

    def summarize_to_episode(self, llm):
        text = "\n".join(m["content"] for m in self.short_term if m["role"] == "user")
        prompt = f"Summarize the key facts the user mentioned in 3 bullet points:\n\n{text}"
        summary = llm.invoke([{"role": "user", "content": prompt}]).content
        self.episodic.append(summary)
        self.long_term_store.add_texts([summary])
        self.short_term.clear()
        return summary

    def recall(self, query: str, k: int = 3):
        if self.long_term_store is None:
            return []
        docs = self.long_term_store.similarity_search(query, k=k)
        return [d.page_content for d in docs]



## 3. Simulate Multi-Session Interaction

We create:

- A user telling the agent about preferences in session 1  
- The agent summarizing to episodic + long-term memory  
- In session 2, the user asks the agent what it remembers  


In [None]:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", temperature=0)
memory = MemoryManager()
memory.setup_long_term()

# Session 1: user shares details
session1_msgs = [
    {"role": "user", "content": "I work on agentic AI systems using LangGraph and MCP."},
    {"role": "user", "content": "I care about reproducible research and good engineering practices."},
    {"role": "user", "content": "I often build RAG pipelines for enterprise data."},
]

for m in session1_msgs:
    memory.add_short(m)

summary1 = memory.summarize_to_episode(llm)
print("Episodic summary from session 1:\n", summary1)



### Session 2: Ask the Agent What It Remembers


In [None]:

query = "What do you remember about my work and interests?"
recalled = memory.recall("agentic AI and my work", k=3)

print("Recalled from long-term memory:")
for r in recalled:
    print("-", r)

prompt = (
    "Based on the following long-term memory snippets, answer the user's question: "
    "'What do you remember about my work and interests?'\n\n"
    + "\n".join(recalled)
)
answer = llm.invoke([{"role": "user", "content": prompt}])
print("\nFinal answer:\n", answer.content)



## 4. Takeaways

- Hierarchical memory lets agents **remember across sessions** without keeping raw logs forever.  
- Episodic summaries compress events; vector stores support semantic recall.  
- You can plug this `MemoryManager` into a LangGraph state object and let nodes read/write memory.

This is the foundation for **long-lived, personality-consistent agents**.
