```{contents}
```
## Agent Memory


**Agent memory** is the mechanism that lets an agent **persist and reuse state across turns or steps**, enabling coherent, stateful behavior instead of stateless, one-off responses.

> Memory answers: *“What should the agent remember from the past to act better now?”*

---

### What Agent Memory Is NOT

* ❌ Not the model’s training data
* ❌ Not `agent_scratchpad` (which is per-run reasoning)
* ❌ Not permanent world knowledge by default

Memory is **runtime state**, scoped to a **session, conversation, or workflow**.

---

### Where Agent Memory Fits

```
User Input
   ↓
Memory (read)
   ↓
Agent Reasoning / Tool Use / Retrieval
   ↓
Response
   ↓
Memory (write/update)
```

Memory is **read before reasoning** and **updated after each turn**.

---

### Agent Memory vs agent_scratchpad

| Aspect       | Agent Memory             | agent_scratchpad              |
| ------------ | ------------------------ | ----------------------------- |
| Purpose      | Remembering across turns | Reasoning within a single run |
| Scope        | Multi-turn/session       | Single execution              |
| Persistence  | Yes (session)            | No                            |
| User-visible | Indirect                 | No                            |

They solve **different problems** and are often used together.

---

### Core Types of Agent Memory

#### 1) Conversation Buffer Memory

Stores the **entire conversation history**.

* **Pros:** Simple, faithful recall
* **Cons:** Grows unbounded; token pressure

**Use when:** Short conversations, low token risk.

---

#### 2) Conversation Window Memory

Keeps **only the last N turns**.

* **Pros:** Predictable size
* **Cons:** Older context lost

**Use when:** Only recent context matters.

---

#### 3) Conversation Summary Memory

Summarizes older turns into a **compact abstract**.

* **Pros:** Scales to long chats
* **Cons:** Possible summary drift

**Use when:** Long-running conversations.

---

#### 4) Hybrid (Summary + Window) Memory

Recent turns + summarized past.

* **Pros:** Best balance (production standard)
* **Cons:** Slight complexity

**Use when:** Enterprise chatbots and assistants.

---

#### 5) Entity Memory

Tracks **entities and attributes** (names, IDs, preferences).

* **Pros:** Structured recall of facts
* **Cons:** Needs clean extraction

**Use when:** CRM/support agents with identifiers.

---

#### 6) Vector (Long-Term) Memory

Stores past interactions as **embeddings** for semantic recall.

* **Pros:** Long-term personalization
* **Cons:** Retrieval tuning required

**Use when:** Personalized assistants over time.

---

### Agent Memory vs RAG

| Dimension | Memory             | RAG                |
| --------- | ------------------ | ------------------ |
| Purpose   | Conversation state | External knowledge |
| Source    | User interactions  | Documents/DBs      |
| Update    | Every turn         | Ingestion-time     |
| Scope     | Session/user       | Global/shared      |

They are **complementary**: memory for *context*, RAG for *facts*.

---

### How Memory Is Injected

Agents typically build prompts like:

```
System Instructions
+ Memory (formatted)
+ User Input
+ agent_scratchpad
```

Memory influences **intent resolution**, **follow-ups**, and **tool decisions**.

---

### Memory in Agents (Execution)

When running an agent:

* **Read:** Memory is loaded before reasoning
* **Write:** New turns (or summaries) are appended after the response

This happens automatically when memory is attached to the executor.

---

### Production Concerns

#### Context Window Management

* Memory consumes tokens
* Use windows, summaries, or compression

#### Safety & Privacy

* Memory may include PII
* Enforce retention and deletion policies

#### Accuracy Drift

* Summaries can accumulate errors
* Validate what gets written to memory

---

### Common Mistakes

* Using full buffer memory indefinitely → token overflow
* Confusing memory with scratchpad → logic errors
* Storing hallucinations → compounding mistakes
* Sharing memory across users → data leakage

---

### Best Practices

* Prefer **hybrid memory** (summary + window)
* Clear memory per session/user
* Don’t store tool internals unless needed
* Validate memory updates
* Monitor token usage

---

### When to Use Agent Memory

* Conversational agents
* Multi-step workflows
* Personalization
* Long-running sessions

---

### When NOT to Use It

* Single-turn APIs
* Stateless services
* High-throughput endpoints

---

### Interview-Ready Summary

> “Agent memory enables stateful behavior by persisting conversational context across turns. It is distinct from the agent scratchpad and from RAG, and must be managed carefully to balance recall, cost, and safety.”

---

### Rule of Thumb

* **Scratchpad → thinking**
* **Memory → remembering**
* **RAG → knowing**
* **LLM → reasoning**

In [4]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{input}")
])

chain = prompt | llm
chain.invoke({"input": "My name is Sanjeev"}).content


'Hello, Sanjeev! How can I assist you today?'

In [3]:
chain.invoke({"input": "What is my name?"}).content


"I'm sorry, but I don't have access to personal information about you unless you share it with me. How can I assist you today?"

Why This Happens

No memory

Each call is independent

The model has no past context

In [5]:
from langchain_classic.memory import ConversationBufferMemory
from langchain_classic.agents import AgentExecutor, create_openai_tools_agent

memory = ConversationBufferMemory(
    return_messages=True
)





  memory = ConversationBufferMemory(


In [23]:
### Agent with memory
from langchain_classic.memory import ConversationBufferMemory
from langchain_classic.agents import AgentExecutor, create_openai_tools_agent

memory = ConversationBufferMemory(
    return_messages=True,
    memory_key="chat_history"  # Explicitly set the key
)


In [24]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("placeholder", "{chat_history}"),  # This injects the memory!
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")  # Separate placeholder for scratchpad
])

agent = create_openai_tools_agent(
    llm=llm,
    tools=[],  # no tools needed for this demo
    prompt=prompt
)


In [25]:
executor = AgentExecutor(
    agent=agent,
    tools=[],
    memory=memory,
    # verbose=True
)


In [29]:
executor = AgentExecutor(
    agent=agent,
    tools=[],
    memory=memory,
)


In [30]:
### Stateful Conversation with an Agent
answer = executor.invoke({"input": "My name is Sanjeev"})
answer['output']


'Hello again, Sanjeev! How can I help you today?'

**Memory now contains the conversation history**
- Human: My name is Sanjeev
- AI: Nice to meet you, Sanjeev.

In [31]:
answer = executor.invoke({"input": "What is my name?"})
answer['output']


'Your name is Sanjeev.'