## ConversationSummaryMemory


**ConversationSummaryMemory** is a **compressed conversation memory** mechanism that **summarizes past interactions** instead of storing the full chat transcript.
The summary is continuously updated using an LLM and injected into future prompts, enabling **long-running conversations with bounded token usage**.

Provided by LangChain.

```
User ↔ Assistant
   ↓
Conversation History
   ↓ (LLM summarization)
Running Summary
   ↓
Prompt (with compact context)
```

---

### Why ConversationSummaryMemory Exists

ConversationBufferMemory replays the entire chat, which:

* Increases token usage
* Raises cost
* Hits context window limits

ConversationSummaryMemory solves this by:

* Keeping **only the essence**
* Maintaining **conversation continuity**
* Scaling to long sessions

---

### How ConversationSummaryMemory Works Internally

1. Conversation progresses
2. When memory updates, an LLM:

   * Reads recent messages
   * Updates a running summary
3. Old raw messages are discarded
4. Only the summary is passed forward

Example evolving summary:

```
User reported VPN issue on Windows.
Assistant suggested basic troubleshooting.
```

---

### Architecture View

![Image](https://dezyre.gumlet.io/images/blog/langchain-memory/Types_of_Langchain_Memory.webp?dpr=2.6\&w=376\&utm_source=chatgpt.com)

![Image](https://towardsdatascience.com/wp-content/uploads/2024/11/1ncfjCpCN8XqYBj7wyZziow.png?utm_source=chatgpt.com)

![Image](https://www.vasinov.com/images/adding-memory-to-gpt-models/gpt-memory-2.png?utm_source=chatgpt.com)

---

### Basic Demonstration (LangChain)

#### Initialize Summary Memory



In [2]:
from langchain_classic.memory import ConversationSummaryMemory
from langchain_openai import OpenAI

llm = OpenAI()

memory = ConversationSummaryMemory(
    llm=llm
)

  memory = ConversationSummaryMemory(




---

#### Attach Memory to a Conversation Chain



In [4]:
from langchain_classic.chains import ConversationChain

conversation = ConversationChain(
    llm=llm,
    memory=memory
)


  conversation = ConversationChain(



---

#### Run a Long Conversation



In [5]:
conversation.predict(input="My VPN is not working on Windows 11")
conversation.predict(input="It disconnects every 5 minutes")
conversation.predict(input="I already tried restarting")
conversation.predict(input="What issue am I facing?")


' It seems like you are experiencing issues with your VPN on Windows 11. Specifically, it keeps disconnecting every 5 minutes. Is that correct?'



**Output**

```
You are facing a VPN connectivity issue on Windows 11 that disconnects frequently, despite restarting.
```

The model recalls this **from the summary**, not raw history.

---

### Inspecting the Stored Summary

```python
print(memory.buffer)
```

Example summary:

```
User has a VPN connectivity issue on Windows 11 with frequent disconnections.
Restarting did not resolve the issue.
```

This replaces dozens of prior messages.

---

### Prompt Injection Mechanism

Instead of replaying full chat history, the prompt contains:

```
System: You are a helpful assistant.

Conversation Summary:
User has a VPN connectivity issue on Windows 11 with frequent disconnections.

Human: What issue am I facing?
AI:
```

---

### Strengths of ConversationSummaryMemory

* Bounded token usage
* Suitable for long sessions
* Lower cost than buffer memory
* Maintains conversational continuity

---

### Limitations

| Limitation     | Explanation                          |
| -------------- | ------------------------------------ |
| Loss of detail | Fine-grained info may be dropped     |
| Summary drift  | Errors propagate if summary is wrong |
| LLM dependency | Needs model calls to summarize       |
| Not queryable  | No semantic recall of specifics      |

---

### When to Use ConversationSummaryMemory

Use it when:

* Sessions are long-running
* Cost control matters
* Exact phrasing is not critical
* You need conversational continuity

Avoid it when:

* Exact wording matters
* You need factual traceability
* Auditing or compliance is required

---

### Comparison with Other Memory Types

| Memory Type               | Stored Data | Token Growth | Best Use         |
| ------------------------- | ----------- | ------------ | ---------------- |
| ConversationBufferMemory  | Full chat   | High         | Short chats      |
| ConversationSummaryMemory | Summary     | Low          | Long chats       |
| VectorStoreMemory         | Embeddings  | Low          | Semantic recall  |
| Redis / DB Memory         | Structured  | Low          | Production scale |

---

### Real-World Use Case

**Production IT Support Assistant**

* Long user sessions
* Multi-step troubleshooting
* Needs continuity without token explosion
* Summary memory maintains context efficiently

---

### Key Takeaways

* ConversationSummaryMemory stores **compressed context**
* It scales better than buffer memory
* Ideal for long-running conversational agents
* Often combined with vector or database memory for precision
