```{contents}
```
## Long-Term vs Short-Term Memory 

**Short-term memory** stores **immediate conversational context** required to answer follow-up questions within the same interaction flow.
It is **temporary, fast, and limited in size**.

Typical implementations:

* ConversationBufferMemory
* ConversationBufferWindowMemory
* ConversationSummaryMemory

Provided by LangChain.

```
Recent Conversation
   ↓
Short-Term Memory
   ↓
Prompt Context
```

---

### What Is Long-Term Memory

**Long-term memory** stores **persistent knowledge** across conversations, sessions, or even days/weeks.
It is **retrievable by relevance**, not by time.

Typical implementations:

* VectorStore-backed memory
* Entity memory
* Database / Redis memory

```
Facts / Events / Preferences
   ↓
Long-Term Storage
   ↓ (semantic or key lookup)
Relevant Context
```

---

### Key Differences at a Glance

| Aspect   | Short-Term Memory   | Long-Term Memory      |
| -------- | ------------------- | --------------------- |
| Lifetime | Single conversation | Across sessions       |
| Size     | Small               | Large                 |
| Recall   | Sequential / recent | Semantic / factual    |
| Cost     | Low                 | Medium                |
| Purpose  | Dialogue continuity | Knowledge persistence |

---

### Architecture View (Together)

![Image](https://pub.mdpi-res.com/information/information-16-00251/article_deploy/html/images/information-16-00251-ag.png?1742545416=\&utm_source=chatgpt.com)

![Image](https://www.researchgate.net/publication/330146405/figure/fig1/AS%3A740703420289028%401553608865987/Block-diagram-of-hybrid-memory-architecture-of-XMT-The-serial-portion-comprises-the.png?utm_source=chatgpt.com)

![Image](https://framerusercontent.com/images/1dnPyEubTTivZJXo8nRbdQRpds.png?height=1264\&width=2000\&utm_source=chatgpt.com)

```
User Input
   ↓
Short-Term Memory (recent chat)
   ↓
Long-Term Memory (relevant facts)
   ↓
LLM Response
```

---

## Short-Term Memory — Demonstration

### Short-Term Memory Using ConversationBufferWindowMemory

```python
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationChain
from langchain.llms import OpenAI

llm = OpenAI()

short_term_memory = ConversationBufferWindowMemory(k=2)

conversation = ConversationChain(
    llm=llm,
    memory=short_term_memory
)
```

---

### Short-Term Memory in Action

```python
conversation.predict(input="My VPN is not working")
conversation.predict(input="It disconnects every 5 minutes")
conversation.predict(input="What issue did I mention?")
```

**Output**

```
You mentioned that your VPN disconnects every 5 minutes.
```

Only **recent turns** are remembered.

---

## Long-Term Memory — Demonstration

### Long-Term Memory Using Vector Store

Using Chroma:

```python
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.memory import VectorStoreRetrieverMemory

embeddings = OpenAIEmbeddings()

vectorstore = Chroma(
    collection_name="long_term_memory",
    embedding_function=embeddings
)

long_term_memory = VectorStoreRetrieverMemory(
    retriever=vectorstore.as_retriever(search_kwargs={"k": 2})
)
```

---

### Long-Term Memory in Action

```python
conversation = ConversationChain(
    llm=llm,
    memory=long_term_memory
)

conversation.predict(input="I use Windows 11 and my VPN drops often")
conversation.predict(input="What OS do I usually work on?")
```

**Output**

```
You usually work on Windows 11.
```

This works **even after many interactions**, because recall is **semantic**, not positional.

---

## Combining Short-Term + Long-Term Memory

### Hybrid Memory Pattern (Recommended)

```python
from langchain.memory import CombinedMemory

memory = CombinedMemory(memories=[
    short_term_memory,
    long_term_memory
])

conversation = ConversationChain(
    llm=llm,
    memory=memory
)
```

---

### Hybrid Memory Behavior

* Short-term memory → recent troubleshooting steps
* Long-term memory → OS, device, recurring issues

```
User: It’s still disconnecting
LLM:
• Uses short-term memory → “still”
• Uses long-term memory → “VPN on Windows 11”
```

---

### Real-World Use Case

**Production IT Support Assistant**

* Short-term memory:

  * Current troubleshooting flow
* Long-term memory:

  * User OS, device, recurring incidents
  * Historical preferences

This mirrors **human cognition**.

---

### When to Use What

**Use short-term memory when:**

* Context is immediate
* Tasks are linear
* Cost must be minimal

**Use long-term memory when:**

* Personalization matters
* Sessions span time
* Facts must persist

---

### Best-Practice Memory Strategy

| Layer          | Memory Type     |
| -------------- | --------------- |
| Immediate      | Buffer / Window |
| Conversational | Summary         |
| Knowledge      | Vector / Entity |
| Persistence    | DB / Redis      |

---

### Key Takeaways

* Short-term memory handles **conversation flow**
* Long-term memory handles **knowledge retention**
* They solve different problems
* Production systems **always combine both**