# 15 — Memory with LangGraph (Short-term & Persistent)

This notebook shows how to add **chat memory** using **LangGraph** checkpointers (recommended in LangChain v0.3+). We cover:

1) **Short-term memory** with `MemorySaver` (in RAM).
2) **Persistent memory** with `SqliteSaver` (on disk).

We use `MessagesState` + `StateGraph` + a `thread_id` to scope history per conversation. See the official guide for message history and LangGraph persistence. 

> Docs:
> - How to add message history (LangChain v0.3+):
>   - https://python.langchain.com/docs/how_to/message_history/
> - LangGraph checkpointing (Memory/SQLite/Postgres):
>   - https://langchain-ai.github.io/langgraph/reference/checkpoints/
> - Persistence overview & sqlite package note:
>   - https://langchain-ai.github.io/langgraph/concepts/persistence/

In [1]:
# ╔══════════════════════════════════════════════════════╗
# ║ Setup: env & model                                  ║
# ╚══════════════════════════════════════════════════════╝

import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

from langchain_openai import ChatOpenAI
# from langchain_groq import ChatGroq

llm = ChatOpenAI(model="gpt-4o-mini")
# llm = ChatGroq(model="llama-3.1-70b-versatile")

print("✅ Environment loaded and model ready.")

✅ Environment loaded and model ready.


## 1) Short-term memory with `MemorySaver`

We define a **single-node** graph that sends the current message history to the model and appends the AI reply back into state.

We compile with **`MemorySaver`** (in-memory checkpointer). History is scoped by `thread_id` in `config`.

In [2]:
from langchain_core.messages import HumanMessage
from langgraph.graph import START, MessagesState, StateGraph
from langgraph.checkpoint.memory import MemorySaver

# Define a chat workflow over MessagesState (list of messages)
workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    # state['messages'] is a list[BaseMessage]; we pass it directly to the model
    ai_msg = llm.invoke(state["messages"])  # returns an AIMessage
    return {"messages": ai_msg}

workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# In-memory memory (short-term)
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# Each distinct thread gets its own conversation history
cfg = {"configurable": {"thread_id": "demo-thread-1"}}

# Turn 1
out = app.invoke({"messages": [HumanMessage("Hi! My name is Alex.")]}, cfg)
print(out["messages"][-1].content)

# Turn 2 (same thread → history is preserved)
out = app.invoke({"messages": [HumanMessage("What is my name?")]}, cfg)
print(out["messages"][-1].content)

# New thread (no prior history)
cfg2 = {"configurable": {"thread_id": "demo-thread-2"}}
out2 = app.invoke({"messages": [HumanMessage("Do you remember me?")]}, cfg2)
print(out2["messages"][-1].content)

Hi Alex! How can I assist you today?
Your name is Alex. How can I help you today, Alex?
I don’t have the ability to remember past interactions, but I'm here to help you with any questions or topics you want to discuss! How can I assist you today?


**What happened?**
- The app persists message history **per `thread_id`** using LangGraph checkpoints.
- Re-invocations on the same thread include prior messages in `MessagesState` automatically.

## 2) Persistent memory with `SqliteSaver`

To persist across kernel restarts, use **SQLite**. Install the separate package:

```bash
pip install -U langgraph-checkpoint-sqlite
```

Then compile the same graph with `SqliteSaver`. (You can also use Postgres via `langgraph-checkpoint-postgres`.)

In [3]:
# If not installed, run: pip install -U langgraph-checkpoint-sqlite
from langgraph.checkpoint.sqlite import SqliteSaver

sqlite_memory = SqliteSaver.from_conn_string("chat_memory.db")  # persisted on disk
app_sqlite = workflow.compile(checkpointer=sqlite_memory)

cfg_sql = {"configurable": {"thread_id": "customer-42"}}
out = app_sqlite.invoke({"messages": [HumanMessage("Hi, I'm Jamie. Please remember it.")]}, cfg_sql)
print(out["messages"][-1].content)

# On a later run (even after restarting the kernel), the thread history is loaded from SQLite:
out = app_sqlite.invoke({"messages": [HumanMessage("What is my name?")]}, cfg_sql)
print(out["messages"][-1].content)

AttributeError: '_GeneratorContextManager' object has no attribute 'get_next_version'

> **Tip:** You can inspect or mutate state:
```python
state_view = app_sqlite.get_state(cfg_sql).values
for m in state_view["messages"]:
    m.pretty_print()
```
You can also **truncate/trim** messages if the history grows too large (custom policy). See the message-history guide for examples.

## 3) Multi-input state (prompt params + messages)

When your runnable accepts multiple inputs (e.g., a `language` parameter plus `messages`), define a **TypedDict** state and mark the messages channel with `add_messages` so new messages **append**, while scalar fields **overwrite**.

In [None]:
from typing import Sequence
from typing_extensions import Annotated, TypedDict
from langchain_core.messages import BaseMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.graph.message import add_messages

class ChatState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str

prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer in {language}. Be concise."),
    MessagesPlaceholder("messages"),
])
runnable = prompt | llm

wf2 = StateGraph(state_schema=ChatState)
def call_model_dict(state: ChatState):
    # Pass the whole dict (messages + language) to the runnable
    ai_msg = runnable.invoke(state)
    return {"messages": [ai_msg]}  # append the new AI message

wf2.add_node("model", call_model_dict)
wf2.add_edge(START, "model")

app2 = wf2.compile(checkpointer=MemorySaver())
cfg = {"configurable": {"thread_id": "lang-es-1"}}
out = app2.invoke({"messages": [HumanMessage("Hi, I'm Bob.")], "language": "Spanish"}, cfg)
print(out["messages"][-1].content)  # Expect Spanish reply

## Best practices
- Use **`thread_id`** to isolate conversations by user/session.
- **Short-term (MemorySaver)** for demos/tests; **SQLite/Postgres** for apps that must survive restarts.
- Limit history (e.g., last *k* messages) or **summarize** when token budgets are tight.
- Store useful metadata in your DB if you’ll analyze or filter threads later.
- For agents/tools/multi-step workflows, LangGraph checkpointing also enables **human-in-the-loop**, **time-travel/replay**, and **fault recovery**.