# Memory (Conversational / State Memory)

Memory helps your app carry state across turns (conversation history, summaries, slots), so the model can stay on topic, avoid repetition, and reference prior facts.

In LangChain v0.3 you can do memory in a few straightforward ways:
- Manual history (a list of messages) with MessagesPlaceholder
- `RunnableWithMessageHistory` to auto-log and replay history per session
- Windowed memory (only keep the last N turns)
- Summary memory (keep a running summary + recent messages)

## Bootstrap

⚓--- Before proceeding futher it is very important you do the following: --- 👾

Select the 🗝 (key) icon in the left pane and include your OpenAI Api key with Name as "OPENAPI_KEY" and value as the key, and grant it notebook access in order to be able to run this notebook.

Run the below two cells in the order they are in, before running further cells. Wait till a number appears in place of '*' or '[ ]'. Below the cell you should see "✅ Ready: Chat model initialized"

In [None]:
!pip install -q langchain langchain-openai langchain-community

In [None]:
# Environment & imports
from google.colab import userdata

key = userdata.get('OPENAI_API_KEY')  # returns None if not granted
if not key:
    raise RuntimeError("Set OPENAI_API_KEY in a .env file next to this notebook.")

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, BaseMessage
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda

# For automatic chat histories with LCEL
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2, api_key=key)
print("✅ Ready: Chat model initialized")

## Manual Memory

LLMs don't retain memory in the same way humans retain previous conversations. LLMs are sophisticated mathematical functions that figure out what word comes next for a given sequence of words.

In order to make LLMs retain history of previous message to function as a conversational chatbot, there are a number of ways to include the previous messages.

The most simplest of them is to keep a list of history and pass it every turn.

In [None]:
# We’ll define a prompt that accepts a history placeholder + new user input
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a concise assistant. If context is insufficient, ask a clarifying question."),
    MessagesPlaceholder(variable_name="history"),
    ("user", "{user_input}")
])

chain = prompt | llm | StrOutputParser()

# Start empty conversation history
history: list[BaseMessage] = []

# Turn 1
user_input_1 = "I'm planning a weekend trip. Suggest 3 destinations near Bangalore."
answer1 = chain.invoke({"history": history, "user_input": user_input_1})
history.append(HumanMessage(content=user_input_1))
history.append(AIMessage(content=answer1))

# Turn 2
user_input_2 = "Pick the best for hiking and tell me why."
answer2 = chain.invoke({"history": history, "user_input": user_input_2})
history.append(HumanMessage(content=user_input_2))
history.append(AIMessage(content=answer2))

print("\n--- Turn 1 ---\n", answer1)
print("\n--- Turn 2 ---\n", answer2)

Refer to `01_prompt_templates` file to understand what Messages and MessagePlaceholders are used for here.

### How it works

1. You keep a list of messages (history) that grows each turn.
2. The prompt includes a `MessagesPlaceholder("history")`.
3. Each turn: append HumanMessage, call the chain, append the `AIMessage`.

## Auto Memory

Manually passing history is fine, but in a real app you’ll want the chain to store & fetch history automatically per user or session.

Here's an example of session scoped chat using an in-memory store

In [None]:
# A base chat prompt that expects a {question}; history will be injected automatically
base_prompt = ChatPromptTemplate.from_messages([
    ("system", "Be brief and helpful."),
    MessagesPlaceholder("history"),
    ("user", "{question}")
])

base_chain = base_prompt | llm | StrOutputParser()

# A simple per-session store for histories
store: dict[str, InMemoryChatMessageHistory] = {}

def get_history(sess_id: str) -> BaseChatMessageHistory:
    if sess_id not in store:
        store[sess_id] = InMemoryChatMessageHistory()
    return store[sess_id]

# Wrap base_chain with RunnableWithMessageHistory
chat_with_memory = RunnableWithMessageHistory(
    base_chain,
    get_history,
    input_messages_key="question",   # which input field is the new user message
    history_messages_key="history",  # which placeholder the chain expects
)

# Simulate two turns in the same session
session_config = {"configurable": {"session_id": "user-123"}}
print(type(session_config))

resp1 = chat_with_memory.invoke({"question": "Remind me what LCEL is in one sentence."}, config=session_config)
resp2 = chat_with_memory.invoke({"question": "Great. Now give me two practical uses."},  config=session_config)

print("\n--- Session user-123 ---")
print(resp1)
print(resp2)

# A different session starts with a clean slate
resp_other = chat_with_memory.invoke({"question": "What is LCEL?"}, config={"configurable": {"session_id": "guest"}})
print("\n--- Session guest ---")
print(resp_other)


### How it works?

`RunnableWithMessageHistory` requires the following inputs:
1. Any LCEL runnable that uses the prompt template with `MessagesPlaceholder`.
2. A callable that accepts a string (session id) as input and returns a `BaseChatMessageHistory` implementation.
3. `input_messages_key` which key is used in the invoke for the user query.
4. `history_messages_key` which key is used in the `MessagesPlaceholder`.
5. A config value in the `.invoke`. The config must be passed as `{"configurable": {"session_id": "<id>"}}` those keys are fixed.

On each call, the `RunnableWithMessageHistory`:
- Derives a session identifier from the config dictionary.
- Retrieves a `BaseChatMessageHistory` (here, `InMemoryChatMessageHistory`) which exposes `messages`, `add_user_message`, `add_ai_message`.
- Creates inputs by merging the caller’s inputs and history_messages_key: history.messages.
- Delegates to `runnable.invoke(...`.
- Persists the turn by writing user/AI messages back to the history.

## Windowed Memory (keep only the last K messages)

Long histories get expensive. A common trick is to window: keep only the last K messages (or last K turns).

In [None]:
def last_n_messages(history: list[BaseMessage], n: int = 6) -> list[BaseMessage]:
    # keep only the tail of the conversation
    return history[-n:] if len(history) > n else history

# Build a windowed chain: compute a trimmed history before calling the model
windowed_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are helpful and concise."),
    MessagesPlaceholder("history"),
    ("user", "{user_input}")
])

def with_window(inputs):
    full_history = inputs["history"]
    trimmed = last_n_messages(full_history, n=4)
    return {"history": trimmed, "user_input": inputs["user_input"]}

windowed_chain = RunnableLambda(with_window) | windowed_prompt | llm | StrOutputParser()

# Demo
hist = []
hist.append(HumanMessage(content="Remember my favorite city is Mysuru.")); hist.append(AIMessage(content="Got it!"))
hist.append(HumanMessage(content="I like hiking."));                     hist.append(AIMessage(content="Noted."))
hist.append(HumanMessage(content="I enjoy filter coffee."));             hist.append(AIMessage(content="Nice!"))

# Now ask something: only the last 4 messages will be sent to the model
ans = windowed_chain.invoke({"history": hist, "user_input": "Plan a morning in my favorite city."})
print("\n--- Windowed Answer ---\n", ans)

## Summary Memory (rolling summary + recent messages)

Windowed memory cannot retain the context of the previous messages, this is not useful when you're developing a chatbot.

Another strategy is to keep a short summary of prior context (the “old” part), plus fresh recent messages verbatim.

In [None]:
# A small summarizer chain that condenses old history + the latest turn
summarizer_prompt = ChatPromptTemplate.from_messages([
    ("system", "Summarize the conversation so far in 3-4 crisp bullet points. Keep key facts."),
    ("user", "Existing summary:\n{summary}\n\nNew messages:\n{new_text}\n\nReturn only the updated summary.")
])
summarizer_chain = summarizer_prompt | llm | StrOutputParser()

def update_summary(summary: str, new_messages: list[BaseMessage]) -> str:
    joined = "\n".join(f"{m.type.upper()}: {m.content}" for m in new_messages) # This creates a string that looks like this USER: Hi\nASSISTANT: Hello back
    return summarizer_chain.invoke({"summary": summary or "(none yet)", "new_text": joined})

# Conversation driver that keeps:
#   - 'summary' for distant context
#   - last 4 messages verbatim
summary = ""
history: list[BaseMessage] = []

def ask(user_text: str):
    global summary, history
    history.append(HumanMessage(content=user_text))

    # Build the prompt: include summary as a system hint + last K messages
    K = 4
    recent = history[-K:]

    composed = ChatPromptTemplate.from_messages([
        ("system", "You are helpful and concise."),
        ("system", "Conversation summary so far (for context): {summary}"),
        MessagesPlaceholder("recent"),
        ("user", "{user_input}")
    ]) | llm | StrOutputParser()

    answer = composed.invoke({"summary": summary, "recent": recent, "user_input": user_text})
    history.append(AIMessage(content=answer))

    # Periodically fold older messages into the summary (here: when history grows)
    if len(history) > 8:
        # summarize everything except the most recent K messages
        old = history[:-K]
        summary = update_summary(summary, old)
        # keep only the recent window in history (we already folded the old part)
        history[:] = history[-K:]

    return answer

# Demo turns
print(ask("I’m planning a 2-day trip; I love nature walks and historical places."))
print(ask("Budget is moderate; prefer public transport."))
print(ask("Suggest 2 itineraries near Bangalore."))
print(ask("Pick one and list packing essentials."))

### How it works

- You maintain a summary string for distant context (cheap to include).
- You keep only recent messages verbatim (last K).
- As the conversation grows, fold older content into the summary via a small summarizer chain.
- The main prompt includes both: summary and recent.

## Slot/State Memory

You need to persist more information other than the chat history. You'll need the user specific information, key discoveries like user's name, email, preferences, home city. These might not be available just through summarizing conversations.

In [None]:
# A dict acts as simple slot memory
profile = {"name": "Raghu", "home_city": "Bangalore", "likes": ["hiking", "filter coffee"]}

profile_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a personalized assistant."),
    ("system", "User profile: name={name}, home_city={home_city}, likes={likes}"),
    MessagesPlaceholder("history"),
    ("user", "{question}")
])

profile_chain = profile_prompt | llm | StrOutputParser()

hist = []
hist.append(HumanMessage(content="Remember I prefer early mornings.")); hist.append(AIMessage(content="Noted."))

print(profile_chain.invoke({
    "name": profile["name"],
    "home_city": profile["home_city"],
    "likes": ", ".join(profile["likes"]),
    "history": hist,
    "question": "Plan a Saturday morning activity I’d enjoy."
}))

### How it works

- Store stable facts outside chat as a dict/DB.
- Inject them into the prompt as system context.
- Use history for ephemeral, conversational detail.