# Session 1.4: Memory and Conversation Management


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TCfyUKwq_cjW-I1EAgDDK74-A6-pIAQB?usp=sharing)


## Overview

Memory allows LLMs to maintain context across multiple interactions. In this notebook, you'll learn:

- **What is memory** in LLM applications
- **Types of memory** in LangChain
- **Conversation buffer memory**
- **Conversation summary memory**
- **Window-based memory**
- **Memory management strategies**

### Learning Objectives

✅ Understand memory concepts  
✅ Implement different memory types  
✅ Build stateful conversations  
✅ Manage conversation history  
✅ Optimize memory usage  

In [1]:
!pip install -q langchain langchain-openai langchain-community

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/76.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.0/76.0 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.5/2.5 MB[0m [31m137.4 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m50.1 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/64.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.7/64.7 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/50.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0

In [2]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

from google.colab import userdata
import os

# Set OpenAI API key from Google Colab's user environment or default
def set_openai_api_key(default_key: str = "YOUR_API_KEY") -> None:
    """Set the OpenAI API key from Google Colab's user environment or use a default value."""
    #if not (userdata.get("OPENAI_API_KEY") or "OPENAI_API_KEY" in os.environ):
    os.environ["OPENAI_API_KEY"] = userdata.get("MDX_OPENAI_API_KEY") or default_key


set_openai_api_key()
#set_openai_api_key("sk-...")

llm = ChatOpenAI(model="gpt-5-nano")

## 1. Why Memory Matters

LLMs are **stateless** - they don't remember previous interactions by default.

### Without Memory:

In [3]:
# Demonstrating stateless behavior
response1 = llm.invoke("My name is Alice")
print("User: My name is Alice")
print(f"AI: {response1.content}\n")

response2 = llm.invoke("What's my name?")
print("User: What's my name?")
print(f"AI: {response2.content}")
print("\n❌ The model doesn't remember!")

User: My name is Alice
AI: Nice to meet you, Alice! How can I help you today? I can answer questions, help plan something, draft a message, explain a concept, or just chat. Would you like me to remember your name for this conversation?

User: What's my name?
AI: I don’t know your name. I don’t have access to personal data unless you share it. If you’d like, tell me your name or a preferred nickname and I’ll use it. I can also just call you “friend,” “you,” or anything you choose. What would you like me to call you?

❌ The model doesn't remember!


## 2. Message History Basics

Manually manage conversation history using message lists.

In [4]:
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage

# Initialize conversation history
conversation_history = [
    SystemMessage(content="You are a helpful assistant.")
]

# First exchange
user_msg_1 = "My name is Alice"
conversation_history.append(HumanMessage(content=user_msg_1))

response1 = llm.invoke(conversation_history)
conversation_history.append(AIMessage(content=response1.content))

print(f"User: {user_msg_1}")
print(f"AI: {response1.content}\n")

# Second exchange
user_msg_2 = "What's my name?"
conversation_history.append(HumanMessage(content=user_msg_2))

response2 = llm.invoke(conversation_history)
conversation_history.append(AIMessage(content=response2.content))

print(f"User: {user_msg_2}")
print(f"AI: {response2.content}")
print("\n✅ Now it remembers!")

User: My name is Alice
AI: Hi Alice! Nice to meet you. How can I help today? If you’d like, I can remember your name for this chat and tailor things accordingly. What would you like to work on or discuss?

User: What's my name?
AI: Your name is Alice. I can use it to personalize our chat. If you’d like me to stop using it after this session or switch to a nickname, just say the word.

✅ Now it remembers!


## 3. ConversationBufferMemory

Stores the complete conversation history.

In [5]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# Initialize memory
memory = ConversationBufferMemory()

# Create conversation chain
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True  # Shows the prompt being sent
)

# Have a conversation
print("=" * 50)
response1 = conversation.predict(input="Hi! I'm Bob and I love Python programming.")
print(f"\nAI: {response1}\n")

print("=" * 50)
response2 = conversation.predict(input="What's my name?")
print(f"\nAI: {response2}\n")

print("=" * 50)
response3 = conversation.predict(input="What programming language do I like?")
print(f"\nAI: {response3}")

  memory = ConversationBufferMemory()
  conversation = ConversationChain(




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi! I'm Bob and I love Python programming.
AI:[0m

[1m> Finished chain.[0m

AI: Hi Bob! Nice to meet you. I love Python too—great choice for a language.

What would you like to dive into today? Here are a few options, or tell me your own idea:
- Quick refresher on a Python topic (lists, dicts, comprehensions, decorators, etc.)
- Build a small project together (CLI tool, web scraper, data analysis starter)
- Review and debug some code you’ve written
- Best practices (virtual environments, packaging, testing, debugging tips)

If you want a fast starter, we can whip up a tiny example right now. For instance:
- A simple CLI that greets you a

In [6]:
# Inspect memory contents
print("Memory Variables:")
print(memory.load_memory_variables({}))

Memory Variables:
{'history': "Human: Hi! I'm Bob and I love Python programming.\nAI: Hi Bob! Nice to meet you. I love Python too—great choice for a language.\n\nWhat would you like to dive into today? Here are a few options, or tell me your own idea:\n- Quick refresher on a Python topic (lists, dicts, comprehensions, decorators, etc.)\n- Build a small project together (CLI tool, web scraper, data analysis starter)\n- Review and debug some code you’ve written\n- Best practices (virtual environments, packaging, testing, debugging tips)\n\nIf you want a fast starter, we can whip up a tiny example right now. For instance:\n- A simple CLI that greets you and prints the current time\n- A function to count vowels in a string\n- A small decorator that logs when a function runs\n\nWhat would you like to start with, or tell me about a project you’re working on?\nHuman: What's my name?\nAI: You're Bob—the name you introduced earlier. Nice to meet you again!\n\nWhat would you like to dive into to

## 4. ConversationBufferWindowMemory

Keeps only the last K interactions to manage token limits.

In [7]:
from langchain.memory import ConversationBufferWindowMemory

# Keep only last 2 interactions (4 messages total)
window_memory = ConversationBufferWindowMemory(k=2)

window_conversation = ConversationChain(
    llm=llm,
    memory=window_memory,
    verbose=False
)

# Have multiple exchanges
exchanges = [
    "My favorite color is blue",
    "I work as a data scientist",
    "I have a dog named Max",
    "What's my favorite color?",  # Should remember
    "What's my dog's name?",      # Should remember
    "What do I do for work?"      # Might not remember (outside window)
]

for i, user_input in enumerate(exchanges, 1):
    response = window_conversation.predict(input=user_input)
    print(f"{i}. User: {user_input}")
    print(f"   AI: {response}\n")

  window_memory = ConversationBufferWindowMemory(k=2)


1. User: My favorite color is blue
   AI: Nice—blue is a fantastic favorite. It’s often tied to calm, trust, and the vast sky and ocean. Here are a few vibe ideas and shade suggestions:

- Airy/light blues: powder blue, sky blue, alice blue — great for bedrooms, summer outfits, or a fresh feel.
- Classic blues: royal blue, dodger blue, cornflower blue — bold but friendly; nice for jeans, accents, or logos.
- Deep blues: navy, midnight blue, Prussian blue — sophisticated and versatile; ideal for formalwear or elegant decor.

Pairing tips:
- White + blue for a crisp, nautical vibe.
- Gray + blue for a modern, subdued look.
- Orange or coral accents for a lively pop.
- Gold accents for a touch of elegance.

Want me to tailor a palette for a specific use (clothes, a room, a website)? I can also give hex codes if you’d like.

2. User: I work as a data scientist
   AI: Nice—data scientist. I can tailor color palettes for your specific use cases (visualizations, dashboards, slides, or even co

## 5. ConversationSummaryMemory

Summarizes conversation history to save tokens.

In [8]:
from langchain.memory import ConversationSummaryMemory

# Initialize summary memory
summary_memory = ConversationSummaryMemory(llm=llm)

summary_conversation = ConversationChain(
    llm=llm,
    memory=summary_memory,
    verbose=True
)

# Have a conversation
response1 = summary_conversation.predict(
    input="I'm planning a trip to Japan next month. I'm interested in visiting Tokyo, Kyoto, and Osaka."
)
print(f"AI: {response1}\n")

response2 = summary_conversation.predict(
    input="I love Japanese food, especially sushi and ramen."
)
print(f"AI: {response2}\n")

# Check the summary
print("\n" + "="*50)
print("Conversation Summary:")
print(summary_memory.load_memory_variables({}))



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: I'm planning a trip to Japan next month. I'm interested in visiting Tokyo, Kyoto, and Osaka.
AI:[0m


  summary_memory = ConversationSummaryMemory(llm=llm)



[1m> Finished chain.[0m
AI: Great choice—Tokyo, Kyoto, and Osaka make a fantastic loop. Here’s a practical starter plan with flexible options, plus some tips to tailor it to your dates and pace.

Sample itineraries

Option A: 9–10 days (balanced pace)
- Tokyo: 4 days
  - Highlights: Shibuya Crossing and Hachiko, Shinjuku Gyoen, Meiji Shrine, Harajuku, Asakusa and Senso-ji, Akihabara or teamLab (if you like digital art), Tsukiji Outer Market or Toyosu for seafood, maybe a day trip to Nikko or Hakone if you’re keen.
- Kyoto: 3 days
  - Highlights: Fushimi Inari Taisha (early morning is best), Kiyomizu-dera, Gion and Higashiyama, Arashiyama Bamboo Grove and Tenryu-ji, Nishiki Market.
- Osaka: 2 days
  - Highlights: Dotonbori and Kuromon Market, Osaka Castle, Shinsekai, Umeda Sky Building or Floating Garden (great city views). If you’re a theme-park fan, consider a half-day at Universal Studios Japan.
- Travel between cities: Shinkansen from Tokyo to Kyoto (about 2h 15m on a Hikari) and

## 6. ConversationSummaryBufferMemory

Hybrid approach: keeps recent messages and summarizes older ones.

In [9]:
from langchain.memory import ConversationSummaryBufferMemory

# Summarize when history exceeds token limit
hybrid_memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=100  # Summarize when history > 100 tokens
)

hybrid_conversation = ConversationChain(
    llm=llm,
    memory=hybrid_memory,
    verbose=False
)

# Long conversation
topics = [
    "Tell me about machine learning",
    "What are neural networks?",
    "Explain deep learning",
    "What did we start discussing?"
]

for topic in topics:
    response = hybrid_conversation.predict(input=topic)
    print(f"User: {topic}")
    print(f"AI: {response[:100]}...\n")

print("\nMemory Content:")
print(hybrid_memory.load_memory_variables({}))

  hybrid_memory = ConversationSummaryBufferMemory(


User: Tell me about machine learning
AI: Sure—here’s a solid overview of machine learning (ML) and how it fits into the broader field of arti...

User: What are neural networks?
AI: Neural networks are a family of machine learning models inspired by how brains process information. ...

User: Explain deep learning
AI: Deep learning is a subset of machine learning that uses neural networks with many layers to learn re...

User: What did we start discussing?
AI: We started discussing deep learning—the subset of machine learning that uses deep neural networks to...


Memory Content:
{'history': 'System: New summary:\nIn addition to the earlier overview of machine learning, the conversation now explains deep learning as a subset of ML that uses neural networks with many layers to learn representations directly from raw data, enabling end-to-end learning. It clarifies that depth matters because multiple layers build progressively abstract features, and training is end-to-end, adjusting all l

## 7. Using Memory with LCEL

Modern approach using RunnableWithMessageHistory.

In [10]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory

# Create prompt with message history placeholder
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

# Create chain
chain = prompt | llm

# Store for message histories (in production, use database)
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

# Wrap chain with message history
chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# Have a conversation with session ID
config = {"configurable": {"session_id": "user_123"}}

response1 = chain_with_history.invoke(
    {"input": "Hi! My favorite programming language is Python."},
    config=config
)
print(f"AI: {response1.content}\n")

response2 = chain_with_history.invoke(
    {"input": "What's my favorite language?"},
    config=config
)
print(f"AI: {response2.content}")

AI: Nice! Python is a great choice—versatile and beginner-friendly. What do you enjoy most about Python? Are you into web development, data science, automation, or something else?

If you’d like, I can tailor suggestions to your goal. A few quick ideas:
- Beginner projects: a simple CLI to-do list, a calculator, or a file renamer.
- Web apps: a tiny Flask or FastAPI app.
- Data: analyze a CSV with pandas and visualize results.
- Automation: small scripts to automate repetitive tasks.

Tell me your current level and what you want to build or learn, and I’ll help with project ideas, explanations of concepts, or code examples.

AI: Python. Want some Python-specific project ideas or help with a topic you’re learning?


## 8. Multi-User Conversations

Handle multiple users with separate conversation histories.

In [11]:
# User 1 conversation
print("=== User 1 ===")
config_user1 = {"configurable": {"session_id": "user_1"}}

r1 = chain_with_history.invoke(
    {"input": "My name is Alice"},
    config=config_user1
)
print(f"AI: {r1.content}\n")

# User 2 conversation
print("=== User 2 ===")
config_user2 = {"configurable": {"session_id": "user_2"}}

r2 = chain_with_history.invoke(
    {"input": "My name is Bob"},
    config=config_user2
)
print(f"AI: {r2.content}\n")

# Check User 1's name
print("=== Back to User 1 ===")
r3 = chain_with_history.invoke(
    {"input": "What's my name?"},
    config=config_user1
)
print(f"AI: {r3.content}\n")

# Check User 2's name
print("=== Back to User 2 ===")
r4 = chain_with_history.invoke(
    {"input": "What's my name?"},
    config=config_user2
)
print(f"AI: {r4.content}")

=== User 1 ===
AI: Hi Alice! Nice to meet you. How can I help you today? I can assist with writing, planning, learning, coding, or just chat—whatever you need.

=== User 2 ===
AI: Nice to meet you, Bob! How can I help you today? I can answer questions, brainstorm ideas, draft messages or reports, explain topics, help with math or coding, plan tasks, or do research—just tell me what you’d like to work on.

=== Back to User 1 ===
AI: Your name is Alice. What would you like to do today, Alice?

=== Back to User 2 ===
AI: Your name is Bob. Would you like me to keep using that name in this chat or remember it for future chats?


## 9. Memory Management Best Practices

In [12]:
# Counting tokens in conversation history
from langchain.memory import ConversationBufferMemory

def estimate_tokens(text):
    """Rough estimate: 1 token ≈ 4 characters"""
    return len(text) // 4

memory = ConversationBufferMemory()
memory.save_context(
    {"input": "Tell me about artificial intelligence"},
    {"output": "Artificial intelligence (AI) is a branch of computer science..."}
)

history = memory.load_memory_variables({})["history"]
token_estimate = estimate_tokens(history)

print(f"Conversation History:\n{history}\n")
print(f"Estimated tokens: {token_estimate}")
print(f"\nToken Limits:")
print(f"  GPT-3.5-turbo: 4,096 tokens")
print(f"  GPT-4: 8,192 tokens")
print(f"  GPT-4-turbo: 128,000 tokens")

Conversation History:
Human: Tell me about artificial intelligence
AI: Artificial intelligence (AI) is a branch of computer science...

Estimated tokens: 28

Token Limits:
  GPT-3.5-turbo: 4,096 tokens
  GPT-4: 8,192 tokens
  GPT-4-turbo: 128,000 tokens


## 10. Memory Type Comparison

| Memory Type | When to Use | Pros | Cons |
|------------|-------------|------|------|
| **ConversationBufferMemory** | Short conversations | Simple, complete history | High token usage |
| **ConversationBufferWindowMemory** | Fixed context window | Predictable tokens | Loses old context |
| **ConversationSummaryMemory** | Long conversations | Efficient tokens | Loses detail, costs to summarize |
| **ConversationSummaryBufferMemory** | Long + detail needed | Best of both | More complex |
| **Message History (LCEL)** | Production apps | Flexible, scalable | Requires setup |

## 🎯 Exercise 6: Build a Personal Assistant

**Task**: Create a personal assistant that:
1. Remembers user preferences (name, interests, etc.)
2. Uses appropriate memory type
3. Handles multiple conversation threads
4. Can reset or export conversation history

In [13]:
class PersonalAssistant:
    def __init__(self, user_id):
        """
        Initialize a personal assistant for a user

        Args:
            user_id: Unique identifier for the user
        """
        # TODO: Implement initialization
        pass

    def chat(self, message):
        """
        Send a message to the assistant

        Args:
            message: User's message

        Returns:
            Assistant's response
        """
        # TODO: Implement chat
        pass

    def get_history(self):
        """
        Export conversation history
        """
        # TODO: Implement history export
        pass

    def reset(self):
        """
        Clear conversation history
        """
        # TODO: Implement reset
        pass

# Test your assistant
# assistant = PersonalAssistant(user_id="alice")
# assistant.chat("Hi, my name is Alice")
# assistant.chat("What's my name?")

## 🎯 Exercise 7: Smart Memory Selection

**Task**: Create a function that:
1. Analyzes conversation length
2. Automatically selects the best memory type
3. Switches memory types dynamically if needed

In [14]:
def select_memory_type(conversation_length, avg_message_length):
    """
    Select optimal memory type based on conversation characteristics

    Args:
        conversation_length: Number of messages
        avg_message_length: Average message length in characters

    Returns:
        Appropriate memory instance
    """
    # TODO: Implement smart memory selection
    pass

# Test
# memory = select_memory_type(conversation_length=100, avg_message_length=50)
# print(f"Selected memory type: {type(memory).__name__}")

## Summary

In this notebook, you learned:

✅ Why memory is essential for LLM applications  
✅ Different memory types and their use cases  
✅ ConversationBufferMemory for simple conversations  
✅ ConversationBufferWindowMemory for token management  
✅ ConversationSummaryMemory for long conversations  
✅ Modern LCEL approach with RunnableWithMessageHistory  
✅ Multi-user conversation handling  
✅ Memory management best practices  

**Congratulations!** You've completed Session 1 of the LangChain course!

**Next**: Tomorrow we'll dive into advanced topics including Agents, Tools, and RAG systems!