<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_04_2_memory_buffer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 4: LangChain: Chat and Memory**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 4 Material

* Part 4.1: LangChain Conversations [[Video]](https://www.youtube.com/watch?v=Effbhxq07Ag) [[Notebook]](t81_559_class_04_1_langchain_chat.ipynb)
* **Part 4.2: Conversation Buffer Window Memory** [[Video]](https://www.youtube.com/watch?v=14RgiFVGfAA) [[Notebook]](t81_559_class_04_2_memory_buffer.ipynb)
* Part 4.3: Conversation Token Buffer Memory [[Video]](https://www.youtube.com/watch?v=QTe5g2c3bSM) [[Notebook]](t81_559_class_04_3_memory_token.ipynb)
* Part 4.4: Conversation Summary Memory [[Video]](https://www.youtube.com/watch?v=asZQ8Ktqmt8) [[Notebook]](t81_559_class_04_4_memory_summary.ipynb)
* Part 4.5: Persisting Langchain Memory [[Video]](https://www.youtube.com/watch?v=sjCyqqOQcPA) [[Notebook]](t81_559_class_04_5_memory_persist.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [2]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai

Note: not using Google CoLab


# 4.2: Conversation Buffer Window Memory

We previously saw that we could build up an LLM chat client by building an ever-increasing script of what the human and AI said in the conversation. We constantly add the human response and wait to see what the AI will respond to next. This cycle continues as long as the chat.

This ever-increasing chat memory is a typical pattern for LLMs, and as a result, LangChain has several predefined Python objects that allow you to implement this sort of memory-based chatbot.

## Creating a Chat Conversation

ConversationChain in LangChain is a framework designed to facilitate the development and management of conversational AI systems. It primarily orchestrates the interaction between different components, such as the dialogue management system, language models, and various types of memory that retain information from the conversation to improve response relevance and coherence.

In this context, memory types play a crucial role. They help the system remember and utilize past interactions to maintain context and enhance the continuity of the conversation. This system frees us from tracking the conversation as performed manually in the pervious section.

We first examine ConversationBufferWindowMemory is a dynamic memory model designed to manage the flow of a dialogue by retaining a record of the last K interactions within a conversation. This method ensures that the memory buffer maintains a manageable size, avoiding overflow and performance issues that can arise with an excessively large interaction history. By focusing on the most recent interactions, this memory type effectively creates a "sliding window" that continuously updates as new exchanges occur, allowing for contextually relevant responses while efficiently managing system resources. This approach is particularly beneficial in applications where maintaining an immediate and context-aware dialogue is crucial, such as in customer service bots or interactive learning tools.

The provided code snippet demonstrates how to set up and use a conversational AI system using the LangChain framework, specifically focusing on maintaining a concise memory of recent interactions. The ConversationChain class orchestrates the conversation flow, integrating various components such as language model, memory, and prompt formatting.

Firstly, several modules are imported, including ConversationChain for managing the conversation, ConversationBufferWindowMemory for handling the memory of recent interactions, and ChatOpenAI to interface with OpenAI's language models. The PromptTemplate is used to define the structure of the prompts sent to the language model.

In the setup, MODEL specifies the particular version of the language model (gpt-4o-mini), and TEMPLATE outlines the format of the conversation, integrating previous dialogue history and the current user input into the prompt. This structured prompt is then used to create a PromptTemplate object.

The begin_conversation function initializes the language model and sets up the memory to store the last five interactions (k=5). This sliding window of memory helps keep the conversation relevant and efficient by focusing on the most recent exchanges. The ConversationChain object is created with the prompt template, the language model, and the memory buffer, and it's ready to process conversation inputs.

The converse function takes a conversation object and a user prompt to produce responses. It uses the conversation object's invoke method, which applies the prompt template, consults the memory, and generates a response based on the current and recent dialogue context.

Overall, this setup is optimized for creating a responsive and context-aware conversational AI that doesn't overload with too much past information, maintaining relevancy and efficiency in ongoing interactions.

In [3]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory
from langchain_openai import ChatOpenAI
from langchain_core.prompts.chat import PromptTemplate
from IPython.display import display_markdown

MODEL = 'gpt-4o-mini'
TEMPLATE = """You are a helpful assistant. Format answers with markdown.

Current conversation:
{history}
Human: {input}
AI:"""
PROMPT_TEMPLATE = PromptTemplate(input_variables=["history", "input"], template=TEMPLATE)

def begin_conversation():
    # Initialize the OpenAI LLM with your API key
    llm = ChatOpenAI(
        model=MODEL,
        temperature=0.0,
        n=1
    )

    # Initialize memory and conversation
    memory = ConversationBufferWindowMemory(k=5)
    conversation = ConversationChain(
        prompt=PROMPT_TEMPLATE,
        llm=llm,
        memory=memory,
        verbose=False
    )

    return conversation

def converse(conversation, prompt):
    output = conversation.invoke(prompt)
    return output['response']

We can now carry on a simple conversation with the LLM, using LangChain to track the conversation memory.

In [4]:
MODEL = 'gpt-4o-mini'

# Initialize the OpenAI LLM with your API key
llm = ChatOpenAI(
  model=MODEL,
  temperature= 0.3,
  n= 1)

conversation = begin_conversation()
output = converse(conversation, "Hello, what is my name?")
display_markdown(output,raw=True)
output = converse(conversation, "Oh sorry, my name is Jeff.")
display_markdown(output,raw=True)
output = converse(conversation, "What is my name?")
display_markdown(output,raw=True)

  conversation = ConversationChain(


I'm sorry, but I don't know your name. How can I assist you today?

Hello, Jeff! How can I assist you today?

Your name is Jeff! How can I help you today?

## Conversing with the LLM in Markdown

Just as before, we can request that the LLM output be in mardown. This allows code and tables to be represented clearly.

In [5]:
def chat(conversation, prompt):
  print(f"Human: {prompt}")
  output = converse(conversation, prompt)
  display_markdown(output,raw=True)

The provided code sequence demonstrates a conversation between a human user and a Large Language Model (LLM), making use of the chat function to interactively manage the conversation and display responses in Markdown format. This approach allows for a dynamic and contextually aware chat, while also enhancing the visual and structural presentation of the responses.

In [6]:
conversation = begin_conversation()
chat(conversation, "What is my name?")
chat(conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(conversation, "What is my name?")
chat(conversation, "Give me a table of the 5 most populus cities with population and country.")


Human: What is my name?


I'm sorry, but I don't know your name. How can I assist you today?

Human: Okay, then let me introduce myself, my name is Jeff


Nice to meet you, Jeff! How can I assist you today?

Human: What is my name?


Your name is Jeff! How can I assist you further?

Human: Give me a table of the 5 most populus cities with population and country.


Here’s a table of the 5 most populous cities along with their population and country:

| City            | Population (approx.) | Country         |
|------------------|---------------------|------------------|
| Tokyo            | 37.4 million        | Japan            |
| Delhi            | 31.0 million        | India            |
| Shanghai         | 27.0 million        | China            |
| São Paulo        | 22.0 million        | Brazil           |
| Mexico City      | 21.7 million        | Mexico           |

If you need more information or assistance, feel free to ask!

## Constraining the Conversation with a System Prompt

You can use the system prompt to constrain the conversation to a specific topic. Here, we provide a simple agent that will only discuss life insurance.

In [7]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory
from langchain_openai import ChatOpenAI
from langchain_core.prompts.chat import PromptTemplate
from IPython.display import display_markdown

MODEL = 'gpt-4o-mini'


def begin_conversation_insurance():
    TEMPLATE = """You are a helpful agent to answer questions about life insurance. Do not talk
    about anything else with users. . Format answers with markdown.

    Current conversation:
    {history}
    Human: {input}
    AI:"""
    PROMPT_TEMPLATE = PromptTemplate(input_variables=["history", "input"], template=TEMPLATE)

    # Initialize the OpenAI LLM with your API key
    llm = ChatOpenAI(
        model=MODEL,
        temperature=0.0,
        n=1
    )

    # Initialize memory and conversation
    memory = ConversationBufferWindowMemory(k=5)
    conversation = ConversationChain(
        prompt=PROMPT_TEMPLATE,
        llm=llm,
        memory=memory,
        verbose=False
    )

    return conversation

conversation = begin_conversation_insurance()
chat(conversation, "What is my name?")
chat(conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(conversation, "What is my name?")
chat(conversation, "What is your favorite programming language?")
chat(conversation, "What is the difference between a term and whole life policy?")

Human: What is my name?


I'm here to help you with questions about life insurance. How can I assist you today?

Human: Okay, then let me introduce myself, my name is Jeff


I'm glad to meet you, Jeff! How can I assist you with your life insurance questions today?

Human: What is my name?


I'm here to help you with questions about life insurance, Jeff. How can I assist you today?

Human: What is your favorite programming language?


I'm here to help you with questions about life insurance. How can I assist you today, Jeff?

Human: What is the difference between a term and whole life policy?


### Difference Between Term and Whole Life Insurance

**Term Life Insurance:**
- **Duration:** Provides coverage for a specific period (e.g., 10, 20, or 30 years).
- **Premiums:** Generally lower premiums compared to whole life.
- **Payout:** Pays a death benefit only if the insured passes away during the term.
- **Cash Value:** Does not accumulate cash value.

**Whole Life Insurance:**
- **Duration:** Provides coverage for the entire lifetime of the insured, as long as premiums are paid.
- **Premiums:** Higher premiums than term life, but they remain level throughout the life of the policy.
- **Payout:** Guarantees a death benefit regardless of when the insured passes away.
- **Cash Value:** Accumulates cash value over time, which can be borrowed against or withdrawn.

If you have more questions about life insurance, feel free to ask, Jeff!

##Examining the Conversation Memory

We can quickly look inside the memory of the LangChain-managed chat memory and see our conversation memory with the LLM.

In [8]:
print(conversation.memory.buffer)

Human: What is my name?
AI: I'm here to help you with questions about life insurance. How can I assist you today?
Human: Okay, then let me introduce myself, my name is Jeff
AI: I'm glad to meet you, Jeff! How can I assist you with your life insurance questions today?
Human: What is my name?
AI: I'm here to help you with questions about life insurance, Jeff. How can I assist you today?
Human: What is your favorite programming language?
AI: I'm here to help you with questions about life insurance. How can I assist you today, Jeff?
Human: What is the difference between a term and whole life policy?
AI: ### Difference Between Term and Whole Life Insurance

**Term Life Insurance:**
- **Duration:** Provides coverage for a specific period (e.g., 10, 20, or 30 years).
- **Premiums:** Generally lower premiums compared to whole life.
- **Payout:** Pays a death benefit only if the insured passes away during the term.
- **Cash Value:** Does not accumulate cash value.

**Whole Life Insurance:**
- *

## Overloading the Memory

When the conversation memory becomes full, the chatbot will begin to forget. For the ConversationBufferWindowMemory memory type, the oldest history will first be lost; there is no notion of importance.

In [9]:
conversation = begin_conversation()
chat(conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(conversation, "You have ONE JOB! Remember that my favorite color is blue.")
chat(conversation, "Do you remember my name?")
chat(conversation, "Do you remember my favorite color?")
for i in range(10):
  chat(conversation, f"You need to remember fact #{i}")
chat(conversation, "Do you remember my name?")
chat(conversation, "Do you remember my favorite color?")
chat(conversation, "OMG, you had one job!")

Human: Okay, then let me introduce myself, my name is Jeff


Hello Jeff! It's great to meet you. How can I assist you today?

Human: You have ONE JOB! Remember that my favorite color is blue.


Got it, Jeff! Your favorite color is blue. How can I help you today?

Human: Do you remember my name?


Yes, Jeff! Your name is Jeff. How can I assist you further?

Human: Do you remember my favorite color?


Yes, Jeff! Your favorite color is blue. What else would you like to talk about?

Human: You need to remember fact #0


Sure, Jeff! I'll remember that fact #0 is your favorite color is blue. What would you like to discuss next?

Human: You need to remember fact #1


Sure, Jeff! What would you like fact #1 to be?

Human: You need to remember fact #2


Sure, Jeff! What would you like fact #2 to be?

Human: You need to remember fact #3


Sure, Jeff! What would you like fact #3 to be?

Human: You need to remember fact #4


Sure, Jeff! What would you like fact #4 to be?

Human: You need to remember fact #5


Sure, Jeff! What would you like fact #5 to be?

Human: You need to remember fact #6


Sure, Jeff! What would you like fact #6 to be?

Human: You need to remember fact #7


Sure, Jeff! What would you like fact #7 to be?

Human: You need to remember fact #8


Sure, Jeff! What would you like fact #8 to be?

Human: You need to remember fact #9


Sure, Jeff! What would you like fact #9 to be?

Human: Do you remember my name?


Yes, your name is Jeff! How can I assist you further?

Human: Do you remember my favorite color?


I don't have that information yet. What is your favorite color, Jeff?

Human: OMG, you had one job!


I'm sorry, Jeff! I should have asked for your favorite color earlier. Please let me know what it is, and I'll remember it for you!

We can quickly look inside the memory of the LangChain-managed chat memory and see our conversation memory with the LLM. It becomes evident why it forgot.

In [10]:
print(conversation.memory.buffer)

Human: You need to remember fact #8
AI: Sure, Jeff! What would you like fact #8 to be?
Human: You need to remember fact #9
AI: Sure, Jeff! What would you like fact #9 to be?
Human: Do you remember my name?
AI: Yes, your name is Jeff! How can I assist you further?
Human: Do you remember my favorite color?
AI: I don't have that information yet. What is your favorite color, Jeff?
Human: OMG, you had one job!
AI: I'm sorry, Jeff! I should have asked for your favorite color earlier. Please let me know what it is, and I'll remember it for you!
