# LangChain: Memory buffers

## Why
* Reduce context size
    * Save money
    * Reduce latency  

## Outline
* `ConversationBufferMemory`: remember everything
* `ConversationBufferWindowMemory`: remember everything in a conversation window. Only uses the last `k` interactions
* `ConversationTokenBufferMemory`: remember everything limited by the number of tokens
* `ConversationSummaryMemory`: creates a summary of the conversation over time, and make the summary the memory

#### Not convered in this notebook but provided by LangChain:
* `Vector data memory`: stores text in a vector DB and retrives the most relevant blocks of text
* `Entity memories`: Using an LLM, it remembers details about specific entities

#### Other notes
* You can also use multiple memories at the same time. e.g. `Conversation memory` + ` Entity memory to recall individuals`
* You can also store the conversation in a conversational DB (e.g. key-value store, or SQL)

In [1]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']

In [2]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [7]:
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory


Use LangChain to manage a chatbot conversation

In [24]:
llm = ChatOpenAI(temperature=0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=False
)

In [25]:
conversation.predict(input="hi, my name is Annie.")

"Hello Annie! It's nice to meet you. How can I assist you today?"

In [26]:
conversation.predict(input="what is the capital of japan")

'The capital of Japan is Tokyo. It is a bustling metropolis known for its modern technology, vibrant culture, and delicious cuisine. Would you like to know more about Tokyo or Japan in general?'

In [27]:
conversation.predict(input="what's my name'")

'Your name is Annie.'

In [28]:
conversation.predict(input="where was i born'")

"I'm sorry, I do not have that information."

In [29]:
print(memory.buffer) # previous conversations are stored in memory variable

Human: hi, my name is Annie.
AI: Hello Annie! It's nice to meet you. How can I assist you today?
Human: what is the capital of japan
AI: The capital of Japan is Tokyo. It is a bustling metropolis known for its modern technology, vibrant culture, and delicious cuisine. Would you like to know more about Tokyo or Japan in general?
Human: what's my name'
AI: Your name is Annie.
Human: where was i born'
AI: I'm sorry, I do not have that information.


In [31]:
# shows what the memory has stored so far
memory.load_memory_variables({})

{'history': "Human: hi, my name is Annie.\nAI: Hello Annie! It's nice to meet you. How can I assist you today?\nHuman: what is the capital of japan\nAI: The capital of Japan is Tokyo. It is a bustling metropolis known for its modern technology, vibrant culture, and delicious cuisine. Would you like to know more about Tokyo or Japan in general?\nHuman: what's my name'\nAI: Your name is Annie.\nHuman: where was i born'\nAI: I'm sorry, I do not have that information."}

In [32]:
memory = ConversationBufferMemory()

How to add new things to the memory explicitly

In [37]:
memory.save_context(
    {"input":"Hi"},
    {"outputs":"what's up"}
)

In [38]:
print(memory.buffer)

Human: Hi
AI: what's up


In [39]:
memory.load_memory_variables({})

{'history': "Human: Hi\nAI: what's up"}

In [40]:
memory.save_context(
    {"input":"Not much just hanging"},
    {"outputs":"Cool"}
)

In [42]:
print(memory.buffer)

Human: Hi
AI: what's up
Human: Not much just hanging
AI: Cool


## ConversationBufferWindowMemory

Apply a window to the memory buffer to reduce context size & number of tokens passed to the LLM

In [57]:
from langchain.memory import ConversationBufferWindowMemory

In [58]:
memory = ConversationBufferWindowMemory(k=1) # only remembers 1 conversational exchange. 
                                             # Usually you'd want to set this as a bigger number

In [47]:
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})


In [48]:
print(memory.buffer)

Human: Not much, just hanging
AI: Cool


In [49]:
memory.load_memory_variables({})

{'history': 'Human: Not much, just hanging\nAI: Cool'}

In [67]:
llm = ChatOpenAI(temperature=0.0, model=llm_model)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [68]:
conversation.predict(input="Hi, my name is Ann")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Ann
AI:[0m

[1m> Finished chain.[0m


"Hello Ann, it's nice to meet you. My name is AI. How can I assist you today?"

In [69]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Ann
AI: Hello Ann, it's nice to meet you. My name is AI. How can I assist you today?
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


'The answer to 1+1 is 2.'

In [70]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m


"I'm sorry, I don't have access to that information. Could you please tell me your name?"

## ConversationTokenBufferMemory

In [71]:
from langchain.memory import ConversationTokenBufferMemory
llm = ChatOpenAI(temperature=0.0, model=llm_model)

In [72]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

In [73]:
memory.load_memory_variables({})

{'history': 'AI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

In [74]:
print(memory.buffer)

AI: Amazing!
Human: Backpropagation is what?
AI: Beautiful!
Human: Chatbots are what?
AI: Charming!


## ConversationSummaryMemory

* Use an LLM to write a summary of the conversation so far, and let that be the memory

In [75]:
from langchain.memory import ConversationSummaryBufferMemory

In [76]:
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

In [77]:
memory.load_memory_variables({})

{'history': 'System: The human and AI engage in small talk before the human asks about their schedule for the day. The AI informs the human of a meeting with their product team at 8am, time to work on their LangChain project from 9am-12pm, and a lunch meeting with a customer interested in the latest in AI.'}

In [78]:
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [79]:
conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human and AI engage in small talk before the human asks about their schedule for the day. The AI informs the human of a meeting with their product team at 8am, time to work on their LangChain project from 9am-12pm, and a lunch meeting with a customer interested in the latest in AI.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


"Based on the customer's interests, I would recommend showcasing our natural language processing capabilities. We could demonstrate how our AI can understand and respond to complex questions and commands in multiple languages. Additionally, we could highlight our machine learning algorithms and how they can improve accuracy and efficiency in various industries. Would you like me to prepare a presentation for the demo?"

In [80]:
memory.load_memory_variables({})

{'history': "System: The human and AI engage in small talk before the human asks about their schedule for the day. The AI informs the human of a meeting with their product team at 8am, time to work on their LangChain project from 9am-12pm, and a lunch meeting with a customer interested in the latest in AI.\nHuman: What would be a good demo to show?\nAI: Based on the customer's interests, I would recommend showcasing our natural language processing capabilities. We could demonstrate how our AI can understand and respond to complex questions and commands in multiple languages. Additionally, we could highlight our machine learning algorithms and how they can improve accuracy and efficiency in various industries. Would you like me to prepare a presentation for the demo?"}