# LangChain: Memory

## Outline
* ConversationBufferMemory
* ConversationBufferWindowMemory
* ConversationTokenBufferMemory
* ConversationSummaryMemory

## ConversationBufferMemory

In [2]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv("config.env")) # read local .env file

import warnings
warnings.filterwarnings('ignore')

In [3]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [4]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [5]:
conversation.predict(input="Hi, my name is kloping")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is kloping
AI:[0m

[1m> Finished chain.[0m


"Hello kloping, it's nice to meet you! My name is AI. How can I assist you today?"

In [6]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is kloping
AI: Hello kloping, it's nice to meet you! My name is AI. How can I assist you today?
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


'The answer to 1+1 is 2.'

In [7]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is kloping
AI: Hello kloping, it's nice to meet you! My name is AI. How can I assist you today?
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m


'Your name is kloping, as you mentioned earlier.'

In [8]:
print(memory.buffer)

Human: Hi, my name is kloping
AI: Hello kloping, it's nice to meet you! My name is AI. How can I assist you today?
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI: Your name is kloping, as you mentioned earlier.


In [9]:
memory.load_memory_variables({})

{'history': "Human: Hi, my name is kloping\nAI: Hello kloping, it's nice to meet you! My name is AI. How can I assist you today?\nHuman: What is 1+1?\nAI: The answer to 1+1 is 2.\nHuman: What is my name?\nAI: Your name is kloping, as you mentioned earlier."}

In [10]:
# You can also manually give chat history as the following:
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"}, 
                    {"output": "What's up"})
print(memory.buffer)

Human: Hi
AI: What's up


In [11]:
memory.save_context({"input": "Not much, just hanging"}, 
                    {"output": "Cool"})
print(memory.buffer)

Human: Hi
AI: What's up
Human: Not much, just hanging
AI: Cool


## Memory of LLM
Large Language Models are "stateless", each transaction is indepent.

Chatbots appear to have memory by providing the **full conversation as `context`**

As your conversation evolves, you actually send accumulated tokens to the LLM.

---
LangChain provides several kinds of "memory" to store and accumulate the conversation.

## ConversationBufferWindowMemory
limits the number of conversations on the memory

In [12]:
from langchain.memory import ConversationBufferWindowMemory

In [13]:
memory = ConversationBufferWindowMemory(k=1)  # `k=1` means: remember only one conversational exchange

In [14]:
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

In [15]:
memory.load_memory_variables({})  # only remembers the most recent conversation

{'history': 'Human: Not much, just hanging\nAI: Cool'}

In [16]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=False
)

In [17]:
conversation.predict(input="Hi, my name is Andrew")

"Hello Andrew, it's nice to meet you. My name is AI. How can I assist you today?"

In [18]:
conversation.predict(input="What is 1+1?")

'The answer to 1+1 is 2.'

In [19]:
conversation.predict(input="What is my name?")

"I'm sorry, I don't have access to that information. Could you please tell me your name?"

## ConversationTokenBufferMemory
limits the number of tokens on the memory

In [20]:
#!pip install tiktoken

In [21]:
from langchain.memory import ConversationTokenBufferMemory
from langchain.llms import OpenAI
llm = ChatOpenAI(temperature=0.0)

In [22]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)

In [23]:
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

memory.load_memory_variables({})

{'history': 'AI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

## ConversationSummaryMemory

In [24]:
from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

In [25]:
# Using LLM to summarize the input and store it in the buffer
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)

In [27]:
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

In [28]:
memory.load_memory_variables({})

{'history': "System: The human and AI engage in small talk before discussing the day's schedule. The AI informs the human of a morning meeting with the product team, time to work on the LangChain project, and a lunch meeting with a customer interested in the latest AI developments. The schedule includes a PowerPoint presentation for the morning meeting, time to work on the LangChain project, and a lunch meeting with a customer who is driving over an hour to learn about the latest AI developments."}

In [29]:
conversation = ConversationChain(
    llm=llm, 
    memory=memory,
    verbose=True
)

In [30]:
conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human and AI engage in small talk before discussing the day's schedule. The AI informs the human of a morning meeting with the product team, time to work on the LangChain project, and a lunch meeting with a customer interested in the latest AI developments. The schedule includes a PowerPoint presentation for the morning meeting, time to work on the LangChain project, and a lunch meeting with a customer who is driving over an hour to learn about the latest AI developments.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


"Based on the customer's interest in AI developments, I would suggest showcasing our latest natural language processing technology. We could demonstrate how it can accurately analyze and interpret complex language patterns, and how it can be integrated into various applications. Additionally, we could highlight its potential benefits for the customer's specific industry or use case."

In [31]:
memory.load_memory_variables({})

{'history': "System: The human and AI engage in small talk before discussing the day's schedule. The AI informs the human of a morning meeting with the product team, time to work on the LangChain project, and a lunch meeting with a customer interested in the latest AI developments. The schedule includes a PowerPoint presentation for the morning meeting, time to work on the LangChain project, and a lunch meeting with a customer who is driving over an hour to learn about the latest AI developments.\nHuman: What would be a good demo to show?\nAI: Based on the customer's interest in AI developments, I would suggest showcasing our latest natural language processing technology. We could demonstrate how it can accurately analyze and interpret complex language patterns, and how it can be integrated into various applications. Additionally, we could highlight its potential benefits for the customer's specific industry or use case."}

***Notice that*** the new conversation is summarized again, the number of tokens is restricted to approximately 100 tokens (as specified before

## Memory Types

### ConversationBufferMemory
- This memory allows for storing messages and then extracts the messages in a variable.

### ConversationBufferWindowMemory
- This memory keeps a list of the interactions of the conversation over time. It only uses the last K interactions.

### ConversationTokenBufferMemory
- This memory keeps a buffer of recent interactions in memory, and uses token length rather than number of interactions to determine when to flush interactions.

### ConverationSummaryMemory
- This memory creates a summary of the conversation over time.

## Additional Memory Types

### Vector data memory
- Stores text embeddings (from conversation of elsewhere) in a vector database and retrieves the most relevant blocks of text.

### Entity memories
- Using an LLM, it remembers details about specific entities.

You can also use multiple memories at one time. E.g., Conversation memory + Entity memory to recall individuals.

You can also store the conversation in a conversational database (such as key-value store or SQL)