# Memory in Langchain

Need to add memory in chatbot so that chatbot can remember your previous chat. The Default OpenAI API without langchain doesn't support memory. Hence, it is our job to add the history to the prompt that LLM would get. 

Here are a list of memories and how each of them are different, and implement each type of memory into the chain.


### Conversation Buffer
- Very simple, and saves the whole memory.
- This is useful when using text completion, prediction.
- However, it is not efficient when working with chat model. The longer conversation, the memory also grows which cost a lot.

In [4]:
from langchain.memory import ConversationBufferMemory

# initialize memory
memory = ConversationBufferMemory(return_messages=True)

memory.save_context({"input": "Hi"}, {"output": "How are you?"})

memory.load_memory_variables({})

{'history': [HumanMessage(content='Hi'), AIMessage(content='How are you?')]}

### Conversation Buffer Window
- it remembers the most recent messages by configuring the number of messages
- can keep a memory upto a certain size
- However, the chatbot will only remember the most recent conversation, not the entire chat history.

In [11]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(
    return_messages=True,
    # how many interactions you want
    k=4
)

def add_message(input, output):
    memory.save_context({"input": input}, {"output":output})
    
add_message(1, 1)

In [12]:
add_message(2, 2)
add_message(3, 3)
add_message(4, 4)

In [13]:
 memory.load_memory_variables({})

{'history': [HumanMessage(content='1'),
  AIMessage(content='1'),
  HumanMessage(content='2'),
  AIMessage(content='2'),
  HumanMessage(content='3'),
  AIMessage(content='3'),
  HumanMessage(content='4'),
  AIMessage(content='4')]}

In [14]:
add_message(5, 5)

In [15]:
# as the buffer grows to 5, the number one message is gone.
memory.load_memory_variables({})

{'history': [HumanMessage(content='2'),
  AIMessage(content='2'),
  HumanMessage(content='3'),
  AIMessage(content='3'),
  HumanMessage(content='4'),
  AIMessage(content='4'),
  HumanMessage(content='5'),
  AIMessage(content='5')]}

## Conversation Summary Memory
- it summarize the conversation.
- ConversationSummaryMemory will require more tokens at the beginning.
- but as the conversation becomes longer, making a summary of them is better, which reduces the amount of token count.


In [16]:
from langchain.memory import ConversationSummaryMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0.1)
memory = ConversationSummaryMemory(llm=llm)

def add_message(input, output):
    memory.save_context({"input":input}, {"output":output})

def get_history():
    return memory.load_memory_variables({})

add_message("Hi, I;m Joong, I live in California", "Wow that is cool!")

In [17]:
add_message("California is so pretty", "I wish I could visit there.")

In [18]:
get_history()

{'history': 'The human introduces themselves as Joong and mentions that they live in California. The AI responds by expressing admiration for this information and expresses a desire to visit California because it is so pretty.'}

## Conversation Summary Buffer Memory
- A mixture of Conversation Summary Memory and Conversation Buffer Memory
- keeps track of the most recent interactions with messages
- the moment it reaches the limit, instead of forgetting what happened previously, it will summarize the older messages.

In [36]:
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0.1)

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=70,
    return_messages=True,
)

def add_message(input, output):
    memory.save_context({"input":input}, {"output":output})

def get_history():
    return memory.load_memory_variables({})

add_message("Hi, I;m Joong, I live in California", "Wow that is cool!")


In [37]:
get_history()

{'history': [HumanMessage(content='Hi, I;m Joong, I live in California'),
  AIMessage(content='Wow that is cool!')]}

In [38]:
add_message("California is so pretty", "I wish I could visit there.")

In [39]:
get_history()

{'history': [HumanMessage(content='Hi, I;m Joong, I live in California'),
  AIMessage(content='Wow that is cool!'),
  HumanMessage(content='California is so pretty'),
  AIMessage(content='I wish I could visit there.')]}

In [40]:
add_message("How far is California from Wisconsin?", "I don't know!.")

In [41]:
get_history()

{'history': [SystemMessage(content='The human introduces themselves as Joong and mentions that they live in California.'),
  AIMessage(content='Wow that is cool!'),
  HumanMessage(content='California is so pretty'),
  AIMessage(content='I wish I could visit there.'),
  HumanMessage(content='How far is California from Wisconsin?'),
  AIMessage(content="I don't know!.")]}

In [42]:
add_message("Then how far is Wisconsin from Arizona?", "That's pretty far!")

once it hits the limit, it produces a summarization of the older system.

In [43]:
get_history()

{'history': [SystemMessage(content='The human introduces themselves as Joong and mentions that they live in California. The AI responds with enthusiasm, saying "Wow, that is cool!"'),
  HumanMessage(content='California is so pretty'),
  AIMessage(content='I wish I could visit there.'),
  HumanMessage(content='How far is California from Wisconsin?'),
  AIMessage(content="I don't know!."),
  HumanMessage(content='Then how far is Wisconsin from Arizona?'),
  AIMessage(content="That's pretty far!")]}

## Conversation Knowledge Graph Memory


In [45]:
from langchain.memory import ConversationKGMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0.1)

memory = ConversationKGMemory(
    llm=llm,
    return_messages=True,
)

def add_message(input, output):
    memory.save_context({"input":input}, {"output":output})

add_message("Hi, I;m Joong, I live in California", "Wow that is cool!")

In [46]:
memory.load_memory_variables({"input": "who is Joong?"})

{'history': [SystemMessage(content='On Joong: Joong lives in California.')]}

In [47]:
add_message("Joong likes python.", "Wow that is cool!")

In [49]:
memory.load_memory_variables({"inputs": "What does Joong like"})

{'history': [SystemMessage(content='On Joong: Joong lives in California. Joong likes python.')]}

However, it is hard to imagine how chatbot is going to use this if the user has to inform this manually. It is annoying to have ".save_context" and ".load_memory_variables". 
There is a chain automatically working with memory without having to manually inform everything.

There are many memory integrations with various databases. https://python.langchain.com/docs/integrations/memory


## How to plug memory into chain
- Off-the_shelf LLM Chain : it automatically receives a response from LLM and updates the memory, but it is still our job to put that history of the memory on the prompt.


First, we will have a look at auto-complete a string.

In [122]:
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

llm = ChatOpenAI(temperature=0.1)

memory1 = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=80,
)

chain = LLMChain(
    llm=llm,
    memory=memory1,
    prompt=PromptTemplate.from_template("{question}")
)


In [64]:
chain.predict(question="My name is Joong")

'Nice to meet you, Joong! How can I assist you today?'

In [54]:
chain.predict(question="I live in California.")

"That's great! California is a beautiful state known for its diverse landscapes, including stunning beaches, towering mountains, and vast deserts. It is also home to vibrant cities like Los Angeles, San Francisco, and San Diego. California offers a wide range of activities and attractions, from exploring national parks like Yosemite and Joshua Tree to enjoying the entertainment industry in Hollywood. The state is also known for its progressive culture, technological advancements, and thriving economy."

In [55]:
chain.predict(question="What is my name?")

"I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."

As you can see, it does not remember the previous chat. We will start debugging by passing verbose property in memory. 

In [58]:
memory2 = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=80,
)

chain2 = LLMChain(
    llm=llm,
    memory=memory2,
    prompt=PromptTemplate.from_template("{question}"),
    verbose=True,
)

chain2.predict(question="My name is Joong")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mMy name is Joong[0m

[1m> Finished chain.[0m


'Nice to meet you, Joong! How can I assist you today?'

by passing verbose property, we can see prompt log of the chain.

In [60]:
chain2.predict(question="I live in California.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI live in California.[0m

[1m> Finished chain.[0m


"That's great! California is a beautiful state known for its diverse landscapes, including stunning beaches, majestic mountains, and vast deserts. It is also home to vibrant cities like Los Angeles, San Francisco, and San Diego. California offers a wide range of activities and attractions, from exploring national parks like Yosemite and Joshua Tree to enjoying the entertainment industry in Hollywood. The state is also known for its progressive culture, technological advancements, and thriving economy."

In [61]:
chain2.predict(question="What is my name?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWhat is my name?[0m

[1m> Finished chain.[0m


"I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."

As you can see, the history of conversation is not being added to the prompt. Hence, we need to add the history of the conversation into the prompt by adding memory.

In [62]:
memory.load_memory_variables({})

{'history': "System: The human introduces themselves as Joong. The AI greets Joong and asks how it can assist. Joong mentions that they live in California. The AI responds by describing California's diverse landscapes, vibrant cities, and popular tourist attractions. The AI also offers to provide information or discuss any specific topics related to California.\nHuman: What is my name?\nAI: I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."}

The history and memory are being updated with inputs and outputs of human and LLMs. The LLM is not given the history of conversation, which should go into the prompt, which only gives LLM the question.
All we have to do is making a space for the memory in our prompt template. Then, tell Conversation Summary Buffer Memory (or, any Memory Class you are using) where to put its history.

In [67]:
# Instead of calling "memory.load.variables()", call memory_key property.
memory3 = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=120,
    memory_key="chat_history"
)

# This is a string-based Template.
template = """ 

    You are a helpful AI talking to a human.
    
    {chat_history}
    Human:{question}
    You:
"""

chain3 = LLMChain(
    llm=llm,
    memory=memory3,
    prompt=PromptTemplate.from_template(template),
    verbose=True,
)

In [68]:
chain3.predict(question="My name is Joong")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m

    You are a helpful AI talking to a human.
    
    
    Human:My name is Joong
    You:
[0m

[1m> Finished chain.[0m


'Hello Joong! How can I assist you today?'

In [69]:
chain3.predict(question="I live in California.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m

    You are a helpful AI talking to a human.
    
    Human: My name is Joong
AI: Hello Joong! How can I assist you today?
    Human:I live in California.
    You:
[0m

[1m> Finished chain.[0m


"That's great! California is a beautiful state. How can I assist you today, Joong?"

In [70]:
chain3.predict(question="What is my name?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m

    You are a helpful AI talking to a human.
    
    Human: My name is Joong
AI: Hello Joong! How can I assist you today?
Human: I live in California.
AI: That's great! California is a beautiful state. How can I assist you today, Joong?
    Human:What is my name?
    You:
[0m

[1m> Finished chain.[0m


'Your name is Joong.'

In [72]:
chain3.predict(question="My favorite language is python.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m

    You are a helpful AI talking to a human.
    
    Human: My name is Joong
AI: Hello Joong! How can I assist you today?
Human: I live in California.
AI: That's great! California is a beautiful state. How can I assist you today, Joong?
Human: What is my name?
AI: Your name is Joong.
Human: What is my name?
AI: Your name is Joong.
    Human:My favorite language is python.
    You:
[0m

[1m> Finished chain.[0m


"That's great to hear! Python is a popular programming language known for its simplicity and readability. How can I assist you with Python today, Joong?"

Adding memory to a message-based conversation is easy. First, you need to memory classes can output memory in two ways.
- string
- message format

In [92]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
memory4 = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=120,
    memory_key="chat_history",
    return_messages=True # Don't turn them into string, but chat message format.
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "you are a helpful AI talking to a human"),
    # since we don't know how long the conversation would be, how do we decide make space for that memory?
    # Hence, import MessagesPlaceholder 
    MessagesPlaceholder(variable_name="chat_history"), # conversationsummarybuffermemory will get messages from its history, and is going to fill messageplaceholder with all these message.
    ("human", "{question}"),
])

chain4 = LLMChain(
    llm=llm,
    memory=memory4,
    prompt=prompt,
    verbose=True,
)

In [77]:
chain4.predict(question="My name is Joong")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: you are a helpful AI talking to a human
Human: My name is Joong[0m

[1m> Finished chain.[0m


'Hello Joong! How can I assist you today?'

In [78]:
chain4.predict(question="I live in California.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: you are a helpful AI talking to a human
Human: My name is Joong
AI: Hello Joong! How can I assist you today?
Human: I live in California.[0m

[1m> Finished chain.[0m


"That's great, Joong! California is a beautiful state with a lot to offer. Is there anything specific you would like to know or discuss about living in California?"

In [79]:
chain4.predict(question="What is my name?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: you are a helpful AI talking to a human
Human: My name is Joong
AI: Hello Joong! How can I assist you today?
Human: I live in California.
AI: That's great, Joong! California is a beautiful state with a lot to offer. Is there anything specific you would like to know or discuss about living in California?
Human: What is my name?[0m

[1m> Finished chain.[0m


'Your name is Joong, as you mentioned earlier. Is there anything else you would like to know or discuss?'

### Langchain Expressions Language
- we can also do this job by ourselves by using Langchain Expressinos Language.
- Have to fill up settings manually, but it is quite useful when it comes to customizing since the configuration is exposed.

In [None]:
# The problem of this approach is everytime when we invoke the chain, we also have to add the chat history.
# chain = prompt | llm

# chain.invoke({
#     "chat_history": memory.load_memory_variables({})["chat_history"],
#     "question": "My name is Joong."
# })

RunnablePassthrough allows you to run a function before the prompt is formatted.

In [130]:
from langchain.schema.runnable import RunnablePassthrough

llm = ChatOpenAI(temperature=0.1)

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=120,
    # memory_key="chat_history", (don't need this anymore since the memory key is history by default.)
    return_messages=True, # Don't turn them into string, but chat message format.
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "you are a helpful AI talking to a human"),
    # since we don't know how long the conversation would be, how do we decide make space for that memory?
    # Hence, import MessagesPlaceholder 
    MessagesPlaceholder(variable_name="history"), # conversationsummarybuffermemory will get messages from its history, and is going to fill messageplaceholder with all these message.
    ("human", "{question}"),
])

# Langchain will first call load_memory function
def load_memory(_):
    return memory.load_memory_variables({})["history"]

# after calling the function, it will put it inside of chat_history that prompt needs.
# the output of running load_memory should be given to chat_history property
# and that combined with the input of the user is going to be given to the prompt.
chain = RunnablePassthrough.assign(history=load_memory) | prompt | llm

# chain = prompt | llm

# Since we are doing memory management manually, we have to save the interaction between the user and the machine into the memory.
# invoke_chain function invoke the chain, save that interaction in the memory.
def invoke_chain(question):
    # The dictionary in this invoke will be the input for the first item on the chain.
    result = chain.invoke({"question": question})
    memory.save_context( # save_context adds input and output to the memory.
        {"input": question}, 
        {"output": result.content}
    ),
    print(result)

In [131]:
invoke_chain("My name is Joong.")

content='Hello Joong! How can I assist you today?'


In [133]:
invoke_chain("What is my name?")

content='Your name is Joong, as you mentioned earlier. Is there anything specific you would like assistance with, Joong?'
