# Adding Memory to Conversations

Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation. At bare minimum, a conversational system should be able to access some window of past messages directly. A more complex system will need to have a world model that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.

We call this ability to store information about past interactions "memory". LangChain provides a lot of utilities for adding memory to a system. These utilities can be used by themselves or incorporated seamlessly into a chain.

A memory system needs to support two basic actions: reading and writing. Recall that every chain defines some core execution logic that expects certain inputs. Some of these inputs come directly from the user, but some of these inputs can come from memory. A chain will interact with its memory system twice in a given run.

1. `AFTER` receiving the initial user inputs but `BEFORE` executing the core logic, a chain will READ from its memory system and augment the user inputs.
2. `AFTER` executing the core logic but `BEFORE` returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.

Reference: <https://python.langchain.com/docs/modules/memory/>

In [15]:
import boto3
from utils import get_inference_parameters
from rich import print
from operator import itemgetter
from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory

from langchain.prompts import PromptTemplate
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough
from langchain.llms.bedrock import Bedrock
import warnings

warnings.filterwarnings("ignore")

In [2]:
bedrock_runtime = boto3.client(
    service_name="bedrock-runtime", region_name="us-west-2"
)  # bedrock runtime boto3 client
bedrock_model_id = "anthropic.claude-v2"  # Amazon Bedrock Model ID for claude-v2

model_kwargs = get_inference_parameters("anthropic")  # anthropic claude model parameters

# Define the llm with langchain.llms.bedrock
model = Bedrock(
    model_id=bedrock_model_id, client=bedrock_runtime, model_kwargs=model_kwargs
)

### Conversation Buffer Memory

A memory system needs to support two basic actions: reading and writing. Recall that every chain defines some core execution logic that expects certain inputs. Some of these inputs come directly from the user, but some of these inputs can come from memory. A chain will interact with its memory system twice in a given run.

1. AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
2. AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.

In [4]:
# prompt = ChatPromptTemplate.from_messages(
#     [
#         ("system", "You are a helpful chatbot"),
#         MessagesPlaceholder(variable_name="history"),
#         ("human", "{input}"),
#     ]
# )

# memory = ConversationBufferMemory(return_messages=True, ai_prefix="assistant")
# memory.chat_memory.add_user_message("howdy claude!")
# memory.chat_memory.add_ai_message("Hi there! How are you?")
# memory.load_memory_variables({})


memory = ConversationBufferMemory(return_messages=True)
# Notice that "chat_history" is present in the prompt template
template = """You are a nice chatbot having a conversation with a human.

Previous conversation:
{chat_history}

New human question: {question}
Response:"""
prompt = PromptTemplate.from_template(template)

In [5]:
chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(memory.load_memory_variables)
        | itemgetter("chat_history")
    )
    | prompt
    | model
)
inputs = {"question": "hi im bob"}
response = chain.invoke(inputs)
response  # str
memory.save_context(inputs, {"output": response})

In [8]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='hi im bob'),
  AIMessage(content=" Hello Bob, nice to meet you! I'm Claude, an AI assistant created by Anthropic.")]}

In [13]:
inputs = {"question": "how did I introduce myself?"}
response = chain.invoke(inputs)
response
memory.save_context(inputs, {"output": response})

In [14]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='hi im bob'),
  AIMessage(content=" Hello Bob, nice to meet you! I'm Claude, an AI assistant created by Anthropic."),
  HumanMessage(content='how did I introduce myself?'),
  AIMessage(content=' Based on the previous conversation, it seems you introduced yourself by saying "hi im bob".')]}

### Conversation Buffer Window

ConversationBufferWindowMemory keeps a list of the interactions of the conversation over time. It only uses the last K interactions.

This can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.

In [25]:
bw_memory = ConversationBufferWindowMemory(k=3, return_messages=True)
bw_memory.save_context({"input": "hi"}, {"output": "whats up"})
bw_memory.save_context({"input": "not much you"}, {"output": "not much"})
bw_memory.save_context(
    {"input": "is it going well"},
    {"output": "yes, it is going well for me. Thanks for asking"},
)

In [26]:
bw_memory.load_memory_variables({})

{'history': [HumanMessage(content='hi'),
  AIMessage(content='whats up'),
  HumanMessage(content='not much you'),
  AIMessage(content='not much'),
  HumanMessage(content='is it going well'),
  AIMessage(content='yes, it is going well for me. Thanks for asking')]}

In [27]:
from langchain.chains import ConversationChain

conversation_with_summary = ConversationChain(
    llm=model,
    # We set a low k=2, to only keep the last 2 interactions in memory
    memory=bw_memory,
    verbose=True,
)
conversation_with_summary.predict(input="Hi, what's up?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[HumanMessage(content='hi'), AIMessage(content='whats up'), HumanMessage(content='not much you'), AIMessage(content='not much'), HumanMessage(content='is it going well'), AIMessage(content='yes, it is going well for me. Thanks for asking')]
Human: Hi, what's up?
AI:[0m

[1m> Finished chain.[0m


" Not much, just chatting with you! How's your day going?"

In [28]:
conversation_with_summary.predict(input="What's their issues?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[HumanMessage(content='not much you'), AIMessage(content='not much'), HumanMessage(content='is it going well'), AIMessage(content='yes, it is going well for me. Thanks for asking'), HumanMessage(content="Hi, what's up?"), AIMessage(content=" Not much, just chatting with you! How's your day going?")]
Human: What's their issues?
AI:[0m

[1m> Finished chain.[0m


' I\'m afraid I don\'t have enough context to know specifically who you are referring to with "their" or what "issues" you are asking about. As an AI assistant created by Anthropic to be helpful, harmless, and honest, I don\'t actually have personal experiences or private knowledge about other people or their issues. Could you please rephrase your question or provide more details so I can better understand what you are asking? I\'m happy to continue our conversation if you can clarify the context further.'

In [29]:
conversation_with_summary.predict(input="Is it going well?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[HumanMessage(content='is it going well'), AIMessage(content='yes, it is going well for me. Thanks for asking'), HumanMessage(content="Hi, what's up?"), AIMessage(content=" Not much, just chatting with you! How's your day going?"), HumanMessage(content="What's their issues?"), AIMessage(content=' I\'m afraid I don\'t have enough context to know specifically who you are referring to with "their" or what "issues" you are asking about. As an AI assistant created by Anthropic to be helpful, harmless, and honest, I don\'t actually have personal experiences or private knowledge about other people or their issues. Could you please rephrase your question o

" I don't have enough context to determine if something specific is going well. As an AI system, I don't have personal experiences or a notion of how things are going in general. I'm happy to continue our conversation if you can provide more details about what you are asking!"