# LangChain Agent Conversation and Memory

Inspiration taken from https://www.youtube.com/watch?v=X550Zbz_ROE

This is important to consider because amount of tokens is not limitless meaning that we need to be wise about the way we store context during an conversation with the LLM

In [5]:
import os
from langchain.prompts  import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain, ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory

In [None]:
# Make sure you have setup your environmental variables beforehand
os.environ["OPENAI_API_KEY"]
os.environ["OPENAI_ORGANIZATION"]

Initiate model as usual

In [6]:
llm_openai = OpenAI(model_name="gpt-4-0613", temperature=0, max_tokens=512)



### Look at first ConversationBufferMemory for simpler tasks

Instiate memory where log will be stored. This will be track what user & AI exchange, and stack that into prompt

In [7]:
ai_memory = ConversationBufferMemory()

Start conversational chain. This like LLMChain, can also include a prompt (let's test that in the Playground notebooks)

In [8]:
conversation_chain = ConversationChain(
    llm=llm_openai,
    memory=ai_memory,
    verbose=True
)

In [10]:
conversation_chain.predict(input="Hi, I'm El Brian")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, I'm El Brian
AI:[0m

[1m> Finished chain.[0m


"Hello El Brian! It's a pleasure to meet you. I'm an AI developed to assist with information and tasks. How can I help you today?"

Notice how it keeps track of the conversation even as new queries are requested

In [20]:
conversation_chain.predict(input="how is your afternoon so far?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, I'm El Brian
AI: Hello El Brian! It's a pleasure to meet you. I'm an AI developed to assist with information and tasks. How can I help you today?
Human: how is your afternoon so far?
AI:[0m

[1m> Finished chain.[0m


"As an artificial intelligence, I don't experience time or have personal feelings. But thank you for asking! How can I assist you further?"

Notice below the memory object where AI/User exchange is logged

In [22]:
print(ai_memory.chat_memory.messages)

[HumanMessage(content="Hi, I'm El Brian", additional_kwargs={}, example=False), AIMessage(content="Hello El Brian! It's a pleasure to meet you. I'm an AI developed to assist with information and tasks. How can I help you today?", additional_kwargs={}, example=False), HumanMessage(content='how is your afternoon so far?', additional_kwargs={}, example=False), AIMessage(content="As an artificial intelligence, I don't experience time or have personal feelings. But thank you for asking! How can I assist you further?", additional_kwargs={}, example=False)]


This is also stored in the actual Conversation Chain object

In [23]:
print(conversation_chain.memory.buffer)

Human: Hi, I'm El Brian
AI: Hello El Brian! It's a pleasure to meet you. I'm an AI developed to assist with information and tasks. How can I help you today?
Human: how is your afternoon so far?
AI: As an artificial intelligence, I don't experience time or have personal feelings. But thank you for asking! How can I assist you further?


The above type of memory (ConversationBufferMemory) works well where limited exchanges are expected (~5). Let's look at longer memory types below

### Let's explore ConversationSummaryMemory for some more exchange intensives scenarios

In [24]:
from langchain.memory import ConversationSummaryMemory

Notice how this initiation of conversation memory does require the llm input (unlike ConversationMemory above)

<u>Why the difference?</u> Because it uses an additional LLM to summarise the log, besides the LLM use for the  (Test multi-LLM usage in Playground)

In [26]:
ai_summary_memory = ConversationSummaryMemory(llm=llm_openai)

In [28]:
conversation_chain = ConversationChain(
    llm=llm_openai,
    memory=ai_summary_memory,
    verbose=True
)

In [29]:
conversation_chain.predict(input="Hi, I'm El Brian")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, I'm El Brian
AI:[0m

[1m> Finished chain.[0m


"Hello El Brian! It's a pleasure to meet you. I'm an AI developed to assist with information and answer any questions you might have. How can I assist you today?"

Notice the key summary difference now

In [30]:
conversation_chain.predict(input="How are you doing this evening?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
The human, El Brian, introduces himself to the AI. The AI responds by introducing itself as an assistant developed to provide information and answer questions. It then asks El Brian how it can assist him.
Human: How are you doing this evening?
AI:[0m

[1m> Finished chain.[0m


"As an artificial intelligence, I don't have feelings or personal experiences, but I'm here and ready to assist you. How can I help you this evening?"

They AI summarizes the prior conversation, rather than logging the entire conversation

In [33]:
print(conversation_chain.memory.buffer)

El Brian introduces himself to the AI, which identifies itself as an assistant designed to provide information and answer questions. When El Brian asks how the AI is doing, it explains that as an AI, it doesn't have feelings or personal experiences, but is ready to assist. It then reiterates its readiness to help El Brian.


Notice however in the example that at first we are using more tokens. Let's add more to the log to keep seeing how summarization works

In [34]:
conversation_chain.predict(input="Can you help me with identifying the best country to visit in the summer?")

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', ConnectionResetError(54, 'Connection reset by peer')).




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
El Brian introduces himself to the AI, which identifies itself as an assistant designed to provide information and answer questions. When El Brian asks how the AI is doing, it explains that as an AI, it doesn't have feelings or personal experiences, but is ready to assist. It then reiterates its readiness to help El Brian.
Human: Can you help me with identifying the best country to visit in the summer?
AI:[0m

[1m> Finished chain.[0m


'Of course, I\'d be happy to help with that! However, the "best" country to visit in the summer can depend on a variety of factors, such as your personal interests, budget, and tolerance for heat. \n\nIf you\'re a fan of mild weather, you might consider visiting Iceland. The average temperature in the summer is around 10-13 degrees Celsius (50-55 degrees Fahrenheit), and the country is known for its stunning natural beauty, including waterfalls, geysers, and hot springs.\n\nIf you\'re interested in history and culture, Italy could be a great choice. Summer is a popular time to visit cities like Rome, Florence, and Venice, where you can see famous landmarks like the Colosseum, the Uffizi Gallery, and the Grand Canal.\n\nFor beach lovers, the Maldives offers beautiful white sand beaches and clear, warm waters. The country is also known for its luxury resorts, many of which are located on their own private islands.\n\nIf you\'re a foodie, you might consider visiting Japan. The country is 

In [35]:
print(conversation_chain.memory.buffer)

El Brian introduces himself to the AI, which identifies itself as an assistant designed to provide information and answer questions. When El Brian asks how the AI is doing, it explains that as an AI, it doesn't have feelings or personal experiences, but is ready to assist. El Brian then asks the AI to help identify the best country to visit in the summer. The AI responds that the "best" country can depend on various factors like personal interests, budget, and heat tolerance. It suggests Iceland for mild weather and natural beauty, Italy for history and culture, the Maldives for beaches and luxury resorts, and Japan for food lovers. The AI also offers to provide a more tailored recommendation if given more information about El Brian's preferences for a summer vacation.


The above is a nice option to store memory in summary so that the 'important' context is maximized in memory (tokens). Let's look at an alternate version of the memory where we store memory (tokens) based on a pre-defined token window

### Let's explore ConversationBufferWindowMemory to test this pre-defined token-windowed approach

In [36]:
from langchain.memory import ConversationBufferWindowMemory