**Conversational memory** is how chatbots can respond to our queries in a chat-like manner. It enables a coherent conversation, and without it, every query would be treated as an entirely independent input without considering past interactions.

The memory allows a "agent" to remember previous interactions with the user. By default, agents are stateless — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.

There are many applications where remembering previous interactions is very important, such as chatbots. Conversational memory allows us to do that.

https://langchain.readthedocs.io/en/latest/modules/memory/how_to_guides.html

In [3]:
!pip install -qU langchain openai tiktoken

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/425.6 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━[0m [32m256.0/425.6 KB[0m [31m7.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m425.6/425.6 KB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.1/70.1 KB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m41.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m56.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m264.6/264.6 KB[0m [31m23.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.2/114.2 KB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━

In [28]:
import inspect

from getpass import getpass
from langchain import OpenAI
from langchain.chains import LLMChain, ConversationChain
from langchain.chains.conversation.memory import (ConversationBufferMemory, ConversationSummaryMemory, ConversationBufferWindowMemory, ConversationKGMemory)
from langchain.callbacks import get_openai_callback
import tiktoken

To run this notebook, we will need to use an OpenAI LLM. Here we will setup the LLM we will use for the whole notebook, just input your openai api key when prompted.
https://platform.openai.com/account/api-keys

In [29]:
OPENAI_API_KEY = getpass()

··········


In [30]:
llm = OpenAI(
    temperature=0, 
    openai_api_key=OPENAI_API_KEY,
    model_name='text-davinci-003'  # can be used with llms like 'gpt-3.5-turbo'
)

Later we will make use of a count_tokens utility function. This will allow us to count the number of tokens we are using for each call. We define it as so:

In [31]:
def count_tokens(chain, query):
    with get_openai_callback() as cb:
        result = chain.run(query)
        print(f'Spent a total of {cb.total_tokens} tokens')

    return result

In [32]:
conversation = ConversationChain(
    llm=llm, 
)

In [33]:
print(conversation.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


Interesting! So this chain's prompt is telling it to chat with the user and try to give truthful answers. If we look closely, there is a new component in the prompt that we didn't see when we were tinkering with the LLMMathChain: history. This is where our memory will come into play.

In [34]:
print(inspect.getsource(conversation._call), inspect.getsource(conversation.apply))

    def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
        return self.apply([inputs])[0]
     def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:
        """Utilize the LLM generate method for speed gains."""
        response = self.generate(input_list)
        return self.create_outputs(response)



Nothing really magical going on here, just a straightforward pass through an LLM. In fact, this chain inherits these methods directly from the LLMChain without any modification:

In [35]:
print(inspect.getsource(LLMChain._call), inspect.getsource(LLMChain.apply))

    def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
        return self.apply([inputs])[0]
     def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:
        """Utilize the LLM generate method for speed gains."""
        response = self.generate(input_list)
        return self.create_outputs(response)



So basically this chain combines an input from the user with the conversation history to generate a meaningful (and hopefully truthful) response.

**Memory type #1: ConversationBufferMemory**

The ConversationBufferMemory does just what its name suggests: it keeps a buffer of the previous conversation excerpts as part of the context in the prompt.

Key feature: the conversation buffer memory keeps the previous pieces of conversation completely unmodified, in their raw form

In [36]:
conversation_buf = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)

We pass a user prompt the the ConversationBufferMemory like so:

In [37]:
conversation_buf("Good morning AI!")

{'input': 'Good morning AI!',
 'history': '',
 'response': " Good morning! It's a beautiful day today, isn't it? How can I help you?"}

This one call used a total of 85 tokens, but we can't see that from the above. If we'd like to count the number of tokens being used we just pass our conversation chain object and the message we'd like to input via the count_tokens function we defined earlier:

In [38]:
count_tokens(
    conversation_buf, 
     "My interest here is to learn about recursion"
)

Spent a total of 162 tokens


" Ah, recursion! That's a fascinating concept. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve. Do you have any specific questions about recursion?"

In [39]:
count_tokens(
    conversation_buf,
    "can you show me an example with python code"
)

Spent a total of 263 tokens


" Sure! Here's an example of a recursive function written in Python:\n\ndef factorial(n):\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)\n\nThis function calculates the factorial of a given number. It calls itself until the base case (n == 0) is reached, at which point it returns the result."

In [40]:
count_tokens(
    conversation_buf, 
    "show me more example"
)

Spent a total of 391 tokens


" Sure! Here's another example of a recursive function written in Python:\n\ndef fibonacci(n):\n    if n == 0:\n        return 0\n    elif n == 1:\n        return 1\n    else:\n        return fibonacci(n-1) + fibonacci(n-2)\n\nThis function calculates the nth number in the Fibonacci sequence. It calls itself twice, once for the n-1th number and once for the n-2th number, and then adds the two results together."

In [41]:
count_tokens(
    conversation_buf, 
    "What is my aim again? "
)

Spent a total of 455 tokens


" Your aim is to learn about recursion. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve."

Our LLM with ConversationBufferMemory can clearly remember earlier interactions in the conversation. Let's take a closer look to how the LLM is saving our previous conversation. We can do this by accessing the .buffer attribute for the .memory in our chain.

In [42]:
print(conversation_buf.memory.buffer)

Human: Good morning AI!
AI:  Good morning! It's a beautiful day today, isn't it? How can I help you?
Human: My interest here is to learn about recursion
AI:  Ah, recursion! That's a fascinating concept. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve. Do you have any specific questions about recursion?
Human: can you show me an example with python code
AI:  Sure! Here's an example of a recursive function written in Python:

def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

This function calculates the factorial of a given number. It calls itself until the base case (n == 0) is reached, at which point it returns the result.
Human: show me more example
AI:  Sure! Here's another example of a recursive function written in Python:

def fibonacci(n):
    if n == 0:
        return 0


**Memory type #2: ConversationSummaryMemory**

The problem with the ConversationBufferMemory is that as the conversation progresses, the token count of our context history adds up. This is problematic because we might max out our LLM with a prompt that is too large to be processed.

Enter ConversationSummaryMemory.

Again, we can infer from the name what is going on.. we will keep a summary of our previous conversation snippets as our history. How will we summarize these? LLM to the rescue.

Key feature: the conversation summary memory keeps the previous pieces of conversation in a summarized form, where the summarization is performed by an LLM.

In this case we need to send the llm to our memory constructor to power its summarization ability.

In [43]:
conversation_sum = ConversationChain(
    llm=llm, 
    memory=ConversationSummaryMemory(llm=llm)
)

When we have an llm, we always have a prompt ;) Let's see what's going on inside our conversation summary memory:

In [44]:
print(conversation_sum.memory.prompt.template)

Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:


Cool! So each new interaction is summarized and appended to a running summary as the memory of our chain. Let's see how this works in practice!

In [45]:
# without count_tokens we'd call `conversation_sum("Good morning AI!")`
# but let's keep track of our tokens:
count_tokens(
    conversation_sum, 
    "Good morning AI!"
)

Spent a total of 283 tokens


" Good morning! It's a beautiful day today, isn't it? How can I help you?"

In [46]:
count_tokens(
    conversation_sum, 
    "My interest here is to learn about recursion"
)

Spent a total of 484 tokens


" Hi there! I'm glad you're interested in learning about recursion. Recursion is a programming technique that involves a function calling itself repeatedly until a certain condition is met. It's a powerful tool for solving complex problems, but it can be difficult to understand and implement. Can I answer any specific questions you have about recursion?"

In [47]:
count_tokens(
    conversation_sum, 
    "can you show me an example with python code"
)

Spent a total of 702 tokens


' Sure! Here is an example of a recursive function written in Python:\n\ndef recursive_function(n):\n    if n == 0:\n        return 0\n    else:\n        return n + recursive_function(n-1)\n\nThis function takes an integer as an argument and returns the sum of all the integers from 0 to the argument. For example, if you call recursive_function(5), it will return 15 (0 + 1 + 2 + 3 + 4 + 5).'

In [48]:
count_tokens(
    conversation_sum, 
    "show me more example"
)

Spent a total of 759 tokens


" Sure! Here's another example of a recursive function written in Python. This one takes a string as an argument and returns the reverse of the string.\n\ndef reverse_string(string):\n    if len(string) == 0:\n        return string\n    else:\n        return reverse_string(string[1:]) + string[0]\n\nTry it out by passing in a string of your choice!"

In [49]:
count_tokens(
    conversation_sum, 
    "What is my aim again?"
)

Spent a total of 747 tokens


" Your aim is to learn about recursion and understand how to implement it. I can provide more examples of recursive functions written in Python if you'd like."

In [50]:
print(conversation_sum.memory.buffer)


The human greeted the AI and the AI responded with a greeting and asked how it could help. The human expressed interest in learning about recursion, to which the AI responded by explaining that recursion is a powerful tool for solving complex problems, but can be difficult to understand and implement. The AI then asked if it could answer any specific questions the human had about recursion, and provided an example of a recursive function written in Python which takes an integer as an argument and returns the sum of all the integers from 0 to the argument. When the human asked for more examples, the AI provided another example of a recursive function written in Python which takes a string as an argument and returns the reverse of the string, and reminded the human that their aim was to learn about recursion and understand how to implement it, offering to provide more examples if needed.


You might be wondering.. if the aggregate token count is greater in each call here than in the buffer example, why should we use this type of memory? Well, if we check out buffer we will realize that although we are using more tokens in each instance of our conversation, our final history is shorter. This will enable us to have many more interactions before we reach our prompt's max length, making our chatbot more robust to longer conversations.

We can count the number of tokens being used (without making a call to OpenAI) using the tiktoken tokenizer like so:

In [51]:
# initialize tokenizer
tokenizer = tiktoken.encoding_for_model('text-davinci-003')

# show number of tokens for the memory used by each memory type
print(
    f'Buffer memory conversation length: {len(tokenizer.encode(conversation_buf.memory.buffer))}\n'
    f'Summary memory conversation length: {len(tokenizer.encode(conversation_sum.memory.buffer))}'
)

Buffer memory conversation length: 401
Summary memory conversation length: 171


 **Note: the text-davinci-003 and gpt-3.5-turbo models have a large max tokens count of 4096 tokens between prompt and answer.**

**Memory type #3: ConversationBufferWindowMemory**

Another great option for these cases is the ConversationBufferWindowMemory where we will be keeping a few of the last interactions in our memory but we will intentionally drop the oldest ones - short-term memory if you'd like. Here the aggregate token count and the per-call token count will drop noticeably. We will control this window with the k parameter.

Key feature: the conversation buffer window memory keeps the latest pieces of the conversation in raw form

In [52]:
conversation_bufw = ConversationChain(
    llm=llm, 
    memory=ConversationBufferWindowMemory(k=1)
)

In [53]:
count_tokens(
    conversation_bufw, 
    "Good morning AI!"
)

Spent a total of 85 tokens


" Good morning! It's a beautiful day today, isn't it? How can I help you?"

In [54]:
count_tokens(
    conversation_bufw, 
    "My interest here is to learn about recursion"
)

Spent a total of 162 tokens


" Ah, recursion! That's a fascinating concept. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve. Do you have any specific questions about recursion?"

In [55]:
count_tokens(
    conversation_bufw, 
    "can you show me an example with python code"
)

Spent a total of 232 tokens


" Sure! Here's an example of a recursive function written in Python:\n\ndef factorial(n):\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)\n\nThis function calculates the factorial of a given number. It calls itself until the base case (n == 0) is reached, at which point it returns the result."

In [56]:
count_tokens(
    conversation_bufw, 
    "show me more example"
)

Spent a total of 206 tokens


" Sure! Here's an example of a for loop written in Python:\n\nfor i in range(10):\n    print(i)\n\nThis loop prints out the numbers 0 through 9."

In [57]:
count_tokens(
    conversation_bufw, 
    "What is my aim again? "
)

Spent a total of 131 tokens


' Your aim is to understand how for loops work in Python.'

As we can see, it effectively 'forgot' what we talked about in the first interaction. Let's see what it 'remembers'. Given that we set k to be 1, we would expect it remembers only the last interaction.

We need to access a special method here since, in this memory type, the buffer is first passed through this method to be sent later to the llm.

In [58]:
bufw_history = conversation_bufw.memory.load_memory_variables(
    inputs=[]
)['history']

In [59]:
print(bufw_history)

Human: What is my aim again? 
AI:  Your aim is to understand how for loops work in Python.


Makes sense.

On the plus side, we are shortening our conversation length when compared to buffer memory without a window:

In [60]:
print(
    f'Buffer memory conversation length: {len(tokenizer.encode(conversation_buf.memory.buffer))}\n'
    f'Summary memory conversation length: {len(tokenizer.encode(conversation_sum.memory.buffer))}\n'
    f'Buffer window memory conversation length: {len(tokenizer.encode(bufw_history))}'
)

Buffer memory conversation length: 401
Summary memory conversation length: 171
Buffer window memory conversation length: 25


*Note: We are using k=2 here for illustrative purposes, in most real world applications you would need a higher value for k.*

**Memory type #4 ConversationSummaryBufferMemory**


Key feature: the conversation summary memory keeps a summary of the earliest pieces of conversation while retaining a raw recollection of the latest interactions.

In [62]:
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory


In [65]:
conversation_sum_bufw = ConversationChain(
    llm=llm, 
    memory=ConversationSummaryBufferMemory(llm=llm,max_token_limit=650) #same like previous k=6
)

In [66]:
count_tokens(
    conversation_sum_bufw, 
    "Good morning AI!"
)

Spent a total of 85 tokens


" Good morning! It's a beautiful day today, isn't it? How can I help you?"

In [67]:
count_tokens(
    conversation_sum_bufw, 
    "My interest here is to learn about recursion"
)

Spent a total of 162 tokens


" Ah, recursion! That's a fascinating concept. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve. Do you have any specific questions about recursion?"

In [68]:
count_tokens(
    conversation_sum_bufw, 
    "can you show me an example with python code"
)

Spent a total of 263 tokens


" Sure! Here's an example of a recursive function written in Python:\n\ndef factorial(n):\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)\n\nThis function calculates the factorial of a given number. It calls itself until the base case (n == 0) is reached, at which point it returns the result."

In [69]:
count_tokens(
    conversation_sum_bufw, 
   "show me more example"
)

Spent a total of 391 tokens


" Sure! Here's another example of a recursive function written in Python:\n\ndef fibonacci(n):\n    if n == 0:\n        return 0\n    elif n == 1:\n        return 1\n    else:\n        return fibonacci(n-1) + fibonacci(n-2)\n\nThis function calculates the nth number in the Fibonacci sequence. It calls itself twice, once for the n-1th number and once for the n-2th number, and then adds the two results together."

In [70]:
count_tokens(
    conversation_sum_bufw, 
    "What is my aim again? "
)

Spent a total of 455 tokens


" Your aim is to learn about recursion. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve."

In [71]:
bufw_history = conversation_sum_bufw.memory.load_memory_variables(
    inputs=[]
)['history']

In [72]:
print(bufw_history)

Human: Good morning AI!
AI:  Good morning! It's a beautiful day today, isn't it? How can I help you?
Human: My interest here is to learn about recursion
AI:  Ah, recursion! That's a fascinating concept. Recursion is a programming technique where a function calls itself, allowing it to repeat a task until a certain condition is met. It's often used to solve complex problems that would otherwise be difficult to solve. Do you have any specific questions about recursion?
Human: can you show me an example with python code
AI:  Sure! Here's an example of a recursive function written in Python:

def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

This function calculates the factorial of a given number. It calls itself until the base case (n == 0) is reached, at which point it returns the result.
Human: show me more example
AI:  Sure! Here's another example of a recursive function written in Python:

def fibonacci(n):
    if n == 0:
        return 0


**Memory type #5 :ConversationKnowledgeGraphMemory**

This is a super cool memory type that was introduced just recently. It is based on the concept of a knowledge graph which recognizes different entities and connects them in pairs with a predicate resulting in (subject, predicate, object) triplets. This enables us to compress a lot of information into highly significant snippets that can be fed into the model as context. 

Key feature: the conversation knowledge graph memory keeps a knowledge graph of all the entities that have been mentioned in the interactions together with their semantic relationships.

In [80]:
!pip install -qU networkx

In [81]:
conversation_kg = ConversationChain(
    llm=llm, 
    memory=ConversationKGMemory(llm=llm)
)

In [82]:
count_tokens(
    conversation_kg, 
    "My name is human and I like mangoes!"
)

Spent a total of 1167 tokens


" Hi Human! My name is AI. It's nice to meet you. I like mangoes too! Did you know that mangoes are a great source of vitamins A and C?"

the memory keeps a knowledge graph of everything it learned so far.

In [83]:
conversation_kg.memory.kg.get_triples()

[('Human', 'human', 'name'), ('Human', 'mangoes', 'likes')]