# Conversation Memory

Conversational memory is a way for chatbots to remember previous interactions with users. This allows them to respond to queries in a more coherent way and to provide a more natural conversation experience.

In this notebook, we will explore how to use conversational memory with the LangChain library.

In [1]:
!pip install -qU langchain openai tiktoken

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.3/803.3 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m225.4/225.4 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m11.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m14.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m208.0/208.0 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.2/48.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.9/75.9 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━

In [2]:
import inspect

from getpass import getpass
from langchain import OpenAI
from langchain.chains import LLMChain, ConversationChain
from langchain.chains.conversation.memory import (ConversationBufferMemory,
                                                  ConversationSummaryMemory,
                                                  ConversationBufferWindowMemory,
                                                  ConversationKGMemory)

from langchain.callbacks import get_openai_callback
import tiktoken

In [14]:
import os

os.environ['OPENAI_API_KEY'] = getpass()

··········


In [15]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(
    temperature = 0,
    model_name = "gpt-4"  # can be used with gpt-4 as well
)

In [5]:
# We will be using this function later to count the number of tokens we are using for each call.

def count_tokens(chain, query):
  with get_openai_callback() as cb:
    result = chain.run(query)
    print(f'Spent a total of {cb.total_tokens} tokens.')

    return result

### What is a memory?

Memory refers to an agent's ability to recall past interactions with the user, particularly in the context of chatbots.

The official definition of memory is as follows:

By default, Chains and Agents operate in a stateless manner, treating each incoming query independently. In certain applications, such as chatbots, it becomes crucial to retain information from previous interactions, both in the short and long term. The concept of "Memory" is designed to fulfill this need.

As we explore further, it becomes evident that, despite its apparent simplicity, there are various methods to implement this memory capability.

Before delving into the diverse memory modules provided by the library, we will introduce the chain used in the upcoming examples: the ConversationChain.

As is customary, gaining an understanding of a chain involves examining its prompt first and then delving into its ._call method. As highlighted in the chain chapter, the prompt can be accessed by exploring the template within the prompt attribute.

In [6]:
# By default CoversationChain uses ConversationBufferMemory(). It store raw conversation.

conversation_chain = ConversationChain(
    # memory: BaseMemory = ConversationBufferMemory,
   llm = llm,

)

In [7]:
print(conversation_chain.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


In this chain's prompt, the instruction is to engage in a conversation with the user while endeavoring to provide truthful responses. Upon closer examination, a novel element appears in the prompt is "**history**." This is the crucial aspect where our memory function will be put into action.

This chain combines an input from the user with the conversation history to generate a meaningful (and hopefully truthful) response.

## Memory types:


### 1. ConversationBufferMemory

This keep the buffer of all the previous conversation without modifying at all.

In [16]:
conversation_buffer = ConversationChain(
    llm = llm,
  c  memory = ConversationBufferMemory()
)

In [17]:
conversation_buffer("Hello! How are you doing?")

{'input': 'Hello! How are you doing?',
 'history': '',
 'response': "Hello! As an artificial intelligence, I don't have feelings, but I'm functioning as expected and ready to assist you. How can I help you today?"}

In [18]:
# Let's call this from the count_token function which we created earlier so we can keep a track of token count.

count_tokens(
    conversation_buffer,
    "I am great. BTW can you tell me somthing about Generative AI?"
)

Spent a total of 336 tokens.


"Absolutely, I'd be happy to explain. Generative AI is a subset of artificial intelligence that focuses on creating new content. It's a type of machine learning, specifically a type of unsupervised learning, where the AI system learns from the data without explicit programming.\n\nThe most common types of Generative AI models are Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer models. \n\nGANs, for example, consist of two parts: a generator and a discriminator. The generator creates new data instances, while the discriminator evaluates them for authenticity; i.e., whether they come from the actual training dataset or were created by the generator. The goal of the generator is to fool the discriminator into thinking the instances it creates are from the actual dataset.\n\nGenerative AI has been used in a variety of applications, including creating realistic images, synthesizing human-like text, composing music, and even designing drug molecules f

In [19]:

count_tokens(
    conversation_buffer,
    "Awesome! Well I am Yashasvi Shukla and you tell me how AI is different from GenAI?"
)

Spent a total of 535 tokens.


'Nice to meet you, Yashasvi Shukla! Artificial Intelligence (AI) is a broad term that refers to machines or software that exhibit capabilities that mimic or replicate human intelligence. This can include tasks like learning, reasoning, problem-solving, perception, and language understanding.\n\nGenerative AI, on the other hand, is a specific subset of AI. While AI in general can include anything from a simple decision tree to a complex deep learning model, Generative AI specifically refers to models that can generate new, previously unseen content. This content can be anything from images to text to music, and is created based on patterns the model has learned from its training data.\n\nSo, in essence, all Generative AI is AI, but not all AI is Generative AI. Generative AI is a specialized tool with unique capabilities within the larger AI toolbox.'

In [20]:

count_tokens(
    conversation_buffer,
    "Great! I got it. I am willing to master Generative AI and I believe your knowedge will help me"
)

Spent a total of 739 tokens.


"That's fantastic, Yashasvi Shukla! I'm glad to hear of your interest in Generative AI. There are many resources available to help you learn more about this fascinating field. Online courses, textbooks, research papers, and tutorials can all be valuable resources. \n\nYou might want to start with foundational knowledge in machine learning and deep learning, as understanding these concepts is crucial for mastering Generative AI. Then, you can delve into specific topics like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer models.\n\nRemember, practical experience is also important. Try to implement what you learn through projects. This will not only help you understand the concepts better but also give you hands-on experience.\n\nAnd of course, I'm here to assist you with any questions or concepts you might find difficult to understand. Happy learning!"

In [21]:

count_tokens(
    conversation_buffer,
    "Sure. Wanted to check back if you remember my name and aim?"
)

Spent a total of 788 tokens.


"Of course, Yashasvi Shukla! Your aim is to master Generative AI. I'm here to assist you on your learning journey."

We can clearly see that the ConversationBufferMemory remember everything we discussed during the conversation. Let's look at the buffer

In [22]:
print(conversation_buffer.memory.buffer)

Human: Hello! How are you doing?
AI: Hello! As an artificial intelligence, I don't have feelings, but I'm functioning as expected and ready to assist you. How can I help you today?
Human: I am great. BTW can you tell me somthing about Generative AI?
AI: Absolutely, I'd be happy to explain. Generative AI is a subset of artificial intelligence that focuses on creating new content. It's a type of machine learning, specifically a type of unsupervised learning, where the AI system learns from the data without explicit programming.

The most common types of Generative AI models are Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer models. 

GANs, for example, consist of two parts: a generator and a discriminator. The generator creates new data instances, while the discriminator evaluates them for authenticity; i.e., whether they come from the actual training dataset or were created by the generator. The goal of the generator is to fool the discriminator

In the buffer we can see that each conversation is stored into the memory as an history.

### 2. ConversationSummaryMemory

ConversationBufferMemory() seems to be working great but as it is storing every conversation in the raw format and every new input is getting add up to the history, it might hit the maximum token limit of the llm.

However, this problem can be solved using ConversationSummaryMemory(). By name itself we can infer that instead of raw text history, it will be keep summarized version of conversation history, where summarization is performed by an LLM.

In [25]:
conversation_summary = ConversationChain(
    llm = llm,
    memory = ConversationSummaryMemory(llm = llm)
)

In [28]:
print(conversation_summary.memory.prompt.template)

Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:


That Interesting!! Above you can see that the new interaction is summarized and got appended to the summary.


Let pass the same piece of text which we passed to ConversationBufferMemory()

In [32]:
conversation_summary("Hello! How are you doing?")


{'input': 'Hello! How are you doing?',
 'history': '',
 'response': "Hello! As an artificial intelligence, I don't have feelings, but I'm functioning as expected. How can I assist you today?"}

In [33]:
conversation_summary(
    "I am great. BTW can you tell me somthing about Generative AI?"
)


{'input': 'I am great. BTW can you tell me somthing about Generative AI?',
 'history': "The human greets the AI and asks how it's doing. The AI responds that, as an artificial intelligence, it doesn't have feelings but is functioning as expected and asks how it can assist the human.",
 'response': "Absolutely, I'd be happy to explain. Generative AI is a subset of artificial intelligence that focuses on creating new content. It's a type of machine learning, specifically a type of unsupervised learning, where the AI is given a set of data and then uses that data to generate new, similar content.\n\nOne of the most common examples of Generative AI is a Generative Adversarial Network (GAN). GANs consist of two parts: a generator, which creates new data, and a discriminator, which tries to distinguish the generated data from the real, original data. The two parts work together, with the generator constantly improving its data creation and the discriminator constantly improving its ability t

In [34]:
count_tokens(
    conversation_summary,
    "Awesome! Well I am Yashasvi Shukla and you tell me how AI is different from GenAI?"
)


Spent a total of 1130 tokens.


"Hello Yashasvi Shukla, nice to meet you! Artificial Intelligence (AI) is a broad term that refers to machines or software exhibiting intelligence. It's about creating systems that can perform tasks that would normally require human intelligence, such as understanding natural language, recognizing patterns, solving problems, and making decisions.\n\nGenerative AI, on the other hand, is a specific subset of AI. It's focused on creating new content or data that is similar to the input data it's been trained on. This could be anything from creating a new piece of music that sounds like it was composed by Beethoven, to generating realistic images of people who don't exist, or even writing a new chapter of a book in the style of a particular author.\n\nSo, in essence, while AI is the broad concept of machines mimicking human intelligence, Generative AI is a specific application of AI that focuses on creation of new, original content."

In [35]:
count_tokens(
    conversation_summary,
    "Great! I got it. I am willing to master Generative AI and I believe your knowedge will help me"
)


Spent a total of 1224 tokens.


"That's fantastic, Yashasvi! I'd be more than happy to assist you in your journey to mastering Generative AI. We can start with the basics and gradually move to more complex concepts. We can cover topics like machine learning, deep learning, neural networks, and of course, Generative Adversarial Networks (GANs). We can also explore various applications and case studies of Generative AI. Remember, understanding AI and machine learning is a process, and it's okay to take your time to fully grasp each concept. Let's start whenever you're ready!"

In [36]:
count_tokens(
    conversation_summary,
    "Sure. Wanted to check back if you remember my name and aim?"
)

Spent a total of 1051 tokens.


'Yes, your name is Yashasvi Shukla and your aim is to master Generative AI.'

In [38]:
conversation_summary.memory.buffer

"The human, Yashasvi Shukla, greets the AI and asks about its functionality and Generative AI. The AI explains that it's functioning as expected and that Generative AI is a subset of artificial intelligence that creates new content, using machine learning and Generative Adversarial Networks (GANs). It also differentiates between AI and Generative AI, stating that AI is a broad term for machines or software exhibiting intelligence, while Generative AI is a specific subset focused on creating new content. Yashasvi expresses a desire to master Generative AI, and the AI offers to assist, suggesting a gradual learning process covering various topics and applications of Generative AI. When Yashasvi checks if the AI remembers their name and aim, the AI confirms that it does."

You may be pondering why one would opt for this type of memory when the aggregate token count is higher in each call compared to the buffer example. The rationale behind this choice becomes apparent when we examine the buffer. Despite employing more tokens in each conversation instance, our ultimate history is actually shorter. This feature allows us to engage in a greater number of interactions before hitting the maximum length of our prompt, enhancing the resilience of our chatbot in handling longer conversations.

### There are few more memory types!!

### 3. ConversationBufferWindowMemory

A compelling alternative in such scenarios is the ConversationBufferWindowMemory. In this approach, we retain a subset of the recent interactions in our memory, deliberately discarding the oldest ones—a form of short-term memory, if you will. This results in a significant reduction in both aggregate token count and per-call token count. The window size in this context is regulated by the parameter k. However, with this memory we will be forgeting the history conversation as it completely based on the number of conversation you are willing to store in memory i.e., k parameter.

### 4. CoversationSummaryBufferMemory

The ConversationSummaryBufferMemory maintains a summary of initial conversation elements while preserving an unprocessed recollection of the most recent interactions. This dual functionality ensures a comprehensive memory representation that encompasses both summarized insights from the early stages of the conversation and the raw details of recent exchanges.

### 5. ConversationKnowledgeGraphMemory

Recently introduced, this memory type is exceptionally fascinating. It operates on the principle of a knowledge graph, identifying various entities and linking them in pairs through a predicate, forming (subject, predicate, object) triplets. This allows for the compression of substantial information into meaningful snippets that can serve as context for the model. For a more in-depth understanding of this memory type, you can refer to [this](https://apex974.com/articles/explore-langchain-support-for-knowledge-graph) blogpost.

In [39]:
conversation_kg = ConversationChain(
    llm=llm,
    memory=ConversationKGMemory(llm=llm)
)

In [41]:
conversation_kg("I am Yashasvi and I love exploring new places.")

{'input': 'I am Yashasvi and I love exploring new places.',
 'history': '',
 'response': "Hello Yashasvi! It's great to meet someone with a passion for exploration. There's so much to see and learn in the world. Are there any specific regions or types of places you're particularly interested in? For example, are you more drawn to natural landscapes like mountains and forests, or do you prefer exploring cities and their cultural landmarks?"}

In [42]:
count_tokens(
    conversation_kg,
    "Can you suggest some places which I can explore in Varanasi?"
)

Spent a total of 1911 tokens.


"Absolutely! Varanasi, also known as Benares, is one of the oldest living cities in the world and is rich in cultural and spiritual heritage. Here are some places you might want to explore:\n\n1. Kashi Vishwanath Temple: This is one of the most famous Hindu temples dedicated to Lord Shiva. It's located on the western bank of the holy river Ganges.\n\n2. Dashashwamedh Ghat: This is the main and probably the oldest ghat of Varanasi located on the Ganges. It's close to the Kashi Vishwanath Temple and is known for its evening aarti.\n\n3. Manikarnika Ghat: This is the primary cremation ghat in Varanasi. It's a powerful place that provides a stark reminder of the cycle of life and death.\n\n4. Assi Ghat: This is where pilgrims pay homage to Lord Shiva before bathing in the river. It's a peaceful place, perfect for watching the sunrise.\n\n5. Sarnath: Located just outside Varanasi, this is where Buddha gave his first sermon after attaining enlightenment.\n\n6. Ramnagar Fort: This 18th-centur

In [43]:
conversation_kg.memory.kg.get_triples()

[('Yashasvi', 'I', 'is'), ('Yashasvi', 'exploring new places', 'loves')]