# LangChain: Memory

Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation. At bare minimum, a conversational system should be able to access some window of past messages directly. A more complex system will need to have a world model that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.

### Outline
* ConversationBufferMemory
* ConversationBufferWindowMemory
* ConversationTokenBufferMemory
* ConversationSummaryMemory

![](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQcYZAK7bD8PPvjbAIe5tLk19SX9zKaSGHlVrE_vKrDk09tCgj7ujp_re7SpYHI8I7yUA&usqp=CAU)

## Setting Up

In [0]:
!pip install -qU openai
!pip install -qU langchain 
!pip install -qU langchain-openai
!pip install -qU tiktoken
dbutils.library.restartPython()

In [0]:
import os
import warnings
warnings.filterwarnings('ignore')
import tiktoken

import openai
from langchain.llms import OpenAI
from langchain_openai import AzureOpenAI

from langchain.chat_models import ChatOpenAI
from langchain_openai import AzureChatOpenAI

from langchain.prompts import PromptTemplate
from langchain.prompts import ChatPromptTemplate

from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory
from langchain.memory import ConversationTokenBufferMemory
from langchain.memory import ConversationSummaryBufferMemory

from langchain.chains import ConversationChain

openai.api_type = "azure"
azure_endpoint = "https://rg-rbi-aa-aitest-dsacademy.openai.azure.com/"
#azure_endpoint = "https://chatgpt-summarization.openai.azure.com/"

openai.api_version = "2023-07-01-preview"
openai.api_key = os.environ["OPENAI_API_KEY"]

deployment_name = "model-gpt-35-turbo"
openai_model_name = "gpt-35-turbo"

client = AzureOpenAI(api_key=openai.api_key,
                     api_version=openai.api_version,
                     azure_endpoint=azure_endpoint,
                     )

chat = AzureChatOpenAI(azure_endpoint=azure_endpoint,
                       openai_api_version=openai.api_version,
                       deployment_name=deployment_name,
                       openai_api_key=os.environ["OPENAI_API_KEY"],
                       openai_api_type=openai.api_type,
                       )

## ConversationBufferMemory  

We are going to use the simplest memory type.  
Let's check what is happening behind the scenes using the ```verbose``` mode of the LLM:  

In [0]:
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=chat, memory = memory, verbose=True)
conversation.predict(input="Hi, my name is Renato")

In [0]:
conversation.predict(input="What is 1+1?")

In [0]:
conversation.predict(input="What is my name?")

##### As we are using a memory module, we can check what is stored in there:  

In [0]:
print(memory.buffer)

In [0]:
memory.load_memory_variables({})

##### Now let's erase the memory buffer and insert something else in it:  

In [0]:
memory = ConversationBufferMemory()

memory.save_context({"input": "Hi"}, 
                    {"output": "What's up"})

print(memory.buffer)

In [0]:
memory.load_memory_variables({})

##### Let's add even more context:  

In [0]:
memory.save_context({"input": "Not much, just hanging"}, 
                    {"output": "Cool"})

memory.load_memory_variables({})

##### Now, let's check if the model still know the information from a previous interaction:  

In [0]:
conversation = ConversationChain(llm=chat,
                                 memory = memory,
                                 verbose=True
                                 )
conversation.predict(input="What is my name?")

## ConversationBufferWindowMemory  

We can define a buffer size to determine how much context will be kept:  

In [0]:
memory = ConversationBufferWindowMemory(k=1)

memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

memory.load_memory_variables({})

##### We see that only one interaction was saved. Now let's see how it would be with the LLM:  

In [0]:
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(llm=chat, memory = memory, verbose=False)
conversation.predict(input="Hi, my name is Renato")

In [0]:
conversation.predict(input="What is 1+1?")

In [0]:
conversation.predict(input="What is my name?")

## ConversationTokenBufferMemory

Similarly, we can define a buffer size in the quantity of tokens:  

In [0]:
memory = ConversationTokenBufferMemory(llm=chat, max_token_limit=30)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})

memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Something Beautiful!"})

memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

memory.load_memory_variables({})

## ConversationSummaryMemory  

This last type of memory creates an automatic summary of the previous interactions to store:  

In [0]:
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=chat, max_token_limit=100)

memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

memory.load_memory_variables({})

In [0]:
conversation = ConversationChain(llm=chat, memory = memory, verbose=True)
conversation.predict(input="What would be a good demo to show?")

In [0]:
memory.load_memory_variables({})