# Memory

LangChain allows us to use previous interactions to inform future ones. This is done by storing the previous interactions in memory. There are many types of memory, but we are interested in the following:

- `ConversationBufferMemory`: Stores all previous interactions.
- `ConversationBufferWindowMemory`: Stores certain number of previous interactions.
- `ConversationTokenBufferMemory `: Stores previous interactions not exceeding a token length.
- `ConversationSummaryMemory`: Stores previous interactions as a summary.


## Conversation Buffer

`ConversationBufferMemory` allows for storing messages and then extracts the messages in a variable.

In [None]:
from langchain.memory import ConversationBufferMemory

In [None]:
memory = ConversationBufferMemory()
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.load_memory_variables({})

Using `return_messages=True` allows us to get a the chat message history as a list of messages. See [different types of messages](https://python.langchain.com/docs/modules/model_io/models/chat/#messages).

In [None]:
memory = ConversationBufferMemory(return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.load_memory_variables({})

In [None]:
memory.chat_memory

In [None]:
memory.chat_memory.messages

Let's use the conversation buffer in a chain.

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain


llm = ChatOpenAI(temperature=0)
memory = ConversationBufferMemory(return_messages=True)
conversation = ConversationChain(
    llm=llm,
    verbose=True,
    memory=memory,
)

In [None]:
conversation.predict(input="Hi there!")

In [None]:
conversation.predict(input="I'm doing well. How are you?")

In [None]:
memory.load_memory_variables({})

## Conversation Buffer Window

`ConversationBufferWindowMemory` keeps a list of the interactions of the conversation over time. It only uses the last K interactions. This can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.

In [None]:
from langchain.memory import ConversationBufferWindowMemory

# Save the last 1 interaction
memory = ConversationBufferWindowMemory(k=1, return_messages=True)

memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})

memory.load_memory_variables({})

Let's use the buffer in a chain.

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain

llm = ChatOpenAI(temperature=0)
memory = ConversationBufferWindowMemory(k=1)
conversation_with_summary = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True,
)

conversation_with_summary.predict(input="Hi, what's up?")

In [None]:
conversation_with_summary.predict(input="I'm doing well.")

At this point, notice that the first interaction is not in the buffer. This is because the buffer only stores the last 1 interaction.

In [None]:
conversation_with_summary.predict(input="My name is Chris.")

## Conversation Token Buffer

`ConversationTokenBufferMemory` keeps a buffer of recent interactions in memory, and uses token length rather than number of interactions to determine when to flush interactions.

In [None]:
from langchain.memory import ConversationTokenBufferMemory

memory = ConversationTokenBufferMemory(
    llm=llm,
    max_token_limit=10,
    return_messages=True,
)

memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})

Let's use the buffer in a chain.

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain

llm = ChatOpenAI(temperature=0)
memory = ConversationTokenBufferMemory(
    llm=llm,
    max_token_limit=60,
    return_messages=True,
)

conversation_with_summary = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True,
)

conversation_with_summary.predict(input="Hi, what's up?")

Notice that the first interaction is not in the buffer. This is because the first interaction exceeds the token limit.

In [None]:
conversation_with_summary.predict(input="I'm doing well.")

## Conversation Summary

In [None]:
from langchain.memory import ConversationSummaryMemory, ChatMessageHistory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0)
memory = ConversationSummaryMemory(llm=llm, return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.load_memory_variables({})

Access the summary using `buffer` attribute.

In [None]:
memory.buffer

Access the interactions using `chat_memory` attribute.

In [None]:
memory.chat_memory.messages

We can also utilize the `predict_new_summary` method directly.

In [None]:
messages = memory.chat_memory.messages
previous_summary = ""
memory.predict_new_summary(messages, previous_summary)

In [None]:
history = ChatMessageHistory()
history.add_user_message("hi")
history.add_ai_message("hi there!")

memory = ConversationSummaryMemory.from_messages(
    llm=ChatOpenAI(temperature=0),
    chat_memory=history,
    return_messages=True
)

memory.buffer

Optionally you can speed up initialization using a previously generated summary, and avoid regenerating the summary by just initializing directly.

In [None]:
memory = ConversationSummaryMemory(
    llm=ChatOpenAI(temperature=0),
    buffer="The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.",
    chat_memory=history,
    return_messages=True
)

Let's use the buffer in a chain.

In [None]:
from langchain.llms import OpenAI
from langchain.chains import ConversationChain
llm = OpenAI(temperature=0)
conversation_with_summary = ConversationChain(
    llm=llm, 
    memory=ConversationSummaryMemory(llm=OpenAI()),
    verbose=True
)

conversation_with_summary.predict(input="Hi, what's up?")

In [None]:
conversation_with_summary.predict(input="I'm doing well.")

In [None]:
conversation_with_summary.predict(input="How are you?")

In [None]:
conversation_with_summary.predict(input="What are your interests?")

In [None]:
conversation_with_summary.predict(input="What is buffer memory?")