# LangChain: Memory

## Outline
* ConversationBufferMemory: Like a chat log that records everything in the conversation so far.
* ConversationBufferWindowMemory: only remembers the most recent part of your chat, not the entire conversation.
* ConversationTokenBufferMemory: Instead of remembering by a number of phrases or sentences, it remembers based on a certain amount of space it can hold, called “tokens.”
* ConversationSummaryMemory: Instead of keeping track of every word, it writes down short summaries of important parts of the conversation

## Install Modules

In [None]:
!pip install litellm

In [None]:
!pip install langchain

In [None]:
!pip install langchain_community

In [None]:
!pip install langchain_openai

## ConversationBufferMemory

In [None]:
import os

import warnings
warnings.filterwarnings('ignore')

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [None]:
# Set up a conversational model with a buffer memory to track full conversation history
llm = ChatOpenAI(model="qwen2.5",
                 api_key="",
                 base_url='',
                 temperature=0.0)

# Initialize memory and conversation chain
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

In [None]:
# Simulate conversation inputs
conversation.predict(input="Hi, my name is Andrew")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

In [None]:
# Display entire conversation history stored in the buffer
print(memory.buffer)

In [None]:
# Demonstrate saving and loading conversation states
memory.load_memory_variables({})

In [None]:
memory = ConversationBufferMemory()

In [None]:
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})

In [None]:
print(memory.buffer)

In [None]:
memory.load_memory_variables({})

In [None]:
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

In [None]:
memory.load_memory_variables({})

## ConversationBufferWindowMemory

In [None]:
from langchain.memory import ConversationBufferWindowMemory

In [None]:
# Initialize memory with a fixed window (k=1) to retain only the latest interaction
memory = ConversationBufferWindowMemory(k=1)

In [None]:
# Store recent conversation interactions within the set window size
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

In [None]:
memory.load_memory_variables({})  # Load only the latest context due to window size

In [None]:
# Verify the memory’s windowed effect in a conversation
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=False
)

In [None]:
conversation.predict(input="Hi, my name is Andrew")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

## ConversationTokenBufferMemory

*Skipped during simulation due to the model Qwen2.5 being not supported by the tokenizer library tiktoken*

In [None]:
from langchain_community.chat_models import ChatLiteLLM
from langchain.memory import ConversationTokenBufferMemory
from langchain.llms import OpenAI

In [None]:
from langchain.memory import ConversationTokenBufferMemory
from langchain.chat_models import ChatInhouse
llm = ChatInhouse(model="qwen2.5",
                 api_key="sk-68ZEzMn9vg_c6SWQhBZMYg",
                 base_url='https://litellm.level-up.co.id/',
                 temperature=0.0)

In [None]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"},
                    {"output": "Charming!"})

In [None]:
memory.load_memory_variables({})

## ConversationSummaryMemory

In [None]:
from langchain.memory import ConversationSummaryBufferMemory

In [None]:
# Create sample context with a long message
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

# Initialize memory with summary-based retention (condenses prior exchanges into summaries)
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)

# Save initial context and more complex interactions
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
                    {"output": f"{schedule}"})

In [None]:
memory.load_memory_variables({})

In [None]:
# Engage in conversation using summarized memory
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

In [None]:
conversation.predict(input="What would be a good demo to show?")

In [None]:
memory.load_memory_variables({})