## Conversation Buffer Memory
> Simply store all previous conversations.

**Cons**: The memory keeps growing as it will store every conversation with User and API. This is neither sustainable nor cose-effective.
as you keep running `save_context` function, the result history will ever get growing.

In [3]:
# Conversation Buffer Memory
from langchain.memory import ConversationBufferMemory

# Regardless of memory type, they use the same API. 
memory = ConversationBufferMemory(return_messages=True) #turn return_message=False if it is not for chat model
memory.save_context({"input": "Hi"},{"output": "How are you?"})
memory.load_memory_variables({})


{'history': [HumanMessage(content='Hi'), AIMessage(content='How are you?')]}

## Conversation Buffer Window Memory
> It holdes the input message up to the point. For high level explanation, this memory will hold the most recent message up to the certain side you specified.

**Cons**: The chat bot will only remember the recent conversation. 

In [4]:
# Import conversationBufferWindowMemory
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(return_messages=True, k=4) #k = num of recent conversations to save

for i in range(0,10):
    memory.save_context({"input": i}, {"output": i})

memory.load_memory_variables({})


{'history': [HumanMessage(content='6'),
  AIMessage(content='6'),
  HumanMessage(content='7'),
  AIMessage(content='7'),
  HumanMessage(content='8'),
  AIMessage(content='8'),
  HumanMessage(content='9'),
  AIMessage(content='9')]}

## ConversationSummaryMemory
> creats a summary of the conversation over time. Advantages of using this type of memory kicks in especiallly when keeping longer conversations in verbatim would take up too many tokens.

In [11]:
from langchain.memory import ConversationSummaryMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(
    temperature = 0.1,
    model = 'gpt-4o-mini',
)

memory = ConversationSummaryMemory(llm=llm, return_messages=True)

def add_message(input, output):
    memory.save_context({"input": input}, {"output": output})

add_message("I am craving something sweet!", "Same here. What do you want?")
add_message("I will probably go get ice cream. Do you want to tag along?", "That's a good idea.")
add_message("Strawberry cheesecake is my favorite flavor", "Well, I like coconut mango more.")

memory.load_memory_variables({})

{'history': [SystemMessage(content="The human expresses a craving for something sweet, and the AI shares that it feels the same way and asks what the human wants. The human mentions they will probably go get ice cream and invites the AI to tag along, to which the AI responds that it's a good idea. The human reveals that strawberry cheesecake is their favorite flavor, while the AI states that it prefers coconut mango.")]}

## ConversationSummaryBufferMemory
> It keeps a buffer of recent interactions in memory, but rather than just completely flushing old interactions it compiles them into a summary and uses both. It uses token length rather than number of interactions to determine when to flush interactions.

### Notes:
model = 'gpt-4o-mini' is not supported.

In [15]:
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(
    temperature = 0.1,
)

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=150, return_messages=True)

def add_message(input, output):
    memory.save_context({"input": input}, {"output": output})

add_message("I am craving something sweet!", "Same here. What do you want?")
add_message("I will probably go get ice cream. Do you want to tag along?", "That's a good idea.")
add_message("Strawberry cheesecake is my favorite flavor", "Well, I like coconut mango more.")

memory.load_memory_variables({})

{'history': [HumanMessage(content='I am craving something sweet!'),
  AIMessage(content='Same here. What do you want?'),
  HumanMessage(content='I will probably go get ice cream. Do you want to tag along?'),
  AIMessage(content="That's a good idea."),
  HumanMessage(content='Strawberry cheesecake is my favorite flavor'),
  AIMessage(content='Well, I like coconut mango more.')]}

In [16]:
# It is still under the token limit, so run extra commands to see when it hits the limit.
add_message("Strawberry cheesecake is my favorite flavor", "Well, I like coconut mango more.")
add_message("Strawberry cheesecake is my favorite flavor", "Well, I like coconut mango more.")
add_message("Strawberry cheesecake is my favorite flavor", "Well, I like coconut mango more.")
memory.load_memory_variables({})

#Observe that the memory generates SystemMessage to summarize the previous conversation when hits the max-token limit.

{'history': [SystemMessage(content='The human expresses a craving for something sweet, to which the AI responds that it also wants something sweet and asks the human what they desire.'),
  HumanMessage(content='I will probably go get ice cream. Do you want to tag along?'),
  AIMessage(content="That's a good idea."),
  HumanMessage(content='Strawberry cheesecake is my favorite flavor'),
  AIMessage(content='Well, I like coconut mango more.'),
  HumanMessage(content='Strawberry cheesecake is my favorite flavor'),
  AIMessage(content='Well, I like coconut mango more.'),
  HumanMessage(content='Strawberry cheesecake is my favorite flavor'),
  AIMessage(content='Well, I like coconut mango more.'),
  HumanMessage(content='Strawberry cheesecake is my favorite flavor'),
  AIMessage(content='Well, I like coconut mango more.')]}