# Working with memory 
When implementing a long running chat or a chat that goes over multiple sessions you might want to persist the messages of a conversation in an external history store. This will give you the ability to load previous conversations in the model chat prompt to provide context, trim old messages to reduce the amount of distracting information and leverage summaries to keep really long running conversations on point.

Overview: <br>
We will be using Azure CosmosDB for MongoDB (vCore) to persist chat, use langchain to store, load and trim conversation flows.

## Setup

In [None]:
import os
from dotenv import load_dotenv
session_id = "session1" # your session id

# Load environment variables
if load_dotenv():
    print("Found OpenAPI Base Endpoint: " + os.getenv("AZURE_OPENAI_ENDPOINT"))
else: 
    print("No file .env found")

Lets create a simple conversation and use the SDK to persist the messages in CosmosDB.

In [None]:
pip install -U langchain-mongodb

In [None]:
connection_string=os.getenv("MONGO_DB_CONNECTION_STRING")
database_name=os.getenv("MONGO_DB_database_name")
collection_name=os.getenv("MONGO_DB_chathistory_collection_name")


In [None]:
from langchain_mongodb import MongoDBChatMessageHistory
from langchain.memory import ConversationBufferMemory

chat_message_history = MongoDBChatMessageHistory(
    session_id=session_id,
    connection_string=connection_string,
    database_name=database_name,
    collection_name=collection_name,
)

conversational_memory = ConversationBufferMemory(
    chat_memory=chat_message_history,
    memory_key='chat_history',
    return_messages=True)

# Prompt: Create a conversation flow of 10 questions and answers between Luke and Yoda about the power of the force for creating cloud born applications on Azure.
chat_message_history.add_user_message("Master, how can Azure help me in creating cloud-born applications?")
chat_message_history.add_ai_message("Azure, a powerful platform it is. Build, deploy, and manage applications across a global network of Microsoft-managed data centers, you can. With your preferred tools and frameworks, it works.")

chat_message_history.add_user_message("What kind of applications can I create using Azure?")
chat_message_history.add_ai_message("Limitless, the possibilities are. Web apps, mobile apps, AI apps, IoT apps, all can be built. Even microservices or docker-based apps, Azure can handle.")

chat_message_history.add_user_message("Can Azure help me in scaling these applications?")
chat_message_history.add_ai_message("Indeed, young Padawan. Automatically, Azure scales. Adapts to demand, it does. Worry about infrastructure, you need not. Focus on code, you should.")

chat_message_history.add_user_message("How secure is Azure for building applications?")
chat_message_history.add_ai_message("Strong, Azure's security is. Advanced security, compliance, and privacy features built-in, it has. Trust Azure, you can.")

chat_message_history.add_user_message("What if my application needs to be available all the time?")
chat_message_history.add_ai_message("Worry not. Azure promises 99.99% availability. Disaster recovery, backup, and migration options, it provides. Always on, your applications will be.")

chat_message_history.add_user_message("Azure help me analyze data from my applications?")
chat_message_history.add_ai_message("Yes, young one. Powerful analytics tools Azure has. Insight into your data, it will give. Make informed decisions, you can.")

chat_message_history.add_user_message("What about the costs? Is Azure affordable?")
chat_message_history.add_ai_message("Flexible, Azure's pricing is. Pay for what you use, you do. Even offer free services, they do.")

chat_message_history.add_user_message("Can Azure support open-source technologies?")
chat_message_history.add_ai_message("Indeed, Luke. Strong supporter of open source, Azure is. Many languages, tools, and frameworks, it supports.")

chat_message_history.add_user_message("Is it possible to automate tasks in Azure?")
chat_message_history.add_ai_message("Mmm, automate tasks, you can. Azure DevOps and Azure Automation, use you can. Increase productivity, you will.")

chat_message_history.add_user_message("Finally, what if I need help? Does Azure provide support?")
chat_message_history.add_ai_message("Fear not, Luke. Strong support, Azure provides. Community support, documentation, tutorials, all available they are. Even professional support options, they have.")

print("This is what has been persisted:")
chat_message_history.messages

Now go into the Data Explorer of CosmosDB or Mongo Compass to check the documents that have been created.
There should be several documents each for a single message that will be connected by the SessionId:

```json
{
	"_id" : ObjectId("65d620e61875ca4a55e09fae"),
	"SessionId" : "session1",
	"History" : "{\"type\": \"human\", \"data\": {\"content\": \"Can Azure support open-source technologies?\", \"additional_kwargs\": {}, \"type\": \"human\", \"example\": false}}"
}
```

If you want to delete all items in the database use the MongoDB shell and execute the following command:
```
use my_db
db.chat_histories.deleteMany({})
```

Now lets create a conversation flow that includes the history to also use questions on the chat history itself.

In [None]:
from langchain_openai import AzureChatOpenAI
import os
chat = AzureChatOpenAI(
    azure_deployment = os.getenv("AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME")
)

from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | chat

follow_up_question = "Can you summarize the last two answers I got from you?"

chat_message_history.add_user_message(follow_up_question)

response = chain.invoke(
    {
        "messages": chat_message_history.messages,
    }
)

print(response)


## Trim the message history
The downside of this approach is that we always have to pass the messages to the chain explicitly. That approach is valid but requires us to keep the message history in sync manually. To overcome that we can use the RunnableWithMessageHistory.

In [None]:
from langchain.memory import ChatMessageHistory
demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages

from langchain_core.runnables.history import RunnableWithMessageHistory

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

chain_with_message_history.invoke(
    {"input": "What was my name again?"},
    {"configurable": {"session_id": "unused"}},
)

More interesting is the idea of regularly trimming the history to a fixed size to keep the context window of messages (and the amount of prompt tokens low and relevant). In this case we will only keep the two last messages in the history.

In [None]:
from langchain_core.runnables import RunnablePassthrough


def trim_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) <= 2:
        return False

    demo_ephemeral_chat_history.clear()

    for message in stored_messages[-2:]:
        demo_ephemeral_chat_history.add_message(message)

    return True


chain_with_trimming = (
    RunnablePassthrough.assign(messages_trimmed=trim_messages)
    | chain_with_message_history
)

If we now ask the same question as before we will see that the first question in the context is a different one.

In [None]:
print("These are all the messages before the questions")

demo_ephemeral_chat_history.messages

chain_with_trimming.invoke(
    {"input": "What is my name again?"},
    {"configurable": {"session_id": "unused"}},
)

print("These are all the messages after the questions")
demo_ephemeral_chat_history.messages

## Automatically create summaries

This approach keeps the amount of chat history low but the history can loose relevance very quickly. An approach to solve that is to ask the LLM to create a summary of a fixed size of the previous conversation and only keep that summary in the context while updating the summary continously in CosmosDB.

In [None]:
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_mongodb import MongoDBChatMessageHistory

chat_message_history = MongoDBChatMessageHistory(
    session_id=session_id,
    connection_string=connection_string,
    database_name=database_name,
    collection_name=collection_name,
)

# Prompt: Create a conversation flow of 10 questions and answers between Luke and Yoda about the power of the force for creating cloud born applications on Azure.
chat_message_history.add_user_message("Master, how can Azure help me in creating cloud-born applications?")
chat_message_history.add_ai_message("Azure, a powerful platform it is. Build, deploy, and manage applications across a global network of Microsoft-managed data centers, you can. With your preferred tools and frameworks, it works.")

chat_message_history.add_user_message("What kind of applications can I create using Azure?")
chat_message_history.add_ai_message("Limitless, the possibilities are. Web apps, mobile apps, AI apps, IoT apps, all can be built. Even microservices or docker-based apps, Azure can handle.")

chat_message_history.add_user_message("Can Azure help me in scaling these applications?")
chat_message_history.add_ai_message("Indeed, young Padawan. Automatically, Azure scales. Adapts to demand, it does. Worry about infrastructure, you need not. Focus on code, you should.")

print("These messages have been persisted:")
chat_message_history.messages

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    return chat_message_history

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability. The provided chat history includes facts about the user you are speaking with.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("user", "{input}"),
    ]
)

chain = prompt | chat

chain_with_saved_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

def summarize_messages(chain_input):
    stored_messages = chat_message_history.messages
    if len(stored_messages) == 0:
        return False
    summarization_prompt = ChatPromptTemplate.from_messages(
        [
            MessagesPlaceholder(variable_name="chat_history"),
            (
                "user",
                "Distill the above chat messages into a single summary message. Include as many specific details as you can.",
            ),
        ]
    )
    summarization_chain = summarization_prompt | chat

    summary_message = summarization_chain.invoke({"chat_history": stored_messages})

    chat_message_history.clear()

    chat_message_history.add_message(summary_message)

    return True


chain_with_summarization = (
    RunnablePassthrough.assign(messages_summarized=summarize_messages)
    | chain_with_saved_message_history
)

Now if we ask a question the items in the CosmosDB will be updated and you will no longer see every message but instead only a summary of the details.

In [None]:
chain_with_summarization.invoke(
    {"input": "What did we talk about Azure?"},
    {"configurable": {"session_id": session_id}},
)

print("These messages have been persisted after the summary:")
chat_message_history.messages

This demonstrates the basic principles for persisting a chat history, trim the contents and use automatic summaries. Of course we have not implemented session handling or multiple users yet but maybe you can extend the scenario?