# Memory

## Setup
#### Follow [README](https://github.com/tirtho/open-ai/blob/main/README.md) and perform setup before running the notebooks

Reference : 
- [Azure Open AI](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/overview)
- [LangChain home page](https://python.langchain.com/docs/get_started/introduction.html)

#### Load the API key and relevant Python libaries.

In [1]:
import openai
import sys

from azure_openai_setup import set_openai_config, get_openai_global_config_parameters 

set_openai_config()

theOpenAIParams, modelName, modelDeploymentName = get_openai_global_config_parameters()

from langchain.chat_models import AzureChatOpenAI

# The openai.<variables> are already filled up by the above 
# set_openai_config() helper function called from above cell.
# Check that function for more details

azureChatClient = AzureChatOpenAI(
            openai_api_key = theOpenAIParams.api_key,
            openai_api_base = theOpenAIParams.api_base,
            openai_api_version = theOpenAIParams.api_version,
            deployment_name = modelDeploymentName,
            temperature=0
)

Got Azure OpenAI Credentials from Azure Key Vault with Azure CLI Auth


## Memory in LangChain

#### Add memory to an LLMChain

In [2]:
from langchain.memory import ConversationBufferMemory
from langchain import OpenAI, LLMChain, PromptTemplate

template = """You are a chatbot having a conversation with a human.

{chat_history_from_memory}

Human: {human_input}

Chatbot:"""

prompt = PromptTemplate(
                input_variables=["chat_history_from_memory", "human_input"], 
                template=template
            )
memory = ConversationBufferMemory(memory_key="chat_history_from_memory")

langChainConversationWithMemory = LLMChain(
                                    llm=azureChatClient,
                                    prompt=prompt,
                                    verbose=True,
                                    memory=memory
                                  )
langChainConversationWithMemory.predict(human_input="Hi there! My name is Joe Smith.")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.



Human: Hi there! My name is Joe Smith.

Chatbot:[0m

[1m> Finished chain.[0m


"Hello Joe Smith! It's nice to meet you. How can I assist you today?"

#### View what is in the memory

In [18]:
print(memory.buffer)

Human: Hi there! My name is Joe Smith.
AI: Hello Joe Smith! It's nice to meet you. How can I assist you today?


#### View what is in the memory (in json)

In [19]:
memory.load_memory_variables({})

{'chat_history_from_memory': "Human: Hi there! My name is Joe Smith.\nAI: Hello Joe Smith! It's nice to meet you. How can I assist you today?"}

#### Add more data to the memory

In [20]:
memory.save_context({"Human": "How are you doing my friend?"}, {"AI": "I am well. Thank you for asking."})
print(memory.buffer)

Human: Hi there! My name is Joe Smith.
AI: Hello Joe Smith! It's nice to meet you. How can I assist you today?
Human: How are you doing my friend?
AI: I am well. Thank you for asking.


In [21]:
langChainConversationWithMemory.predict(human_input="Can you tell me my last name?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.

Human: Hi there! My name is Joe Smith.
AI: Hello Joe Smith! It's nice to meet you. How can I assist you today?
Human: How are you doing my friend?
AI: I am well. Thank you for asking.

Human: Can you tell me my last name?

Chatbot:[0m

[1m> Finished chain.[0m


'Yes, your last name is Smith.'

## Memory Window

#### Keep Last k interactions in memory

In [17]:
# Using the same prompt template from above
# just with a different memory buffer

from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(memory_key="chat_history_from_memory", k=2)

langChainConversationWithMemory = LLMChain(
                                    llm=azureChatClient,
                                    prompt=prompt,
                                    verbose=True,
                                    memory=memory
                                  )

langChainConversationWithMemory.predict(human_input="Hi there! My name is Joan Johnson.")

IndentationError: unexpected indent (3584280531.py, line 8)

In [12]:
# Your name is still in memory (as last 2 interactions are stored)
# So, the model should be able to repond back with your name correctly.
langChainConversationWithMemory.predict(human_input="Can you tell me my name?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.

Human: Hi there! My name is Joan Johnson.
AI: Hello Joan Johnson! It's nice to meet you. How can I assist you today?

Human: Can you tell me my name?

Chatbot:[0m

[1m> Finished chain.[0m


'Yes, your name is Joan Johnson.'

In [13]:
# Now lets add some more interactions
langChainConversationWithMemory.predict(human_input="How are you doing?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.

Human: Hi there! My name is Joan Johnson.
AI: Hello Joan Johnson! It's nice to meet you. How can I assist you today?
Human: Can you tell me my name?
AI: Yes, your name is Joan Johnson.

Human: How are you doing?

Chatbot:[0m

[1m> Finished chain.[0m


"As an AI, I don't have emotions like humans do, but I'm functioning properly and ready to assist you with any questions or tasks you may have. How can I help you today?"

In [14]:
# Adding one more interaction
langChainConversationWithMemory.predict(human_input="What is the capital of USA?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.

Human: Can you tell me my name?
AI: Yes, your name is Joan Johnson.
Human: How are you doing?
AI: As an AI, I don't have emotions like humans do, but I'm functioning properly and ready to assist you with any questions or tasks you may have. How can I help you today?

Human: What is the capital of USA?

Chatbot:[0m

[1m> Finished chain.[0m


'The capital of the USA is Washington D.C.'

In [15]:
# Now the first interaction is out of memory buffer
# So, the model can't tell your name anymore
langChainConversationWithMemory.predict(human_input="Can you tell me my name?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a chatbot having a conversation with a human.

Human: How are you doing?
AI: As an AI, I don't have emotions like humans do, but I'm functioning properly and ready to assist you with any questions or tasks you may have. How can I help you today?
Human: What is the capital of USA?
AI: The capital of the USA is Washington D.C.

Human: Can you tell me my name?

Chatbot:[0m

[1m> Finished chain.[0m


"I'm sorry, but I don't have access to that information. Could you please tell me your name?"

## Other Memory types

#### 1. Keep max token limit in memory
Use ConversationTokenBufferMemory(llm, max_token_limit)
#### Create custom memory class
[Custom Memory Class Example](https://python.langchain.com/docs/modules/memory/how_to/custom_memory)
#### 2. Keep memory of given facts about specific entities in a conversation with Entity Memory
[Entity memory example](https://python.langchain.com/docs/modules/memory/how_to/entity_summary_memory)
#### 3. Conversation Summary Memory

Conversation summary memory summarizes the conversation as it happens and stores the current summary in memory. This memory can then be used to inject the summary of the conversation so far into a prompt/chain. This memory is most useful for longer conversations, where keeping the past message history in the prompt verbatim would take up too many tokens.

[Detailed examples](https://python.langchain.com/docs/modules/memory/how_to/summary)

## Conversation SummaryBufferMemory

It keeps a buffer of recent interactions in memory, but rather than just completely flushing old interactions it compiles them into a summary and uses both. Unlike the previous implementation though, it uses token length rather than number of interactions to determine when to flush interactions. [Details](https://python.langchain.com/docs/modules/memory/how_to/summary_buffer)

In [30]:
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chains import ConversationChain

memory = ConversationSummaryBufferMemory(
                llm=azureChatClient,
                max_token_limit=50
            )

conversationWithSummaryChain = ConversationChain(
                                    llm = azureChatClient,
                                    memory = memory,
                                    verbose = True
                                )

conversationWithSummaryChain.predict(input="Hi, this is Joe from Massachusetts.")
# The message so far is within the token limit, so no summarization yet.



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, this is Joe from Massachusetts.
AI:[0m

[1m> Finished chain.[0m


"Hello Joe from Massachusetts! It's great to chat with you. How can I assist you today?"

In [31]:
conversationWithSummaryChain.predict(input="Will you help me get a summary on a topic?")
# The message so far is within the token limit, so no summarization yet.



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, this is Joe from Massachusetts.
AI: Hello Joe from Massachusetts! It's great to chat with you. How can I assist you today?
Human: Will you help me get a summary on a topic?
AI:[0m

[1m> Finished chain.[0m


"Of course, I'd be happy to help! What topic are you interested in?"

#### Now it should summarize the previous interactions and continue

In [32]:
conversationWithSummaryChain.predict(input="Can you write a summary of Large Language Models?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human introduces themselves as Joe from Massachusetts and the AI greets them warmly, asking how they can assist.
Human: Will you help me get a summary on a topic?
AI: Of course, I'd be happy to help! What topic are you interested in?
Human: Can you write a summary of Large Language Models?
AI:[0m

[1m> Finished chain.[0m


'Absolutely! Large Language Models are a type of artificial intelligence that use deep learning algorithms to analyze and understand human language. These models are trained on massive amounts of text data, allowing them to generate human-like responses to questions and even create original content. Some of the most well-known examples of Large Language Models include GPT-3 and BERT. These models have a wide range of applications, from chatbots and virtual assistants to language translation and content creation. However, there are also concerns about the potential misuse of these models, such as the spread of misinformation or the creation of deepfakes. Overall, Large Language Models are a powerful tool in the field of AI, but their development and use must be carefully monitored and regulated.'

## Vector store-backed memory

This memory type stores memories in a VectorDB and queries the top-K most "salient" docs every time it is called.

This differs from most of the other Memory classes in that it doesn't explicitly track the order of interactions.

[Examples](https://python.langchain.com/docs/modules/memory/how_to/vectorstore_retriever_memory)