# Understanding Memory in LLMs

In the previous Notebook 03, we successfully explored how OpenAI models can enhance the results from Azure Cognitive Search. [Bing Chat](http://chat.bing.com/) is a search engine with a GPT-4 model that utilizes the content of search results to provide context and deliver accurate responses to queries.

However, we have yet to discover how to engage in a conversation with the LLM. With Bing Chat, this is possible, as the LLM can understand and reference the previous responses.

There is a common misconception that GPT models have memory. This is not true. While they possess knowledge, they do not retain information from previous questions asked to them.

The aim of this Notebook is to demonstrate how we can "provide memory" to the LLM by utilizing prompts and context.

In [1]:
import os
import random
from langchain.chat_models import AzureChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
from langchain.memory import ConversationBufferMemory, ConversationTokenBufferMemory
from openai.error import OpenAIError
from langchain.docstore.document import Document
from langchain.memory import CosmosDBChatMessageHistory

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string))

#custom libraries that we will use later in the app
from common.utils import (
    get_search_results,
    order_search_results,
    model_tokens_limit,
    num_tokens_from_docs,
    embed_docs,
    search_docs,
    get_answer,
)

from common.prompts import COMBINE_QUESTION_PROMPT, COMBINE_PROMPT, COMBINE_CHAT_PROMPT

from dotenv import load_dotenv
load_dotenv("credentials.env")

import logging

# Get the root logger
logger = logging.getLogger()
# Set the logging level to a higher level to ignore INFO messages
logger.setLevel(logging.WARNING)

In [2]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_BASE"] = os.environ["AZURE_OPENAI_ENDPOINT"]
os.environ["OPENAI_API_KEY"] = os.environ["AZURE_OPENAI_API_KEY"]
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]
os.environ["OPENAI_API_TYPE"] = "azure"

### Let's start with the basics
Let's use a very simple example to see if the GPT model of Azure OpenAI have memory. We again will be using langchain to simplify our code 

In [3]:
QUESTION = "Tell me some use cases for reinforcement learning?"
FOLLOW_UP_QUESTION = "Can you summarize your last response?"

In [4]:
# Define model
MODEL = "gpt-35-turbo"
# Create an OpenAI instance
llm = AzureChatOpenAI(deployment_name=MODEL, temperature=0.5, max_tokens=500)

In [5]:
# We create a very simple prompt template, just the question as is:
prompt = PromptTemplate(
    input_variables=["question"],
    template="{question}",
)

chain = LLMChain(llm=llm, prompt=prompt)

In [6]:
# Let's see what the GPT model responds
response = chain.run(QUESTION)
printmd(response)

Reinforcement learning can be applied to various domains and use cases. Here are some examples:

1. Game playing: Reinforcement learning has been successfully used to train agents to play complex games such as chess, Go, and poker. AlphaGo, developed by DeepMind, is a famous example of reinforcement learning applied to game playing.

2. Robotics: Reinforcement learning can be used to train robotic agents to perform tasks like grasping objects, navigating through environments, or manipulating objects. It enables robots to learn from their own experiences and adapt to different scenarios.

3. Autonomous vehicles: Reinforcement learning can aid in developing self-driving cars by training agents to make decisions based on real-time sensory input. It helps to optimize driving behavior, handle complex traffic situations, and improve safety.

4. Recommendation systems: Reinforcement learning can be employed to enhance recommendation algorithms by learning from user feedback. It enables personalized recommendations and improves user experience on platforms like e-commerce, streaming services, or social media.

5. Healthcare: Reinforcement learning can assist in optimizing treatment plans, dosage determination, or resource allocation in healthcare settings. It can learn from patient data and provide personalized treatment strategies.

6. Finance: Reinforcement learning can be used for portfolio management, algorithmic trading, and risk management. Agents can learn optimal trading strategies and adapt to market conditions.

7. Energy management: Reinforcement learning algorithms can optimize energy consumption in buildings, factories, or power grids. It can learn to control energy systems efficiently and reduce costs.

8. Natural language processing: Reinforcement learning can improve dialogue systems and chatbots by learning to generate appropriate responses based on user feedback. It enhances the conversational abilities of virtual assistants.

9. Advertising and marketing: Reinforcement learning can be utilized to optimize ad placement, bidding strategies, and personalized marketing campaigns. It helps to maximize user engagement and conversion rates.

10. Supply chain management: Reinforcement learning can optimize inventory management, logistics, and demand forecasting. It aids in reducing costs, improving delivery times, and managing supply chain disruptions.

These are just a few examples, and reinforcement learning has a wide range of applications across various fields.

In [7]:
#Now let's ask a follow up question
chain.run(FOLLOW_UP_QUESTION)

"In my last response, I provided a detailed explanation of how the GPT-3 language model works. I mentioned that GPT-3 is a deep learning model that has been trained on a vast amount of data to generate human-like text. It uses a transformer architecture and utilizes a process called unsupervised learning to understand and generate text. The model's training data consists of a wide range of sources, including books, articles, websites, and more. GPT-3's ability to generate coherent and contextually relevant responses is due to its understanding of language patterns and its ability to predict the most likely next word or phrase based on the given context. However, it's important to note that GPT-3's responses are generated based on statistical patterns and may not always be accurate or reliable."

As you can see, it doesn't remember what it just responded, sometimes it responds based only on the system prompt. This proof that the LLM does NOT have memory and that we need to give the memory as a a conversation history as part of the prompt, like this:

In [8]:
hist_prompt = PromptTemplate(
    input_variables=["history", "question"],
    template="""
                {history}
                Human: {question}
                AI:
            """
    )
chain = LLMChain(llm=llm, prompt=hist_prompt)

In [9]:
Conversation_history = """
Human: {question}
AI: {response}
""".format(question=QUESTION, response=response)

In [10]:
chain.run({"history":Conversation_history, "question": FOLLOW_UP_QUESTION})

'Reinforcement learning can be applied to various domains such as game playing, robotics, autonomous vehicles, recommendation systems, healthcare, finance, energy management, natural language processing, advertising and marketing, and supply chain management. It enables agents to learn from their own experiences and make optimal decisions in complex scenarios.'

**Bingo!**, so we now know how to create a chatbot using LLMs, we just need to keep the state/history of the conversation and pass it as context every time

## Now that we understand the concept of memory via adding history as a context, let's go back to our GPT Smart Search engine

In order to not duplicate code, we have put many of the code used in Notebook 3 into functions. These functions are in the `common/utils.py` and `common/prompts.py` files This way we can use these functios in the app that we will build later.

In [36]:
# Since Memory adds tokens to the prompt, we would need a better model that allows more space on the prompt
MODEL = "gpt-35-turbo-16k"
llm = AzureChatOpenAI(deployment_name=MODEL, temperature=0.5, max_tokens=1000)

In [53]:
index1_name = "cogsrch-index-files"
index2_name = "cogsrch-index-csv"
index3_name = "cogsrch-index-books-vector"
indexes = [index1_name, index2_name]

agg_search_results = get_search_results(QUESTION, indexes)
ordered_results = order_search_results(agg_search_results, reranker_threshold=1)

In [54]:
docs = []
for key,value in ordered_results.items():
    for page in value["chunks"]:
        location = value["location"] if value["location"] is not None else ""
        docs.append(Document(page_content=page, metadata={"source": location+os.environ['BLOB_SAS_TOKEN']}))

# Calculate number of tokens of our docs
tokens_limit = model_tokens_limit(MODEL)

if(len(docs)>0):
    num_tokens = num_tokens_from_docs(docs)
    print("Custom token limit for", MODEL, ":", tokens_limit)
    print("Combined docs tokens count:",num_tokens)
        
else:
    print("NO RESULTS FROM AZURE SEARCH")


Custom token limit for gpt-35-turbo-16k : 14500
Combined docs tokens count: 1299577


In [55]:
%%time
if num_tokens > tokens_limit:
    index = embed_docs(docs)
    top_docs = search_docs(index,QUESTION,k=4)
    
    # Now we need to recalculate the tokens count of the top results from similarity vector search
    # in order to select the chain type: stuff or map_reduce
    
    num_tokens = num_tokens_from_docs(top_docs)   
    print("Token count after similarity search:", num_tokens)
    chain_type = "map_reduce" if num_tokens > tokens_limit else "stuff"
    
else:
    # if total tokens is less than our limit, we don't need to vectorize and do similarity search
    top_docs = docs
    chain_type = "stuff"
    
print("Chain Type selected:", chain_type)

Token count after similarity search: 1664
Chain Type selected: stuff
CPU times: user 568 ms, sys: 24.1 ms, total: 592 ms
Wall time: 9.71 s


In [56]:
# Get the answer
response = get_answer(llm=llm, docs=top_docs, query=QUESTION, language="English", chain_type=chain_type)
printmd(response['output_text'])

Reinforcement learning has several use cases, including:
1. Learning prevention strategies in the context of pandemic influenza<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[1]</a></sup>.
2. Personalized music recommendation based on capturing changes in listeners' preferences<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[2]</a></sup>.
3. Decision-making based on identity and collaboration between districts in the context of prevention strategies<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[3]</a></sup>.
4. Modeling epidemics and optimizing individual decisions to control the spread of disease<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[4]</a></sup>.

Anything else I can help you with?

And if we ask the follow up question:

In [41]:
response = get_answer(llm=llm, docs=top_docs,  query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type)
printmd(response['output_text'])

I apologize, but I cannot summarize my last response as it was not provided.

You might get a different response from above, but it doesn't matter what response you get, it will be based on the context given, not on previous answers.

Until now we just have the same as the prior Notebook 03: results from Azure Search enhanced by OpenAI model, with no memory

**Now let's add memory to it:**

Reference: https://python.langchain.com/docs/modules/memory/how_to/adding_memory_chain_multiple_inputs

In [42]:
# memory object, which is neccessary to track the inputs/outputs and hold a conversation.
memory = ConversationBufferMemory(memory_key="chat_history",input_key="question")

response = get_answer(llm=llm, docs=top_docs, query=QUESTION, language="English", chain_type=chain_type, 
                        memory=memory)
printmd(response['output_text'])

Reinforcement learning can be applied to various use cases. Here are a few examples:

1. Epidemiological Modeling: Reinforcement learning can be used to automatically learn prevention strategies for infectious diseases like pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, researchers can develop mitigation policies and prevention strategies<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[1]</a></sup>.

2. Personalized Recommendation Systems: Reinforcement learning can enhance personalized recommendation systems, especially in the domain of music. By simulating the interaction process and continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend song sequences that better match listeners' preferences<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[2]</a></sup>.

3. Decision-Making and Identity: Reinforcement learning can shed light on decision-making processes influenced by identity. People often make decisions based on who they are and what people like them would do in a particular situation. This model of decision-making can explain why individuals may reject offers or incentives that seem rational from a self-interest perspective<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[3]</a></sup>.

4. Micropopulation Modeling for Epidemics: Reinforcement learning can be used to model the spread of epidemics at a microscopic level. By considering the consequences of individual decisions on disease transmission, researchers can predict the spread of diseases and identify the need for external interventions to regulate behaviors<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[4]</a></sup>.

These are just a few examples of how reinforcement learning can be applied in different domains. Let me know if you have any other questions!

In [43]:
# Now we add a follow up question:
response = get_answer(llm=llm, docs=top_docs, query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type, 
                      memory=memory)
printmd(response['output_text'])

Reinforcement learning has various use cases in different domains. Here is a summary of the examples I provided earlier:

1. Epidemiological Modeling: Reinforcement learning can be used to automatically learn prevention strategies for infectious diseases like pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, researchers can develop mitigation policies and prevention strategies<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[1]</a></sup>.

2. Personalized Recommendation Systems: Reinforcement learning can enhance personalized recommendation systems, especially in the domain of music. By simulating the interaction process and continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend song sequences that better match listeners' preferences<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[2]</a></sup>.

3. Decision-Making and Identity: Reinforcement learning can shed light on decision-making processes influenced by identity. People often make decisions based on who they are and what people like them would do in a particular situation. This model of decision-making can explain why individuals may reject offers or incentives that seem rational from a self-interest perspective<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[3]</a></sup>.

4. Micropopulation Modeling for Epidemics: Reinforcement learning can be used to model the spread of epidemics at a microscopic level. By considering the consequences of individual decisions on disease transmission, researchers can predict the spread of diseases and identify the need for external interventions to regulate behaviors<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[4]</a></sup>.

These examples demonstrate how reinforcement learning can be applied in various fields to optimize decision-making, personalize recommendations, and model complex systems. Let me know if you have any further questions!



In [44]:
# Another follow up query
response = get_answer(llm=llm, docs=top_docs, query="Thank you", language="English", chain_type=chain_type,  
                      memory=memory)
printmd(response['output_text'])

Reinforcement learning has several use cases in different domains. Here is a summary of the examples I provided earlier:

1. Epidemiological Modeling: Reinforcement learning can be used to automatically learn prevention strategies for infectious diseases like pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, researchers can develop mitigation policies and prevention strategies<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[1]</a></sup>.

2. Personalized Recommendation Systems: Reinforcement learning can enhance personalized recommendation systems, especially in the domain of music. By simulating the interaction process and continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend song sequences that better match listeners' preferences<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[2]</a></sup>.

3. Decision-Making and Identity: Reinforcement learning can shed light on decision-making processes influenced by identity. People often make decisions based on who they are and what people like them would do in a particular situation. This model of decision-making can explain why individuals may reject offers or incentives that seem rational from a self-interest perspective<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[3]</a></sup>.

4. Micropopulation Modeling for Epidemics: Reinforcement learning can be used to model the spread of epidemics at a microscopic level. By considering the consequences of individual decisions on disease transmission, researchers can predict the spread of diseases and identify the need for external interventions to regulate behaviors<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[4]</a></sup>.

These examples demonstrate how reinforcement learning can be applied in various fields to optimize decision-making, personalize recommendations, and model complex systems. Let me know if you have any further questions!

You might get a different answer on the above cell, and it is ok, this bot is not yet well configured to answer any question that is not related to its knowledge base, including salutations.

Let's check our memory to see that it's keeping the conversation

In [45]:
memory.buffer

'Human: Tell me some use cases for reinforcement learning?\nAI: Reinforcement learning can be applied to various use cases. Here are a few examples:\n\n1. Epidemiological Modeling: Reinforcement learning can be used to automatically learn prevention strategies for infectious diseases like pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, researchers can develop mitigation policies and prevention strategies<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D">[1]</a></sup>.\n\n2. Personalized Recommendation Systems: Reinforcement learning can enhance personalized recommendation systems, especially in the domain of music. By simulating the interaction process and continuously updating the model based on users\' preferences, reinforcement learning algorithms can recommend song sequences tha

## Using CosmosDB as persistent memory

In previous cell we have added local RAM memory to our chatbot. However, it is not persistent, it gets deleted once the app user's session is terminated. It is necessary then to use a Database for persistent storage of each of the bot user conversations, not only for Analytics and Auditing, but also if we wisg to provide recommendations. 

Here we will store the conversation history into CosmosDB for future auditing purpose.
We will use a class in LangChain use CosmosDBChatMessageHistory, see [HERE](https://python.langchain.com/en/latest/_modules/langchain/memory/chat_message_histories/cosmos_db.html)

In [46]:
# Create CosmosDB instance from langchain cosmos class.
cosmos = CosmosDBChatMessageHistory(
    cosmos_endpoint=os.environ['AZURE_COSMOSDB_ENDPOINT'],
    cosmos_database=os.environ['AZURE_COSMOSDB_NAME'],
    cosmos_container=os.environ['AZURE_COSMOSDB_CONTAINER_NAME'],
    connection_string=os.environ['AZURE_COMOSDB_CONNECTION_STRING'],
    session_id="Agent-Test-Session" + str(random.randint(1, 1000)),
    user_id="Agent-Test-User" + str(random.randint(1, 1000))
    )

# prepare the cosmosdb instance
cosmos.prepare_cosmos()

In [47]:
# Create or Memory Object
memory = ConversationBufferMemory(memory_key="chat_history",input_key="question",chat_memory=cosmos)

In [48]:
# Testing using our Question
response = get_answer(llm=llm, docs=top_docs, query=QUESTION, language="English", chain_type=chain_type, 
                        memory=memory)
printmd(response['output_text'])

Reinforcement learning has various use cases in different domains. Here are a few examples:

1. **Epidemic Prevention**: Reinforcement learning can be used to automatically learn prevention strategies for epidemics, such as pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, prevention strategies can be learned and optimized to control the spread of diseases<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[1]</a></sup>.

2. **Personalized Recommendation Systems**: Reinforcement learning can enhance recommendation systems by considering the simulation of the interaction process. By continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend personalized song sequences that match listeners' preferences better<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[2]</a></sup>.

3. **Decision-Making based on Identity**: Reinforcement learning can be applied to decision-making processes that are influenced by identity. Instead of solely focusing on self-interest and consequences, reinforcement learning algorithms can consider norms, principles, and group affiliations to make decisions<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[3]</a></sup>.

4. **Microscopic Modeling of Epidemics**: Reinforcement learning can be used to model epidemics at a microscopic level, where individual agents make decisions that affect the spread of the disease. By optimizing agents' decisions through game theory and multi-agent reinforcement learning, predictions about the spread of the disease can be made<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[4]</a></sup>.

These are just a few examples, and reinforcement learning can be applied to many other use cases as well. Let me know if there's anything else I can help you with!

In [49]:
# Now we add a follow up question:
response = get_answer(llm=llm, docs=top_docs, query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type, 
                      memory=memory)
printmd(response['output_text'])

Reinforcement learning has various use cases in different domains. Here is a summary of the use cases mentioned in my previous response:

1. **Epidemic Prevention**: Reinforcement learning can be used to automatically learn prevention strategies for epidemics, such as pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, prevention strategies can be learned and optimized to control the spread of diseases<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[1]</a></sup>.

2. **Personalized Recommendation Systems**: Reinforcement learning can enhance recommendation systems by considering the simulation of the interaction process. By continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend personalized song sequences that match listeners' preferences better<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[2]</a></sup>.

3. **Decision-Making based on Identity**: Reinforcement learning can be applied to decision-making processes that are influenced by identity. Instead of solely focusing on self-interest and consequences, reinforcement learning algorithms can consider norms, principles, and group affiliations to make decisions<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[3]</a></sup>.

4. **Microscopic Modeling of Epidemics**: Reinforcement learning can be used to model epidemics at a microscopic level, where individual agents make decisions that affect the spread of the disease. By optimizing agents' decisions through game theory and multi-agent reinforcement learning, predictions about the spread of the disease can be made<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[4]</a></sup>.

You can refer to the provided sources for more detailed information on each use case. Let me know if there's anything else I can assist you with!

In [50]:
# Another follow up query
response = get_answer(llm=llm, docs=top_docs, query="Thank you", language="English", chain_type=chain_type,  
                      memory=memory)
printmd(response['output_text'])

Reinforcement learning has various use cases in different domains. Here is a summary of the use cases mentioned in my previous response:

1. **Epidemic Prevention**: Reinforcement learning can be used to automatically learn prevention strategies for epidemics, such as pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, prevention strategies can be learned and optimized to control the spread of diseases<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[1]</a></sup>.

2. **Personalized Recommendation Systems**: Reinforcement learning can enhance recommendation systems by considering the simulation of the interaction process. By continuously updating the model based on users' preferences, reinforcement learning algorithms can recommend personalized song sequences that match listeners' preferences better<sup><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206183/?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[2]</a></sup>.

3. **Decision-Making based on Identity**: Reinforcement learning can be applied to decision-making processes that are influenced by identity. Instead of solely focusing on self-interest and consequences, reinforcement learning algorithms can consider norms, principles, and group affiliations to make decisions<sup><a href="https://blobstoragearbnfv3wffx5o.blob.core.windows.net/books/Made_To_Stick.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[3]</a></sup>.

4. **Microscopic Modeling of Epidemics**: Reinforcement learning can be used to model epidemics at a microscopic level, where individual agents make decisions that affect the spread of the disease. By optimizing agents' decisions through game theory and multi-agent reinforcement learning, predictions about the spread of the disease can be made<sup><a href="https://arxiv.org/pdf/2004.12959v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[4]</a></sup>.

You can refer to the provided sources for more detailed information on each use case. Let me know if there's anything else I can assist you with!

Let's check our Azure CosmosDB to see the whole conversation


In [51]:
#load message from cosmosdb
cosmos.load_messages()
cosmos.messages

[HumanMessage(content='Tell me some use cases for reinforcement learning?', additional_kwargs={}, example=False),
 AIMessage(content='Reinforcement learning has various use cases in different domains. Here are a few examples:\n\n1. **Epidemic Prevention**: Reinforcement learning can be used to automatically learn prevention strategies for epidemics, such as pandemic influenza. By constructing epidemiological models and using reinforcement learning algorithms, prevention strategies can be learned and optimized to control the spread of diseases<sup><a href="https://arxiv.org/pdf/2003.13676v1.pdf?sv=2022-11-02&ss=bf&srt=sco&sp=rltfx&se=2024-10-02T01:02:07Z&st=2023-08-03T17:02:07Z&spr=https&sig=gLxStXFSY6X29OPpPDpBEhoQDdtJNDrMVExNYJ%2BhmBQ%3D" target="_blank">[1]</a></sup>.\n\n2. **Personalized Recommendation Systems**: Reinforcement learning can enhance recommendation systems by considering the simulation of the interaction process. By continuously updating the model based on users\' pref

![CosmosDB Memory](./images/cosmos-chathistory.png)

# Summary
##### Adding memory to our application allows the user to have a conversation, however this feature is not something that comes with the LLM, but instead, memory is something that we must provide to the LLM in form of context of the question.

We added persitent memory using CosmosDB.

We also can notice that the current chain that we are using is smart, but not that much. Although we have given memory to it, it searches for similar docs everytime, it struggles to respond to prompts like: Hello, Thank you, Bye, What's your name, What's the weather and any other task that is not search in the knowledge base.



# NEXT
We know now how to do a Smart Search Engine that can power a chatbot!! great!

But, does this solve all the possible scenarios that a virtual assistant will require?  **What about if the answer to the Smart Search Engine is not related to text, but instead requires to look into tabular data?** The next notebook 05 explains and solves the tabular problem and the concept of Agents