# Understanding Memory in LLMs

In the previous Notebook, we successfully explored how OpenAI models can enhance the results from Azure Cognitive Search. 

However, we have yet to discover how to engage in a conversation with the LLM. With [Bing Chat](http://chat.bing.com/), for example, this is possible, as it can understand and reference the previous responses.

There is a common misconception that GPT models have memory. This is not true. While they possess knowledge, they do not retain information from previous questions asked to them.

In this Notebook, our goal is to illustrate how we can effectively "endow the LLM with memory" by employing prompts and context.

In [33]:
import os
import random
from langchain.chat_models import AzureChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from openai.error import OpenAIError
from langchain.embeddings import OpenAIEmbeddings
from langchain.docstore.document import Document
from langchain.memory import CosmosDBChatMessageHistory

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string))

#custom libraries that we will use later in the app
from common.utils import (
    get_search_results,
    update_vector_indexes,
    model_tokens_limit,
    num_tokens_from_docs,
    num_tokens_from_string,
    get_answer,
)

from common.prompts import COMBINE_CHAT_PROMPT_TEMPLATE
COMBINE_CHAT_PROMPT_TEMPLATE = """
Given the following: 
- a chat history, and a question from the Human
- extracted parts from several documents 

Instructions:
- Create a final answer with references. 
- You can only provide numerical references to documents, using this html format: `<sup><a href="url?query_parameters" target="_blank">[number]</a></sup>`.
- The reference must be from the `Source:` section of the extracted parts. You are not to make a reference from the content, only from the `Source:` of the extract parts.
- Reference (source) document's url can include query parameters, for example: "https://example.com/search?query=apple&category=fruits&sort=asc&page=1". On these cases, **you must** include que query references on the document url, using this html format: <sup><a href="url?query_parameters" target="_blank">[number]</a></sup>.
- **You can only answer the question from information contained in the extracted parts below**, DO NOT use your prior knowledge.
- Never provide an answer without references.
- If you don't know the answer, just say that you don't know. Don't try to make up an answer.
- Respond in {language}.

Chat History:

{chat_history}

HUMAN: {question}
=========
{summaries}
=========
AI:"""

from dotenv import load_dotenv
load_dotenv("credentials.env")

import logging

# Get the root logger
logger = logging.getLogger()
# Set the logging level to a higher level to ignore INFO messages
logger.setLevel(logging.WARNING)
print(COMBINE_CHAT_PROMPT_TEMPLATE)


Given the following: 
- a chat history, and a question from the Human
- extracted parts from several documents 

Instructions:
- Create a final answer with references. 
- You can only provide numerical references to documents, using this html format: `<sup><a href="url?query_parameters" target="_blank">[number]</a></sup>`.
- The reference must be from the `Source:` section of the extracted parts. You are not to make a reference from the content, only from the `Source:` of the extract parts.
- Reference (source) document's url can include query parameters, for example: "https://example.com/search?query=apple&category=fruits&sort=asc&page=1". On these cases, **you must** include que query references on the document url, using this html format: <sup><a href="url?query_parameters" target="_blank">[number]</a></sup>.
- **You can only answer the question from information contained in the extracted parts below**, DO NOT use your prior knowledge.
- Never provide an answer without references.


In [34]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_BASE"] = os.environ["AZURE_OPENAI_ENDPOINT"]
os.environ["OPENAI_API_KEY"] = os.environ["AZURE_OPENAI_API_KEY"]
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]
os.environ["OPENAI_API_TYPE"] = "azure"

### Let's start with the basics
Let's use a very simple example to see if the GPT model of Azure OpenAI have memory. We again will be using langchain to simplify our code 

In [35]:
QUESTION = "Tell me some use cases for reinforcement learning"
FOLLOW_UP_QUESTION = "Give me the main points of our conversation"

In [36]:
# Define model
MODEL = "gpt-35-turbo-16k"
COMPLETION_TOKENS = 1000
# Create an OpenAI instance
llm = AzureChatOpenAI(deployment_name=MODEL, temperature=0.5, max_tokens=COMPLETION_TOKENS)

In [37]:
# We create a very simple prompt template, just the question as is:
prompt = PromptTemplate(
    input_variables=["question"],
    template="{question}",
)

chain = LLMChain(llm=llm, prompt=prompt)

In [38]:
# Let's see what the GPT model responds
response = chain.run(QUESTION)
printmd(response)

Sure! Here are some use cases for reinforcement learning:

1. Game playing: Reinforcement learning has been successful in training agents to play complex games like chess, Go, and Dota 2, achieving superhuman performance.

2. Robotics: Reinforcement learning can be used to train robots to perform tasks like object manipulation, grasping, and locomotion, enabling them to learn from trial and error in real-world environments.

3. Autonomous vehicles: Reinforcement learning can be applied to train self-driving cars to make decisions on the road, such as lane changing, merging, and navigating complex traffic scenarios.

4. Recommendation systems: Reinforcement learning can be used to personalize recommendations for users by learning their preferences and optimizing the recommendations based on user feedback.

5. Healthcare: Reinforcement learning can help optimize treatment plans for patients by learning from patient data and medical records, leading to personalized and more effective treatment strategies.

6. Finance: Reinforcement learning can be utilized in algorithmic trading to make trading decisions based on market conditions and historical data, optimizing portfolio management and risk management.

7. Energy management: Reinforcement learning can be used to optimize energy consumption in buildings or power grids by learning from real-time data, resulting in more efficient energy usage and cost savings.

8. Supply chain management: Reinforcement learning can be applied to optimize inventory management, pricing, and distribution decisions in supply chain networks, leading to improved efficiency and reduced costs.

9. Advertising: Reinforcement learning can be used to optimize online advertising strategies by learning from user interactions and feedback, improving ad targeting and maximizing advertising revenue.

10. Drug discovery: Reinforcement learning can assist in the discovery and optimization of new drugs by exploring vast chemical space and predicting the effectiveness of potential drug candidates.

These are just a few examples, and reinforcement learning has potential applications in various domains where decision-making and optimization problems exist.

In [39]:
#Now let's ask a follow up question
chain.run(FOLLOW_UP_QUESTION)

'1. The conversation was about the importance of effective communication in professional and personal relationships.\n\n2. We discussed how communication skills can impact various aspects of life, such as career success, relationship building, and problem-solving.\n\n3. We explored different strategies for improving communication skills, including active listening, empathy, and clarity in expressing thoughts and emotions.\n\n4. We also talked about the challenges that can hinder effective communication, such as misunderstandings, cultural differences, and lack of self-awareness.\n\n5. The significance of non-verbal communication, such as body language and tone of voice, was highlighted in our conversation.\n\n6. We touched upon the role of technology in communication and how it can both facilitate and hinder effective communication.\n\n7. Overall, the conversation emphasized the importance of continuously developing and honing communication skills to foster positive relationships and a

As you can see, it doesn't remember what it just responded, sometimes it responds based only on the system prompt, or just randomly. This proof that the LLM does NOT have memory and that we need to give the memory as a a conversation history as part of the prompt, like this:

In [40]:
hist_prompt = PromptTemplate(
    input_variables=["history", "question"],
    template="""
                {history}
                Human: {question}
                AI:
            """
    )
chain = LLMChain(llm=llm, prompt=hist_prompt)

In [41]:
Conversation_history = """
Human: {question}
AI: {response}
""".format(question=QUESTION, response=response)

In [42]:
printmd(chain.run({"history":Conversation_history, "question": FOLLOW_UP_QUESTION}))

- Reinforcement learning can be used in various domains such as game playing, robotics, autonomous vehicles, recommendation systems, healthcare, finance, energy management, supply chain management, advertising, and drug discovery.
- Reinforcement learning has been successful in training agents to play complex games and achieve superhuman performance.
- It can be used to train robots to perform tasks and learn from trial and error in real-world environments.
- Reinforcement learning can be applied to self-driving cars to make decisions on the road.
- It can personalize recommendations for users and optimize them based on user feedback.
- Reinforcement learning can optimize treatment plans for patients and lead to more effective treatment strategies.
- It can be utilized in algorithmic trading to make trading decisions and optimize portfolio management and risk management.
- Reinforcement learning can optimize energy consumption in buildings or power grids and result in more efficient energy usage and cost savings.
- It can optimize inventory management, pricing, and distribution decisions in supply chain networks.
- Reinforcement learning can optimize online advertising strategies and improve ad targeting and advertising revenue.
- It can assist in the discovery and optimization of new drugs by exploring chemical space and predicting the effectiveness of potential drug candidates.

**Bingo!**, so we now know how to create a chatbot using LLMs, we just need to keep the state/history of the conversation and pass it as context every time

## Now that we understand the concept of memory via adding history as a context, let's go back to our GPT Smart Search engine

In [43]:
# Since Memory adds tokens to the prompt, we would need a better model that allows more space on the prompt
MODEL = "gpt-35-turbo-16k"
COMPLETION_TOKENS = 1000
llm = AzureChatOpenAI(deployment_name=MODEL, temperature=0.5, max_tokens=COMPLETION_TOKENS)
embedder = OpenAIEmbeddings(deployment="text-embedding-ada-002", chunk_size=1) 

In [44]:
index1_name = "cogsrch-index-files"
#index2_name = "cogsrch-index-csv"
index3_name = "cogsrch-index-books-vector"
text_indexes = [index1_name]
vector_indexes = [index+"-vector" for index in text_indexes] + [index3_name]
QUESTION = "What is a bonus?"

In [45]:
%%time

# Search in text-based indexes first and update vector indexes
k=4 # Top k results per each text-based index
ordered_results = get_search_results(QUESTION, text_indexes, k=k, reranker_threshold=1, vector_search=False)
update_vector_indexes(ordered_search_results=ordered_results, embedder=embedder)

# Search in all vector-based indexes available
similarity_k = 4 # top results from multi-vector-index similarity search
ordered_results = get_search_results(QUESTION, vector_indexes, k=k, vector_search=True,
                                        similarity_k=similarity_k,
                                        query_vector = embedder.embed_query(QUESTION))
print("Number of results:",len(ordered_results))

Number of results: 4
CPU times: user 187 ms, sys: 9.11 ms, total: 196 ms
Wall time: 1.27 s


In [46]:
# Uncomment the below line if you want to inspect the ordered results
ordered_results

OrderedDict([('aUNvcmUtQm9udXMtTWFuYWdlbWVudC1Vc2VyLU1hbnVhbC5wZGY4',
              {'title': 'iCore-Bonus-Management-User-Manual.pdf_page_8',
               'name': 'iCore-Bonus-Management-User-Manual.pdf',
               'location': 'https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf',
               'caption': 'Bonusing overview The Bonusing module is a part of the iCore platform that is designed for management of bonus configurations.',
               'index': 'cogsrch-index-books-vector',
               'chunk': ':unselected:\niCore Bonus Management User Manual\nCOMTRADE GAMING\n1. Bonusing overview\nThe Bonusing module is a part of the iCore platform that is designed for management of bonus configurations. This module enables bonusing across multiple content providers and brands, regardless of the used game channel ("Desktop web", "Desktop app", "Mobile web", "Mobile app", "Terminal"), which is key feature for cross-selling 

In [47]:
top_docs = []
for key,value in ordered_results.items():
    location = value["location"] if value["location"] is not None else ""
    top_docs.append(Document(page_content=value["chunk"], metadata={"source": location+os.environ['BLOB_SAS_TOKEN']}))
        
print("Number of chunks:",len(top_docs))

Number of chunks: 4


In [48]:
# Calculate number of tokens of our docs
if(len(top_docs)>0):
    tokens_limit = model_tokens_limit(MODEL) # this is a custom function we created in common/utils.py
    prompt_tokens = num_tokens_from_string(COMBINE_CHAT_PROMPT_TEMPLATE) # this is a custom function we created in common/utils.py
    context_tokens = num_tokens_from_docs(top_docs) # this is a custom function we created in common/utils.py
    
    requested_tokens = prompt_tokens + context_tokens + COMPLETION_TOKENS
    
    chain_type = "map_reduce" if requested_tokens > 0.9 * tokens_limit else "stuff"  
    
    print(COMBINE_CHAT_PROMPT_TEMPLATE)
    print("System prompt token count:",prompt_tokens)
    print("Max Completion Token count:", COMPLETION_TOKENS)
    print("Combined docs (context) token count:",context_tokens)
    print("--------")
    print("Requested token count:",requested_tokens)
    print("Token limit for", MODEL, ":", tokens_limit)
    print("Chain Type selected:", chain_type)
        
else:
    print("NO RESULTS FROM AZURE SEARCH")


Given the following: 
- a chat history, and a question from the Human
- extracted parts from several documents 

Instructions:
- Create a final answer with references. 
- You can only provide numerical references to documents, using this html format: `<sup><a href="url?query_parameters" target="_blank">[number]</a></sup>`.
- The reference must be from the `Source:` section of the extracted parts. You are not to make a reference from the content, only from the `Source:` of the extract parts.
- Reference (source) document's url can include query parameters, for example: "https://example.com/search?query=apple&category=fruits&sort=asc&page=1". On these cases, **you must** include que query references on the document url, using this html format: <sup><a href="url?query_parameters" target="_blank">[number]</a></sup>.
- **You can only answer the question from information contained in the extracted parts below**, DO NOT use your prior knowledge.
- Never provide an answer without references.


In [49]:
%%time
# Get the answer
response = get_answer(llm=llm, docs=top_docs, query=QUESTION, language="English", chain_type=chain_type)
printmd(response['output_text'])

A bonus is a feature in the iCore platform that allows for the management of bonus configurations. It enables the offering of bonuses to eligible players based on specified criteria such as awarding, expiry, and wagering requirements. Bonuses can be awarded automatically or manually, and there is an option to allow multiple instances of a bonus. The bonus amount can be set as a percentage of the deposit amount or a fixed or dynamic amount. Recurring bonus funds can be offered to players based on specific time constraints, such as daily recurrence or no recurrence. The awarding of bonuses can be based on various conditions, including deposit amount, change of VIP level, and wager, net loss, or net win. The iCore Bonus Management User Manual provides detailed information on the configuration and management of bonuses<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D">[1]</a></sup><sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D">[2]</a></sup><sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D">[3]</a></sup>.

CPU times: user 6.91 ms, sys: 335 µs, total: 7.25 ms
Wall time: 7.26 s


And if we ask the follow up question:

In [50]:
FOLLOW_UP_QUESTION = "What kind of bonuses are possible?"
response = get_answer(llm=llm, docs=top_docs,  query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type)
printmd(response['output_text'])

The possible types of bonuses are: 
- Automatic: awarded to eligible players who meet the eligibility and/or awarding criteria.
- Manual: awarded manually through the Give bonus action or via the Award Bonus via CSV page.
- Allow Multiple Instances of Bonus: allows for recurring bonus funds offers to be credited to eligible players multiple times.
- No recurrence: no time constraints on recurrent bonus funds offers.
- Daily: recurrent bonus funds offers can take place on selected days of the week.
- All day: bonus awarding conditions are checked throughout the calendar day.

For more detailed information, please refer to the iCore Bonus Management User Manual, specifically the sections on "Bonusing Overview" and "Awarding" <sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D">[1]</a></sup>.

You might get a different response from above, but it doesn't matter what response you get, it will be based on the context given, not on previous answers.

Until now we just have the same as the prior Notebook 03: results from Azure Search enhanced by OpenAI model, with no memory

**Now let's add memory to it:**

Reference: https://python.langchain.com/docs/modules/memory/how_to/adding_memory_chain_multiple_inputs

In [51]:
# memory object, which is neccessary to track the inputs/outputs and hold a conversation.
memory = ConversationBufferMemory(memory_key="chat_history",input_key="question")

response = get_answer(llm=llm, docs=top_docs, query=QUESTION, language="English", chain_type=chain_type, 
                        memory=memory)
printmd(response['output_text'])

A bonus is a feature in the iCore platform that allows for the management of bonus configurations. It enables the offering of bonuses to players based on certain criteria, such as eligibility, awarding conditions, expiry, and wagering requirements. Bonuses can be offered across multiple content providers and brands, regardless of the game channel used.

The iCore Bonus Management User Manual provides detailed information on how to configure and manage bonuses. Here are some key points from the manual:

- The Bonusing module in iCore offers various selections in the navigation pane, including finding bonus configurations, creating new bonuses, managing granted bonuses, and more.
- Bonus configurations can have incremental naming enabled or disabled. When enabled, a prefix is added to the bonus configuration name, consisting of a product type identifier and a consecutive number.
- Bonus configurations can have friendly names and descriptions, which are alternative names and additional descriptions used for indicating the bonus award on the player portal.
- The start and end dates of a bonus configuration determine when the bonus starts offering and awarding bonuses to players and when it stops awarding bonuses.
- Bonuses can be linked to campaigns in the Dynamics CRM system for reporting purposes.
- Bonus configurations can have categories to make searching for bonus configurations easier, such as "First deposit," "Reload," "Rebate," "Cashback," "Free spins," "Award games," and "Freebet."
- The Player enrollment property determines whether players need to accept the bonus terms and conditions before being awarded the bonus.
- The Display T&C when awarding criteria met property determines whether the terms and conditions pop-up message is displayed to players after they meet the awarding requirements.
- The Bonus was awarded property determines whether the "Bonus was awarded" template is displayed to players.

These are just a few highlights from the iCore Bonus Management User Manual. For more detailed information, you can refer to the manual itself, which can be found at [1].

I hope this helps! Let me know if you have any more questions.

[1]<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>

In [52]:
# Now we add a follow up question:
response = get_answer(llm=llm, docs=top_docs, query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type, 
                      memory=memory)
printmd(response['output_text'])

Based on the information from the iCore Bonus Management User Manual, there are several types of bonuses that are possible. Here are some examples:

1. Deposit Bonus: This type of bonus is awarded to players when they make a deposit. The bonus amount can be a percentage of the deposit or a fixed amount. The bonus can be awarded automatically or manually, depending on the configuration.

2. Wager Bonus: This type of bonus is awarded to players based on their wagering activity. The bonus amount is determined by the total amount wagered by the player within a specific time period.

3. Net Loss Bonus: This type of bonus is awarded to players based on their net losses. The bonus amount is determined by the total net losses incurred by the player within a specific time period.

4. Net Win Bonus: This type of bonus is awarded to players based on their net winnings. The bonus amount is determined by the total net winnings achieved by the player within a specific time period.

5. VIP Level Change Bonus: This type of bonus is awarded to players when they change their VIP level. The bonus amount and eligibility criteria can be customized based on the specific VIP levels.

These are just a few examples of the types of bonuses that can be configured in the iCore platform. For more detailed information, you can refer to the iCore Bonus Management User Manual, which can be found at [1].

I hope this answers your question. Let me know if you have any more questions.

[1]<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>

In [53]:
# Another follow up query
response = get_answer(llm=llm, docs=top_docs, query="Thank you", language="English", chain_type=chain_type,  
                      memory=memory)
printmd(response['output_text'])

Based on the information from the iCore Bonus Management User Manual, there are several types of bonuses that are possible. Here are some examples:

1. Deposit Bonus: This type of bonus is awarded to players when they make a deposit. The bonus amount can be a percentage of the deposit or a fixed amount. The bonus can be awarded automatically or manually, depending on the configuration.

2. Wager Bonus: This type of bonus is awarded to players based on their wagering activity. The bonus amount is determined by the total amount wagered by the player within a specific time period.

3. Net Loss Bonus: This type of bonus is awarded to players based on their net losses. The bonus amount is determined by the total net losses incurred by the player within a specific time period.

4. Net Win Bonus: This type of bonus is awarded to players based on their net winnings. The bonus amount is determined by the total net winnings achieved by the player within a specific time period.

5. VIP Level Change Bonus: This type of bonus is awarded to players when they change their VIP level. The bonus amount and eligibility criteria can be customized based on the specific VIP levels.

These are just a few examples of the types of bonuses that can be configured in the iCore platform. For more detailed information, you can refer to the iCore Bonus Management User Manual, which can be found at [1].

I hope this answers your question. Let me know if you have any more questions.

[1]<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>

You might get a different answer on the above cell, and it is ok, this bot is not yet well configured to answer any question that is not related to its knowledge base, including salutations.

Let's check our memory to see that it's keeping the conversation

In [54]:
memory.buffer

'Human: What is a bonus?\nAI: A bonus is a feature in the iCore platform that allows for the management of bonus configurations. It enables the offering of bonuses to players based on certain criteria, such as eligibility, awarding conditions, expiry, and wagering requirements. Bonuses can be offered across multiple content providers and brands, regardless of the game channel used.\n\nThe iCore Bonus Management User Manual provides detailed information on how to configure and manage bonuses. Here are some key points from the manual:\n\n- The Bonusing module in iCore offers various selections in the navigation pane, including finding bonus configurations, creating new bonuses, managing granted bonuses, and more.\n- Bonus configurations can have incremental naming enabled or disabled. When enabled, a prefix is added to the bonus configuration name, consisting of a product type identifier and a consecutive number.\n- Bonus configurations can have friendly names and descriptions, which are

## Using CosmosDB as persistent memory

In previous cell we have added local RAM memory to our chatbot. However, it is not persistent, it gets deleted once the app user's session is terminated. It is necessary then to use a Database for persistent storage of each of the bot user conversations, not only for Analytics and Auditing, but also if we wisg to provide recommendations. 

Here we will store the conversation history into CosmosDB for future auditing purpose.
We will use a class in LangChain use CosmosDBChatMessageHistory, see [HERE](https://python.langchain.com/en/latest/_modules/langchain/memory/chat_message_histories/cosmos_db.html)

In [55]:
# Create CosmosDB instance from langchain cosmos class.
cosmos = CosmosDBChatMessageHistory(
    cosmos_endpoint=os.environ['AZURE_COSMOSDB_ENDPOINT'],
    cosmos_database=os.environ['AZURE_COSMOSDB_NAME'],
    cosmos_container=os.environ['AZURE_COSMOSDB_CONTAINER_NAME'],
    connection_string=os.environ['AZURE_COMOSDB_CONNECTION_STRING'],
    session_id="Agent-Test-Session" + str(random.randint(1, 1000)),
    user_id="Agent-Test-User" + str(random.randint(1, 1000))
    )

# prepare the cosmosdb instance
cosmos.prepare_cosmos()

In [56]:
# Create or Memory Object
memory = ConversationBufferMemory(memory_key="chat_history",input_key="question",chat_memory=cosmos)

In [57]:
# Testing using our Question
response = get_answer(llm=llm, docs=top_docs, query="What is a procedure for a single game?", language="English", chain_type=chain_type, 
                        memory=memory)
printmd(response['output_text'])

The procedure for creating a single game bonus configuration in the iCore platform is as follows:

1. Open the Bonusing module in the iCore platform.
2. Navigate to the "Create New Bonus" page.
3. Specify the bonus configuration details, such as the brand and product type.
4. Set the start and end dates for the bonus configuration.
5. Choose the applicable campaign from the Dynamics CRM system (if integrated).
6. Select the bonus category, such as "First deposit," "Reload," "Rebate," etc.
7. Provide a general description of the bonus configuration.
8. Define the player enrollment settings, such as whether the player needs to accept or decline the bonus.
9. Configure the display of terms and conditions when the awarding criteria are met.
10. Set the option for the "Bonus was awarded" template display.
11. Configure the award conditions for the bonus, such as wager amount, net loss, net win, or change of VIP level.
12. Specify the award type as either "Automatic" (system rules) or "Manual" (given by administrators).
13. Determine if multiple instances of the bonus are allowed.
14. Set the bonus recurrence options if multiple instances are allowed.
15. Save the bonus configuration as a draft.

Please note that these steps are based on the information extracted from the iCore Bonus Management User Manual<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>. For more detailed instructions and information, please refer to the user manual.

Is there anything else I can assist you with?

In [58]:
# Now we add a follow up question:
response = get_answer(llm=llm, docs=top_docs, query=FOLLOW_UP_QUESTION, language="English", chain_type=chain_type, 
                      memory=memory)
printmd(response['output_text'])

Based on the information extracted from the iCore Bonus Management User Manual<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>, the following types of bonuses are possible:

1. First deposit
2. Reload
3. Rebate
4. Cashback
5. Free spins
6. Award games
7. Freebet

These bonus configurations allow you to offer bonuses to players when they are eligible. You can set configuration parameters for awarding, expiry, and wagering requirements, and manage bonuses after they have been awarded to individual players.

Please note that the information provided is based on the extracted parts from the iCore Bonus Management User Manual. For more detailed instructions and information, please refer to the user manual<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>.

Let me know if there's anything else I can help you with!

In [59]:
# Another follow up query
response = get_answer(llm=llm, docs=top_docs, query="Thank you", language="English", chain_type=chain_type,  
                      memory=memory)
printmd(response['output_text'])

Based on the information extracted from the iCore Bonus Management User Manual<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>, the following types of bonuses are possible:

1. First deposit
2. Reload
3. Rebate
4. Cashback
5. Free spins
6. Award games
7. Freebet

These bonus configurations allow you to offer bonuses to players when they are eligible. You can set configuration parameters for awarding, expiry, and wagering requirements, and manage bonuses after they have been awarded to individual players.

Please note that the information provided is based on the extracted parts from the iCore Bonus Management User Manual. For more detailed instructions and information, please refer to the user manual<sup><a href="https://kajetanstorage.blob.core.windows.net/fileupload-ix-kajetan2/iCore-Bonus-Management-User-Manual.pdf?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2024-11-01T18:16:38Z&st=2023-11-30T10:16:38Z&spr=https&sig=Kkx8BJ9j5pTgD4EeEeWI7b4FR%2Bk54%2FS19hW5gIllaBQ%3D" target="_blank">[1]</a></sup>.

Let me know if there's anything else I can help you with!

Let's check our Azure CosmosDB to see the whole conversation


In [60]:
#load message from cosmosdb
cosmos.load_messages()
cosmos.messages

[HumanMessage(content='What is a procedure for a single game?'),
 AIMessage(content='The procedure for creating a single game bonus configuration in the iCore platform is as follows:\n\n1. Open the Bonusing module in the iCore platform.\n2. Navigate to the "Create New Bonus" page.\n3. Specify the bonus configuration details, such as the brand and product type.\n4. Set the start and end dates for the bonus configuration.\n5. Choose the applicable campaign from the Dynamics CRM system (if integrated).\n6. Select the bonus category, such as "First deposit," "Reload," "Rebate," etc.\n7. Provide a general description of the bonus configuration.\n8. Define the player enrollment settings, such as whether the player needs to accept or decline the bonus.\n9. Configure the display of terms and conditions when the awarding criteria are met.\n10. Set the option for the "Bonus was awarded" template display.\n11. Configure the award conditions for the bonus, such as wager amount, net loss, net win, 

![CosmosDB Memory](./images/cosmos-chathistory.png)

# Summary
##### Adding memory to our application allows the user to have a conversation, however this feature is not something that comes with the LLM, but instead, memory is something that we must provide to the LLM in form of context of the question.

We added persitent memory using CosmosDB.

We also can notice that the current chain that we are using is smart, but not that much. Although we have given memory to it, it searches for similar docs everytime, regardless of the input and it struggles to respond to prompts like: Hello, Thank you, Bye, What's your name, What's the weather and any other task that is not search in the knowledge base.


## <u>Important Note</u>:<br>
As we proceed, while all the code will remain compatible with GPT-3.5 models, we highly recommend transitioning to GPT-4. Here's why:

**GPT-3.5-Turbo** can be likened to a 7-year-old child. You can provide it with concise instructions, but it frequently struggles to follow them accurately. Additionally, its limited memory can make sustained conversations challenging.

**GPT-3.5-Turbo-16k** resembles the same 7-year-old, but with an increased attention span for longer instructions. However, it still faces difficulties accurately executing them about half the time.

**GPT-4** exhibits the capabilities of a 10-12-year-old child. It possesses enhanced reasoning skills and more consistently adheres to instructions. While its memory retention for instructions is moderate, it excels at following them.

**GPT-4-32k** is akin to the 10-12-year-old child with an extended memory. It comprehends lengthy sets of instructions and engages in meaningful conversations. Thanks to its robust memory, it offers detailed responses.

Understanding this analogy above will become clearer as you complete the final notebook.


# NEXT
We know now how to do a Smart Search Engine that can power a chatbot!! great!

But, does this solve all the possible scenarios that a virtual assistant will require?  **What about if the answer to the Smart Search Engine is not related to text, but instead requires to look into tabular data?** The next notebook explains and solves the tabular problem and the concept of Agents