# Understanding Memory in LLMs

In the previous Notebooks, we successfully explored how OpenAI models can enhance the results from Azure AI Search queries. 

However, we have yet to discover how to engage in a conversation with the LLM. With [Microsoft Copilot](http://chat.bing.com/), for example, this is possible, as it can understand and reference the previous responses.

There is a common misconception that LLMs (Large Language Models) have memory. This is not true. While they possess knowledge, they do not retain information from previous questions asked to them.

In this Notebook, our goal is to illustrate how we can effectively "endow the LLM with memory" by employing prompts and context.

In [1]:
import os
import random
from langchain_community.chat_message_histories import ChatMessageHistory, CosmosDBChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables import ConfigurableFieldSpec
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from operator import itemgetter
from typing import List

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string))

#custom libraries that we will use later in the app
from common.utils import CustomAzureSearchRetriever, get_answer
from common.prompts import DOCSEARCH_PROMPT_TEXT

from dotenv import load_dotenv
load_dotenv("credentials.env")

import logging

# Get the root logger
logger = logging.getLogger()
# Set the logging level to a higher level to ignore INFO messages
logger.setLevel(logging.WARNING)

In [2]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

### Let's start with the basics
Let's use a very simple example to see if the GPT model of Azure OpenAI have memory. We again will be using langchain to simplify our code 

In [30]:
QUESTION = "tell me chinese medicines that help fight covid-19"
FOLLOW_UP_QUESTION = "What was my prior question?"

In [46]:
COMPLETION_TOKENS = 1000
# Create an OpenAI instance
llm = AzureChatOpenAI(deployment_name=os.environ["GPT4o_DEPLOYMENT_NAME"], 
                      temperature=0.5, max_tokens=COMPLETION_TOKENS)

In [47]:
# We create a very simple prompt template, just the question as is:
output_parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant that give thorough responses to users."),
    ("user", "{input}")
])

In [48]:
# Let's see what the GPT model responds
chain = prompt | llm | output_parser
response_to_initial_question = chain.invoke({"input": QUESTION})
display(Markdown(response_to_initial_question))

Chinese medicine has a long history and a rich tradition of using herbs and other natural substances to treat various ailments. During the COVID-19 pandemic, some Traditional Chinese Medicine (TCM) practices and formulations have been explored for their potential to support the immune system and alleviate symptoms. It's important to note that while TCM may provide supportive care, it should not replace conventional medical treatments for COVID-19. Always consult with a healthcare professional before starting any new treatment.

Here are some TCM herbs and formulations that have been studied or used in the context of COVID-19:

### 1. **Lianhua Qingwen Capsule (连花清瘟胶囊)**
- **Ingredients:** Forsythia fruit, Honeysuckle flower, Ephedra, Bitter Apricot Seed, Gypsum, Isatis root, Dryopteris root, Rhubarb, Houttuynia, Patchouli, Rhodiola, and Menthol.
- **Uses:** This formulation is often used for treating flu-like symptoms, including fever, cough, and fatigue. Some studies suggest it may help alleviate symptoms of COVID-19.

### 2. **Shuanghuanglian Oral Liquid (双黄连口服液)**
- **Ingredients:** Honeysuckle, Scutellaria, and Forsythia.
- **Uses:** Traditionally used to clear heat and detoxify, it has been explored for its potential antiviral properties.

### 3. **Jinhua Qinggan Granule (金花清感颗粒)**
- **Ingredients:** Honeysuckle, Forsythia, Ephedra, Bitter Apricot Seed, Gypsum, Isatis root, Dryopteris root, Rhubarb, Houttuynia, Patchouli, and Rhodiola.
- **Uses:** Used for treating respiratory infections and reducing fever and cough.

### 4. **Qingfei Paidu Decoction (清肺排毒汤)**
- **Ingredients:** A combination of 21 herbs including Ephedra, Licorice, Gypsum, Cinnamon Twig, and others.
- **Uses:** This formulation has been recommended by the Chinese government for treating COVID-19 symptoms, particularly in mild to moderate cases.

### 5. **Xuebijing Injection (血必净注射液)**
- **Ingredients:** A combination of herbs including Carthamus, Salvia, Angelica, Paeonia, and Ligusticum.
- **Uses:** Used in severe cases to reduce inflammation and improve microcirculation.

### 6. **Pneumonia No. 1 Formula (肺炎1号方)**
- **Ingredients:** A specific formulation used in Wuhan during the early stages of the pandemic, containing various herbs aimed at treating lung infections.
- **Uses:** Targeted at treating symptoms like fever, cough, and fatigue.

### 7. **Maxingshigan-Yinqiaosan (麻杏石甘-银翘散)**
- **Ingredients:** Ephedra, Apricot Seed, Gypsum, Licorice, Honeysuckle, and Forsythia.
- **Uses:** Used to clear heat and detoxify, addressing symptoms like fever and cough.

### General Herbs Used in TCM for Respiratory Health:
- **Astragalus (Huangqi, 黄芪):** Known for its immune-boosting properties.
- **Licorice Root (Gancao, 甘草):** Used for its anti-inflammatory and antiviral effects.
- **Isatis Root (Banlangen, 板蓝根):** Commonly used for its antiviral properties.
- **Honeysuckle (Jinyinhua, 金银花):** Used for its heat-clearing and detoxifying effects.

### Important Considerations:
1. **Consult Healthcare Providers:** Always consult with a healthcare professional, particularly one knowledgeable in TCM, before starting any new treatment.
2. **Combination with Conventional Medicine:** TCM should complement, not replace, conventional medical treatments.
3. **Quality and Source of Herbs:** Ensure that the herbs and formulations are sourced from reputable suppliers to avoid contamination or adulteration.

### Research and Efficacy:
While there is some preliminary research and anecdotal evidence supporting the use of these TCM formulations for COVID-19, more rigorous clinical trials are needed to confirm their efficacy and safety. The integration of TCM with Western medicine has shown promise in some cases, but it should be approached with caution and professional guidance.

In [49]:
#Now let's ask a follow up question
printmd(chain.invoke({"input": FOLLOW_UP_QUESTION}))

I'm sorry, but I don't have access to previous interactions or questions you've asked. Each session is independent and doesn't retain any information from past interactions. How can I assist you today?

As you can see, it doesn't remember what it just responded, sometimes it responds based only on the system prompt, or just randomly. This proof that the LLM does NOT have memory and that we need to give the memory as a a conversation history as part of the prompt, like this:

In [50]:
hist_prompt = ChatPromptTemplate.from_template(
"""
    {history}
    Human: {question}
    AI:
"""
)
chain = hist_prompt | llm | output_parser

In [51]:
Conversation_history = """
Human: {question}
AI: {response}
""".format(question=QUESTION, response=response_to_initial_question)

In [52]:
printmd(chain.invoke({"history":Conversation_history, "question": FOLLOW_UP_QUESTION}))

Your prior question was: "tell me chinese medicines that help fight covid-19"

**Bingo!**, so we now know how to create a chatbot using LLMs, we just need to keep the state/history of the conversation and pass it as context every time

## Now that we understand the concept of memory via adding history as a context, let's go back to our GPT Smart Search engine

From Langchain website:
    
A memory system needs to support two basic actions: reading and writing. Recall that every chain defines some core execution logic that expects certain inputs. Some of these inputs come directly from the user, but some of these inputs can come from memory. A chain will interact with its memory system twice in a given run.

    AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
    AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.
    
So this process adds delays to the response, but it is a necessary delay :)

![image](./images/memory_diagram.png)

In [53]:
index1_name = "srch-index-files"
index2_name = "srch-index-csv"
index3_name = "srch-index-books"
indexes = [index1_name, index2_name, index3_name]

In [54]:
# Initialize our custom retriever 
retriever = CustomAzureSearchRetriever(indexes=indexes, topK=10, reranker_threshold=1)

**Prompt Template Definition**

If you check closely below, there is an optional variable in the `DOCSEARCH_PROMPT` called `history`. It is basically a placeholder were we will inject the conversation in the prompt so the LLM is aware of it before it answers.


In [55]:

DOCSEARCH_PROMPT = ChatPromptTemplate.from_messages(
    [
        ("system", DOCSEARCH_PROMPT_TEXT + "\n\nCONTEXT:\n{context}\n\n"),
        MessagesPlaceholder(variable_name="history", optional=True),
        ("human", "{question}"),
    ]
)


**Now let's add memory to it:**

In [56]:
store = {} # Our first memory will be a dictionary in memory

# We have to define a custom function that takes a session_id and looks somewhere
# (in this case in a dictionary in memory) for the conversation
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


In [57]:
# We use our original chain with the retriever but removing the StrOutputParser
chain = (
    {
        "context": itemgetter("question") | retriever, 
        "question": itemgetter("question"),
        "history": itemgetter("history")
    }
    | DOCSEARCH_PROMPT
    | llm
)

## Then we pass the above chain to another chain that adds memory to it

output_parser = StrOutputParser()

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
) | output_parser

In [58]:
# This is where we configure the session id
config={"configurable": {"session_id": "abc123"}}

In [59]:
printmd(chain_with_history.invoke({"question": QUESTION}, config=config))

Traditional Chinese Medicine (TCM) has been used in various ways to help fight COVID-19. Several specific medicines and formulations have been recommended and utilized for their potential benefits. Here are some notable mentions:

1. **Qingfei Paidu Decoction**: This formulation has been recommended by the National Health Commission of the People’s Republic of China and the National Administration of Traditional Chinese Medicine for the treatment of COVID-19. It has shown good clinical efficacy and potential in treating the disease [[6]](https://doi.org/10.19540/j.cnki.cjcmm.20200219.501; https://www.ncbi.nlm.nih.gov/pubmed/32281335/).

2. **Shuang Huang Lian Kou Fu Ye**: This is one of the traditional Chinese medicines that have been used to attenuate COVID-19. It works by triggering the inflammation pathway, such as the neuraminidase blocker, to fight the SARS-CoV-2 virus [[1]](https://doi.org/10.1101/2020.04.10.20060376).

3. **Bu Huan Jin Zheng Qi San and Da Yuan Yin**: This combination is another traditional Chinese medicine used for treating COVID-19, focusing on the inflammation pathway [[1]](https://doi.org/10.1101/2020.04.10.20060376).

4. **Xue Bi Jing Injection**: This traditional Chinese medicine has also been used to treat COVID-19 by targeting similar pathways [[1]](https://doi.org/10.1101/2020.04.10.20060376).

5. **Qing Fei Pai Du Tang**: Another formulation used to combat COVID-19 by triggering the inflammation pathway [[1]](https://doi.org/10.1101/2020.04.10.20060376).

6. **Astragalus membranaceus and Yupingfeng Powder**: These are often used in various prevention programs for reinforcing vital qi and preventing COVID-19 [[2]](https://doi.org/10.7501/j.issn.0253-2670.2020.04.006).

These TCM formulations and medicines have been integrated into COVID-19 treatment protocols and have shown potential benefits in managing the disease, although their efficacy and mechanisms are still subject to further research and validation.

In [60]:
# Remembers
printmd(chain_with_history.invoke({"question": FOLLOW_UP_QUESTION},config=config))

Your prior question was: "tell me chinese medicines that help fight covid-19"

In [61]:
# Remembers
printmd(chain_with_history.invoke({"question": "Thank you! Good bye"},config=config))

You're welcome! If you have any more questions in the future, feel free to ask. Goodbye!

## Using CosmosDB as persistent memory

Previously, we  added local RAM memory to our chatbot. However, it is not persistent, it gets deleted once the app user's session is terminated. It is necessary then to use a Database for persistent storage of each of the bot user conversations, not only for Analytics and Auditing, but also if we wish to provide recommendations in the future. 

In the next notebook we are going to explain how to use an external Database (CosmosDB) to keep the state of the conversation.

# Summary
##### Adding memory to our application allows the user to have a conversation, however this feature is not something that comes with the LLM, but instead, memory is something that we must provide to the LLM in form of context of the question.

We added persitent memory using local RAM.

We also can notice that the current chain that we are using is smart, but not that much. Although we have given memory to it, many times it searches for similar docs everytime, regardless of the input. This doesn't seem efficient, but regardless, we are very close to finish our first RAG talk-to-your-data Bot.

Note:The use of `RunnableWithMessageHistory` in this notebook is for example purposes. We will see later (on the next notebooks), that we recomend the use of memory state and graphs in order to inject memory into a bot. 

# NEXT
We know now how to do a Smart Search Engine that can power a chatbot!! great!

In the next notebook 6, we are going to build our first RAG bot. In order to do this we will introduce the concept of Agents.