# Understanding Memory in LLMs

In the previous Notebooks, we successfully explored how OpenAI models can enhance the results from Azure AI Search queries. 

However, we have yet to discover how to engage in a conversation with the LLM. With [Bing Chat](http://chat.bing.com/), for example, this is possible, as it can understand and reference the previous responses.

There is a common misconception that LLMs (Large Language Models) have memory. This is not true. While they possess knowledge, they do not retain information from previous questions asked to them.

In this Notebook, our goal is to illustrate how we can effectively "endow the LLM with memory" by employing prompts and context.

In [1]:
import os
import random
from langchain_community.chat_message_histories import ChatMessageHistory, CosmosDBChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables import ConfigurableFieldSpec
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from operator import itemgetter
from typing import List

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string))

#custom libraries that we will use later in the app
from common.utils import CustomAzureSearchRetriever, get_answer
from common.prompts import DOCSEARCH_PROMPT

from dotenv import load_dotenv
load_dotenv("credentials.env")

import logging

# Get the root logger
logger = logging.getLogger()
# Set the logging level to a higher level to ignore INFO messages
logger.setLevel(logging.WARNING)

In [2]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

### Let's start with the basics
Let's use a very simple example to see if the GPT model of Azure OpenAI have memory. We again will be using langchain to simplify our code 

In [3]:
QUESTION = "Tell me some use cases for reinforcement learning"
FOLLOW_UP_QUESTION = "What was my prior question?"

In [4]:
COMPLETION_TOKENS = 1000
# Create an OpenAI instance
llm = AzureChatOpenAI(deployment_name=os.environ["GPT4_DEPLOYMENT_NAME"], 
                      temperature=0.5, max_tokens=COMPLETION_TOKENS)

In [5]:
# We create a very simple prompt template, just the question as is:
output_parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant that give thorough responses to users."),
    ("user", "{input}")
])

In [6]:
# Let's see what the GPT model responds
chain = prompt | llm | output_parser
response_to_initial_question = chain.invoke({"input": QUESTION})
display(Markdown(response_to_initial_question))

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by performing certain actions and receiving rewards or penalties. This trial-and-error approach helps the agent to maximize cumulative reward over time. Here are some notable use cases across various domains:

### 1. **Gaming and Entertainment**
   - **Game Playing**: RL has been used to develop AI that can play games at superhuman levels. Notable examples include DeepMind's AlphaGo, which defeated world champions in the game of Go, and OpenAI's Dota 2 bots.
   - **Game Development**: RL can be used to create non-player characters (NPCs) that learn and adapt to player behavior, making games more challenging and engaging.

### 2. **Robotics**
   - **Autonomous Navigation**: Robots can learn to navigate complex environments, such as warehouses or factories, avoiding obstacles and optimizing paths.
   - **Manipulation Tasks**: RL can be used to teach robots to perform tasks like picking and placing objects, assembling products, or even cooking.

### 3. **Healthcare**
   - **Personalized Treatment Plans**: RL can help in developing personalized treatment strategies for patients by continuously learning from their responses to different treatments.
   - **Medical Imaging**: RL can assist in optimizing the parameters for imaging devices to produce clearer images while minimizing exposure to harmful radiation.

### 4. **Finance**
   - **Algorithmic Trading**: RL algorithms can be used to develop trading strategies that learn to maximize returns by adapting to market conditions.
   - **Portfolio Management**: RL can help in dynamically adjusting the composition of investment portfolios to maximize returns while managing risk.

### 5. **Autonomous Vehicles**
   - **Self-Driving Cars**: RL is used to train autonomous vehicles to make decisions in real-time, such as lane changing, obstacle avoidance, and route planning.
   - **Traffic Management**: RL can be applied to optimize traffic light control systems to reduce congestion and improve traffic flow.

### 6. **Natural Language Processing (NLP)**
   - **Dialogue Systems**: RL can be used to train conversational agents to interact more naturally and effectively with users.
   - **Machine Translation**: RL can improve translation systems by optimizing for fluency and accuracy over time.

### 7. **Energy Management**
   - **Smart Grids**: RL can optimize the distribution of electricity in smart grids, balancing supply and demand while minimizing costs.
   - **Building Energy Management**: RL can be used to control heating, ventilation, and air conditioning (HVAC) systems to reduce energy consumption while maintaining comfort.

### 8. **Industrial Automation**
   - **Predictive Maintenance**: RL can predict equipment failures and optimize maintenance schedules to minimize downtime and costs.
   - **Process Optimization**: RL can be used to optimize manufacturing processes, improving efficiency and reducing waste.

### 9. **Marketing and Advertising**
   - **Personalized Recommendations**: RL can be used to develop recommendation systems that adapt to user preferences over time, improving user engagement.
   - **Ad Placement**: RL can optimize the placement and timing of advertisements to maximize click-through rates and conversions.

### 10. **Resource Management**
   - **Supply Chain Optimization**: RL can help in optimizing supply chain operations, including inventory management, logistics, and distribution.
   - **Water Management**: RL can be used to optimize the distribution and usage of water resources in agriculture and urban settings.

### 11. **Telecommunications**
   - **Network Optimization**: RL can be used to optimize the performance of communication networks, improving data throughput and reducing latency.
   - **Dynamic Spectrum Allocation**: RL can help in dynamically allocating radio frequencies to improve the efficiency of wireless communication systems.

### 12. **Education**
   - **Personalized Learning**: RL can be used to develop adaptive learning systems that tailor educational content to individual student needs, improving learning outcomes.
   - **Tutoring Systems**: RL can be applied to create intelligent tutoring systems that provide personalized feedback and guidance to students.

Reinforcement learning is a powerful tool with a wide range of applications, and its potential continues to grow as the technology advances.

In [7]:
#Now let's ask a follow up question
printmd(chain.invoke({"input": FOLLOW_UP_QUESTION}))

I'm sorry, but I don't have access to prior interactions or any previous questions you may have asked. How can I assist you today?

As you can see, it doesn't remember what it just responded, sometimes it responds based only on the system prompt, or just randomly. This proof that the LLM does NOT have memory and that we need to give the memory as a a conversation history as part of the prompt, like this:

In [8]:
hist_prompt = ChatPromptTemplate.from_template(
"""
    {history}
    Human: {question}
    AI:
"""
)
chain = hist_prompt | llm | output_parser

In [9]:
Conversation_history = """
Human: {question}
AI: {response}
""".format(question=QUESTION, response=response_to_initial_question)

In [10]:
printmd(chain.invoke({"history":Conversation_history, "question": FOLLOW_UP_QUESTION}))

Your prior question was: "Tell me some use cases for reinforcement learning."

**Bingo!**, so we now know how to create a chatbot using LLMs, we just need to keep the state/history of the conversation and pass it as context every time

## Now that we understand the concept of memory via adding history as a context, let's go back to our GPT Smart Search engine

From Langchain website:
    
A memory system needs to support two basic actions: reading and writing. Recall that every chain defines some core execution logic that expects certain inputs. Some of these inputs come directly from the user, but some of these inputs can come from memory. A chain will interact with its memory system twice in a given run.

    AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
    AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.
    
So this process adds delays to the response, but it is a necessary delay :)

![image](https://python.langchain.com/assets/images/memory_diagram-0627c68230aa438f9b5419064d63cbbc.png)

In [11]:
index1_name = "cogsrch-index-files"
index2_name = "cogsrch-index-csv"
index3_name = "cogsrch-index-books"
indexes = [index1_name, index2_name, index3_name]

In [12]:
# Create the function to retrieve the conversation

def get_session_history(session_id: str, user_id: str) -> CosmosDBChatMessageHistory:
    cosmos = CosmosDBChatMessageHistory(
        cosmos_endpoint=os.environ['AZURE_COSMOSDB_ENDPOINT'],
        cosmos_database=os.environ['AZURE_COSMOSDB_NAME'],
        cosmos_container=os.environ['AZURE_COSMOSDB_CONTAINER_NAME'],
        connection_string=os.environ['AZURE_COMOSDB_CONNECTION_STRING'],
        session_id=session_id,
        user_id=user_id
        )

    # prepare the cosmosdb instance
    cosmos.prepare_cosmos()
    return cosmos


In [13]:
chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="session_id",
            annotation=str,
            name="Session ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
) | output_parser

In [14]:
# This is where we configure the session id and user id
random_session_id = "session"+ str(random.randint(1, 1000))
ramdom_user_id = "user"+ str(random.randint(1, 1000))

config={"configurable": {"session_id": random_session_id, "user_id": ramdom_user_id}}

In [15]:
config

{'configurable': {'session_id': 'session834', 'user_id': 'user733'}}

In [16]:
printmd(chain_with_history.invoke({"question": QUESTION}, config=config))

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to maximize some notion of cumulative reward. Here are several use cases for reinforcement learning:

1. **Robotics**:
   - **Autonomous Navigation**: RL can be used to train robots to navigate through complex environments without human intervention.
   - **Manipulation Tasks**: Robots can learn to perform tasks like picking and placing objects, assembling parts, or even cooking.

2. **Gaming**:
   - **Game Playing**: RL has been used to develop agents that can play games at superhuman levels, such as AlphaGo for the game of Go, and OpenAI Five for Dota 2.
   - **Game Testing**: RL can automate the process of testing video games by simulating human-like play.

3. **Finance**:
   - **Algorithmic Trading**: RL can optimize trading strategies to maximize returns while managing risk.
   - **Portfolio Management**: It can be used to dynamically adjust the composition of a portfolio to achieve the best performance.

4. **Healthcare**:
   - **Treatment Planning**: RL can help in creating personalized treatment plans for patients by considering the long-term effects of various treatments.
   - **Medical Imaging**: It can improve the accuracy of diagnostics by optimizing the parameters used in imaging techniques.

5. **Autonomous Vehicles**:
   - **Self-Driving Cars**: RL can be used to train autonomous vehicles to navigate safely and efficiently in various driving conditions.
   - **Traffic Management**: RL can optimize traffic light control systems to reduce congestion and improve traffic flow.

6. **Energy Management**:
   - **Smart Grid Optimization**: RL can be used to balance supply and demand in smart grids, optimizing energy distribution.
   - **Energy Consumption**: It can help in reducing energy consumption in buildings by optimizing heating, ventilation, and air conditioning (HVAC) systems.

7. **Natural Language Processing**:
   - **Dialogue Systems**: RL can improve the performance of chatbots and virtual assistants by optimizing their responses to user queries.
   - **Machine Translation**: It can enhance the accuracy and fluency of machine translation systems.

8. **Manufacturing**:
   - **Process Optimization**: RL can optimize manufacturing processes to improve efficiency and reduce waste.
   - **Predictive Maintenance**: It can predict equipment failures and schedule maintenance to minimize downtime.

9. **Marketing**:
   - **Personalized Recommendations**: RL can enhance recommendation systems by tailoring suggestions to individual user preferences.
   - **Ad Placement**: It can optimize the placement of advertisements to maximize click-through rates and conversions.

10. **Telecommunications**:
    - **Network Optimization**: RL can optimize the allocation of resources in communication networks to improve performance and reduce latency.
    - **Dynamic Spectrum Management**: It can be used to dynamically allocate spectrum resources to improve the efficiency of wireless communication.

These are just a few examples, and the potential applications of reinforcement learning are vast and continually expanding as the technology advances.

In [17]:
# Remembers
printmd(chain_with_history.invoke({"question": FOLLOW_UP_QUESTION},config=config))

Your prior question was: "Tell me some use cases for reinforcement learning."

In [18]:
# Remembers
printmd(chain_with_history.invoke(
    {"question": "Can you tell me a one line summary of our conversation?"},
    config=config))

You asked for use cases of reinforcement learning, and I provided a detailed list of various applications across different domains.

In [19]:
try:
    printmd(chain_with_history.invoke(
    {"question": "Thank you very much!"},
    config=config))
except Exception as e:
    print(e)

You're welcome! If you have any more questions or need further assistance, feel free to ask. Have a great day!

#### Let's check our Azure CosmosDB to see the whole conversation


![CosmosDB Memory](./images/cosmos-chathistory.png)

# Summary
##### Adding memory to our application allows the user to have a conversation, however this feature is not something that comes with the LLM, but instead, memory is something that we must provide to the LLM in form of context of the question.

We added persitent memory using CosmosDB.

We also can notice that the current chain that we are using is smart, but not that much. Although we have given memory to it, it searches for similar docs everytime, regardless of the input. This doesn't seem efficient, but regardless, we are very close to finish our first RAG-talk to your data Bot.




# NEXT
We know now how to do a Smart Search Engine that can power a chatbot!! great!

In the next notebook 6, we are going to build our first RAG bot. In order to do this we will introduce the concept of Agents.