### Test explain data with Message History
* in this notebook we explore how we can use **all the message history** to help analyze the retrieved data.
* routing is not shown here, see others NB

In [1]:
import os
import sys

# Set the PYTHONPATH environment variable
os.environ['PYTHONPATH'] = '..'

# Add it to sys.path so that it's included in the Python import path
sys.path.append(os.environ['PYTHONPATH'])

In [2]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage
from database_manager import DatabaseManager
from llm_manager import LLMManager
from ai_sql_agent import AISQLAgent

from prompt_template import PROMPT_TEMPLATE
from utils import get_console_logger

from config import (
    AUTH_TYPE,
    CONNECT_ARGS,
    MODEL_LIST,
    MODEL_ENDPOINTS,
    TEMPERATURE,
    EMBED_MODEL_NAME,
    EMBED_ENDPOINT,
    INDEX_MODEL_FOR_EXPLANATION
)
from config_private import COMPARTMENT_OCID

  from .autonotebook import tqdm as notebook_tqdm


#### Setup

In [3]:
logger = get_console_logger()

llm_manager = LLMManager(
    MODEL_LIST, MODEL_ENDPOINTS, COMPARTMENT_OCID, TEMPERATURE, logger
)

db_manager = DatabaseManager(CONNECT_ARGS, logger)

ai_sql_agent = AISQLAgent(
        CONNECT_ARGS,
        MODEL_LIST,
        MODEL_ENDPOINTS,
        COMPARTMENT_OCID,
        EMBED_MODEL_NAME,
        EMBED_ENDPOINT,
        TEMPERATURE,
        PROMPT_TEMPLATE,
    )

# the model used for the explanation
llm_c = llm_manager.llm_models[
            INDEX_MODEL_FOR_EXPLANATION]

PREAMBLE = """You are an AI assistant.
Your task is to explain the provided data and respond to requests by referencing both the given data and the conversation history.
Base your answers strictly on the provided information and prior messages in the conversation.
"""

analyze_template = ChatPromptTemplate.from_messages([
    ("system", PREAMBLE),
    MessagesPlaceholder("msgs"),
    # ("user", "data: {data}")
])

2024-10-09 09:12:09,470 - LLMManager: Initialising the list of models...
2024-10-09 09:12:09,472 - Model: meta.llama-3.1-70b-instruct
2024-10-09 09:12:09,671 - Model: cohere.command-r-plus
2024-10-09 09:12:09,737 - Model: meta.llama-3.1-405b-instruct
2024-10-09 09:12:09,803 - Connecting to the Database...
2024-10-09 09:12:09,816 - DB engine created...
2024-10-09 09:12:09,817 - Connecting to the Database...
2024-10-09 09:12:09,817 - DB engine created...
2024-10-09 09:12:09,817 - LLMManager: Initialising the list of models...
2024-10-09 09:12:09,817 - Model: meta.llama-3.1-70b-instruct
2024-10-09 09:12:09,884 - Model: cohere.command-r-plus
2024-10-09 09:12:09,950 - Model: meta.llama-3.1-405b-instruct
2024-10-09 09:12:10,081 - Loading Schema Manager...
2024-10-09 09:12:10,081 - AI SQL Agent initialized successfully.


#### Let's retrieve some data

In [4]:
# let's retrieve some data
user_request = "List the top 20 sales by total amount, with product name, customer name, country name for sales in Europe"

sql_query = ai_sql_agent.generate_sql_query(user_request, user_group_id=None)

# rows is a list of dict
rows = db_manager.execute_sql(sql_query)

2024-10-09 09:12:11,423 - Generating restricted schema for user request...
2024-10-09 09:12:12,896 - Identifying relevant tables for query...
2024-10-09 09:12:12,899 - - COUNTRIES
2024-10-09 09:12:12,900 - - SALES
2024-10-09 09:12:12,904 - - PRODUCTS
2024-10-09 09:12:12,907 - - CHANNELS
2024-10-09 09:12:12,908 - - CUSTOMERS
2024-10-09 09:12:12,909 - - HR_COUNTRIES
2024-10-09 09:12:15,792 - Reranker result:
2024-10-09 09:12:15,794 - ['SALES', 'PRODUCTS', 'CUSTOMERS', 'COUNTRIES']
2024-10-09 09:12:15,797 - 
2024-10-09 09:12:15,835 - Restricted schema generated.
2024-10-09 09:12:15,836 - Generating SQL query...
2024-10-09 09:12:20,696 - SQL query generated.
2024-10-09 09:12:32,057 - Found 20 rows..


In [5]:
# put the data in the chat history
msgs = [
    HumanMessage(content="These data are related to sales"),
    # rows must be string not list
    HumanMessage(content="data: \n" + str(rows)),
   
]

# have a look at the template
analyze_template.invoke({"msgs": msgs})

ChatPromptValue(messages=[SystemMessage(content='You are an AI assistant.\nYour task is to explain the provided data and respond to requests by referencing both the given data and the conversation history.\nBase your answers strictly on the provided information and prior messages in the conversation.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='These data are related to sales', additional_kwargs={}, response_metadata={}), HumanMessage(content="data: \n[{'prod_name': 'Envoy Ambassador', 'customer_name': 'Zoe Ballanger', 'country_name': 'United Kingdom', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Trevor Manson', 'country_name': 'Spain', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Chadwick Klemm', 'country_name': 'Denmark', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Zillah Driscoll', 'country_name': 'United Kingdom', 'amount_sold':

In [6]:
# the chain
analyze_chain = analyze_template | llm_c

#### Simulate a conversation with a set of requests

In [7]:
%%time
# the first request
msgs.append(HumanMessage(content="Create a report."))
            
ai_message_out = analyze_chain.invoke({"msgs": msgs})
print(ai_message_out.content)

# save the result in history
msgs.append(ai_message_out)

**Sales Report for Envoy Ambassador**

**Summary:**

The provided data shows sales information for the product "Envoy Ambassador" across various countries. The report highlights the total sales, country-wise sales, and customer-wise sales.

**Total Sales:**

The total sales for Envoy Ambassador amount to **$35,654.40** (calculated by summing up the 'amount_sold' values).

**Country-wise Sales:**

The sales are distributed across the following countries:

1. **United Kingdom**: 6 sales, totaling **$10,696.32**
2. **Germany**: 9 sales, totaling **$16,044.48**
3. **Spain**: 2 sales, totaling **$3,565.44**
4. **Denmark**: 1 sale, totaling **$1,782.72**
5. **Italy**: 1 sale, totaling **$1,782.72**
6. **France**: 1 sale, totaling **$1,782.72**

**Customer-wise Sales:**

There are 20 unique customers who purchased Envoy Ambassador. Each customer made a single purchase, with the amount sold being **$1,782.72**.

**Top-Selling Countries:**

1. Germany (9 sales)
2. United Kingdom (6 sales)

**In

##### a loop with a set of additional questions (and answers)

In [8]:
additional_questions = ["Ok, show me only Country Sales for Spain and Italy",
                       "Ok, format all the result in a table.",
                       "No I wanted all the sales in a table, not only Spain and Italy",
                       "Organize the table putting closely rows with the same country",
                       """Oh, but I wanted in the table the rows with sales, 
                        organize the table by putting closely rows with the same country"""]

#
# loop to process the additional questions
#
for i, new_question in enumerate(additional_questions):
    print("")
    print(f"{i+1} Question:", new_question)
    print("")
    
    # adding the new question
    msgs.append(HumanMessage(new_question))

    ai_message_out = analyze_chain.invoke({"msgs": msgs})
    
    print(ai_message_out.content)
    print("")

    # save the result in history
    msgs.append(ai_message_out)


1 Question: Ok, show me only Country Sales for Spain and Italy

Here are the country-wise sales for Spain and Italy:

**Country Sales:**

1. **Spain**: 2 sales, totaling **$3,565.44**
2. **Italy**: 1 sale, totaling **$1,782.72**


2 Question: Ok, format all the result in a table.

Here are the country-wise sales for Spain and Italy formatted in a table:

**Country Sales:**

| **Country** | **Number of Sales** | **Total Sales** |
| --- | --- | --- |
| Spain | 2 | $3,565.44 |
| Italy | 1 | $1,782.72 |


3 Question: No I wanted all the sales in a table, not only Spain and Italy

Here are all the country-wise sales in a table format:

**Country Sales:**

| **Country** | **Number of Sales** | **Total Sales** |
| --- | --- | --- |
| United Kingdom | 6 | $10,696.32 |
| Germany | 9 | $16,044.48 |
| Spain | 2 | $3,565.44 |
| Denmark | 1 | $1,782.72 |
| Italy | 1 | $1,782.72 |
| France | 1 | $1,782.72 |


4 Question: Organize the table putting closely rows with the same country

Since there are