### Test explain data with Message History
* in this notebook we explore how we can use **all the message history** to help analyze the retrieved data.
* routing is not shown here, see others NB

In [1]:
import os
import sys

# Set the PYTHONPATH environment variable
os.environ['PYTHONPATH'] = '..'

# Add it to sys.path so that it's included in the Python import path
sys.path.append(os.environ['PYTHONPATH'])

In [2]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage
from database_manager import DatabaseManager
from llm_manager import LLMManager
from ai_sql_agent import AISQLAgent

from prompt_template import PROMPT_TEMPLATE
from utils import get_console_logger

from config import (
    AUTH_TYPE,
    CONNECT_ARGS,
    MODEL_LIST,
    MODEL_ENDPOINTS,
    TEMPERATURE,
    EMBED_MODEL_NAME,
    EMBED_ENDPOINT,
    INDEX_MODEL_FOR_EXPLANATION
)
from config_private import COMPARTMENT_OCID

  from .autonotebook import tqdm as notebook_tqdm


#### Setup

In [3]:
logger = get_console_logger()

llm_manager = LLMManager(
    MODEL_LIST, MODEL_ENDPOINTS, COMPARTMENT_OCID, TEMPERATURE, logger
)

db_manager = DatabaseManager(CONNECT_ARGS, logger)

ai_sql_agent = AISQLAgent(
        CONNECT_ARGS,
        MODEL_LIST,
        MODEL_ENDPOINTS,
        COMPARTMENT_OCID,
        EMBED_MODEL_NAME,
        EMBED_ENDPOINT,
        TEMPERATURE,
        PROMPT_TEMPLATE,
    )

# the model used for the explanation
llm_c = llm_manager.llm_models[
            INDEX_MODEL_FOR_EXPLANATION]

PREAMBLE = """You are an AI assistant.
Your task is to explain the provided data and respond to requests by referencing both the given data and the conversation history.
Base your answers strictly on the provided information and prior messages in the conversation.
"""

analyze_template = ChatPromptTemplate.from_messages([
    ("system", PREAMBLE),
    MessagesPlaceholder("msgs")
])

2024-10-09 09:32:53,090 - LLMManager: Initialising the list of models...
2024-10-09 09:32:53,091 - Model: meta.llama-3.1-70b-instruct
2024-10-09 09:32:53,293 - Model: cohere.command-r-plus
2024-10-09 09:32:53,359 - Model: meta.llama-3.1-405b-instruct
2024-10-09 09:32:53,424 - Connecting to the Database...
2024-10-09 09:32:53,437 - DB engine created...
2024-10-09 09:32:53,438 - Connecting to the Database...
2024-10-09 09:32:53,438 - DB engine created...
2024-10-09 09:32:53,438 - LLMManager: Initialising the list of models...
2024-10-09 09:32:53,438 - Model: meta.llama-3.1-70b-instruct
2024-10-09 09:32:53,505 - Model: cohere.command-r-plus
2024-10-09 09:32:53,571 - Model: meta.llama-3.1-405b-instruct
2024-10-09 09:32:53,702 - Loading Schema Manager...
2024-10-09 09:32:53,702 - AI SQL Agent initialized successfully.


#### Let's retrieve some data

In [4]:
# let's retrieve some data
user_request = "List the top 20 sales by total amount, with product name, customer name, country name for sales in Europe"

sql_query = ai_sql_agent.generate_sql_query(user_request, user_group_id=None)

# rows is a list of dict
rows = db_manager.execute_sql(sql_query)

2024-10-09 09:34:36,157 - Generating restricted schema for user request...
2024-10-09 09:34:37,644 - Identifying relevant tables for query...
2024-10-09 09:34:37,646 - - COUNTRIES
2024-10-09 09:34:37,647 - - SALES
2024-10-09 09:34:37,648 - - PRODUCTS
2024-10-09 09:34:37,649 - - CHANNELS
2024-10-09 09:34:37,649 - - CUSTOMERS
2024-10-09 09:34:37,649 - - HR_COUNTRIES
2024-10-09 09:34:45,916 - Reranker result:
2024-10-09 09:34:45,918 - ['SALES', 'PRODUCTS', 'CUSTOMERS', 'COUNTRIES']
2024-10-09 09:34:45,919 - 
2024-10-09 09:34:45,954 - Restricted schema generated.
2024-10-09 09:34:45,958 - Generating SQL query...
2024-10-09 09:34:50,689 - SQL query generated.
2024-10-09 09:35:02,053 - Found 20 rows..


In [6]:
# put the data in the chat history
msgs = [
    # rows must be string not list
    HumanMessage(content="These are the data for your analysis.\nData: \n" + str(rows)), 
]

# have a look at the template
analyze_template.invoke({"msgs": msgs})

ChatPromptValue(messages=[SystemMessage(content='You are an AI assistant.\nYour task is to explain the provided data and respond to requests by referencing both the given data and the conversation history.\nBase your answers strictly on the provided information and prior messages in the conversation.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content="These are the data for your analysis.\nData: \n[{'prod_name': 'Envoy Ambassador', 'customer_name': 'Zoe Ballanger', 'country_name': 'United Kingdom', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Trevor Manson', 'country_name': 'Spain', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Chadwick Klemm', 'country_name': 'Denmark', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'customer_name': 'Zillah Driscoll', 'country_name': 'United Kingdom', 'amount_sold': Decimal('1782.72')}, {'prod_name': 'Envoy Ambassador', 'custo

In [7]:
# the chain
analyze_chain = analyze_template | llm_c

#### Simulate a conversation with a set of requests

In [8]:
%%time
# the first request
msgs.append(HumanMessage(content="Create a report."))
            
ai_message_out = analyze_chain.invoke({"msgs": msgs})
print(ai_message_out.content)

# save the result in history
msgs.append(ai_message_out)

**Sales Report for Envoy Ambassador**

**Summary:**

The provided data shows sales information for the product "Envoy Ambassador" across various countries. The report highlights the total sales, country-wise sales, and customer information.

**Total Sales:**

The total sales for Envoy Ambassador amount to **$35,654.40** (calculated by summing up the 'amount_sold' values).

**Country-wise Sales:**

The sales are distributed across the following countries:

1. **United Kingdom**: 6 customers, total sales: **$10,696.32**
2. **Germany**: 9 customers, total sales: **$16,044.48**
3. **Spain**: 2 customers, total sales: **$3,565.44**
4. **Denmark**: 1 customer, total sales: **$1,782.72**
5. **Italy**: 1 customer, total sales: **$1,782.72**
6. **France**: 1 customer, total sales: **$1,782.72**

**Customer Information:**

The data includes 20 customers who purchased the Envoy Ambassador product. The customers are from various countries, with the majority being from Germany (9 customers) and the

##### a loop with a set of additional questions (and answers)

In [9]:
additional_questions = ["Ok, show me only Country Sales for Spain and Italy",
                       "Ok, format all the result in a table.",
                       "No I wanted all the sales in a table, not only Spain and Italy",
                       "Organize the table putting closely rows with the same country",
                       """Oh, but I wanted in the table the rows with sales, 
                        organize the table by putting closely rows with the same country"""]

#
# loop to process the additional questions
#
for i, new_question in enumerate(additional_questions):
    print("")
    print(f"{i+1} Question:", new_question)
    print("")
    
    # adding the new question
    msgs.append(HumanMessage(new_question))

    ai_message_out = analyze_chain.invoke({"msgs": msgs})
    
    print(ai_message_out.content)
    print("")

    # save the result in history
    msgs.append(ai_message_out)


1 Question: Ok, show me only Country Sales for Spain and Italy

Here is the country-wise sales information for Spain and Italy:

1. **Spain**: 2 customers, total sales: **$3,565.44**
2. **Italy**: 1 customer, total sales: **$1,782.72**


2 Question: Ok, format all the result in a table.

Here is the country-wise sales information for Spain and Italy in a table format:

| **Country** | **Number of Customers** | **Total Sales** |
| --- | --- | --- |
| Spain | 2 | $3,565.44 |
| Italy | 1 | $1,782.72 |


3 Question: No I wanted all the sales in a table, not only Spain and Italy

Here is the complete sales data in a table format:

| **Country** | **Number of Customers** | **Total Sales** |
| --- | --- | --- |
| United Kingdom | 6 | $10,696.32 |
| Germany | 9 | $16,044.48 |
| Spain | 2 | $3,565.44 |
| Denmark | 1 | $1,782.72 |
| Italy | 1 | $1,782.72 |
| France | 1 | $1,782.72 |

Let me know if you'd like me to add any other columns or details to the table!


4 Question: Organize the table 