In [1]:
! python --version

Python 3.12.0


## 1. Install the required libraries

We start off by installing all the required libraries. Note that apart from the LlamaIndex modules, we also need psycopg - for interfacing with Postgres databases, and SQLAlchemy - a Python-based ORM mapper for SQL. 

In [2]:
%%capture
! pip install llama-index
! pip install psycopg2-binary
! pip install SQLAlchemy
! pip install python-dotenv
! pip install llama-index-llms-bedrock
! pip install llama-index-embeddings-bedrock

## 2. Connection to database

After loading the environment variables, we proceed by instantiating the SQL engine using SQLAlchemy. This will be used later for the LlamaIndex calls. Note that internally the psycopg Postgres adapter is being used. 

In [1]:
from sqlalchemy import create_engine
import os
from llama_index.core.callbacks import (
    CallbackManager,
    LlamaDebugHandler,
)
from dotenv import load_dotenv

load_dotenv(verbose=True, dotenv_path="../../../.env")

llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])

engine = create_engine(f"postgresql+psycopg2://postgres:{
                       os.environ["PG_VECTOR_PW"]}@localhost:5432/{os.environ["PG_VECTOR_DB"]}")

## 3. Testing the connection

To test if our database connection works, we perform a couple of SELECTs to two of the tables:

In [2]:
# Testing our connection
with engine.connect() as connection:
    cursor = connection.exec_driver_sql("SELECT * FROM users LIMIT 5")
    print(cursor.fetchall())
    
with engine.connect() as connection:
    cursor = connection.exec_driver_sql("SELECT * FROM categories LIMIT 5")
    print(cursor.fetchall())

[(1, 'john_doe', 'john@example.com', 'password123', 'John', 'Doe', '123 Main St, Anytown, USA', '123-456-7890', datetime.datetime(2024, 7, 4, 7, 46, 50, 312399), datetime.datetime(2024, 7, 4, 7, 46, 50, 312399)), (2, 'jane_smith', 'jane@example.com', 'password456', 'Jane', 'Smith', '456 Elm St, Othertown, USA', '987-654-3210', datetime.datetime(2024, 7, 4, 7, 46, 50, 312399), datetime.datetime(2024, 7, 4, 7, 46, 50, 312399)), (3, 'bob_brown', 'bob@example.com', 'password789', 'Bob', 'Brown', '789 Oak St, Thistown, USA', '555-555-5555', datetime.datetime(2024, 7, 4, 7, 46, 50, 312399), datetime.datetime(2024, 7, 4, 7, 46, 50, 312399)), (4, 'alice_jones', 'alice@example.com', 'password1234', 'Alice', 'Jones', '321 Pine St, Yourtown, USA', '234-567-8901', datetime.datetime(2024, 7, 4, 7, 46, 50, 312399), datetime.datetime(2024, 7, 4, 7, 46, 50, 312399)), (5, 'charlie_davis', 'charlie@example.com', 'password5678', 'Charlie', 'Davis', '654 Maple St, Hishtown, USA', '345-678-9012', datetime.

## 4. Setting up the LLM Bedrock configuration

In LlamaIndex, the Settings object is a global configuration interface for setting up LlamaIndex with your LLM, embedding model and other miscellaneous configurations. Here we are setting up both our LLM and the embedding model with the AWS environment variables we have loaded earlier.

Note that we are using Claude 3 Sonnet as the LLM and the Cohere multilingual embedding model as both are excellent models to use.  

In [3]:
from llama_index.core import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding

Settings.llm = Bedrock(
    model="anthropic.claude-3-sonnet-20240229-v1:0",
    aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
    aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
    aws_session_token=os.environ["AWS_SESSION_TOKEN"],
    region_name=os.environ["AWS_DEFAULT_REGION"],
)

Settings.embed_model = BedrockEmbedding(
    model_name="cohere.embed-multilingual-v3",
    region_name=os.environ["AWS_DEFAULT_REGION"],
)

## 5. Setting up the Database Schema index

The SQLDatabase object is a wrapper around SQLAgent, and this is what LlamaIndex uses to actually interact with the database. Here we are setting it up with the tables we want to include in the system. We want to include all the tables.  

In [4]:
from llama_index.core import SQLDatabase

all_tables = [
    ("categories", "This table contains all the categories of scented candles sold in Candle Glow."),
    ("products", "This table contains the products list sold in Candle Glow."),
    ("users", "This table contains the users list who have purchased products from Candle Glow."),
    ("orders", "This table contains the orders list placed by users in Candle Glow."),
    ("order_items", "This table contains the order items list placed by users in Candle Glow."),
    ("reviews", "This table contains the reviews given by users for the products purchased in Candle Glow.")
]

sql_database = SQLDatabase(engine, include_tables=[table_name for table_name, _ in all_tables])

## 6. Building the table schema index and setting up the retriever

Here we are creating an internal index of the SQL tables using ObjectIndex. This is critical because during query-time, LlamaIndex needs to know what tables it needs to query to satisfy the user’s question. 

In this article we have a small database of only 6 tables, but in a production setting, it is not uncommon to find systems with dozens or even hundreds of tables. So instead of ingesting all the table schemas, we are building an index of table schemas. The table name and table description we prepared earlier are used for this. 

Finally, an SQLTableRetrieverQueryEngine is instantiated using the SQLDatabase and the table schema index that we just created.

## 7. Formatting the results output

What’s the use of our results if we cannot read it easily? This is just a convenience formatting to enable us to easily read the results and for this we use the Markdown format as it is just playing text and is supported widely including this notebook. 


In [5]:
from llama_index.core.indices.struct_store import SQLTableRetrieverQueryEngine
from llama_index.core.objects import (
    SQLTableNodeMapping,
    ObjectIndex,
    SQLTableSchema,
)
from llama_index.core import VectorStoreIndex
from IPython.display import Markdown, display

In [6]:
table_node_mapping = SQLTableNodeMapping(sql_database)

table_schema_objs = []
for table_name, table_description in all_tables:
    table_schema_objs.append(
        SQLTableSchema(table_name=table_name, context_str=table_description))
    
print(table_node_mapping)
print(table_schema_objs)

obj_index = ObjectIndex.from_objects(
    table_schema_objs,
    table_node_mapping,
    VectorStoreIndex,
    callback_manager=callback_manager,
)

query_engine = SQLTableRetrieverQueryEngine(
    sql_database,
    obj_index.as_retriever(similarity_top_k=len(all_tables)),
)

def format_sql_results_as_markdown_table(sql_results, headers):
    """
    Formats the SQL results as a markdown table
    """
    if not sql_results and not headers:
        return "No results found."
    
    header_row = "| " + " | ".join(headers) + " |"
    separator_row = "| " + " | ".join(["---"] * len(headers)) + " |"

    table_rows = []
    for row in sql_results:
        table_row = "|"
        for j, _ in enumerate(headers):
            table_row += f" {str(row[j])} |"
        table_rows.append(table_row)

    markdown_table = "\n".join([header_row, separator_row] + table_rows)
    return markdown_table

response_template = """
### Question

**{number}.** **{question}**

### Answer

{response}

### Generated SQL Query
```sql
{sql}
```

### SQL Results

{sql_results}

"""

def text_to_sql(query_engine, question, number=1):
    """
    Calls the query engine with the given question and displays the response as a markdown cell
    """
    engine_response = query_engine.query(question)
    if "result" in engine_response.metadata and "col_keys" in engine_response.metadata:
        display(Markdown(response_template.format(
                number=number,
                question=question,
                response=str(engine_response),
                sql=engine_response.metadata["sql_query"],
                sql_results=format_sql_results_as_markdown_table(
                    engine_response.metadata["result"],
                    engine_response.metadata["col_keys"]),
            )))
    else:
        print("No results found.")

<llama_index.core.objects.table_node_mapping.SQLTableNodeMapping object at 0x13f92db50>
[SQLTableSchema(table_name='categories', context_str='This table contains all the categories of scented candles sold in Candle Glow.'), SQLTableSchema(table_name='products', context_str='This table contains the products list sold in Candle Glow.'), SQLTableSchema(table_name='users', context_str='This table contains the users list who have purchased products from Candle Glow.'), SQLTableSchema(table_name='orders', context_str='This table contains the orders list placed by users in Candle Glow.'), SQLTableSchema(table_name='order_items', context_str='This table contains the order items list placed by users in Candle Glow.'), SQLTableSchema(table_name='reviews', context_str='This table contains the reviews given by users for the products purchased in Candle Glow.')]
**********
Trace: index_construction
**********


## Sample Invocation Output

Here are a few samples of questions and answers that show the capability of this Text2SQL system. LlamaIndex, together with a very capable LLM in Claude 3, can easily connect with your SQL Database, and enable you to get insights using nothing but natural language.


In [7]:
text_to_sql(query_engine, "How many tables are there in the database?", number=1)


### Question

**1.** **How many tables are there in the database?**

### Answer

Based on the SQL query and response, there are 6 tables in the database schema 'public'.

### Generated SQL Query
```sql
SELECT COUNT(*) FROM information_schema.tables WHERE table_schema = 'public';
```

### SQL Results

| count |
| --- |
| 6 |



In [8]:
text_to_sql(query_engine, "What are the categories of products sold?", number=2)


### Question

**2.** **What are the categories of products sold?**

### Answer

Based on the SQL query and response, the categories of products sold are Earthy, Floral, Fresh, Fruity, and Spicy. These categories represent different scent profiles for candles, with Earthy candles having earthy scents like sandalwood, cedar, and pine, Floral candles having floral scents like rose, lavender, and jasmine, Fresh candles having fresh scents like mint, eucalyptus, and ocean breeze, Fruity candles having fruity scents like apple, orange, and berry, and Spicy candles having spicy scents like cinnamon, clove, and ginger.

### Generated SQL Query
```sql
SELECT name, description 
FROM categories
ORDER BY name;
```

### SQL Results

| name | description |
| --- | --- |
| Earthy | Candles with earthy scents like sandalwood, cedar, and pine. |
| Floral | Candles with floral scents like rose, lavender, and jasmine. |
| Fresh | Candles with fresh scents like mint, eucalyptus, and ocean breeze. |
| Fruity | Candles with fruity scents like apple, orange, and berry. |
| Spicy | Candles with spicy scents like cinnamon, clove, and ginger. |



In [11]:
question = """Give me a list of orders in the last week. Also include the user (firstname and lastname) who placed the order and the total amount of the order."""
text_to_sql(query_engine, question, number=3)


### Question

**3.** **Give me a list of orders in the last week. Also include the user (firstname and lastname) who placed the order and the total amount of the order.**

### Answer

Here is a summary of the orders placed in the last week:

A total of 26 orders were placed by 12 different users. The orders range in total amount from $17.26 to $222.62.

Some of the top orders by total amount include:

- An order for $222.62 placed by Rachel Garcia
- An order for $175.68 placed by Fiona Hall 
- An order for $174.09 placed by Ian Lee
- An order for $154.40 placed by Paula Martinez

The users with the most orders in the last week were:

- Jessica Adams with 3 orders totaling $151.82
- Bob Brown with 2 orders totaling $232.24  
- Edward Wilson with 2 orders totaling $177.52
- Nina Thomas with 2 orders totaling $251.86

The total revenue from these 26 orders over the past week was $2,537.18.

### Generated SQL Query
```sql
SELECT 
    o.order_id, 
    u.first_name, 
    u.last_name, 
    o.total_amount
FROM 
    orders o
JOIN 
    users u ON o.user_id = u.user_id
WHERE 
    o.order_date >= NOW() - INTERVAL '1 week'
ORDER BY 
    o.order_date DESC;
```

### SQL Results

| order_id | first_name | last_name | total_amount |
| --- | --- | --- | --- |
| 1117 | Charlie | Davis | 74.09 |
| 1121 | Edward | Wilson | 75.56 |
| 1120 | Jessica | Adams | 17.26 |
| 1119 | Nina | Thomas | 144.26 |
| 1118 | Hannah | Clark | 82.02 |
| 1114 | Alice | Jones | 94.21 |
| 1116 | Oliver | Miller | 71.09 |
| 1115 | Michael | Taylor | 38.12 |
| 1113 | Ian | Lee | 174.09 |
| 1109 | Nina | Thomas | 107.60 |
| 1110 | John | Doe | 39.69 |
| 1111 | Quincy | Rodriguez | 90.39 |
| 1112 | Rachel | Garcia | 222.62 |
| 1106 | Fiona | Hall | 44.00 |
| 1108 | Edward | Wilson | 101.96 |
| 1107 | Jessica | Adams | 25.34 |
| 1103 | Laura | Moore | 34.10 |
| 1105 | Jessica | Adams | 109.22 |
| 1104 | Fiona | Hall | 175.68 |
| 1100 | Bob | Brown | 101.80 |
| 1102 | Ian | Lee | 45.99 |
| 1101 | Oliver | Miller | 35.43 |
| 1099 | Laura | Moore | 136.36 |
| 1097 | Bob | Brown | 130.44 |
| 1098 | Jane | Smith | 55.12 |
| 1096 | Paula | Martinez | 154.40 |



In [13]:
question = """Give me a summary of sales from the last three months and group per month."""
text_to_sql(query_engine, question, number=4)


### Question

**4.** **Give me a summary of sales from the last three months and group per month.**

### Answer

Based on the query results, here is a summary of sales from the last three months grouped by month:

In July 2024, the total sales amounted to $1,402.30.
In June 2024, the total sales were $9,022.66, which was the highest among the three months.
In May 2024, the total sales were $7,953.47.
In April 2024, the total sales were $7,840.62.

The sales figures show that June 2024 had the highest sales, followed by May and April. July 2024 had the lowest sales among the four months, likely because the data only includes partial month sales.

### Generated SQL Query
```sql
SELECT 
    DATE_TRUNC('month', order_date) AS month,
    SUM(total_amount) AS total_sales
FROM 
    orders
WHERE
    order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '3 months')
GROUP BY
    month
ORDER BY
    month DESC;
```

### SQL Results

| month | total_sales |
| --- | --- |
| 2024-07-01 00:00:00 | 1402.30 |
| 2024-06-01 00:00:00 | 9022.66 |
| 2024-05-01 00:00:00 | 7953.47 |
| 2024-04-01 00:00:00 | 7840.62 |



In [16]:
question = """List top ten orders (name and description) along with the total number of items in each order in the last week."""
text_to_sql(query_engine, question, number=5)


### Question

**5.** **List top ten orders (name and description) along with the total number of items in each order in the last week.**

### Answer

Based on the SQL query and results, here is a synthesized response:

In the last week, the top 10 orders by total number of items were:

1. Oakmoss Oasis candle with 14 items ordered. This tranquil oakmoss-scented candle was the most popular.

2. Orange Zest candle with 12 items ordered. Customers loved this zesty orange scent. 

3. Minty Fresh candle with 11 items ordered. A refreshing mint aroma made this a fan favorite.

4. Pine Forest and Patchouli Passion candles tied with 9 items each. The refreshing pine and passionate patchouli scents were equally sought after.

6. Cedarwood Bliss candle also had 9 items ordered, making this earthy cedarwood scent a top seller.

7. Cinnamon Spice candle had 8 items ordered, satisfying cravings for a spicy cinnamon fragrance.

8. Ginger Glow, Sandalwood Serenity, and Clove Comfort candles rounded out the top 10, each with 7 items ordered. These warming ginger, serene sandalwood, and comforting clove aromas were popular choices.

The results highlight our customers' diverse preferences for different natural scents over the past week. From earthy to zesty to spicy, a variety of fragrances made it into the top 10 best-selling candles.

### Generated SQL Query
```sql
SELECT p.name, p.description, SUM(oi.quantity) AS total_items
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE o.order_date >= NOW() - INTERVAL '1 WEEK'
GROUP BY p.name, p.description
ORDER BY total_items DESC
LIMIT 10;
```

### SQL Results

| name | description | total_items |
| --- | --- | --- |
| Oakmoss Oasis | A tranquil oakmoss-scented candle. | 14 |
| Orange Zest | A zesty orange-scented candle. | 12 |
| Minty Fresh | A refreshing mint-scented candle. | 11 |
| Pine Forest | A refreshing pine-scented candle. | 9 |
| Patchouli Passion | A passionate patchouli-scented candle. | 9 |
| Cedarwood Bliss | An earthy cedarwood-scented candle. | 9 |
| Cinnamon Spice | A spicy cinnamon-scented candle. | 8 |
| Ginger Glow | A warming ginger-scented candle. | 7 |
| Sandalwood Serenity | A serene sandalwood-scented candle. | 7 |
| Clove Comfort | A comforting clove-scented candle. | 7 |



In [18]:
question = """Retrieve the average rating for each product, just return the top 10 products."""
text_to_sql(query_engine, question, number=6)


### Question

**6.** **Retrieve the average rating for each product, just return the top 10 products.**

### Answer

Based on the SQL query and results, here is a possible response:

The top 10 highest rated products based on average customer ratings are:

1. Nutmeg Nook with an average rating of 3.45
2. Ocean Breeze with an average rating of 3.41
3. Minty Fresh with an average rating of 3.26
4. Cedarwood Bliss with an average rating of 3.25  
5. Oakmoss Oasis with an average rating of 3.22
6. Eucalyptus Escape with an average rating of 3.19
7. Apple Delight with an average rating of 3.18
8. Cardamom Charm with an average rating of 3.15
9. Pineapple Paradise with an average rating of 3.11
10. Berry Bliss with an average rating of 3.10

The query retrieves the product names and calculates the average rating for each product from the reviews table. It orders the results by the average rating in descending order and limits the output to the top 10 products. This allows us to highlight the most highly rated products based on customer feedback.

### Generated SQL Query
```sql
SELECT p.name, AVG(r.rating) AS average_rating
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.name
ORDER BY average_rating DESC
LIMIT 10;
```

### SQL Results

| name | average_rating |
| --- | --- |
| Nutmeg Nook | 3.4500000000000000 |
| Ocean Breeze | 3.4117647058823529 |
| Minty Fresh | 3.2608695652173913 |
| Cedarwood Bliss | 3.2500000000000000 |
| Oakmoss Oasis | 3.2187500000000000 |
| Eucalyptus Escape | 3.1923076923076923 |
| Apple Delight | 3.1818181818181818 |
| Cardamom Charm | 3.1481481481481481 |
| Pineapple Paradise | 3.1111111111111111 |
| Berry Bliss | 3.0952380952380952 |



In [21]:
question = """Find the user (firstname and lastname) who placed the highest total amount of orders. And include the total amount of orders placed by this user."""
text_to_sql(query_engine, question, number=7)


### Question

**7.** **Find the user (firstname and lastname) who placed the highest total amount of orders. And include the total amount of orders placed by this user.**

### Answer

Based on the SQL query and the response, the user who placed the highest total amount of orders is Paula Martinez, with a total order amount of $6405.77.

### Generated SQL Query
```sql
SELECT u.first_name, u.last_name, SUM(o.total_amount) AS total_order_amount
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.first_name, u.last_name
ORDER BY total_order_amount DESC
LIMIT 1;
```

### SQL Results

| first_name | last_name | total_order_amount |
| --- | --- | --- |
| Paula | Martinez | 6405.77 |



In [22]:
question = """Get the month-by-month sales total for the past year."""
text_to_sql(query_engine, question, number=8)


### Question

**8.** **Get the month-by-month sales total for the past year.**

### Answer

Based on the SQL query and results, here is a synthesized response:

The month-by-month sales totals for the past year show that sales were highest in June 2024 at $9,022.66. The next highest months were March 2024 at $9,333.30 and December 2023 at $9,260.40. Sales were lowest in July 2023 at $6,493.30 and July 2024 at $1,402.30, though the current month is only partially complete. Overall, sales remained fairly consistent throughout the year, ranging from around $7,000 to $9,000 per month for most months, with a few peaks over $9,000. The total sales for the 12-month period was $100,847.32.

### Generated SQL Query
```sql
SELECT DATE_TRUNC('month', order_date) AS month, SUM(total_amount) AS total_sales
FROM orders
WHERE order_date >= NOW() - INTERVAL '1 year'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month DESC;
```

### SQL Results

| month | total_sales |
| --- | --- |
| 2024-07-01 00:00:00 | 1402.30 |
| 2024-06-01 00:00:00 | 9022.66 |
| 2024-05-01 00:00:00 | 7953.47 |
| 2024-04-01 00:00:00 | 7840.62 |
| 2024-03-01 00:00:00 | 9333.30 |
| 2024-02-01 00:00:00 | 8226.47 |
| 2024-01-01 00:00:00 | 8910.12 |
| 2023-12-01 00:00:00 | 9260.40 |
| 2023-11-01 00:00:00 | 8636.59 |
| 2023-10-01 00:00:00 | 7955.52 |
| 2023-09-01 00:00:00 | 8056.06 |
| 2023-08-01 00:00:00 | 7249.51 |
| 2023-07-01 00:00:00 | 6493.30 |



In [25]:
question = """Find the top three popular product categories based on the number of orders."""
text_to_sql(query_engine, question, number=9)


### Question

**9.** **Find the top three popular product categories based on the number of orders.**

### Answer

Based on the SQL query and results, the top three most popular product categories based on the number of orders are:

1. Spicy (484 orders)
2. Earthy (484 orders) 
3. Floral (436 orders)

The query first joins the categories, products, and order_items tables to connect the category information with the order details. It then groups the results by category name and counts the number of order_ids for each category. Finally, it orders the results by the order count in descending order and limits the output to the top 3 rows.

### Generated SQL Query
```sql
SELECT c.name, COUNT(oi.order_id) AS order_count
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY c.name
ORDER BY order_count DESC
LIMIT 3;
```

### SQL Results

| name | order_count |
| --- | --- |
| Spicy | 484 |
| Earthy | 484 |
| Floral | 436 |



In [35]:
question = """Identify trends in product ratings over the last quarter and group by product and month."""
text_to_sql(query_engine, question, number=10)


### Question

**10.** **Identify trends in product ratings over the last quarter and group by product and month.**

### Answer

Based on the query results, here are the key trends in product ratings over the last quarter, grouped by product and month:

For the most recent month (July 2024):
- The top rated products with a perfect 5.0 average rating were Cedarwood Bliss, Nutmeg Nook, Rose Garden, Oakmoss Oasis, and Herbal Harmony.
- The lowest rated products were Pineapple Paradise and Eucalyptus Escape with an average of just 1.0.

In June 2024:
- Violet Petals had the highest average rating of 5.0.
- Several products like Citrus Splash, Eucalyptus Escape, Ocean Breeze, and Minty Fresh had relatively high ratings around 4.0.
- Lower rated products included Cardamom Charm, Herbal Harmony (1.5), Orange Zest and Sandalwood Serenity (2.0).

For May 2024: 
- Cinnamon Spice was the top rated product with 5.0 average.
- Tropical Mango, Herbal Harmony, Cedarwood Bliss and Nutmeg Nook all had high ratings around 4.0-4.5.
- The lowest rated were Ocean Breeze and Ginger Glow at just 1.0.

In April 2024:
- Rose Garden had a perfect 5.0 rating. 
- Apple Delight, Berry Bliss and Violet Petals were also highly rated around 4.0.
- The lowest rated products were Herbal Harmony, Pine Forest and Cinnamon Spice at just 1.0.

So in summary, the top consistently highly rated products were Cedarwood Bliss, Nutmeg Nook, Rose Garden and Violet Petals. The lowest rated tended to be Pineapple Paradise, Eucalyptus Escape, Herbal Harmony and Cinnamon Spice, though ratings varied month-to-month.

### Generated SQL Query
```sql
SELECT p.name, DATE_TRUNC('month', r.created_at) AS month, AVG(r.rating) AS avg_rating
FROM reviews r
JOIN products p ON r.product_id = p.product_id
WHERE r.created_at >= DATE_TRUNC('quarter', CURRENT_DATE) - INTERVAL '3 MONTH'
GROUP BY p.name, DATE_TRUNC('month', r.created_at)
ORDER BY month DESC, avg_rating DESC;
```

### SQL Results

| name | month | avg_rating |
| --- | --- | --- |
| Cedarwood Bliss | 2024-07-01 00:00:00 | 5.0000000000000000 |
| Nutmeg Nook | 2024-07-01 00:00:00 | 5.0000000000000000 |
| Rose Garden | 2024-07-01 00:00:00 | 5.0000000000000000 |
| Oakmoss Oasis | 2024-07-01 00:00:00 | 5.0000000000000000 |
| Herbal Harmony | 2024-07-01 00:00:00 | 5.0000000000000000 |
| Berry Bliss | 2024-07-01 00:00:00 | 2.0000000000000000 |
| Pineapple Paradise | 2024-07-01 00:00:00 | 1.00000000000000000000 |
| Eucalyptus Escape | 2024-07-01 00:00:00 | 1.00000000000000000000 |
| Violet Petals | 2024-06-01 00:00:00 | 5.0000000000000000 |
| Citrus Splash | 2024-06-01 00:00:00 | 4.0000000000000000 |
| Eucalyptus Escape | 2024-06-01 00:00:00 | 4.0000000000000000 |
| Ocean Breeze | 2024-06-01 00:00:00 | 4.0000000000000000 |
| Minty Fresh | 2024-06-01 00:00:00 | 3.8000000000000000 |
| Clove Comfort | 2024-06-01 00:00:00 | 3.6666666666666667 |
| Pineapple Paradise | 2024-06-01 00:00:00 | 3.3333333333333333 |
| Nutmeg Nook | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Berry Bliss | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Cinnamon Spice | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Ginger Glow | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Tropical Mango | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Apple Delight | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Lavender Dream | 2024-06-01 00:00:00 | 3.0000000000000000 |
| Oakmoss Oasis | 2024-06-01 00:00:00 | 2.7500000000000000 |
| Patchouli Passion | 2024-06-01 00:00:00 | 2.6666666666666667 |
| Rose Garden | 2024-06-01 00:00:00 | 2.3333333333333333 |
| Orange Zest | 2024-06-01 00:00:00 | 2.0000000000000000 |
| Sandalwood Serenity | 2024-06-01 00:00:00 | 2.0000000000000000 |
| Cardamom Charm | 2024-06-01 00:00:00 | 1.5000000000000000 |
| Herbal Harmony | 2024-06-01 00:00:00 | 1.5000000000000000 |
| Cinnamon Spice | 2024-05-01 00:00:00 | 5.0000000000000000 |
| Tropical Mango | 2024-05-01 00:00:00 | 4.5000000000000000 |
| Herbal Harmony | 2024-05-01 00:00:00 | 4.0000000000000000 |
| Cedarwood Bliss | 2024-05-01 00:00:00 | 4.0000000000000000 |
| Nutmeg Nook | 2024-05-01 00:00:00 | 4.0000000000000000 |
| Orange Zest | 2024-05-01 00:00:00 | 3.6666666666666667 |
| Violet Petals | 2024-05-01 00:00:00 | 3.5000000000000000 |
| Patchouli Passion | 2024-05-01 00:00:00 | 3.5000000000000000 |
| Eucalyptus Escape | 2024-05-01 00:00:00 | 3.2000000000000000 |
| Minty Fresh | 2024-05-01 00:00:00 | 3.0000000000000000 |
| Clove Comfort | 2024-05-01 00:00:00 | 3.0000000000000000 |
| Cardamom Charm | 2024-05-01 00:00:00 | 2.7500000000000000 |
| Jasmine Bloom | 2024-05-01 00:00:00 | 2.6666666666666667 |
| Lily of the Valley | 2024-05-01 00:00:00 | 2.6666666666666667 |
| Rose Garden | 2024-05-01 00:00:00 | 2.5000000000000000 |
| Pineapple Paradise | 2024-05-01 00:00:00 | 2.5000000000000000 |
| Oakmoss Oasis | 2024-05-01 00:00:00 | 2.5000000000000000 |
| Berry Bliss | 2024-05-01 00:00:00 | 2.0000000000000000 |
| Apple Delight | 2024-05-01 00:00:00 | 2.0000000000000000 |
| Lavender Dream | 2024-05-01 00:00:00 | 2.0000000000000000 |
| Pine Forest | 2024-05-01 00:00:00 | 2.0000000000000000 |
| Citrus Splash | 2024-05-01 00:00:00 | 1.5000000000000000 |
| Ocean Breeze | 2024-05-01 00:00:00 | 1.00000000000000000000 |
| Ginger Glow | 2024-05-01 00:00:00 | 1.00000000000000000000 |
| Rose Garden | 2024-04-01 00:00:00 | 5.0000000000000000 |
| Apple Delight | 2024-04-01 00:00:00 | 4.0000000000000000 |
| Berry Bliss | 2024-04-01 00:00:00 | 4.0000000000000000 |
| Violet Petals | 2024-04-01 00:00:00 | 3.7500000000000000 |
| Nutmeg Nook | 2024-04-01 00:00:00 | 3.5000000000000000 |
| Patchouli Passion | 2024-04-01 00:00:00 | 3.5000000000000000 |
| Citrus Splash | 2024-04-01 00:00:00 | 3.3333333333333333 |
| Lily of the Valley | 2024-04-01 00:00:00 | 3.0000000000000000 |
| Minty Fresh | 2024-04-01 00:00:00 | 3.0000000000000000 |
| Oakmoss Oasis | 2024-04-01 00:00:00 | 3.0000000000000000 |
| Clove Comfort | 2024-04-01 00:00:00 | 3.0000000000000000 |
| Ginger Glow | 2024-04-01 00:00:00 | 3.0000000000000000 |
| Ocean Breeze | 2024-04-01 00:00:00 | 2.0000000000000000 |
| Pineapple Paradise | 2024-04-01 00:00:00 | 2.0000000000000000 |
| Sandalwood Serenity | 2024-04-01 00:00:00 | 2.0000000000000000 |
| Orange Zest | 2024-04-01 00:00:00 | 2.0000000000000000 |
| Jasmine Bloom | 2024-04-01 00:00:00 | 1.6666666666666667 |
| Tropical Mango | 2024-04-01 00:00:00 | 1.3333333333333333 |
| Herbal Harmony | 2024-04-01 00:00:00 | 1.00000000000000000000 |
| Pine Forest | 2024-04-01 00:00:00 | 1.00000000000000000000 |
| Cinnamon Spice | 2024-04-01 00:00:00 | 1.00000000000000000000 |



In [29]:
question = """Calculate the average order value (AOV) for each user (firstname and lastname). Just give me the top 10 users."""
text_to_sql(query_engine, question, number=11)


### Question

**11.** **Calculate the average order value (AOV) for each user (firstname and lastname). Just give me the top 10 users.**

### Answer

Here is the response synthesized from the SQL query results:

The top 10 users by average order value (AOV) are:

1. Bob Brown with an average order value of $100.22
2. Nina Thomas with an average order value of $97.26
3. Jane Smith with an average order value of $97.16
4. Paula Martinez with an average order value of $95.61
5. Alice Jones with an average order value of $93.67
6. Laura Moore with an average order value of $92.65
7. Hannah Clark with an average order value of $92.37
8. Diana Evans with an average order value of $90.87
9. George White with an average order value of $90.17
10. John Doe with an average order value of $89.36

The query calculates the average total order amount grouped by the first and last name of each user, orders the results by the average order value in descending order, and limits the output to the top 10 rows.

### Generated SQL Query
```sql
SELECT u.first_name, u.last_name, AVG(o.total_amount) AS avg_order_value
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.first_name, u.last_name
ORDER BY avg_order_value DESC
LIMIT 10;
```

### SQL Results

| first_name | last_name | avg_order_value |
| --- | --- | --- |
| Bob | Brown | 100.2212244897959184 |
| Nina | Thomas | 97.2570000000000000 |
| Jane | Smith | 97.1598275862068966 |
| Paula | Martinez | 95.6085074626865672 |
| Alice | Jones | 93.6657446808510638 |
| Laura | Moore | 92.6505769230769231 |
| Hannah | Clark | 92.3651470588235294 |
| Diana | Evans | 90.8719642857142857 |
| George | White | 90.1746875000000000 |
| John | Doe | 89.3596774193548387 |



In [30]:
question = """Berechnen Sie den durchschnittlichen Bestellwert für jeden Benutzer (Vorname und Nachname). Geben Sie nur die Top-10-Benutzer an."""
# Now query in German (Calculate the average order value for each user (firstname and lastname). Just give me the top 10 users.)
text_to_sql(query_engine, question, number=11)


### Question

**11.** **Berechnen Sie den durchschnittlichen Bestellwert für jeden Benutzer (Vorname und Nachname). Geben Sie nur die Top-10-Benutzer an.**

### Answer

Die Abfrage berechnet den durchschnittlichen Bestellwert für jeden Benutzer und gibt die Top 10 Benutzer mit dem höchsten durchschnittlichen Bestellwert aus. Basierend auf den Ergebnissen lautet die Antwort:

Die Top 10 Benutzer mit dem höchsten durchschnittlichen Bestellwert sind:

1. Bob Brown mit einem durchschnittlichen Bestellwert von 100,22
2. Nina Thomas mit 97,26
3. Jane Smith mit 97,16 
4. Paula Martinez mit 95,61
5. Alice Jones mit 93,67
6. Laura Moore mit 92,65
7. Hannah Clark mit 92,37
8. Diana Evans mit 90,87
9. George White mit 90,17
10. John Doe mit 89,36

### Generated SQL Query
```sql
SELECT u.first_name, u.last_name, AVG(o.total_amount) AS avg_order_value
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.first_name, u.last_name
ORDER BY avg_order_value DESC
LIMIT 10;
```

### SQL Results

| first_name | last_name | avg_order_value |
| --- | --- | --- |
| Bob | Brown | 100.2212244897959184 |
| Nina | Thomas | 97.2570000000000000 |
| Jane | Smith | 97.1598275862068966 |
| Paula | Martinez | 95.6085074626865672 |
| Alice | Jones | 93.6657446808510638 |
| Laura | Moore | 92.6505769230769231 |
| Hannah | Clark | 92.3651470588235294 |
| Diana | Evans | 90.8719642857142857 |
| George | White | 90.1746875000000000 |
| John | Doe | 89.3596774193548387 |



In [31]:
question = """각 사용자(이름 및 성)의 평균 주문 가치를 계산하십시오. 상위 10명의 사용자만 표시하십시오."""
# Now query in Korean (Calculate the average order value for each user (firstname and lastname). Just give me the top 10 users.)
text_to_sql(query_engine, question, number=12)


### Question

**12.** **각 사용자(이름 및 성)의 평균 주문 가치를 계산하십시오. 상위 10명의 사용자만 표시하십시오.**

### Answer

이 쿼리는 각 사용자의 평균 주문 금액을 계산하고 상위 10명의 사용자를 표시합니다. 결과에 따르면 Bob Brown이 평균 주문 금액 $100.22로 가장 높았고, 그 다음으로 Nina Thomas, Jane Smith, Paula Martinez 순입니다. 상위 10명의 사용자 이름과 성, 그리고 각각의 평균 주문 금액이 표시되어 있습니다.

### Generated SQL Query
```sql
SELECT u.first_name, u.last_name, AVG(o.total_amount) AS avg_order_value
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.first_name, u.last_name
ORDER BY avg_order_value DESC
LIMIT 10;
```

### SQL Results

| first_name | last_name | avg_order_value |
| --- | --- | --- |
| Bob | Brown | 100.2212244897959184 |
| Nina | Thomas | 97.2570000000000000 |
| Jane | Smith | 97.1598275862068966 |
| Paula | Martinez | 95.6085074626865672 |
| Alice | Jones | 93.6657446808510638 |
| Laura | Moore | 92.6505769230769231 |
| Hannah | Clark | 92.3651470588235294 |
| Diana | Evans | 90.8719642857142857 |
| George | White | 90.1746875000000000 |
| John | Doe | 89.3596774193548387 |



In [32]:
question = """Kalkulahin ang average na halaga ng order para sa bawat user (pangalan at apelyido). Ipakita lamang ang nangungunang 10 mga user."""
# Now query in Filipino (Calculate the average order value for each user (firstname and lastname). Just give me the top 10 users.)
text_to_sql(query_engine, question, number=13)


### Question

**13.** **Kalkulahin ang average na halaga ng order para sa bawat user (pangalan at apelyido). Ipakita lamang ang nangungunang 10 mga user.**

### Answer

Batay sa query na iyon, nakuha natin ang nangungunang 10 mga user na may pinakamalaking average na halaga ng order. Ang resulta ay nakalista ng pangalan at apelyido ng user, kasama ang kanilang average order amount na nakasunod-sunod mula sa pinakamalaki hanggang ika-10. Halimbawa, si Bob Brown ang may pinakamalaking average order amount na $100.22, sinusundan ni Nina Thomas na $97.26, Jane Smith na $97.16, at iba pa. Ang query na ito ay nakatuon sa pagsusuri ng gastos ng mga customer batay sa kanilang mga nakalipas na order upang makapagbigay ng ideya kung sino ang mga pinakamahalagang customer sa negosyo.

### Generated SQL Query
```sql
SELECT u.first_name, u.last_name, AVG(o.total_amount) AS average_order_amount
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.first_name, u.last_name
ORDER BY average_order_amount DESC
LIMIT 10;
```

### SQL Results

| first_name | last_name | average_order_amount |
| --- | --- | --- |
| Bob | Brown | 100.2212244897959184 |
| Nina | Thomas | 97.2570000000000000 |
| Jane | Smith | 97.1598275862068966 |
| Paula | Martinez | 95.6085074626865672 |
| Alice | Jones | 93.6657446808510638 |
| Laura | Moore | 92.6505769230769231 |
| Hannah | Clark | 92.3651470588235294 |
| Diana | Evans | 90.8719642857142857 |
| George | White | 90.1746875000000000 |
| John | Doe | 89.3596774193548387 |

