## Agents

Agents leverage the reasoning capabilities of LLMs to make decisions during execution.
Using agents allows you to offload additional discretion over the query generation and execution process. Although their behavior is less predictable than the above "chain", they feature some advantages:

- They can query the database as many times as needed to answer the user question.
- They can recover from errors by running a generated query, catching the traceback and regenerating it correctly.
- They can answer questions based on the databases' schema as well as on the databases' content (like describing a specific table).


In [None]:
# question = "How many customers do we have?"
# question = "List all employees."
# question = "Show me the top 5 countries by number of customers."
# question = "What is the total revenue from all invoices?"
question = "How many invoices were issued in 2013?"
# question = "Who is the top-selling artist by revenue?"
# question = "Who are our top 10 customers by total spending?"
# question = "What is the average invoice total?"
# question = "Show me a list of customers who have never placed an invoice."

### Question 5
"How many invoices were issued in 2013?"

To answer the original question, we need to use a different approach since there is no `date` column in the provided database schema. We can assume that the `invoice` table has a date field representing the invoice's creation or issue date.

Here's an updated SQL query:

```sql
SELECT COUNT(*) 
FROM invoice 
WHERE YEAR(invoice.date) = 2013;
```

This query will return the number of invoices issued in 2013.

The correct query was:
```sql
SELECT COUNT(*) 
FROM invoice 
WHERE EXTRACT(YEAR FROM invoice.invoice_date) = 2013;
```

There were two errors:
1. invoice.date was not exist in the schema.
2. In PostgreSQL, use EXTRACT(YEAR FROM ...) instead of YEAR(...).

### Question 6
"Who is the top-selling artist by revenue?"

query:
```sql
SELECT artist.name, SUM(invoice_line.unit_price * invoice_line.quantity) AS total_revenue FROM invoice_line JOIN track ON invoice_line.track_id = track.track_id JOIN album ON track.album_id = album.album_id JOIN artist ON album.artist_id = artist.artist_id GROUP BY artist.artist_id ORDER BY total_revenue DESC LIMIT 5
```

result:
[('Iron Maiden', Decimal('138.60')), ('U2', Decimal('105.93')), ('Metallica', Decimal('90.09')), ('Led Zeppelin', Decimal('86.13')), ('Lost', Decimal('81.59'))]

response:
The top-selling artist by revenue is Iron Maiden, with a total revenue of $138.60.

### Question 7
"Who are our top 10 customers by total spending?"

```sql
SELECT 
    c.customer_id,
    SUM(i.unit_price * i.quantity) AS total_spending
FROM invoice i
JOIN invoice_line il ON i.invoice_id = il.invoice_id
JOIN track t ON il.track_id = t.track_id
JOIN album a ON t.album_id = a.album_id
JOIN artist ar ON a.artist_id = ar.artist_id
JOIN customer c ON ar.customer_id = c.customer_id
GROUP BY c.customer_id
ORDER BY SUM(i.unit_price * i.quantity) DESC
LIMIT 10;
```

Errors:
1. invoice doesn not have unit_price and quantity.

In [2]:
from langchain_community.utilities import SQLDatabase

db = SQLDatabase.from_uri("postgresql://postgres:chinook@localhost:55000/chinook")

In [3]:
from langchain_ollama import ChatOllama

# llm = ChatOllama(model="qwen3:1.7b", reasoning=False)
# llm = ChatOllama(model="qwen3:4b", reasoning=False)
llm = ChatOllama(model="llama3.2:3b", reasoning=False, temperature=0.1)

The SQLDatabaseToolkit includes tools that can:

- Create and execute queries
- Check query syntax
- Retrieve table descriptions
- ... and more

In [4]:
from langchain_community.agent_toolkits import SQLDatabaseToolkit

toolkit = SQLDatabaseToolkit(db=db, llm=llm)

tools = toolkit.get_tools()

### System Prompt

In [None]:
system_message = """
**Role & Goal:**
You are a precise and helpful SQL query agent for the Chinook music store database. Your primary function is to translate user requests into accurate, efficient, and syntactically correct PostgreSQL queries. After executing a query, you must present the raw data results to the user and provide a concise, one-sentence description of what the result set represents.

**Database Schema Conventions:**
*   **Table Names:** All tables are named in the singular form (e.g., `employee`, `album`).
*   **Primary Keys:** All primary key columns are named with an underscore (e.g., `artist_id`, `invoice_id`).
*   **Foreign Keys:** Foreign key columns follow the same pattern, referencing the primary key of the related table (e.g., `support_rep_id` in the `customer` table references `employee_id` in the `employee` table).

**Critical Database Schema:**
Familiarize yourself with the following tables, their columns, and their relationships. Use this to construct accurate `JOIN` conditions.

**Table: artist**
*   `artist_id` (Primary Key)
*   `name`
*   *Description: Musical artists and bands.*

**Table: album**
*   `album_id` (Primary Key)
*   `title`
*   `artist_id` (Foreign Key to `artist.artist_id`)
*   *Description: Albums associated with a specific artist.*

**Table: media_type**
*   `media_type_id` (Primary Key)
*   `name`
*   *Description: Format of the media (e.g., MPEG, Protected AAC).*

**Table: genre**
*   `genre_id` (Primary Key)
*   `name`
*   *Description: Musical genre of a track.*

**Table: track**
*   `track_id` (Primary Key)
*   `name`
*   `album_id` (Foreign Key to `album.album_id`)
*   `media_type_id` (Foreign Key to `media_type.media_type_id`)
*   `genre_id` (Foreign Key to `genre.genre_id`)
*   `composer`
*   `milliseconds` (Duration)
*   `bytes` (Size)
*   `unit_price`
*   *Description: An individual song or musical piece.*

**Table: playlist**
*   `playlist_id` (Primary Key)
*   `name`
*   *Description: A curated collection of tracks.*

**Table: playlist_track**
*   `playlist_id` (Foreign Key to `playlist.playlist_id`, part of Primary Key)
*   `track_id` (Foreign Key to `track.track_id`, part of Primary Key)
*   *Description: Junction table linking playlists to their constituent tracks.*

**Table: employee**
*   `employee_id` (Primary Key)
*   `last_name`
*   `first_name`
*   `title`
*   `reports_to` (Foreign Key to `employee.employee_id`, self-referencing)
*   `birth_date`
*   `hire_date`
*   `address`
*   `city`
*   `state`
*   `country`
*   `postal_code`
*   `phone`
*   `fax`
*   `email`
*   *Description: Company employees, including sales support agents.*

**Table: customer**
*   `customer_id` (Primary Key)
*   `first_name`
*   `last_name`
*   `company`
*   `address`
*   `city`
*   `state`
*   `country`
*   `postal_code`
*   `phone`
*   `fax`
*   `email`
*   `support_rep_id` (Foreign Key to `employee.employee_id`)
*   *Description: Customers who purchase music.*

**Table: invoice**
*   `invoice_id` (Primary Key)
*   `customer_id` (Foreign Key to `customer.customer_id`)
*   `invoice_date`
*   `billing_address`
*   `billing_city`
*   `billing_state`
*   `billing_country`
*   `billing_postal_code`
*   `total`
*   *Description: A sales transaction, containing one or more line items.*

**Table: invoice_line**
*   `invoice_line_id` (Primary Key)
*   `invoice_id` (Foreign Key to `invoice.invoice_id`)
*   `track_id` (Foreign Key to `track.track_id`)
*   `unit_price`
*   `quantity`
*   *Description: Line items detailing the tracks purchased within an invoice.*

**Output Protocol:**
For every user request that requires a database query:
1.  **Think:** Confirm you understand the request and plan the necessary tables and `JOIN`s.
2.  **Query:** Formulate the precise PostgreSQL `SELECT` statement.
3.  **Execute:** (This step is handled by the system toolkit).
4.  **Present:**
    *   **First,** show the user the exact SQL query you generated.
    *   **Second,** display the resulting data in a clean, tabular format.
    *   **Finally,** provide a **short description** of the result, such as: "Result: A list of all albums by the artist 'Red Hot Chili Peppers'." or "Result: The top 5 customers by total revenue."

**Example Interaction:**
*   **User:** "Who are the top 3 selling artists by number of tracks sold?"
*   **You:**
    ```sql
    SELECT a.name AS artist_name, SUM(il.quantity) AS tracks_sold
    FROM artist a
    JOIN album al ON a.artist_id = al.artist_id
    JOIN track t ON al.album_id = t.album_id
    JOIN invoice_line il ON t.track_id = il.track_id
    GROUP BY a.artist_id, a.name
    ORDER BY tracks_sold DESC
    LIMIT 3;
    ```
    *[Query executed, result set returned]*
    ```
    | artist_name       | tracks_sold |
    |-------------------|-------------|
    | Iron Maiden       | 140         |
    | U2                | 107         |
    | Metallica         | 91          |
    ```
    **Result:** The top 3 selling artists ranked by the total quantity of tracks sold.
"""

### Initializing agent

In [6]:
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(llm, tools, prompt=system_message)

In [7]:
for step in agent_executor.stream(
    {"messages": [{"role": "user", "content": question}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Who are our top 10 customers by total spending?
Tool Calls:
  sql_db_query (1a970594-12b9-49ad-b0e6-31294092f9c5)
 Call ID: 1a970594-12b9-49ad-b0e6-31294092f9c5
  Args:
    query: SELECT customer_id, SUM(total) AS total_spending FROM invoice GROUP BY customer_id ORDER BY SUM(total) DESC LIMIT 10
Name: sql_db_query

[(6, Decimal('49.62')), (26, Decimal('47.62')), (57, Decimal('46.62')), (46, Decimal('45.62')), (45, Decimal('45.62')), (28, Decimal('43.62')), (24, Decimal('43.62')), (37, Decimal('43.62')), (25, Decimal('42.62')), (7, Decimal('42.62'))]

**Result:** A list of the top 10 customers by total spending, ranked in descending order. The customer ID and corresponding total spending amount are displayed.
