# Code 3 - SQL Agents with few Shot Prompting

## Step 1 - Connect to Database

### Import Packages

In [1]:
from sqlite3 import connect
from langchain_community.utilities.sql_database import SQLDatabase

### Create database connection for LLM (create db)

In [2]:
db = SQLDatabase.from_uri("sqlite:///chinook.db", sample_rows_in_table_info = 3)
print(len(db.get_usable_table_names()), db.get_usable_table_names())

11 ['Album', 'Artist', 'Customer', 'Employee', 'Genre', 'Invoice', 'InvoiceLine', 'MediaType', 'Playlist', 'PlaylistTrack', 'Track']


### Check Connection

In [3]:
con = connect("chinook.db")
cur = con.execute("Select Count(Distinct(AlbumId)) from Album;")
print(cur.fetchall())

[(347,)]


## Step 2 - Setup Langchain SQL Chain

### Import API Key

In [4]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [6]:
API_KEY = os.getenv("GROQ_API_KEY")

### Import Packages

In [7]:
from langchain.callbacks import get_openai_callback
from langchain_groq import ChatGroq

### Create an Instance of LLM with configuration & Check (creating llm)

In [8]:
llm = ChatGroq(model="mixtral-8x7b-32768", temperature=0)
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x000001CE4D5833D0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001CE4D578DD0>, temperature=1e-08, groq_api_key=SecretStr('**********'))

## Step 3 Setup Agent with Few Shot Examples

Advantages of SQL Agents:

1. Saves tokens - Retrieving only required schema & relevant tables.
2. Recover from errors by generating queries, catch the traceback and regenerate correctly
3. Chains run on a predefined pattern; Agents is defined by llm; The LLM then chooses the right tools to execute

### Import Packages

In [9]:
from langchain_community.agent_toolkits import create_sql_agent
from langchain.agents import AgentType

### Create few Shot Examples

In [10]:
examples = [
    {   "input": "List all artists.", 
        "query": "SELECT * FROM Artist;"},
    {
        "input": "Find all albums for the artist 'AC/DC'.",
        "query": "SELECT * FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'AC/DC');",
    },
    {
        "input": "List all tracks in the 'Rock' genre.",
        "query": "SELECT * FROM Track WHERE GenreId = (SELECT GenreId FROM Genre WHERE Name = 'Rock');",
    },
    {
        "input": "Find the total duration of all tracks.",
        "query": "SELECT SUM(Milliseconds) FROM Track;",
    },
    {
        "input": "List all customers from Canada.",
        "query": "SELECT * FROM Customer WHERE Country = 'Canada';",
    },
    {
        "input": "How many tracks are there in the album with ID 5?",
        "query": "SELECT COUNT(*) FROM Track WHERE AlbumId = 5;",
    },
    {
        "input": "Find the total number of Albums.",
        "query": "SELECT COUNT(DISTINT(AlbumId)) FROM Invoice;",
    },
    {
        "input": "List all tracks that are longer than 5 minutes.",
        "query": "SELECT * FROM Track WHERE Milliseconds > 300000;",
    },
    {
        "input": "Who are the top 5 customers by total purchase?",
        "query": "SELECT CustomerId, SUM(Total) AS TotalPurchase FROM Invoice GROUP BY CustomerId ORDER BY TotalPurchase DESC LIMIT 5;",
    },
    {
        "input": "How many employees are there",
        "query": 'SELECT COUNT(*) FROM "Employee"',
    },
]
print(len(examples))

10


### Import Template Packages

In [11]:
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

### Create System prefix Prompt

In [12]:
system_prefix = """You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the given tools. Only use the information returned by the tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

If the question does not seem related to the database, just return "I don't know" as the answer.

Here are some examples of user inputs and their corresponding SQL queries:"""

few_shot_prompt = FewShotPromptTemplate(
    examples = examples,
    example_prompt=PromptTemplate.from_template(
        "User input: {input}\nSQL query: {query}"
    ),
    input_variables=["input", "dialect", "top_k"],
    prefix=system_prefix,
    suffix=""
)

### Integrate few Shot Prompts with User Inputs

In [13]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, SystemMessagePromptTemplate

In [14]:
full_prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessagePromptTemplate(prompt=few_shot_prompt),
        ("human", "{input}"),
        MessagesPlaceholder("agent_scratchpad"),
    ]
)

#### Check Prompt

In [15]:
print(full_prompt.messages[0].to_json()['kwargs'])

{'prompt': FewShotPromptTemplate(input_variables=['dialect', 'top_k'], examples=[{'input': 'List all artists.', 'query': 'SELECT * FROM Artist;'}, {'input': "Find all albums for the artist 'AC/DC'.", 'query': "SELECT * FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'AC/DC');"}, {'input': "List all tracks in the 'Rock' genre.", 'query': "SELECT * FROM Track WHERE GenreId = (SELECT GenreId FROM Genre WHERE Name = 'Rock');"}, {'input': 'Find the total duration of all tracks.', 'query': 'SELECT SUM(Milliseconds) FROM Track;'}, {'input': 'List all customers from Canada.', 'query': "SELECT * FROM Customer WHERE Country = 'Canada';"}, {'input': 'How many tracks are there in the album with ID 5?', 'query': 'SELECT COUNT(*) FROM Track WHERE AlbumId = 5;'}, {'input': 'Find the total number of Albums.', 'query': 'SELECT COUNT(DISTINT(AlbumId)) FROM Invoice;'}, {'input': 'List all tracks that are longer than 5 minutes.', 'query': 'SELECT * FROM Track WHERE Milliseconds > 300

### Validate Prompt

In [16]:
prompt_val = full_prompt.invoke(
    {
        "input": "How many arists are there",
        "top_k": 5,
        "dialect": "SQLite",
        "agent_scratchpad": [],
    }
)
print(prompt_val.to_string())

System: You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct SQLite query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 5 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the given tools. Only use the information returned by the tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

If the question does not seem related to the database, just return "I don't 

In [17]:
agent_with_few_shots = create_sql_agent(
    llm=llm,
    db=db,
    prompt=full_prompt,
    verbose=True,
    agent_type="zero-shot-react-description",
    stream_runnable = False
)

ValueError: Prompt missing required variables: {'tools', 'tool_names'}

### Execute LLM Agent

In [43]:
with get_openai_callback() as cb:
    out = agent_with_few_shots.invoke({"input": "How many artists are there?"})
print(out)
print(cb)



[1m> Entering new SQL Agent Executor chain...[0m
[32;1m[1;3m
Invoking: `sql_db_query_checker` with `{'query': 'SELECT COUNT(*) FROM Artist'}`


[0m[36;1m[1;3mSELECT COUNT(\*) FROM Artist

The given SQL query does not contain any of the common mistakes mentioned. Therefore, the final SQL query is:

SELECT COUNT(\*) FROM Artist[0m[32;1m[1;3mThe final SQL query is:

SELECT COUNT(\*) FROM Artist

The query returns the number of rows in the Artist table.[0m

[1m> Finished chain.[0m
{'input': 'How many artists are there?', 'output': 'The final SQL query is:\n\nSELECT COUNT(\\*) FROM Artist\n\nThe query returns the number of rows in the Artist table.'}
Tokens Used: 5388
	Prompt Tokens: 5216
	Completion Tokens: 172
Successful Requests: 3
Total Cost (USD): $0.0


### Check Execution Flow

In [None]:
print(agent_with_few_shots)

# END - Next Topic Dynamic Few Shot Examples