## <b><font color='darkblue'>Answeing Question With SQL</font></b>
([source](https://courses.dataschool.io/view/courses/build-ai-agents-with-python/3021213-chapter-6-answering-questions-with-sql/10019656-downloading-a-sample-database))

![sql](images/1.jpg)

In [1]:
from langchain_google_genai import ChatGoogleGenerativeAI

### <b><font color='darkgreen'>Downloading a sample database</font></b>
([source](https://courses.dataschool.io/view/courses/build-ai-agents-with-python/3021213-chapter-6-answering-questions-with-sql/10019656-downloading-a-sample-database))
- [**SQL Q&A tutorial** (LangChain documentation)](https://python.langchain.com/docs/tutorials/sql_qa/)
- [**Chinook database** (GitHub)](https://github.com/lerocha/chinook-database)
- [**Starter code for this chapter** (GitHub)](https://gist.github.com/justmarkham/7a67fc434e4dc8b9b4f9857609591086)

![database](https://private-user-images.githubusercontent.com/135025/299867754-cea7a05a-5c36-40cd-84c7-488307a123f4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDg2OTMyMTAsIm5iZiI6MTc0ODY5MjkxMCwicGF0aCI6Ii8xMzUwMjUvMjk5ODY3NzU0LWNlYTdhMDVhLTVjMzYtNDBjZC04NGM3LTQ4ODMwN2ExMjNmNC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwNTMxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDUzMVQxMjAxNTBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05OTY4Y2UzZDMzMGQ3OTFkNjIxMzliMzNiNTIxNTg5MWU0OTY2MDY2N2Q2NjFlNDU4YTIxZGE5OTgxMDg1N2NiJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.AsvjEG5HNOkRQoUpbBV5RV0mo9GdiAAJRLZBE1FkSX0)

In [2]:
from langchain_community.utilities.sql_database import SQLDatabase

db = SQLDatabase.from_uri('sqlite:///Chinook_Sqlite.sqlite')

In [3]:
db.get_usable_table_names()

['Album',
 'Artist',
 'Customer',
 'Employee',
 'Genre',
 'Invoice',
 'InvoiceLine',
 'MediaType',
 'Playlist',
 'PlaylistTrack',
 'Track']

In [4]:
db.run('SELECT * FROM Artist LIMIT 10')

"[(1, 'AC/DC'), (2, 'Accept'), (3, 'Aerosmith'), (4, 'Alanis Morissette'), (5, 'Alice In Chains'), (6, 'Antônio Carlos Jobim'), (7, 'Apocalyptica'), (8, 'Audioslave'), (9, 'BackBeat'), (10, 'Billy Cobham')]"

### <b><font color='darkgreen'>Getting the SQL toolkit</font></b>
([source](https://courses.dataschool.io/view/courses/build-ai-agents-with-python/3021213-chapter-6-answering-questions-with-sql/10020544-getting-the-sql-toolkit))
- [**SQLDatabase Toolkit**: This will help you get started with the SQL Database toolkit. ](https://python.langchain.com/docs/integrations/tools/sql_database/)

In [5]:
from langchain_community.agent_toolkits import SQLDatabaseToolkit

In [6]:
sql_model = ChatGoogleGenerativeAI(
    model='gemini-2.0-flash',
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2)

In [7]:
toolkit = SQLDatabaseToolkit(db=db, llm=sql_model)
tool_list = toolkit.get_tools()

In [8]:
_ = [print(f'{tool.name}:\n{tool.description}\n\n') for tool in tool_list]

sql_db_query:
Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.


sql_db_schema:
Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3


sql_db_list_tables:
Input is an empty string, output is a comma-separated list of tables in the database.


sql_db_query_checker:
Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!




In [23]:
tool_list[0]

QuerySQLDatabaseTool(description="Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.", db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x7fa28cf11bb0>)

### <b><font color='darkgreen'>Adding the toolkit to the workflow</font></b>
([source](https://courses.dataschool.io/view/courses/build-ai-agents-with-python/3021213-chapter-6-answering-questions-with-sql/10023180-adding-the-toolkit-to-the-workflow))
- [**SQL agent system prompt** (LangChain Hub)](https://smith.langchain.com/hub/langchain-ai/sql-agent-system-prompt)

In [35]:
import getpass
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState, START
from langgraph.prebuilt import ToolNode, tools_condition
from langgraph.checkpoint.memory import MemorySaver
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.agent_toolkits import SQLDatabaseToolkit
from langchain_experimental.chat_models import Llama2Chat

_ = load_dotenv()
db = SQLDatabase.from_uri('sqlite:///Chinook_Sqlite.sqlite')
sql_model = ChatGoogleGenerativeAI(
    model='gemini-2.0-flash',
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2)
toolkit = SQLDatabaseToolkit(db=db, llm=sql_model)
tool_list = toolkit.get_tools()

In [36]:
model_with_tools = ChatGoogleGenerativeAI(
    model='gemini-2.5-pro-preview-03-25',
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2).bind_tools(tool_list)

In [37]:
system_sql_prompt_message_str = '''
You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct SQLite query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 5 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.
DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.
To start you should ALWAYS look at the tables in the database to see what you can query.
Do NOT skip this step.
Then you should compose the SQL query and call the tool `sql_db_query` to get the information and answer the questions.
Do NOT just return the SQL string only to the user.'''

In [38]:
prompt = ChatPromptTemplate([
    ('system', system_sql_prompt_message_str),
    ('human', '{messages}'),
])

In [26]:
prompt.invoke([{'messages': 'test'}])

ChatPromptValue(messages=[SystemMessage(content='\nYou are an agent designed to interact with a SQL database.\nGiven an input question, create a syntactically correct SQLite query to run, then look at the results of the query and return the answer.\nUnless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 5 results.\nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for the relevant columns given the question.\nYou have access to tools for interacting with the database.\nOnly use the below tools. Only use the information returned by the below tools to construct your final answer.\nYou MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.\nDO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.\nTo start you should 

In [39]:
chain = prompt | model_with_tools

In [40]:
call_tool = ToolNode(tool_list)

In [41]:
def call_model(state: MessagesState, chain=chain):
    updated_messages = chain.invoke(state["messages"])
    return {"messages": updated_messages}

In [42]:
workflow = StateGraph(MessagesState)
workflow.add_node("model_node", call_model)
workflow.add_node("tools", call_tool)
workflow.add_edge(START, "model_node")
workflow.add_conditional_edges("model_node", tools_condition)
workflow.add_edge("tools", "model_node")

memory = MemorySaver()
app = workflow.compile(memory)

In [43]:
def examine_chatbot(chat_id: int=123, app=app):
    config = {"configurable": {"thread_id": chat_id}}

    while True:
        user_input = input("User:")

        if user_input in ["exit", "quit"]:
            print("AI: See you later!")
            break
        
        else:
            print("AI: ", end="")
            for state in app.stream({"messages": user_input}, config, stream_mode="values"):
                message = state['messages'][-1]
                message.pretty_print()
            print("\n")

In [44]:
# Which country's customers spent the most?
examine_chatbot()

User: Which country's customers spent the most?



Which country's customers spent the most?
Tool Calls:
  sql_db_list_tables (d10d0076-09a7-45aa-bc2f-d64548444d20)
 Call ID: d10d0076-09a7-45aa-bc2f-d64548444d20
  Args:
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Tool Calls:
  sql_db_schema (4833a69b-a7d5-4dcd-b557-a3dcd123f2df)
 Call ID: 4833a69b-a7d5-4dcd-b557-a3dcd123f2df
  Args:
    table_names: Customer, Invoice
Name: sql_db_schema


CREATE TABLE "Customer" (
	"CustomerId" INTEGER NOT NULL, 
	"FirstName" NVARCHAR(40) NOT NULL, 
	"LastName" NVARCHAR(20) NOT NULL, 
	"Company" NVARCHAR(80), 
	"Address" NVARCHAR(70), 
	"City" NVARCHAR(40), 
	"State" NVARCHAR(40), 
	"Country" NVARCHAR(40), 
	"PostalCode" NVARCHAR(10), 
	"Phone" NVARCHAR(24), 
	"Fax" NVARCHAR(24), 
	"Email" NVARCHAR(60) NOT NULL, 
	"SupportRepId" INTEGER, 
	PRIMARY KEY ("CustomerId"), 
	FOREIGN KEY("SupportRepId") REFERENCES "Employee" ("EmployeeId")
)

/*
3 rows from Customer table

User: exit


AI: See you later!


In [33]:
db.run('''SELECT
  c.Country,
  SUM(i.Total) AS TotalSpent
FROM Customer AS c
JOIN Invoice AS i
  ON c.CustomerId = i.CustomerId
GROUP BY
  c.Country
ORDER BY
  TotalSpent DESC
LIMIT 5;''')

"[('USA', 523.06), ('Canada', 303.96), ('France', 195.1), ('Brazil', 190.1), ('Germany', 156.48)]"

### <b><font color='darkgreen'>Updating the chatbot</font></b>
([source](https://courses.dataschool.io/view/courses/build-ai-agents-with-python/3021213-chapter-6-answering-questions-with-sql/10023733-updating-the-chatbot))
- [**Conclusion code for this chapter** (GitHub)](https://gist.github.com/justmarkham/69732803e77346b761da5b389d6447b5)