### **Test the sqldb**


In [2]:
from langchain_community.utilities import SQLDatabase


**Connecting to the sqldb**

In [3]:
db_path = "..\data\sqldb.db"
db = SQLDatabase.from_uri(f"sqlite:///{db_path}")

In [4]:
# Đọc tệp SQL
with open("..\data\sql\langchain.sql", "r") as file:
    sql_script = file.read()

# Chia tệp SQL thành các câu lệnh riêng biệt
sql_commands = sql_script.split(';')  # Tách câu lệnh bằng dấu ;

# Thực thi từng câu lệnh SQL
for command in sql_commands:
    command = command.strip()
    if command:  # Nếu câu lệnh không rỗng
        db.run(command)

In [5]:
# validate the connection to the vectordb
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM Costs;")

sqlite
['Cost_Alerts', 'Costs', 'Projects', 'Providers', 'Resource_Utilization', 'Resources', 'User', 'User_Project_Access']


"[(1, 1, 1, 1000, '2024-01-15', 'Subscription', 'Monthly', 'Credit Card', 'INV1234', 'Cloud Storage for January'), (2, 2, 2, 500, '2024-02-10', 'Subscription', 'Quarterly', 'Bank Transfer', 'INV5678', 'SaaS License for Q1'), (3, 3, 3, 3000, '2024-03-05', 'Subscription', 'Annually', 'Credit Card', 'INV9101', 'Hardware Servers for 2024'), (4, 4, 4, 500, '2024-04-15', 'Operational', 'Monthly', 'Bank Transfer', 'INV1122', 'Firewall Protection for April'), (5, 5, 5, 1500, '2024-05-01', 'Subscription', 'Monthly', 'Credit Card', 'INV3344', 'Virtual Machines for May'), (6, 6, 6, 800, '2024-06-10', 'Subscription', 'Annually', 'Bank Transfer', 'INV5566', 'Data Backup Service for 2024'), (7, 7, 7, 1200, '2024-07-01', 'Subscription', 'Annually', 'Credit Card', 'INV7788', 'Server Hosting for 2024'), (8, 8, 8, 350, '2024-08-20', 'Operational', 'Monthly', 'Bank Transfer', 'INV9900', 'SaaS Management Tool for August'), (9, 2, 3, 2000, '2024-02-20', 'Subscription', 'Monthly', 'Credit Card', 'INV1236', 

### **Test your GPT model**

In [12]:
from openai import AzureOpenAI

# Đặt các giá trị API key và endpoint trực tiếp
api_version = "2024-08-01-preview" 
azure_endpoint = "https://ifd-copilot-internship-program.openai.azure.com"  #endpoint
api_key = ""  # API Key 
gpt_deployment_name = "gpt-4o-mini"  

# Chuẩn bị các tin nhắn
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "hello"}
]

# Khởi tạo client AzureOpenAI
client = AzureOpenAI(
    api_version=api_version,
    azure_endpoint=azure_endpoint,
    api_key=api_key,
)

# Gửi yêu cầu và nhận phản hồi
response = client.chat.completions.create(
    model=gpt_deployment_name,
    messages=messages
)

# In kết quả trả về
print(response.choices[0].message.content)

Hello! How can I assist you today?


### **1. SQL query chain** (tạo chuỗi truy vấn SQL với GPT)

In [13]:
from langchain.chat_models import AzureChatOpenAI


# Khởi tạo mô hình G
llm = AzureChatOpenAI(
    openai_api_version=api_version,  
    azure_endpoint=azure_endpoint,  
    api_key=api_key,  # API key
    azure_deployment=gpt_deployment_name,  # Tên deployment mô hình
    model_name=gpt_deployment_name,  # Tên mô hình
    temperature=0.0  # Độ ngẫu nhiên khi trả lời
)

# Kiểm tra mô hình đã được khởi tạo thành công chưa
print("Mô hình GPT đã được khởi tạo thành công")

Mô hình GPT đã được khởi tạo thành công


In [14]:
from langchain.chains import create_sql_query_chain
# Tạo chuỗi truy vấn SQL (bằng cách sử dụng mô hình GPT kết hợp với cơ sở dữ liệu)
chain = create_sql_query_chain(llm, db)
# Gọi chuỗi truy vấn 
response = chain.invoke({"question": "How many Providers are there"})
print(response)
#Khi cung cấp câu hỏi, hệ thống sẽ tự động tạo ra câu lệnh SQL và trả về kết quả(query).

SQLQuery: SELECT COUNT("provider_id") AS "total_providers" FROM "Providers";


In [17]:
# Lọc và làm sạch chuỗi SQL, loại bỏ ký tự thừa
sql_query = response.strip('SQLQuery:\n')
# sql_query = response.strip('SQLQuery:\n')
print(sql_query)

 SELECT COUNT("provider_id") AS "total_providers" FROM "Providers";


In [18]:
#thực thi trên db để lấy kết quả thực tế
db.run(sql_query)

'[(35,)]'

In [19]:
# hiển thị một mô tả chi tiết về cách mô hình GPT tạo câu truy vấn SQL dựa trên câu hỏi đầu vào
# giúp hiểu rõ hơn về cách mô hình xử lý câu hỏi và tạo câu lệnh SQL tương ứng.
chain.get_prompts()[0].pretty_print()

You are a SQLite expert. Given an input question, first create a syntactically correct SQLite query to run, then look at the results of the query and return the answer to the input question.
Unless the user specifies in the question a specific number of examples to obtain, query for at most 5 results using the LIMIT clause as per SQLite. You can order the results to return the most informative data in the database.
Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in double quotes (") to denote them as delimited identifiers.
Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Pay attention to use date('now') function to get the current date, if the question involves "today".

Use the following format:

Question: Question here
SQLQuery: SQL Query to run
SQLResult: Result

### **Add QuerySQLDataBaseTool to the chain**
Execute SQL query

**This is the most dangerous part of creating a SQL chain.** Consider carefully if it is OK to run automated queries over your data. Minimize the database connection permissions as much as possible. Consider adding a human approval step to you chains before query execution (see below).

We can use the QuerySQLDatabaseTool to easily add query execution to our chain:

nghĩa là ở trên cho ra câu query rồi mình decide có run trên db bằng lệnh db.run hay không.
còn ở đây, cho query rồi tự động run trên db luôn

### **2. Agents**

Agent which provides a more flexible way of interacting with SQL databases. The main advantages of using the SQL Agent are:

- It can answer questions based on the databases’ schema as well as on the databases’ content (like describing a specific table).
- It can recover from errors by running a generated query, catching the traceback and regenerating it correctly.
- It can answer questions that require multiple dependent queries.
- It will save tokens by only considering the schema from relevant tables.

To initialize the agent, we use create_sql_agent function. This agent contains the SQLDatabaseToolkit which contains tools to:

- Create and execute queries
- Check query syntax
- Retrieve table descriptions
- …

In [20]:
from langchain_community.agent_toolkits import create_sql_agent

#tạo một sql agent
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
# verbose=True: Cung cấp chế độ chi tiết trong quá trình thực thi, dễ dàng theo dõi các bước mà agent thực hiện.

In [21]:
agent_executor.invoke(
    {
        "input": "List the total Cost per Project. Which Project has the most cost?"
    }
)
#Câu hỏi sẽ được agent xử lý, nó sẽ tự động chuyển đổi câu hỏi thành các câu truy vấn SQL phù hợp, thực hiện chúng trên cơ sở dữ liệu và trả về kết quả.



[1m> Entering new SQL Agent Executor chain...[0m
[32;1m[1;3m
Invoking: `sql_db_list_tables` with `{}`


[0m[38;5;200m[1;3mCost_Alerts, Costs, Projects, Providers, Resource_Utilization, Resources, User, User_Project_Access[0m[32;1m[1;3m
Invoking: `sql_db_schema` with `{'table_names': 'Costs'}`


[0m[33;1m[1;3m
CREATE TABLE "Costs" (
	cost_id INTEGER, 
	fk_project_id INTEGER NOT NULL, 
	fk_resource_id INTEGER NOT NULL, 
	cost_amount DECIMAL(15, 2) NOT NULL, 
	cost_date DATE NOT NULL, 
	cost_type TEXT NOT NULL, 
	billing_cycle TEXT NOT NULL, 
	payment_method TEXT NOT NULL, 
	invoice_number TEXT NOT NULL, 
	notes TEXT, 
	PRIMARY KEY (cost_id), 
	FOREIGN KEY(fk_project_id) REFERENCES "Projects" (project_id), 
	FOREIGN KEY(fk_resource_id) REFERENCES "Resources" (resource_id), 
	UNIQUE (invoice_number), 
	CHECK (cost_amount > 0), 
	CHECK (cost_type IN ('Operational', 'Subscription')), 
	CHECK (billing_cycle IN ('Monthly', 'Quarterly', 'Annually')), 
	CHECK (payment_method IN ('

{'input': 'List the total Cost per Project. Which Project has the most cost?',
 'output': 'Here are the total costs per project, ordered by the highest cost:\n\n1. **Software Development**: $6,300\n2. **Hardware Setup**: $5,250\n3. **Data Security**: $4,100\n4. **SaaS Optimization**: $3,750\n5. **SaaS Integration**: $3,450\n6. **Network Optimization**: $2,950\n7. **Cloud Infrastructure**: $2,800\n8. **CloudMigration**: $1,850\n\nThe project with the most cost is **Software Development** with a total cost of **$6,300**.'}

In [22]:
agent_executor.invoke({"displaying cost alerts for projects where the cost exceeds the alert threshold"})



[1m> Entering new SQL Agent Executor chain...[0m
[32;1m[1;3m
Invoking: `sql_db_list_tables` with `{}`


[0m[38;5;200m[1;3mCost_Alerts, Costs, Projects, Providers, Resource_Utilization, Resources, User, User_Project_Access[0m[32;1m[1;3m
Invoking: `sql_db_schema` with `{'table_names': 'Cost_Alerts, Costs, Projects'}`


[0m[33;1m[1;3m
CREATE TABLE "Cost_Alerts" (
	alert_id INTEGER, 
	fk_project_id INTEGER, 
	fk_resource_id INTEGER, 
	alert_type TEXT NOT NULL, 
	threshold_amount DECIMAL(15, 2), 
	current_amount DECIMAL(15, 2), 
	alert_date DATETIME NOT NULL, 
	status TEXT DEFAULT 'Active', 
	notification_sent BOOLEAN DEFAULT FALSE, 
	PRIMARY KEY (alert_id), 
	FOREIGN KEY(fk_project_id) REFERENCES "Projects" (project_id), 
	FOREIGN KEY(fk_resource_id) REFERENCES "Resources" (resource_id), 
	CHECK (alert_type IN ('Budget Exceed', 'Unusual Spending')), 
	CHECK (threshold_amount > 0), 
	CHECK (current_amount >= 0), 
	CHECK (status IN ('Active', 'Resolved', 'Ignored')), 
	CHECK (

{'input': {'displaying cost alerts for projects where the cost exceeds the alert threshold'},
 'output': 'The cost alerts for projects where the cost exceeds the alert threshold are as follows:\n\n1. **Project Name:** Hardware Setup\n   - **Alert Type:** Budget Exceed\n   - **Threshold Amount:** 3000\n   - **Current Amount:** 3200\n\n2. **Project Name:** CloudMigration\n   - **Alert Type:** Budget Exceed\n   - **Threshold Amount:** 1200\n   - **Current Amount:** 1500\n\nThese projects have costs that exceed their respective alert thresholds.'}