<a href="https://colab.research.google.com/github/maheshboj/agenticai_basics/blob/Langchain_components/DB_Agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Install Required Packages
---



# New Section

In [1]:
%pip install langchain  langgraph  langchain-community

Collecting langchain-community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain-community)
  Downloading langchain_classic-1.0.0-py3-none-any.whl.metadata (3.9 kB)
Collecting requests<3.0.0,>=2.32.5 (from langchain-community)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7.0,>=0.6.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7.0,>=0.6.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting langchain-text-splitters<2.0.0,>=1.0.0 (from langchain-classic<2.0.0,>=1.0.0->langchain-community)
  Downloading langchain_text_splitters-1.0.0

In [6]:
%pip install -U langchain-openai

Collecting langchain-openai
  Downloading langchain_openai-1.1.0-py3-none-any.whl.metadata (2.6 kB)
Downloading langchain_openai-1.1.0-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.3/84.3 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain-openai
Successfully installed langchain-openai-1.1.0


In [7]:
from langchain.chat_models import init_chat_model

In [8]:
from google.colab import userdata
import os
os.environ['OPENAI_API_KEY']=userdata.get('OPENAI_KEY2')

In [9]:
llm = init_chat_model(model='gpt-4o-mini')

In [10]:
import requests, pathlib

url = "https://storage.googleapis.com/benchmarks-artifacts/chinook/Chinook.db"
local_path = pathlib.Path("Chinook.db")

if local_path.exists():
    print(f"{local_path} already exists, skipping download.")
else:
    response = requests.get(url)
    if response.status_code == 200:
        local_path.write_bytes(response.content)
        print(f"File downloaded and saved as {local_path}")
    else:
        print(f"Failed to download the file. Status code: {response.status_code}")

File downloaded and saved as Chinook.db


In [11]:
from langchain_community.utilities import SQLDatabase

db = SQLDatabase.from_uri("sqlite:///Chinook.db")

print(f"Dialect: {db.dialect}")
print(f"Available tables: {db.get_usable_table_names()}")
print(f'Sample output: {db.run("SELECT * FROM Artist LIMIT 5;")}')

Dialect: sqlite
Available tables: ['Album', 'Artist', 'Customer', 'Employee', 'Genre', 'Invoice', 'InvoiceLine', 'MediaType', 'Playlist', 'PlaylistTrack', 'Track']
Sample output: [(1, 'AC/DC'), (2, 'Accept'), (3, 'Aerosmith'), (4, 'Alanis Morissette'), (5, 'Alice In Chains')]


In [13]:
from langchain_community.agent_toolkits import SQLDatabaseToolkit

toolkit = SQLDatabaseToolkit(db=db, llm=llm)

tools = toolkit.get_tools()

for tool in tools:
    print(f"{tool.name}: {tool.description}\n")

sql_db_query: Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.

sql_db_schema: Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3

sql_db_list_tables: Input is an empty string, output is a comma-separated list of tables in the database.

sql_db_query_checker: Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!



In [14]:
system_prompt = """
You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct {dialect} query to run,
then look at the results of the query and return the answer. Unless the user
specifies a specific number of examples they wish to obtain, always limit your
query to at most {top_k} results.

You can order the results by a relevant column to return the most interesting
examples in the database. Never query for all the columns from a specific table,
only ask for the relevant columns given the question.

You MUST double check your query before executing it. If you get an error while
executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the
database.

To start you should ALWAYS look at the tables in the database to see what you
can query. Do NOT skip this step.

Then you should query the schema of the most relevant tables.
""".format(
    dialect=db.dialect,
    top_k=5,
)

In [15]:
from langchain.agents import create_agent

agent = create_agent(
    llm,
    tools,
    system_prompt=system_prompt,
)

In [16]:
question = "Which genre on average has the longest tracks?"

for step in agent.stream(
    {"messages": [{"role": "user", "content": question}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Which genre on average has the longest tracks?
Tool Calls:
  sql_db_list_tables (call_TKN555fSwx6w5hoxzJDUJHmC)
 Call ID: call_TKN555fSwx6w5hoxzJDUJHmC
  Args:
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Tool Calls:
  sql_db_schema (call_yio4ksoEtlKzgLLMXz8BgKvF)
 Call ID: call_yio4ksoEtlKzgLLMXz8BgKvF
  Args:
    table_names: Genre
  sql_db_schema (call_Iom389goPfv7O7UTGzjzmTgu)
 Call ID: call_Iom389goPfv7O7UTGzjzmTgu
  Args:
    table_names: Track
Name: sql_db_schema


CREATE TABLE "Track" (
	"TrackId" INTEGER NOT NULL, 
	"Name" NVARCHAR(200) NOT NULL, 
	"AlbumId" INTEGER, 
	"MediaTypeId" INTEGER NOT NULL, 
	"GenreId" INTEGER, 
	"Composer" NVARCHAR(220), 
	"Milliseconds" INTEGER NOT NULL, 
	"Bytes" INTEGER, 
	"UnitPrice" NUMERIC(10, 2) NOT NULL, 
	PRIMARY KEY ("TrackId"), 
	FOREIGN KEY("MediaTypeId") REFERENCES "MediaType" ("MediaTypeId"), 
	FOREIGN KEY("GenreId") REFERENCES "Genre" ("GenreId"), 

In [17]:
question = "Which many tables do we have in this db?"

for step in agent.stream(
    {"messages": [{"role": "user", "content": question}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Which many tables do we have in this db?
Tool Calls:
  sql_db_list_tables (call_SkH9CTv93bJtaXJVixqGB6wL)
 Call ID: call_SkH9CTv93bJtaXJVixqGB6wL
  Args:
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track

There are 11 tables in this database: Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, and Track.


In [18]:
question = "can you give me the tables and their relation ships"

In [20]:
for step in agent.stream(
    {"messages": [{"role": "user", "content": question}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


can you give me the tables and their relation ships
Tool Calls:
  sql_db_list_tables (call_0PbhX2dSLXrQ5tT5xAItYMHW)
 Call ID: call_0PbhX2dSLXrQ5tT5xAItYMHW
  Args:
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Tool Calls:
  sql_db_schema (call_F5Oa56xGdKPNaCN89UVr2bPG)
 Call ID: call_F5Oa56xGdKPNaCN89UVr2bPG
  Args:
    table_names: Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
  sql_db_query (call_kHK6OBu29wYYDJnfyIHFgNd2)
 Call ID: call_kHK6OBu29wYYDJnfyIHFgNd2
  Args:
    query: PRAGMA foreign_key_list(Album);
  sql_db_query (call_exuOePsHCLRQMdXDsWQobclS)
 Call ID: call_exuOePsHCLRQMdXDsWQobclS
  Args:
    query: PRAGMA foreign_key_list(Artist);
  sql_db_query (call_X4l2ICC0XvyIzIRCfh0Sgdqc)
 Call ID: call_X4l2ICC0XvyIzIRCfh0Sgdqc
  Args:
    query: PRAGMA foreign_key_list(Customer);
  sql_db_query (call_4aYlRDvpPYtoH4g1eV5wEuYQ)
 Call 

In [19]:
agent.invoke({"messages": [{"role": "user", "content": question}]})

{'messages': [HumanMessage(content='can you give me the tables and their relation ships', additional_kwargs={}, response_metadata={}, id='1f681be5-84bf-4af2-9579-7f89ccc3146d'),
  AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 528, 'total_tokens': 540, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_50906f2aac', 'id': 'chatcmpl-ChUbQJ86wuJbC27LNvpfGGOXCtD4V', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--65e7f798-2041-41cc-9e16-5a6425b84c02-0', tool_calls=[{'name': 'sql_db_list_tables', 'args': {}, 'id': 'call_BuMRPqKAr5j0ybZF9ReXMGSn', 'type': 'tool_call'}], usage_metadata={'input_tokens': 528, 'output_token

# Adding Human In the Loop for to oversee the steps taken by the Agent

In [21]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver


agent = create_agent(
    llm,
    tools,
    system_prompt=system_prompt,
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={"sql_db_query": True},
            description_prefix="Tool execution pending approval",
        ),
    ],
    checkpointer=InMemorySaver(),
)

This code demonstrates how to stream the execution of a LangGraph agent and inspect its intermediate reasoning, messages, and tool-call interruptions in real time.

✅ Key Concepts

Streaming: The agent returns partial results step-by-step instead of waiting for the full run to finish.

Threaded State: Using thread_id lets the agent maintain memory across steps.

Interrupts: When the agent wants to call a tool, LangGraph pauses execution and emits a __interrupt__ signal.
This allows you to inspect or approve the tool call before execution continues.

In [22]:
question = "Which genre on average has the longest tracks?"
config = {"configurable": {"thread_id": "1"}}

for step in agent.stream(
    {"messages": [{"role": "user", "content": question}]},
    config,
    stream_mode="values",
):
    if "messages" in step:
        step["messages"][-1].pretty_print()
    elif "__interrupt__" in step:
        print("INTERRUPTED:")
        interrupt = step["__interrupt__"][0]
        for request in interrupt.value["action_requests"]:
            print(request["description"])
    else:
        pass


Which genre on average has the longest tracks?
Tool Calls:
  sql_db_list_tables (call_dPhXjuSlRYPchbH2bZphxzGV)
 Call ID: call_dPhXjuSlRYPchbH2bZphxzGV
  Args:
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Tool Calls:
  sql_db_schema (call_Km365Rf5yynjVa9vSHa9BjsB)
 Call ID: call_Km365Rf5yynjVa9vSHa9BjsB
  Args:
    table_names: Genre
  sql_db_schema (call_0vAVpKwKIiygsvQyoFZtpg8Q)
 Call ID: call_0vAVpKwKIiygsvQyoFZtpg8Q
  Args:
    table_names: Track
Name: sql_db_schema


CREATE TABLE "Track" (
	"TrackId" INTEGER NOT NULL, 
	"Name" NVARCHAR(200) NOT NULL, 
	"AlbumId" INTEGER, 
	"MediaTypeId" INTEGER NOT NULL, 
	"GenreId" INTEGER, 
	"Composer" NVARCHAR(220), 
	"Milliseconds" INTEGER NOT NULL, 
	"Bytes" INTEGER, 
	"UnitPrice" NUMERIC(10, 2) NOT NULL, 
	PRIMARY KEY ("TrackId"), 
	FOREIGN KEY("MediaTypeId") REFERENCES "MediaType" ("MediaTypeId"), 
	FOREIGN KEY("GenreId") REFERENCES "Genre" ("GenreId"), 