# Perform SQL query in natural Language

In this notebook, we provide an example of using React agent to perform sql query in natural lanugage. 
Other example of using react agent can be found in [ReAct agent example](../conversation_with_react_agent/main.ipynb).

First, we initalize the agentscope config and the database to query.

In [None]:
from sql_utils import create_sqlite_db_from_schema
from agentscope.models import load_model_by_config_name

from agentscope.message import Msg
import agentscope
from agentscope.agents import ReActAgent

from agentscope.service import ServiceResponse, ServiceExecStatus, ServiceFactory

from typing import Callable
agentscope.init(
    model_configs="./configs/model_configs.json",
)
example_db_id = "concert_singer"
db_schema_path = "./database/concert_singer/schema.sql"
db_sqlite_path = "./database/concert_singer/concert_singer.sqlite"
create_sqlite_db_from_schema(db_schema_path, db_sqlite_path)
loaded_model = load_model_by_config_name('gpt-4')

Then, we define the tools needed for React Agent to perform SQL query.

Namely, our agent should be able to generate sql query given natural language and perform sql query to get the result.

In [2]:
from sql_utils import DailSQLPromptGenerator, SQLPrompt

def generate_sql_query(question: str, db_path: str) -> ServiceResponse:
    """
    Generate the sql query based on the natural lanugage question from user.
    Args:
        question (`str`):
            The question asked by user in natural language.
        db_path (`str`):
            The path to the .sqlite file.
    """
    prompt_helper = DailSQLPromptGenerator(db_path)
    prepared_prompt = prompt_helper.generate_prompt({"content": question})
    
    def get_response_from_prompt(prompt: dict, model: Callable) -> str:
        """
        Generate response from prompt using LLM
        """
        messages = [{"role": "assistant", "content": prompt}]
        sql = model(messages).text
        sql = " ".join(sql.replace("\n", " ").split())
        sql = sql.strip().split("/*")[0]
        if sql.startswith("SELECT"):
            response = sql + "\n"
        elif sql.startswith(" "):
            response = "SELECT" + sql + "\n"
        else:
            response = "SELECT " + sql + "\n"
        return response

    sql_response = get_response_from_prompt(
        prepared_prompt["prompt"], model=loaded_model
    )

    return ServiceResponse(
        ServiceExecStatus.SUCCESS,
        sql_response
    )

from agentscope.service import query_sqlite

tools = [
    ServiceFactory.get(generate_sql_query, db_path=db_sqlite_path),
    ServiceFactory.get(query_sqlite, database=db_sqlite_path),
]

Now, we can init the ReAct Agent using the defined tools and interact with the agent.

In [3]:
agent = ReActAgent(
    name="assistant",
    model_config_name='gpt-4',
    tools=tools,
    sys_prompt="You are a helpful agent that preform SQL querys base on natual language instructions.",
    verbose=True, # set verbose to True to show the reasoning process
)

from loguru import logger
describe_prompt = SQLPrompt().describe_schema(db_sqlite_path)
sql_description = loaded_model([{"role": "assistant", "content": describe_prompt}]).text
logger.chat(sql_description)


logger.chat("User: How many singers do we have? \n")
mss = Msg(
    name="user", 
    content="How many singers do we have?", 
    role="user"
)
sql_query_mss1 = agent(mss)

logger.chat("User: Show all countries and the number of singers in each country. \n")
mss = Msg(
    name="user", 
    content="Show all countries and the number of singers in each country.", 
    role="user"
)
sql_query_mss2 = agent(mss)

logger.chat("User: How many singers have held concert in multiple stadiums? Give me the name of these singers. \n")
mss = Msg(
    name="user", 
    content="How many singers have held concert in multiple stadiums? Give me the name of these singers.", 
    role="user"
)
sql_query_mss3 = agent(mss)


[32m2024-04-03 17:11:41.076[0m | [1mINFO    [0m | [36magentscope.models.model[0m:[36m__init__[0m:[36m257[0m - [1mInitialize model [gpt-4][0m
The database is composed of four tables which are: 

1. `stadium`: This holds the data about different stadiums. The columns include `Stadium_ID` (the primary key), `Location`, `Name`, `Capacity`, `Highest`, `Lowest`, and `Average`.

2. `singer`: This contains data related to singers. Its columns are `Singer_ID` (the primary key), `Name`, `Country`, `Song_Name`, `Song_release_year`, `Age`, and `Is_male` (a flag indicating the gender of the singer).

3. `concert`: This table stores the data for concerts. The columns here are `concert_ID` (the primary key), `concert_Name`, `Theme`, `Stadium_ID`, and `Year`. The `Stadium_ID` in this table corresponds to the `Stadium_ID` in the Stadiums table, establishing a link between the two. 

4. `singer_in_concert`: This essentially connects the `singer` and `concert` tables. It has two primary keys,

Now let's see how the agent have response to our questions:

In [4]:
# "How many singers do we have?"
sql_query_mss1

{'id': 'bcef433886af4cb598af3e8c80eb765d',
 'timestamp': '2024-04-03 17:12:29',
 'name': 'assistant',
 'content': {'thought': 'We have successfully executed the SQL query and got the result indicating there are 6 singers in the database.',
  'speak': 'We have a total of 6 singers.'},
 'url': None,
 'role': 'assistant'}

In [5]:
# "Show all countries and the number of singers in each country."
sql_query_mss2

{'id': 'a2a300f250e14144b9373d01cc7a377c',
 'timestamp': '2024-04-03 17:12:50',
 'name': 'assistant',
 'content': {'thought': 'We have successfully executed the SQL query and got the result indicating how many singers are from each country in our database.',
  'speak': 'We have 4 singers from France, 1 singer from Netherlands and 1 singer from the United States.'},
 'url': None,
 'role': 'assistant'}

In [6]:
# "How many singers have held concert in multiple stadiums? Give me the name of these singers."
sql_query_mss3

{'id': '282f39c2d69e436682bab07c83e05311',
 'timestamp': '2024-04-03 17:13:39',
 'name': 'assistant',
 'content': {'thought': 'We have successfully executed the SQL query and got the results. It appears that there are four singers who have held concerts in multiple stadiums. These singers are John Nizinik, Justin Brown, Timbaland, and Tribal King.',
  'speak': 'We have four singers who have held concerts in multiple stadiums: John Nizinik, Justin Brown, Timbaland, and Tribal King.'},
 'url': None,
 'role': 'assistant'}

You can use the following code to build a conversation with the ReAct agent:

``` python
from agentscope.agents import UserAgent

user = UserAgent(name="User")

x = None
while True:
    x = user(x)
    if x.content == "exit":
        break
    x = agent(x)
```