# Building an evaluation dataset for SQL Agent

This notebook is used for building an evaluation dataset containing at least:
- `question`
- `SQL query`
- `natural language answer from standard agent`

Evaluation dataset is based on databases obtained from **Spider dataset**.

#### Necessary imports

In [1]:
import os
import pandas as pd

from getpass import getpass
from langchain import LLMChain
from langchain.callbacks import get_openai_callback
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SQLDatabase

from evaluation_prompts import TARGET_PROMPT
from utils import CustomDatabase

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("OpenAI API Key: ")



### Loading the whole evaluation dataset

In [3]:
evaluation_df = pd.read_json('datasets/spider/train_spider.json')


In [2]:
interesting_databases = ['chinook_1', 'architecture']

Filter out only interesting databases.

`databases` list will contain `EvaluationDatabase` objects

In [4]:
evaluation_databases = []
for db_name in interesting_databases:
    evaluation_dataset = evaluation_df[evaluation_df['db_id'] == db_name]
    database = SQLDatabase.from_uri(f'sqlite:///datasets/spider/database/{db_name}/{db_name}.sqlite')
    evaluation_database = CustomDatabase(name=db_name, database=database, evaluation_dataset=evaluation_dataset)
    evaluation_databases.append(evaluation_database)

Now we need to query the results of the `queries` inside the `evaluation dataset` for each `evaluation database`:

In [5]:
for db in evaluation_databases:
    db.run_queries()

84it [00:00, 2941.80it/s]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.evaluation_dataset['query_result'] = results
17it [00:00, 3902.75it/s]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.evaluation_dataset['query_result'] = results


And we parse these query results into a natural language output:

In [6]:
llm = ChatOpenAI(temperature=0)
target_llm_chain = LLMChain(llm=llm,prompt=TARGET_PROMPT)

In [9]:
for db in evaluation_databases[1:]:
    with get_openai_callback() as cb:
        print(f"Parsing results for {db.name}...")
        db.parse_query_results(target_llm_chain)
        print(f"Used {cb.total_tokens} tokens")

Parsing results for architecture...


Parsing results for architecture: 100%|██████████| 17/17 [00:23<00:00,  1.41s/it]

Used 2595 tokens



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.evaluation_dataset['nl_result'] = results


### Saving evaluation dataset

Now let's keep only relevant columns:

In [11]:
for db in evaluation_databases[1:]:
    db.evaluation_dataset[['db_id', 'question', 'query', 'query_result', 'nl_result']].to_json(f"datasets/custom/{db.name}_eval_dataset.json")