Copyright (c) Microsoft Corporation.

Licensed under the MIT License.

# Text2SQL with AutoGen & Azure OpenAI

This notebook demonstrates how the AutoGen Agents can be integrated with Azure OpenAI to answer questions from the database based on the schemas provided. 

A multi-shot approach is used for SQL generation for more reliable results and reduced token usage. More details can be found in the README.md.

### Dependencies

To install dependencies for this demo:

`uv sync --package autogen_text_2_sql`

`uv add --editable text_2_sql_core`

In [None]:
# This is only needed for this notebook to work
import sys
from pathlib import Path

# Add the parent directory of `src` to the path
sys.path.append(str(Path.cwd() / "src"))

In [None]:
import dotenv
import logging
from autogen_agentchat.ui import Console
from autogen_text_2_sql.autogen_text_2_sql import AutoGenText2Sql

In [None]:
logging.basicConfig(level=logging.INFO)

In [None]:
dotenv.load_dotenv()

## Bot setup

In [None]:
agentic_text_2_sql = AutoGenText2Sql(engine_specific_rules="Use TOP X to limit the number of rows returned instead of LIMIT X. NEVER USE LIMIT X as it produces a syntax error.", use_case="Analysing sales data across product categories.").agentic_flow

### Generated Queries

In [6]:
result = agentic_text_2_sql.run_stream(task="What country did we sell the most to in June 2008?")
await Console(result)

INFO:httpx:HTTP Request: POST https://aoai-text2sql-adi.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-08-01-preview "HTTP/1.1 200 OK"
INFO:autogen_core.events:{"prompt_tokens": 62569, "completion_tokens": 191, "type": "LLMCall"}
INFO:autogen_core:Publishing message of type GroupChatMessage to all subscribers: {'message': TextMessage(source='sql_query_generation_agent', models_usage=RequestUsage(prompt_tokens=62569, completion_tokens=191), content="I am currently unable to access the database to execute the SQL query due to a firewall restriction. However, I can provide you with the SQL query that you can run in your own environment to find out which country had the highest sales in June 2008.\n\nHere is the SQL query:\n\n```sql\nSELECT a.CountryRegion, SUM(so.TotalDue) AS TotalSales\nFROM SalesLT.SalesOrderHeader so\nJOIN SalesLT.Address a ON so.ShipToAddressID = a.AddressID\nWHERE so.OrderDate >= '2008-06-01' AND so.OrderDate < '2008-07-01'\nGROUP B

In [None]:
await agentic_text_2_sql.reset()

In [None]:
result = agentic_text_2_sql.run_stream(task="What are the total number of sales within 2008 for the mountain bike category?")
await Console(result)

In [None]:
await agentic_text_2_sql.reset()

In [None]:
result = agentic_text_2_sql.run_stream(task="What are the total number of sales within 2008?")
await Console(result)

In [None]:

await agentic_text_2_sql.reset()