# Building a SQL AI Assistant with Llama Index

[LlamaIndex](https://www.llamaindex.ai/) is a simple, flexible framework for building knowledge assistants using LLMs. It simplifies the process of connecting LLMs to private data by providing tools for data ingestion, indexing, and querying.

This tutorial follows the [Text-to-SQL Guide](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/#part-1-text-to-sql-query-engine) from the LlamaIndex documentation.

Here's a short version for implementing a NL2SQL Assistant with LlamaIndex:

```python
from llama_index.core import SQLDatabase
from llama_index.core.query_engine import NLSQLTableQueryEngine
from llama_index.llms.openai import OpenAI

# Initialize query engine
query_engine = NLSQLTableQueryEngine(sql_database=SQLDatabase(...), llm=OpenAI(...))

# Query database with context
query_engine.query(query_str)
```

In [1]:
from llama_index.core import SQLDatabase
from llama_index.core.query_engine import NLSQLTableQueryEngine
from llama_index.llms.openai import OpenAI
from sqlalchemy import create_engine

In [2]:
# Create the engine

# postgresql+psycopg://username:password@host:port/database
postgres_uri = "postgresql+psycopg://postgres:postgres@localhost:5432/olist_ecommerce"
engine = create_engine(postgres_uri, connect_args={"options": "-csearch_path=ecommerce,marketing"})

In [3]:
# Initialize the SQLDatabase
sql_database = SQLDatabase(engine)

# Show the available tables
sql_database.get_usable_table_names()

['closed_deals',
 'customers',
 'geolocation',
 'marketing_qualified_leads',
 'order_items',
 'order_payments',
 'order_reviews',
 'orders',
 'product_category_name_translations',
 'products',
 'sellers']

In [4]:
# Initialize the LLM
llm = OpenAI(temperature=0.1, model="gpt-4.1-mini")

In [5]:
# Initialize the query engine
query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database, llm=llm,
    text_to_sql_prompt=None,
    response_prompt=None
)

In [6]:
# Target query
query_str = "How many orders were there in 2017?"

# Execute the query
result = query_engine.query(query_str)
display(result)

Response(response='There were 45,101 orders placed in the year 2017.', source_nodes=[NodeWithScore(node=TextNode(id_='9dfbdfd2-5a97-49b0-a63f-ba5140c82a56', embedding=None, metadata={'sql_query': "SELECT COUNT(*) AS total_orders_2017 FROM orders WHERE order_purchase_timestamp >= '2017-01-01' AND order_purchase_timestamp < '2018-01-01';", 'result': [(45101,)], 'col_keys': ['total_orders_2017']}, excluded_embed_metadata_keys=['sql_query', 'result', 'col_keys'], excluded_llm_metadata_keys=['sql_query', 'result', 'col_keys'], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text='[(45101,)]', mimetype='text/plain', start_char_idx=None, end_char_idx=None, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=None)], metadata={'9dfbdfd2-5a97-49b0-a63f-ba5140c82a56': {'sql_query': "SELECT COUNT(*) AS total_orders_2017 FROM orders WHERE order_purchase_timestamp >= '2017-01-01' AND order_purchase_timestamp < '2018-01-01';", 'result': [(45101,

In [7]:
print(f"""
SQL Query:
----------
{result.metadata["sql_query"]}

Query Output:
-------------
{result.metadata["result"]}

Final Answer:
-------------
{result.response}
""")


SQL Query:
----------
SELECT COUNT(*) AS total_orders_2017 FROM orders WHERE order_purchase_timestamp >= '2017-01-01' AND order_purchase_timestamp < '2018-01-01';

Query Output:
-------------
[(45101,)]

Final Answer:
-------------
There were 45,101 orders placed in the year 2017.

