<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q">Community</a>
    </p>
</center>
<h1 align="center">SQL Router Query Engine Example</h1>

LlamaIndex provides high-level APIs that enable users to build powerful applications in a few lines of code. However, it can be challenging to understand what is going on under the hood and to pinpoint the cause of issues. Phoenix makes your LLM applications *observable* by visualizing the underlying structure of each call to your query engine and surfacing problematic `spans`` of execution based on latency, token count, or other evaluation metrics.

In this tutorial, you will:
- Build a query engine that uses both a SQL retriever and a VectorStoreIndex using LlamaIndex
- Record trace data in [OpenInference tracing](https://github.com/Arize-ai/openinference) format using the global `arize_phoenix` handler
- Observe how a more complex LlamaIndex application might perform retrieval

ℹ️ This notebook requires an OpenAI API key.

## 1. Install Dependencies and Import Libraries

Install Phoenix, LlamaIndex, and OpenAI.

In [None]:
!pip install "arize-phoenix[evals]" "openai>=1" gcsfs nest-asyncio "llama-index>=0.10.3" "llama-index-core" "openinference-instrumentation-llama-index>=1.0.0" "llama-index-callbacks-arize-phoenix>=0.1.2" "llama-index-readers-wikipedia" "sqlalchemy" wikipedia

In [None]:
import os
from getpass import getpass

import openai
import pandas as pd
import phoenix as px
import wikipedia
from llama_index.core import Document, Settings, set_global_handler
from llama_index.core.indices import VectorStoreIndex
from llama_index.core.query_engine import NLSQLTableQueryEngine, RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.tools import QueryEngineTool
from llama_index.core.utilities.sql_wrapper import SQLDatabase
from llama_index.llms.openai import OpenAI
from sqlalchemy import (
    create_engine,
    text,
)

pd.set_option("display.max_colwidth", 1000)

## 2. Launch Phoenix

You can run Phoenix in the background to collect trace data emitted by any LlamaIndex application that has been instrumented with the `OpenInferenceTraceCallbackHandler`. Phoenix supports LlamaIndex's [one-click observability](https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/one_click_observability.html) which will automatically instrument your LlamaIndex application! You can consult our [integration guide](https://docs.arize.com/phoenix/integrations/llamaindex) for a more detailed explanation of how to instrument your LlamaIndex application.

Launch Phoenix and follow the instructions in the cell output to open the Phoenix UI (the UI should be empty because we have yet to run the LlamaIndex application).

In [None]:
session = px.launch_app()

Enable Phoenix tracing within LlamaIndex by setting `arize_phoenix` as the global handler. This will mount Phoenix's [OpenInferenceTraceCallback](https://docs.arize.com/phoenix/integrations/llamaindex) as the global handler. Phoenix uses OpenInference traces - an open-source standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix.

In [None]:
set_global_handler("arize_phoenix")

## 3. Configure Your OpenAI API Key

Set your OpenAI API key if it is not already set as an environment variable.

In [None]:
if not (openai_api_key := os.getenv("OPENAI_API_KEY")):
    openai_api_key = getpass("🔑 Enter your OpenAI API key: ")
openai.api_key = openai_api_key
os.environ["OPENAI_API_KEY"] = openai_api_key

## 3. Prepare reference data

First, we'll download a dataset that contains technical details of various digital cameras and convert it into an in-memory SQL database. This dataset is provided by Kaggle and more details can be found [here](https://www.kaggle.com/datasets/crawford/1000-cameras-dataset).

In [None]:
camera_info = pd.read_parquet(
    "https://storage.googleapis.com/arize-phoenix-assets/datasets/structured/camera-info/cameras.parquet"
)

In [None]:
camera_info.head()

In [None]:
engine = create_engine("sqlite:///:memory:", future=True)
camera_info.to_sql("cameras", engine, index=False)

In [None]:
with engine.connect() as connection:
    result = connection.execute(text("SELECT * FROM cameras LIMIT 5")).all()

    for row in result:
        print(row)

Next, for more general queries about digital cameras, we'll download the Wikipedia page on Digital Cameras using the `wikipedia` SDK. We will convert this document into a LlamaIndex `VectorStoreIndex`.

In [None]:
# load the Digital Camera wikipedia page
page = wikipedia.page(pageid=52797)
doc = Document(id_=page.pageid, text=page.content)

vector_indices = []
vector_index = VectorStoreIndex.from_documents([doc])
vector_indices.append(vector_index)

## 4. Build LlamaIndex Application

Let's use a simple `RouterQueryEngine` using multiple query engine tools. We will either route to the SQL retriever or the vector index built over the "Digital Camera" Wikipedia page.

In [None]:
Settings.llm = OpenAI(temperature=0.0, model="gpt-4")

In [None]:
sql_database = SQLDatabase(engine, include_tables=["cameras"])

sql_query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["cameras"],
)
sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,
    description=(
        "Useful for translating a natural language query into a SQL query over"
        " a table containing technical details about specific digital camera models: Model,"
        " Release date, Max resolution, Low resolution, Effective pixels, Zoom wide (W),"
        " Zoom tele (T), Normal focus range, Macro focus range, Storage included,"
        " Weight (inc. batteries), Dimensions, Price"
    ),
)

vector_query_engines = [index.as_query_engine() for index in vector_indices]
vector_tools = []
for query_engine in vector_query_engines:
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=query_engine,
        description="Useful for answering generic questions about digital cameras.",
    )
    vector_tools.append(vector_tool)

In [None]:
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=([sql_tool] + vector_tools),
)

## 5. Make Queries and Use Phoenix to view Spans

In [None]:
response = query_engine.query("What is the most expensive digital camera?")
print(str(response))

This query asked for specific details about a camera, and routed to the SQL retriever to get context for the response. The LLM-generated SQL can be seen in a Phoenix span.

![A view of the Phoenix UI showing SQL retrieval](https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/tracing/llama-index-sql-retrieval-tutorial/sql-retrieval.png)

In [None]:
response = query_engine.query("Tell me about the history of digital camera sensors.")
print(str(response))

More general queries are routed to the vector index.

![A view of the Phoenix UI showing vector retrieval](https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/tracing/llama-index-sql-retrieval-tutorial/vectorstoreindex-retrieval.png)

## 6. Final Thoughts

LLM Traces and the accompanying OpenInference Tracing specification is designed to be a category of telemetry data that is used to understand the execution of LLMs and the surrounding application context. This is especially useful when understanding the behavior of more complex RAG applications that might make use of multiple context retrieval strategies, such as mixing a SQL retriever with more-common vector indexes.

For more details on Phoenix, LLM Tracing, and LLM Evals, checkout our [documentation](https://docs.arize.com/phoenix/).