<a href="https://colab.research.google.com/github/sanjayakanungo/RAG/blob/main/docs/examples/query_engine/RouterQueryEngine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Router Query Engine
In this tutorial, we define a custom router query engine that selects one out of several candidate query engines to execute a query.

### Setup

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [1]:
!pip install llama-index-embeddings-openai
!pip install llama-index-llms-openai

Collecting llama-index-embeddings-openai
  Downloading llama_index_embeddings_openai-0.1.6-py3-none-any.whl (6.0 kB)
Collecting llama-index-core<0.11.0,>=0.10.1 (from llama-index-embeddings-openai)
  Downloading llama_index_core-0.10.20.post2-py3-none-any.whl (15.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.4/15.4 MB[0m [31m33.3 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading httpx-0.27.0-py3-none

In [2]:
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.10.20-py3-none-any.whl (5.6 kB)
Collecting llama-index-agent-openai<0.2.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.1.5-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.9-py3-none-any.whl (25 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.1.4-py3-none-any.whl (6.6 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 (from llama-index)
  Downloading llama_index_multi_modal_llms_openai-0.1.4-py3-none-any.whl (5.8 kB)
Collecting llama-index-program-openai<0.2.0,>=0.1.3 (from llama-index)
  Dow

In [3]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

## Global Models

In [4]:
import os

os.environ["OPENAI_API_KEY"] = "sk-"

In [5]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

### Load Data

We first show how to convert a Document into a set of Nodes, and insert into a DocumentStore.

In [8]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader("/content/data").load_data()

In [9]:
from llama_index.core import Settings

# initialize settings (set chunk size)
Settings.chunk_size = 1024
nodes = Settings.node_parser.get_nodes_from_documents(documents)

In [10]:
from llama_index.core import StorageContext

# initialize storage context (by default it's in-memory)
storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

### Define Summary Index and Vector Index over Same Data

In [11]:
from llama_index.core import SummaryIndex
from llama_index.core import VectorStoreIndex

summary_index = SummaryIndex(nodes, storage_context=storage_context)
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

### Define Query Engines and Set Metadata

In [12]:
list_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [13]:
from llama_index.core.tools import QueryEngineTool


list_tool = QueryEngineTool.from_defaults(
    query_engine=list_query_engine,
    description=(
        "Useful for summarization questions related VCF Deployment"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from VCF deployment Guide"
    ),
)

### Define Router Query Engine

There are several selectors available, each with some distinct attributes.

The LLM selectors use the LLM to output a JSON that is parsed, and the corresponding indexes are queried.

The Pydantic selectors (currently only supported by `gpt-4-0613` and `gpt-3.5-turbo-0613` (the default)) use the OpenAI Function Call API to produce pydantic selection objects, rather than parsing raw JSON.

For each type of selector, there is also the option to select 1 index to route to, or multiple.

#### PydanticSingleSelector

Use the OpenAI Function API to generate/parse pydantic objects under the hood for the router selector.

In [14]:
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector, LLMMultiSelector
from llama_index.core.selectors import (
    PydanticMultiSelector,
    PydanticSingleSelector,
)


query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)

In [15]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

The document provides comprehensive guidance on deploying VMware Cloud Foundation, covering prerequisites, procedures, and troubleshooting methods. It includes information on preparing the environment, deploying the VMware Cloud Builder appliance, configuring ESXi hosts, creating custom ISO images for ESXi, deploying the management domain using ESXi hosts with external certificates, and utilizing the Supportability and Serviceability (SoS) Utility for troubleshooting. Additionally, it offers detailed instructions for running the SoS Utility on the VMware Cloud Builder appliance, log file collection options, JSON generator options, health check operations, and explanations of key terms and concepts related to VMware Cloud Foundation.


In [16]:
response = query_engine.query("How to deploy SDDC Manager?")
print(str(response))

To deploy SDDC Manager, you need to follow a series of steps. First, create a JSON file with the bring-up information for your environment and update the securitySpec section. Then, review and acknowledge the prerequisites and download the deployment parameter workbook from VMware Customer Connect. Fill in the workbook with the required information, upload it, and begin the validation process. Once validated, click "Deploy SDDC" to start the bring-up process. Monitor the status of the bring-up tasks in the UI, and after completion, download a detailed deployment report, finish the process, and launch SDDC Manager. Finally, power off the VMware Cloud Builder appliance.


#### LLMSingleSelector

Use OpenAI (or any other LLM) to parse generated JSON under the hood to select a sub-index for routing.

In [17]:
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)

In [18]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

The document provides comprehensive guidance on deploying VMware Cloud Foundation 5.1, covering prerequisites, deployment parameters, and troubleshooting. It includes detailed steps for preparing the environment, deploying the management domain using VMware Cloud Builder, creating custom ISO images for ESXi, configuring ESXi hosts with signed certificates, and utilizing the Supportability and Serviceability (SoS) Tool for troubleshooting. Additionally, it offers troubleshooting guidance, focusing on the use of the SoS Utility, and includes a glossary of terms related to VMware Cloud Foundation.


In [19]:
response = query_engine.query("How to deploy SDDC Manager?")
print(str(response))

To deploy SDDC Manager, you need to follow the procedure outlined in the VMware Cloud Foundation Deployment Guide. This involves creating a JSON file with the bring-up information, updating the securitySpec section, downloading the deployment parameter workbook from VMware Customer Connect, filling it in with the required information, uploading the completed workbook, and then beginning the validation of the uploaded file. During the bring-up process, the vCenter Server, NSX, and SDDC Manager appliances will be deployed, and the management domain will be created. Once the bring-up is completed, a detailed deployment report can be downloaded, and the process can be finished by launching SDDC Manager and powering off the VMware Cloud Builder appliance.


In [20]:
# [optional] look at selected results
print(str(response.metadata["selector_result"]))

selections=[SingleSelection(index=1, reason='The question is asking for specific context on how to deploy SDDC Manager, so choice 2 is more relevant as it focuses on retrieving specific context from the essay.')]


#### PydanticMultiSelector

In case you are expecting queries to be routed to multiple indexes, you should use a multi selector. The multi selector sends to query to multiple sub-indexes, and then aggregates all responses using a summary index to form a complete answer.

In [21]:
from llama_index.core import SimpleKeywordTableIndex

keyword_index = SimpleKeywordTableIndex(nodes, storage_context=storage_context)

keyword_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context using keywords from VCF Deployment Guide"
    ),
)

In [22]:
query_engine = RouterQueryEngine(
    selector=PydanticMultiSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
        keyword_tool,
    ],
)

In [23]:
# This query could use either a keyword or vector query engine, so it will combine responses from both
response = query_engine.query(
    "Please summerize the deployment steps of SDDC Manager"
)
print(str(response))

The deployment steps of SDDC Manager involve specifying deployment details, such as the hostname and IP address for the SDDC Manager VM, network pool name for the management domain network pool, and the Cloud Foundation Management Domain Name. Additionally, the VMware Cloud Foundation API Reference Guide is followed to deploy the management domain, and the Supportability and Serviceability (SoS) Tool is used for troubleshooting during the deployment stage. After a successful bring-up, the SoS Utility should only be run on the SDDC Manager appliance.


In [24]:
# [optional] look at selected results
print(str(response.metadata["selector_result"]))

selections=[SingleSelection(index=0, reason='Useful for summarization of a specific essay on a topic.')]
