<a href="https://colab.research.google.com/github/Ashish-Soni08/Playground/blob/main/LlamaIndex/RouterQueryEngine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Router Query Engine

In this tutorial, we will be using a router query engine, which will choose one of multiple candidate query engines to execute user query.

[Documentation](https://gpt-index.readthedocs.io/en/stable/examples/query_engine/RouterQueryEngine.html)

# Setup

In [1]:
%%capture

!pip install llama-index

In [2]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

In [3]:
import logging
import sys

# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)  # Set logger level to INFO

# Clear out any existing handlers
logger.handlers = []

# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)  # Set handler level to INFO

# Add the handler to the logger
logger.addHandler(handler)

from llama_index import (
    VectorStoreIndex,
    SummaryIndex,
    SimpleDirectoryReader,
    ServiceContext,
    StorageContext,
)

import openai
openai.api_key = '' # OPENAI_API_KEY

NumExpr defaulting to 2 threads.


## Download Data

In [4]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/jerryjliu/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2023-10-28 07:15:12--  https://raw.githubusercontent.com/jerryjliu/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2023-10-28 07:15:12 (6.04 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



## Load Data

In [5]:
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# Define List Index and Vector Index over Same Data

In [6]:
summary_index = SummaryIndex.from_documents(documents)
vector_index = VectorStoreIndex.from_documents(documents)

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


# Define Query Engines and Set Metadata

In [7]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [8]:
from llama_index.tools.query_engine import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description="Useful for summarization questions related to Paul Graham eassy on What I Worked On.",
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description="Useful for retrieving specific context from Paul Graham essay on What I Worked On.",
)

# Define Router Query Engine

There are several selectors available, each with some distinct attributes.

The LLM selectors use the LLM to output a JSON that is parsed, and the corresponding indexes are queried.

The Pydantic selectors (currently only supported by gpt-4-0613 and gpt-3.5-turbo-0613 (the default)) use the OpenAI Function Call API to produce pydantic selection objects, rather than parsing raw JSON.

For each type of selector, there is also the option to select 1 index to route to, or multiple.

## PydanticSingleSelector

In [9]:
from llama_index.query_engine.router_query_engine import RouterQueryEngine
from llama_index.selectors.llm_selectors import LLMSingleSelector
from llama_index.selectors.pydantic_selectors import (
    PydanticSingleSelector,
)
from IPython.display import display, HTML


query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
)

In [10]:
response = query_engine.query("What is the summary of the document?")

Selecting query engine 0: This choice is specifically mentioned as useful for summarization questions..
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1404 request_id=419cd57a9e440da0873dc3a4c25509d6 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=2909 request_id=82b561c67a80b84b2ba6efe745e1adf5 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=2978 request_id=1239b5027e18fded0f7deabefac9db0a response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3335 request_id=52f70f1da08abbc83b6c091509ff85f0 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3471 request_id=ae26eba0fa1c432dad1cb8673cbc40b3 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3606 requ

In [11]:
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

## LLMSingleSelector

In [12]:
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
)

In [13]:
response = query_engine.query("What is the summary of the document?")

Selecting query engine 0: The first choice is relevant because it mentions summarization questions related to the essay..
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1701 request_id=49b35c1309404eccea074ffc6b395829 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3012 request_id=ac6e67c0b8cc7b1e07e781b52fbeb1a7 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=2982 request_id=8bfd187e56388ac13a5a27986a07c869 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3855 request_id=ebae114951855e021463266940ec8f75 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3366 request_id=37b0a853e1721bde4f1ba7d455ca6b25 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions proce

In [14]:
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

In [15]:
response = query_engine.query("What did Paul Graham do after RICS?")

Selecting query engine 1: The question is asking for specific context about what Paul Graham did after RICS, which is better suited for retrieving specific context from the essay..


In [16]:
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))