<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/query_engine/RouterQueryEngine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Router Query Engine
In this tutorial, we define a custom router query engine that selects one out of several candidate query engines to execute a query.

### Setup

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [1]:
%pip install llama-index-embeddings-openai
%pip install llama-index-llms-openai

Collecting llama-index-embeddings-openai
  Downloading llama_index_embeddings_openai-0.1.7-py3-none-any.whl (6.0 kB)
Collecting llama-index-core<0.11.0,>=0.10.1 (from llama-index-embeddings-openai)
  Downloading llama_index_core-0.10.29-py3-none-any.whl (15.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.4/15.4 MB[0m [31m43.7 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai)
  Downloading httpx-0.27.0-py3-none-any.w

In [None]:
!pip install llama-index

In [31]:
from IPython.display import Markdown


In [None]:
# # NOTE: This is ONLY necessary in jupyter notebook.
# # Details: Jupyter runs an event-loop behind the scenes.
# #          This results in nested event-loops when we start an event-loop to make async queries.
# #          This is normally not allowed, we use nest_asyncio to allow it for convenience.
# import nest_asyncio

# nest_asyncio.apply()

In [14]:
pip install llama-index


Collecting llama-index
  Downloading llama_index-0.10.29-py3-none-any.whl (6.9 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.2.2-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.11-py3-none-any.whl (26 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.1.5-py3-none-any.whl (6.7 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 (from llama-index)
  Downloading llama_index_multi_modal_llms_openai-0.1.5-py3-none-any.whl (5.8 kB)
Collecting llama-index-program-openai<0.2.0,>=0.1.3 (from llama-index)
  D

## Global Models

In [10]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
import os
from google.colab import userdata

# set OpenAI API key in environment variable
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

In [27]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-4-turbo", temperature=0.2)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

### Load Data

We first show how to convert a Document into a set of Nodes, and insert into a DocumentStore.

In [15]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader("data").load_data()

In [17]:
from llama_index.core import Settings

# initialize settings (set chunk size)
Settings.chunk_size = 1024
nodes = Settings.node_parser.get_nodes_from_documents(documents)

In [18]:
from llama_index.core import StorageContext

# initialize storage context (by default it's in-memory)
storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

### Define Summary Index and Vector Index over Same Data

In [19]:
from llama_index.core import SummaryIndex
from llama_index.core import VectorStoreIndex

summary_index = SummaryIndex(nodes, storage_context=storage_context)
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

### Define Query Engines and Set Metadata

In [20]:
list_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [21]:
from llama_index.core.tools import QueryEngineTool


list_tool = QueryEngineTool.from_defaults(
    query_engine=list_query_engine,
    description=(
        "Useful for summarization questions"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context"
    ),
)

### Define Router Query Engine

There are several selectors available, each with some distinct attributes.

The LLM selectors use the LLM to output a JSON that is parsed, and the corresponding indexes are queried.

The Pydantic selectors (currently only supported by `gpt-4-0613` and `gpt-3.5-turbo-0613` (the default)) use the OpenAI Function Call API to produce pydantic selection objects, rather than parsing raw JSON.

For each type of selector, there is also the option to select 1 index to route to, or multiple.

#### PydanticSingleSelector

Use the OpenAI Function API to generate/parse pydantic objects under the hood for the router selector.

In [22]:
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector, LLMMultiSelector
from llama_index.core.selectors import (
    PydanticMultiSelector,
    PydanticSingleSelector,
)


query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)

In [23]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

The document contains information about various operating systems and their respective versions, including Windows 10 and 11, macOS Catalina, Big Sur, and Monterey, different versions of Linux such as Ubuntu, Fedora, and Debian, as well as Android, iOS, and Solaris.


In [24]:
response = query_engine.query("What is the last operating system on the list")
print(str(response))

The last operating system on the list is Solaris 11.4.


In [29]:
response = query_engine.query(
    "Given this list of operating system information for each one give the full os name including version details. Next check if the operating system is supported by crowdstrike security tool and add a comment yes if it is supported and no if it is not supported, or I don't know if you are not sure. explain your reasoning")
print(str(response))

Windows 10, version 10.0.19042
Windows 11, version 11.0.22000
macOS Catalina, version 10.15.7
macOS Big Sur, version 11.6
macOS Monterey, version 12.2.1
Ubuntu, version 20.04 LTS
Fedora, version 34
Debian, version 11
Android Pie, version 9
iOS, version 14.4
Solaris, version 11.4

Windows 10 and Windows 11 are supported by CrowdStrike security tool, so the comment is "yes" for both. macOS Catalina, macOS Big Sur, and macOS Monterey are also supported, so the comment is "yes" for them as well. Ubuntu, Fedora, and Debian are supported, so the comment is "yes" for these Linux distributions. Android Pie and iOS are also supported, so the comment is "yes" for them. However, Solaris is not supported by CrowdStrike security tool, so the comment is "no" for Solaris.


In [32]:
display(Markdown(f"<b>{response}</b>"))

<b>Windows 10, version 10.0.19042
Windows 11, version 11.0.22000
macOS Catalina, version 10.15.7
macOS Big Sur, version 11.6
macOS Monterey, version 12.2.1
Ubuntu, version 20.04 LTS
Fedora, version 34
Debian, version 11
Android Pie, version 9
iOS, version 14.4
Solaris, version 11.4

Windows 10 and Windows 11 are supported by CrowdStrike security tool, so the comment is "yes" for both. macOS Catalina, macOS Big Sur, and macOS Monterey are also supported, so the comment is "yes" for them as well. Ubuntu, Fedora, and Debian are supported, so the comment is "yes" for these Linux distributions. Android Pie and iOS are also supported, so the comment is "yes" for them. However, Solaris is not supported by CrowdStrike security tool, so the comment is "no" for Solaris.</b>

#### LLMSingleSelector

Use OpenAI (or any other LLM) to parse generated JSON under the hood to select a sub-index for routing.

In [None]:
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)

In [None]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

The document provides a comprehensive account of the author's professional journey, covering his involvement in various projects such as Viaweb, Y Combinator, and Hacker News, as well as his transition to focusing on writing essays and working on Y Combinator. It also delves into his experiences with the Summer Founders Program, the growth and challenges of Y Combinator, personal struggles, and his return to working on Lisp. The author reflects on the challenges and successes encountered throughout his career, including funding startups, developing a new version of Arc, and the impact of Hacker News. Additionally, the document touches on the author's interactions with colleagues, his time in Italy, experiences with painting, and the completion of a new Lisp called Bel. Throughout, the author shares insights and lessons learned from his diverse experiences.


In [None]:
response = query_engine.query("What did Paul Graham do after RICS?")
print(str(response))

Paul Graham started painting after leaving Y Combinator. He wanted to see how good he could get if he really focused on it. After spending most of 2014 painting, he eventually ran out of steam and stopped working on it. He then started writing essays again and wrote a bunch of new ones over the next few months. In March 2015, he started working on Lisp again.


In [None]:
# [optional] look at selected results
print(str(response.metadata["selector_result"]))

selections=[SingleSelection(index=1, reason='The question is asking for specific context about what Paul Graham did after RICS, which would require retrieving specific information from his essay.')]


#### PydanticMultiSelector

In case you are expecting queries to be routed to multiple indexes, you should use a multi selector. The multi selector sends to query to multiple sub-indexes, and then aggregates all responses using a summary index to form a complete answer.

In [None]:
from llama_index.core import SimpleKeywordTableIndex

keyword_index = SimpleKeywordTableIndex(nodes, storage_context=storage_context)

keyword_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context using keywords from Paul"
        " Graham essay on What I Worked On."
    ),
)

In [None]:
query_engine = RouterQueryEngine(
    selector=PydanticMultiSelector.from_defaults(),
    query_engine_tools=[
        list_tool,
        vector_tool,
        keyword_tool,
    ],
)

In [None]:
# This query could use either a keyword or vector query engine, so it will combine responses from both
response = query_engine.query(
    "What were noteable events and people from the authors time at Interleaf"
    " and YC?"
)
print(str(response))

The author's time at Interleaf involved working on software for creating documents and learning valuable lessons about what not to do. Notable individuals associated with Y Combinator during the author's time there include Jessica Livingston, Robert Morris, and Sam Altman, who eventually became the second president of YC. The author's time at Y Combinator included notable events such as the creation of the Summer Founders Program, which attracted impressive individuals like Reddit, Justin Kan, Emmett Shear, Aaron Swartz, and Sam Altman.


In [None]:
# [optional] look at selected results
print(str(response.metadata["selector_result"]))

selections=[SingleSelection(index=0, reason='Summarization questions related to Paul Graham essay on What I Worked On.'), SingleSelection(index=2, reason='Retrieving specific context using keywords from Paul Graham essay on What I Worked On.')]
