# Router Engine

Given a query, the router will pick what to run.  This shows dynamic query understanding.

## Setup

In [1]:
# this helper method is not required when running the code on lightning.ai
# it is only required when running the code locally
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

In [2]:
# this is required since Juypyter notebook is not async by default.
import nest_asyncio
nest_asyncio.apply()

## Load data

In [3]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["/teamspace/uploads/metagpt.pdf"]).load_data()

## Split document into nodes

In [4]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

## Define LLM and Embedding Model

In [5]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

## Define Indexes

- Summary Index 
- Vector Index 

Think of the index as a set of metadata over the data.  Different indexes have different index behavior.  A vector index indexes node through text embeddings.  Querying a vector index in llamaindex will return the most similar nodes. A summary index is also a simple index, and a query to it will return all of the nodes that are currently in the index.

In [6]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

## Turn the indexes into query engines

Each query engine represents an interface over the data that has been stored in the index.  It combines retrieval with LLM synthesis.  Each engine is specific for a certain kind of question.  This is where the use case of the router becomes significant.  

In [7]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)

vector_query_engine = vector_index.as_query_engine()

## Create query tools

The query tools are simply the query engine with additional metadata.  The metadata reflects the kind of information the tool can answer.  For example, asking it to "summarize" the article would launch the summary_tool.

In [8]:
from llama_index.core.tools import QueryEngineTool

summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

## Define Router

Llamaindex provides several diffent selectors.  

- The LLM selectors use the LLM to output a JSON that is parsed, and corresponding indexes are queried.
- The Pydantic selects use the OpenAI Function Calling API to produce pydantic selection objects rather than raw JSON.

In this example we are using the LLMSingleSelector. The RouterQueryEngine takes in a selector type as well as a set of tools.

In [17]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    # allows us to view the intermediate steps of the query engine
    verbose=True,
)

In [10]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: This choice indicates that the document is useful for summarization questions related to MetaGPT..
[0mThe document introduces MetaGPT, a meta-programming framework for multi-agent collaboration based on Large Language Models (LLMs). MetaGPT incorporates Standardized Operating Procedures (SOPs) to enhance the efficiency and accuracy of problem-solving processes by assigning specific roles to agents and streamlining workflows. It utilizes a communication protocol to enhance role communication efficiency and implements structured communication interfaces. MetaGPT achieves state-of-the-art performance in evaluations, demonstrating its robustness and efficiency in developing LLM-based multi-agent systems. Additionally, the document discusses the implementation of self-improvement mechanisms and multi-agent economies in software development using MetaGPT, exploring concepts like recursive self-improvement, self-referential mechanisms, dynamic collabo

In [11]:
print(len(response.source_nodes))

34


In [18]:
response = query_engine.query("How do agents share information with other agents?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: Useful for retrieving specific context from the MetaGPT paper..
[0mAgents share information with other agents by utilizing a shared message pool where they can publish structured messages and also subscribe to relevant messages based on their profiles. This shared message pool allows all agents to exchange messages directly, enabling them to access messages from other entities transparently. By storing information in this global message pool, agents can retrieve required information directly from the pool without the need to inquire about other agents and wait for their responses, thus enhancing communication efficiency.


## Let's put everything together

Code from the above cells, have been put into the utils helper method.  Builds the query engine with both vector and summary search.

In [13]:
from utils import get_router_query_engine

query_engine = get_router_query_engine("/teamspace/uploads/metagpt.pdf")

In [3]:
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The ablation study results are specific context from the MetaGPT paper, making choice 2 the most relevant..
[0mThe ablation study results indicate the effectiveness of MetaGPT in addressing challenges related to context efficiency, reducing hallucinations, and managing information overload in software development tasks. The study highlights how MetaGPT's unique designs successfully tackle issues such as eliminating ambiguity in natural language descriptions, maintaining information validity in lengthy contexts, reducing code hallucinations, and managing information overload through a global message pool and subscription mechanism.


In [14]:
response = query_engine.query("Does the paper provide specific coding example?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The paper provides specific context from the MetaGPT paper, which may include coding examples..
[0mYes, the paper provides specific coding examples in Table 8, which includes various software development tasks such as creating games, data processing programs, management programs, music transcriber, custom press releases, and interactive weather dashboard.
