----------------------
#### Router Engine
----------------------
- In LlamaIndex, a `RouterQueryEngine` (also known as a router query engine) allows you to dynamically route queries to different underlying query engines or indexes based on the type or content of the query.
- This approach is useful when working with multiple types of data or specialized indexes where specific queries are best handled by specific engines.

In [1]:
import openai

In [2]:
client = openai.OpenAI(
    # defaults to os.environ.get("OPENAI_API_KEY")
    # api_key=openai_api_key
)

#### asyncio

- Suppose we have an asynchronous function that makes two async calls:
    - one to asyncio.sleep to simulate a delay, and
    - another to print a message.
- We’ll try running this function with and without `nest_asyncio` in an environment (like a Jupyter notebook) that already has an event loop running.

In [3]:
import asyncio

async def fetch_data():
    await asyncio.sleep(1)  # Simulates a delay (e.g., network call)
    print("Data fetched!")

`Without nest_asyncio`
If you try to run the code below directly in a Jupyter notebook without nest_asyncio, you will likely see an error.

In [4]:
await fetch_data()

Data fetched!


looks like nest_asyncio is already set

In [5]:
import nest_asyncio

nest_asyncio.apply()

In [7]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [8]:
from llama_index.core.node_parser import SentenceSplitter

# Initialize the SentenceSplitter: The chunk_size parameter provided  
# appropriate if you want to split documents into chunks of up to 1024 characters.
splitter = SentenceSplitter(chunk_size=1024)

# Split the documents into nodes
nodes = splitter.get_nodes_from_documents(documents)

len(nodes)

34

In [None]:
# Output the nodes
# for node in nodes:
#     print(node)

In [9]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

#### Define Summary Index and Vector Index over the Same Data

- create two types of indexes, `SummaryIndex` and `VectorStoreIndex`, from a collection of nodes. 

`SummaryIndex`
- Purpose: SummaryIndex is designed to provide a `concise summary of the documents or nodes`. It typically captures the `main points or essence of the content`.

`VectorStoreIndex`
- Purpose: VectorStoreIndex is used to create a vector representation of the nodes, which is essential for performing `similarity searches`, `clustering`, or any task that involves comparing the content of the nodes. 

In [10]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index  = VectorStoreIndex(nodes)

#### Define Query Engines and Set Metadata

`Summary Query Engine`

- Purpose: This creates a query engine for the SummaryIndex that allows you to query the summarized content.
- Parameters:
    - `response_mode="tree_summarize"`: This specifies how the responses should be generated. In this case, "tree_summarize" likely indicates that the engine will use a hierarchical or structured approach to generate summaries of the nodes. This can help in organizing and presenting the summarized information in a tree-like format.
    - `use_async=True`: This enables asynchronous processing, which can improve performance when dealing with large datasets or when making multiple queries concurrently.

- Usage: The summary_query_engine allows you to query the index and retrieve summaries based on the indexed content. It's useful for quickly obtaining a high-level overview or summary of the documents.

In [11]:
summary_query_engine = summary_index.as_query_engine(
    response_mode= "tree_summarize",
    use_async    = True,
)

`Vector Index Engine`

- Purpose: This creates a query engine for the VectorStoreIndex that allows you to perform operations like similarity searches or semantic queries on the vectorized nodes.
- Parameters:
    - No additional parameters are provided, so it uses default settings for the vector-based query engine.

- Usage: The vector_query_engine enables you to search for nodes that are similar in meaning or context. It's useful for tasks like finding related documents, clustering, or retrieving information based on semantic similarity.

In [23]:
vector_query_engine = vector_index.as_query_engine()

using `QueryEngineTool` from llama_index to create tools that facilitate interactions with the `SummaryIndex` and `VectorStoreIndex` through their respective query engines.

**SummaryTool**

- `Purpose`: This creates a QueryEngineTool specifically for interacting with the summary_query_engine. The tool is configured to handle summarization-related queries.
- `Parameters`:
    - `query_engine = summary_query_engine`:
        - This specifies that the tool will use the summary_query_engine for handling queries.
        - It leverages the summarization capabilities of the SummaryIndex.
- `description`:
    - Provides a brief description of the tool's purpose.
    - In this case, it's described as being "Useful for summarization questions related to MetaGPT."
    - This description can help users understand when to use this tool.
- `Usage`:
    - The summary_tool can be used to answer questions that require summarizing content from the indexed nodes, making it helpful for quickly obtaining overviews or summaries related to MetaGPT.

**VectorTool**

- `Purpose`: This creates a QueryEngineTool for interacting with the vector_query_engine. It is designed to handle queries that require retrieving specific contextual information from the indexed data.
- `Parameters`:
    - `query_engine = vector_query_engine`
        - Specifies that this tool uses the vector_query_engine for queries.
        - This engine is based on the VectorStoreIndex, which means it can perform semantic similarity searches and retrieve contextually relevant information.
    - `description`:
        - Describes the tool's utility as "Useful for retrieving specific context from the MetaGPT paper."
        - This helps clarify that the tool is suitable for detailed searches and context retrieval.
    - `Usage`: The vector_tool is ideal for tasks where you need to find specific information or context within the MetaGPT content, leveraging the vectorized representation of the nodes.

In [13]:
from llama_index.core.tools import QueryEngineTool

summary_tool = QueryEngineTool.from_defaults(
    query_engine = summary_query_engine,
    description  = ("Useful for summarization questions related to MetaGPT"),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine = vector_query_engine,
    description  = ("Useful for retrieving specific context from the MetaGPT paper."),
)

`return_direct (bool, default = False)`

- If set to True, the tool will return the query result directly.
- This can be useful when you want the output without any additional formatting or post-processing.

`resolve_input_errors (bool, default = True)`

- If set to True, this option attempts to automatically handle common input errors in queries, making the tool more robust against malformed queries.

#### Define Router Query Engine

- creating a `RouterQueryEngine` that combines multiple query engines (summary_tool and vector_tool) into a single interface. This allows you to route queries to the appropriate engine based on the nature of the query.

- `RouterQueryEngine`
    - Purpose: The RouterQueryEngine acts as a central query engine that can direct queries to different sub-engines based on certain criteria. It's useful when you have multiple tools or engines for different types of queries, and you want to automatically choose the best one for a given query.
    - Usage: By using RouterQueryEngine, you can send queries to a single point of interaction, and it will internally decide which specific query engine (summary_tool or vector_tool) to use for each query.



In [14]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine

`LLMSingleSelector`

- Purpose: The LLMSingleSelector is a component that helps the RouterQueryEngine decide which query engine to use for a given query. It uses a language model (LLM) to make this selection.
- Usage: LLMSingleSelector.from_defaults() creates a default selector that will utilize the LLM to choose the most appropriate tool from the available options (summary_tool or vector_tool). This allows for intelligent routing of queries based on their content.

In [15]:
from llama_index.core.selectors import LLMSingleSelector

`Creating the Router Query Engine`

- Parameters:
    - selector=LLMSingleSelector.from_defaults(): Specifies that the LLMSingleSelector will be used to route queries to the appropriate tool.
    - query_engine_tools=[summary_tool, vector_tool]: Provides a list of query engine tools that the RouterQueryEngine can route queries to. In this case, it's the summary_tool and vector_tool.
    - verbose=True: Enables verbose mode, which means the engine will print additional information during query processing. This can be useful for debugging or understanding how queries are routed.

In [16]:
query_engine = RouterQueryEngine(
    selector           = LLMSingleSelector.from_defaults(),
    query_engine_tools = [
                            summary_tool,
                            vector_tool,
    ],
    verbose=True
)

In [17]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: The question asks for a summary of the document, which aligns with the purpose of choice 1, as it is useful for summarization questions..
[0mThe document introduces MetaGPT, a meta-programming framework designed for multi-agent collaboration utilizing large language models (LLMs). It addresses challenges in automated problem-solving by incorporating Standardized Operating Procedures (SOPs) to enhance the coherence and accuracy of task execution among agents. MetaGPT organizes agents into specialized roles, enabling efficient task decomposition and structured communication, which reduces errors and improves collaboration.

The framework employs a unique communication protocol that utilizes structured outputs instead of natural language dialogue, minimizing ambiguities. Additionally, it features an executable feedback mechanism that allows agents to debug and refine code during runtime, significantly improving code generation quality.

Experiment

In [18]:
print(len(response.source_nodes))

34


In [19]:
response = query_engine.query("How do agents share information with other agents?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question pertains to specific mechanisms of information sharing among agents, which would likely be detailed in the MetaGPT paper..
[0mAgents share information with other agents through a shared message pool. They publish structured messages in this pool, allowing all agents to access and exchange information directly. This mechanism enhances communication efficiency by enabling agents to retrieve relevant information without needing to inquire about other agents. Additionally, agents can utilize a subscription mechanism to follow task-related information based on their role profiles, ensuring they receive only pertinent updates while avoiding information overload.


In [20]:
from utils import get_router_query_engine

query_engine = get_router_query_engine("metagpt.pdf")

In [22]:
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question asks for specific results from an ablation study, which would require retrieving specific context from the MetaGPT paper..
[0mThe ablation study results show that MetaGPT effectively addresses challenges related to using large language models (LLMs) for software generation. By focusing on specific tasks like requirement analysis and package selection, MetaGPT guides the thinking process of LLMs, reducing issues such as code hallucinations, incomplete implementations, missing dependencies, and undiscovered bugs. Additionally, MetaGPT utilizes a global message pool and a subscription mechanism to tackle information overload, ensuring that relevant information is prioritized and efficiently communicated.


#### Summarization Prompts

These prompts aim to provide high-level summaries or specific insights about sections within the `MetaGPT` paper:

1. **Overview Summaries**
   - "Provide an executive summary of the MetaGPT paper."
   - "Summarize the main objectives and findings of the MetaGPT study."
   - "What are the primary contributions of MetaGPT according to the authors?"
   - "Summarize the introduction and motivation for creating MetaGPT."

2. **Technical Summaries**
   - "Summarize the architecture of MetaGPT, highlighting its core components."
   - "Provide a brief summary of the training methodologies used in MetaGPT."
   - "Summarize the different layers and attention mechanisms employed in MetaGPT."

3. **Experiment and Results Summaries**
   - "Summarize the experiments conducted in the MetaGPT study and their outcomes."
   - "What were the key findings and metrics used to evaluate MetaGPT's performance?"
   - "Summarize the results section, particularly focusing on how MetaGPT compares to other models."

4. **Use-Case and Application Summaries**
   - "Summarize the primary use cases for MetaGPT as discussed in the paper."
   - "Summarize the potential limitations of MetaGPT for practical applications."

5. **Future Directions and Conclusions**
   - "Summarize the conclusions drawn in the MetaGPT paper."

---

#### Question-Answering (QnA) Prompts

These prompts are crafted to extract specific answers or detailed explanations based on user queries about the `MetaGPT` paper:

1. **Basic Information**
   - "What is MetaGPT, and what problem does it aim to solve?"
   - "Who are the authors of the MetaGPT paper, and which institutions supported this research?"
   - "What are the main research questions addressed in MetaGPT?"

2. **Technical Details**
   - "How does MetaGPT differ architecturally from previous GPT models?"
   - "What training data was used to train MetaGPT, and how was it curated?"
   - "Can you describe the role of multi-head attention in MetaGPT's architecture?"
   - "What are the specific loss functions and optimization techniques used in MetaGPT?"

3. **Experiments and Evaluation**
   - "What benchmarks were used to evaluate MetaGPT's performance?"
   - "How does MetaGPT perform compared to other state-of-the-art models on the benchmarks?"
   - "What experimental setups were used to test MetaGPT, and what were the key variables?"
   - "How did MetaGPT handle language-specific tasks in the evaluation?"

4. **Practical Applications and Limitations**
   - "What are the primary applications for MetaGPT proposed in the paper?"
   - "What limitations do the authors note for the MetaGPT model in real-world scenarios?"
   - "How scalable is MetaGPT, according to the authors, and what are its resource requirements?"

5. **Future Directions**
   - "What future research directions are suggested for improving MetaGPT?"
   - "Are there any planned updates or variations of MetaGPT to address specific limitations?"
   - "How does MetaGPT's architecture support potential future improvements or modifications?"


In [27]:
# QnA
response = query_engine.query("What training data was used to train MetaGPT, and how was it curated??")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question asks for specific details about the training data and curation process, which would be found in the MetaGPT paper..
[0mThe training data used to train MetaGPT consisted of 70 diverse software development tasks. These tasks were carefully curated to cover a range of scenarios and challenges within the software development domain. The dataset included detailed prompts and names for each task, with a focus on ensuring a comprehensive representation of software development requirements.


In [31]:
# QnA
response = query_engine.query("Are there any planned updates or variations of MetaGPT to address specific limitations?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question pertains to specific limitations and potential updates or variations of MetaGPT, which would require retrieving specific context from the MetaGPT paper..
[0mThere are no specific mentions of planned updates or variations of MetaGPT to address specific limitations in the provided context information.
