# Lesson 1: Router Engine

## Setup

In [1]:
from customlib.utils.helper import get_openai_api_key, print_response

OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio

nest_asyncio.apply()

## Load Data

In [16]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["../resources/metagpt.pdf"]).load_data()

## Define LLM and Embedding model

In [4]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [5]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

class ConcreteOpenAI(OpenAI):
    def _prepare_chat_with_tools(self):
        pass

Settings.llm = ConcreteOpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

## Define Summary Index and Vector Index over the Same Data

In [6]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

## Define Query Engines and Set Metadata

In [7]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [8]:
from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

## Define Router Query Engine

In [9]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [10]:
response = query_engine.query("What is the summary of the document?")
print_response(str(response))

[1;3;38;5;200mSelecting query engine 0: Useful for summarization questions related to MetaGPT.
[0m

The document introduces MetaGPT, a meta-programming framework for LLM-based multi-agent systems that enhances collaboration efficiency through human-like Standardized Operating Procedures (SOPs). MetaGPT assigns specific roles to agents, streamlines workflows, and improves task decomposition. By utilizing structured outputs and a communication protocol, MetaGPT ensures consistent and accurate problem-solving processes. The framework achieves state-of-the-art performance in code generation benchmarks and offers a robust platform for developing LLM-based multi-agent systems. Additionally, the document discusses the software development process using MetaGPT, emphasizing the importance of user input commands, collaboration with a professional team, creating a Product Requirement Document, system design, task breakdown, code generation, and unit testing. The document also addresses the performance of GPT models in the HumanEval benchmark, highlighting sensitivity to prompts, limitations, ethical concerns, and how MetaGPT addresses challenges in software generation.

In [11]:
print(len(response.source_nodes))

34


In [12]:
response = query_engine.query(
    "How do agents share information with other agents?"
)
print_response(str(response))

[1;3;38;5;200mSelecting query engine 1: This choice is more relevant as it specifically mentions retrieving specific context from the MetaGPT paper, which would likely include information on how agents share information with other agents..
[0m

Agents share information with other agents by utilizing a shared message pool where they can publish structured messages. Additionally, agents can subscribe to relevant messages based on their profiles. This approach allows all agents to exchange messages directly and access information from other entities transparently, enhancing communication efficiency.

## Let's put everything together

In [13]:
from customlib.utils.utils import get_router_query_engine

query_engine = get_router_query_engine("resources/metagpt.pdf")

In [14]:
response = query_engine.query("Tell me about the ablation study results?")
print_response(str(response))

[1;3;38;5;200mSelecting query engine 1: Ablation study results are specific context from the MetaGPT paper, making choice 2 the most relevant..
[0m

The ablation study results show that MetaGPT effectively addresses challenges related to context utilization, code hallucinations, and information overload in software development. By focusing on unfolding natural language descriptions accurately, maintaining information validity, and utilizing a global message pool with a subscription mechanism, MetaGPT demonstrates improvements in eliminating ambiguity, reducing code hallucinations, and managing information overload efficiently.