# Lesson 1: Router Engine

Welcome to Lesson 1.

To access the `requirements.txt` file, the data/pdf file required for this lesson and the `helper` and `utils` modules, please go to the `File` menu and select`Open...`.

I hope you enjoy this course!

## Setup

In [1]:
from helper import get_openai_api_key

OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio

nest_asyncio.apply()

## Load Data

To download this paper, below is the needed code:

#!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

**Note**: The pdf file is included with this lesson. To access it, go to the `File` menu and select`Open...`.

In [3]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [4]:
documents

[Document(id_='d52a01d2-5d58-4329-914e-60a5ef438840', embedding=None, metadata={'page_label': '1', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-02-25', 'last_modified_date': '2025-02-25'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text_resource=MediaResource(embeddings=None, data=None, text='Preprint\nMETAGPT: M ETA PROGRAMMING FOR A\nMULTI -AGENT COLLABORATIVE FRAMEWORK\nSirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,\nCeyao Zhang4, Jinlin Wang1, Zili Wang, Steven Ka Shing Yau5, Zijuan Lin4,\nLiyang Zhou6, Chenyu Ran1, Lingfeng Xiao1,7, Chenglin Wu1†, J¨urgen Schmidhuber2,8\n1D

In [5]:
documents[0].text

'Preprint\nMETAGPT: M ETA PROGRAMMING FOR A\nMULTI -AGENT COLLABORATIVE FRAMEWORK\nSirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,\nCeyao Zhang4, Jinlin Wang1, Zili Wang, Steven Ka Shing Yau5, Zijuan Lin4,\nLiyang Zhou6, Chenyu Ran1, Lingfeng Xiao1,7, Chenglin Wu1†, J¨urgen Schmidhuber2,8\n1DeepWisdom, 2AI Initiative, King Abdullah University of Science and Technology,\n3Xiamen University, 4The Chinese University of Hong Kong, Shenzhen,\n5Nanjing University, 6University of Pennsylvania,\n7University of California, Berkeley, 8The Swiss AI Lab IDSIA/USI/SUPSI\nABSTRACT\nRemarkable progress has been made on automated problem solving through so-\ncieties of agents based on large language models (LLMs). Existing LLM-based\nmulti-agent systems can already solve simple dialogue tasks. Solutions to more\ncomplex tasks, however, are complicated through logic inconsistencies due to\ncascading hallucinations caused by naively chaining LLMs. Here we introduce\nMetaGPT,

## Define LLM and Embedding model

In [6]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [7]:
nodes

[TextNode(id_='ed5d1f28-ee7c-4ac0-b517-90abf24a3faf', embedding=None, metadata={'page_label': '1', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-02-25', 'last_modified_date': '2025-02-25'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='d52a01d2-5d58-4329-914e-60a5ef438840', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '1', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-02-25', 'last_modified_date': '2025-02-25'}, hash='f4cbab9cbec17be6078f188ad2c44f92ff9f75e1bfb7ca7f4c34c19f7342d2fb')}, metadata_template='{key}: {val

In [8]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

## Define Summary Index and Vector Index over the Same Data

In [9]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

## Define Query Engines and Set Metadata

In [10]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [11]:
from llama_index.core.tools import QueryEngineTool

summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

## Define Router Query Engine

In [12]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [13]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: Useful for summarization questions related to MetaGPT.
[0mThe document introduces MetaGPT, a meta-programming framework that utilizes Standardized Operating Procedures (SOPs) to enhance multi-agent systems based on Large Language Models (LLMs). MetaGPT incorporates role specialization, efficient communication protocols, and an executable feedback mechanism to improve code generation quality. Through extensive experiments and comparisons with existing frameworks, MetaGPT demonstrates state-of-the-art performance in various benchmarks. It also provides detailed information about the development process of a software application called the "Drawing App" using MetaGPT, covering aspects such as requirements gathering, UI design, system architecture, implementation approach, testing, and task allocation. The use of Python libraries like Tkinter and Pillow, along with unit testing, is highlighted. The document discusses the performance of MetaGPT in g

In [14]:
print(len(response.source_nodes))

34


In [15]:
response = query_engine.query(
    "How do agents share information with other agents?"
)
print(str(response))

[1;3;38;5;200mSelecting query engine 1: This choice is more relevant as it specifically mentions retrieving specific context from the MetaGPT paper, which would likely include information on how agents share information with other agents..
[0mAgents share information with other agents by utilizing a shared message pool where they can publish structured messages and subscribe to relevant messages based on their profiles. This shared message pool allows all agents to exchange messages directly, enabling them to access messages from other entities transparently. By storing information in this global message pool, agents can retrieve required information without the need to inquire about other agents and await their responses, thus enhancing communication efficiency.


## Let's put everything together

In [16]:
from utils import get_router_query_engine

query_engine = get_router_query_engine("metagpt.pdf")

In [17]:
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: Ablation study results are specific context from the MetaGPT paper, making choice 2 the most relevant..
[0mThe ablation study results show that MetaGPT addresses challenges related to context utilization, code hallucinations, and information overload effectively. It focuses on unfolding natural language descriptions accurately, maintaining information validity in lengthy contexts, and reducing hallucinations in software generation. Additionally, MetaGPT uses a global message pool and a subscription mechanism to tackle information overload by streamlining communication and filtering out irrelevant contexts.
