# 第二章 路由查询引擎 Router Query Engine

## 一、引言

在本章节中，我们将学习最简单形式的<strong>代理式检索增强生成(Agentic RAG)</strong>。给定一个<strong>查询(Query)</strong>，<strong>路由器(Router)</strong>通过理解查询，从几个<strong>查询引擎(Query Engine)</strong>中的选择一个来执行查询。

我们将构建一个简单的路由器，该路由器可以处理<strong>单个文档的问答和摘要</strong>。 


<pre style="color: blue;">
                               +---------------------------------+
                               |                                 |
                               | Input: What is the summary of   |
                               | the MetaGPT document?           |
                               +---------------+-----------------+
 <pre style="color: green;"> 
                                               |   
                        +----------------------+--------------------+
                        |                                           |
                        |                路由查询引擎                 |
                        |              Router Engine                |
                        +-------+-----------------------------------+
                                |
            +-------------------+-------------------+
            |                                       |
    +-------+-------+                       +-------+-------+
    |               |                       |               |
    |  问答查询引擎   |                       |  摘要查询引擎   |
    |               |                       |               |
    | Query Engine  |                       | Query Engine  |
    |     (QA)      |                       |(Summarization)|
    |               |                       |               |
    +-------+-------+                       +-------+-------+
            |                                       |
    +-------+-------+                       +-------+-------+
    |               |                       |               |
    |  Vector Index |                       | Summary Index |
    |               |                       |               |
    +---------------+                       +-------+-------+
                                                    |
                                                    
<pre style="color: blue;">
                                        +-----------+------------+
                                        |                         |
                                        | Output: Summary of      |
                                        | the MetaGPT document    |
                                        +-------------------------+



<!-- </pre> -->

In [1]:
from dotenv import load_dotenv
from dotenv import find_dotenv

# 你需要在文件目录下创建一个.env的文件，文件中存放你的 OPENAI_API_KEY=“your OPENAI API Key”
# 从 .env 文件将 OPENAI_API_KEY 加载为环境变量。
# 当后续使用到 OPENAI 模型时，环境变量会直接被使用于认证(Authentication)。
_ = load_dotenv(find_dotenv())

因为 OPENAI_API_KEY 已经被加载为环境变量，所以你通过一下方式去查看 OPENAI_API_KEY 这个环境变量。

```python
import os
os.getenv("OPENAI_API_KEY")
```
<br>

In [2]:
import nest_asyncio

nest_asyncio.apply()

LlamaIndex 为了实现高效加载和处理数据，有些功能使用了asyncio实现异步编程优化 I/O 操作。然而，asyncio 的设计本身不支持事件循环的嵌套使用，这意味着在已经运行事件循环的环境中，尝试再次启动事件循环或同步运行任务会导致异常，通常会抛出 "RuntimeError: This event loop is already running" 错误。 在Jupyter Notebook这样的交互式环境中，因为每个代码单元格（cell）都可能尝试创建或操作事件循环。从而出现事件循环的嵌套使用。

为了解决这个问题，nest_asyncio 模块对 asyncio 进行了补丁，允许嵌套使用 asyncio 的函数，例如 asyncio.run() 和 loop.run_until_complete()。这种补丁修改了 asyncio 的行为，临时允许 asyncio 在已有事件循环的环境中正常工作，确保在 Jupyter Notebook 环境中，即使存在已运行的事件循环，也能够正确执行异步任务。

## 二、加载数据

这里需要用到的文档为`metagpt.pdf`。已经在当前目录下面。 或者你也可以通过下面的指令来下载文件。

```bash
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf
```

`wget` 为linux系统下的下载指令。如果你的系统为Windows或者Mac, 则可以通过curl来下载。

```bash
!curl -s -o metagpt.pdf "https://openreview.net/pdf?id=VtmBAGCN7o"
```

### 2.1 加载数据为文档格式

In [3]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(input_files=["data/metagpt.pdf"]).load_data()

In [4]:
type(documents)

list

In [5]:
len(documents)

29

<br>

数据加载后的为格式为列表(`List`)。这里包含列表中共有29个元素。

我们来看看元素的类型。
<br>

In [6]:
type(documents[0])

llama_index.core.schema.Document

<br>

元素的类型为llama_index自定义文档类型(`llama_index.core.schema.Document`)

<br>

In [7]:
documents[0].to_dict().keys()

dict_keys(['id_', 'embedding', 'metadata', 'excluded_embed_metadata_keys', 'excluded_llm_metadata_keys', 'relationships', 'text', 'start_char_idx', 'end_char_idx', 'text_template', 'metadata_template', 'metadata_seperator', 'class_name'])

<br>

将llama_index自定义文档类型转为为字典类型（`dictionary`），并输出其中的键(`keys`)。可以看到对于每个文档，包含的信息。

<br>

In [8]:
from llama_index.core.utils import get_tokenizer
tokenizer = get_tokenizer()
print("------------------------------------")
print("文档的Token个数")
print("------------------------------------")
print([len(tokenizer(documents[i].text)) for i in range(len(documents))])
print()

------------------------------------
文档的Token个数
------------------------------------
[881, 667, 1088, 562, 337, 859, 1050, 528, 816, 1052, 1128, 1080, 993, 355, 904, 155, 591, 313, 419, 248, 238, 472, 540, 878, 867, 304, 53, 550, 813]



### 2.2 切割文档为节点格式

In [9]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [10]:
len(nodes)

34

In [11]:
type(nodes[0])

llama_index.core.schema.TextNode

In [12]:
nodes[0].to_dict().keys()

dict_keys(['id_', 'embedding', 'metadata', 'excluded_embed_metadata_keys', 'excluded_llm_metadata_keys', 'relationships', 'text', 'start_char_idx', 'end_char_idx', 'text_template', 'metadata_template', 'metadata_seperator', 'class_name'])

<br>

将llama_index自定义文档类型documents, 进行进一步切分为llama_index自定义节点类型(llama_index.core.schema.TextNode)。每个文档的Token个数少于等于1024个。 切分之后由29个文档变为34个节点。
 
同时可以看到节点的元数据, 和文档的元数据一样。

<br>

In [13]:
print("------------------------------------")
print("节点的Token个数")
print("------------------------------------")
print([len(tokenizer(nodes[i].text)) for i in range(len(nodes))])
print()
print("----------------------------------------------------------------")
print("第三个文档被分隔为两个节点,紫色+绿色为第一个节点，绿色+蓝色为第二个节点。")
print("也就是说，绿色的部分同时在两个节点中。即两个节点存在交叉(Overlap)")
print("----------------------------------------------------------------")
from src.tool import highlight_doc

highlight_doc(doc = documents[2].text, 
              splited_node1=nodes[2].text, 
              splited_node2=nodes[3].text)


------------------------------------
节点的Token个数
------------------------------------
[881, 667, 968, 304, 562, 337, 859, 988, 87, 528, 816, 761, 478, 955, 360, 966, 278, 993, 355, 904, 155, 591, 313, 419, 248, 238, 472, 540, 878, 867, 304, 53, 550, 813]

----------------------------------------------------------------
第三个文档被分隔为两个节点,紫色+绿色为第一个节点，绿色+蓝色为第二个节点。
也就是说，绿色的部分同时在两个节点中。即两个节点存在交叉(Overlap)
----------------------------------------------------------------
[95mPreprint
•We introduce MetaGPT, a meta-programming framework for multi-agent collaboration based on
LLMs. It is highly convenient and flexible, with well-defined functions like role definition and
message sharing, making it a useful platform for developing LLM-based multi-agent systems.
•Our innovative integration of human-like SOPs throughout MetaGPT’s design significantly en-
hances its robustness, reducing unproductive collaboration among LLM-based agents. Furthermore,
we introduce a novel executive feedback mechanism that d

## 三、定义模型

In [14]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

## 四、创建摘要索引和向量索引

基于前面加载并切割得到的文档节点(nodes)来创建摘要索引和向量索引。 在创建摘要索引和向量索引时会调用到嵌入模型(embed_model)。因为这里的模型为OpenAIEmbedding，课程开始加载的环境变量OPENAI_API_KEY会被用于认证。

In [15]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

## 五、定义查询引擎

基于索引构建摘要相应的查询引擎, 并通过设置相应的元数据(Meta Data)构建查询引擎工具。

元数据为对于引擎工具的描述(description)。这些描述会被路由引擎用于为特定查询(Query)选择最合适的查询引擎。


In [16]:
from llama_index.core.tools import QueryEngineTool

# 基于摘要索引summary_index构建摘要查询引擎
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)

# 基于向量索引vector_index构建向量查询引擎
vector_query_engine = vector_index.as_query_engine()


# 设置置相应的元数据(Meta Data)， 构建摘要查询引擎工具。
# 摘要查询工具用于回答与 MetaGPT 相关的摘要问题查询
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

# 设置置相应的元数据(Meta Data)，构建向量查询引擎工具。
# 向量查询引擎工具用于回答与 MetaGPT 相关的问答
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

## 五、定义路由查询引擎

路由查询引擎从多个候选查询引擎中选择一个来执行查询。
- 选择器(selector): 根据每个候选查询的元数据和查询选择一个选项的选择器。
- 候选查询引擎工具（query_engine_tools）: 一系列候选查询引擎。

In [17]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [18]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: Useful for summarization questions related to MetaGPT.
[0mThe document introduces MetaGPT, a meta-programming framework that utilizes Standardized Operating Procedures (SOPs) to enhance the problem-solving capabilities of multi-agent systems based on Large Language Models (LLMs). MetaGPT assigns specific roles to agents, streamlines workflows, and improves task decomposition. It models agents as a simulated software company, emphasizing role specialization, workflow management, and efficient sharing mechanisms. The framework incorporates an executable feedback mechanism to enhance code generation quality during runtime and has demonstrated state-of-the-art performance in various benchmarks. The document also discusses the development process for software projects using MetaGPT, highlighting the structured workflow involving different team members and the successful generation of functional applications. Additionally, it addresses the performanc

In [19]:
print(len(response.source_nodes))

34


In [20]:
response = query_engine.query(
    "How do agents share information with other agents?"
)
print(str(response))

[1;3;38;5;200mSelecting query engine 0: Useful for summarization questions related to MetaGPT.
[0mAgents share information with other agents by utilizing shared message pools, subscribing to relevant messages based on their profiles, reviewing previous feedback to make necessary adjustments, generating and exchanging various documents and artifacts, and utilizing mechanisms such as message pools and subscriptions within the workflow management framework. These methods facilitate efficient communication and collaboration within the multi-agent system, ensuring that information is exchanged transparently and effectively among the agents.


## 六、结合起来
我们将

In [21]:
from src.utils import get_router_query_engine

query_engine = get_router_query_engine("data/metagpt.pdf")

In [22]:
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: Ablation study results are specific context from the MetaGPT paper, making choice 2 the most relevant..
[0mThe ablation study results demonstrate the effectiveness of MetaGPT in addressing challenges related to information overload and reducing hallucinations in software generation tasks. By utilizing a global message pool and a subscription mechanism, MetaGPT successfully manages excessive or irrelevant information, ensuring efficient communication and enhancing the relevance and utility of the information provided. This design approach is crucial in optimizing the performance of the system in handling complex software development tasks.


In [23]:
response = query_engine.query("请使用中文简要概括文档")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: The question is asking for a summary related to MetaGPT, which is mentioned in choice 1..
[0m这些文档介绍了一个名为MetaGPT的元编程框架，旨在增强基于大型语言模型的多智能体系统的问题解决能力。MetaGPT模拟软件公司代理，利用SOPs、角色专业化和信息共享机制，提高代码生成质量并取得先进性能。研究还涉及软件开发流程、人工智能模型评估以及多智能体协作等主题。


In [24]:
response = query_engine.query("请用告诉我关于ablation的研究结果?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: This choice is more relevant as it mentions retrieving specific context, which would be necessary to understand the research results on ablation..
[0mAblation studies have shown that removing certain components or features from the system can have a significant impact on the overall performance. By systematically disabling or removing specific elements, researchers can evaluate the contribution and importance of each part to the system's functionality and effectiveness.
