# Lesson 4: Building a Multi-Document Agent

## Setup

In [None]:
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

用下面这个更快

In [1]:
import os
import openai

os.environ["OPENAI_API_KEY"] = " "
openai.api_key = os.environ["OPENAI_API_KEY"]

In [2]:
import nest_asyncio
nest_asyncio.apply()

## 1. Setup an agent over 3 papers

**Note**: The pdf files are included with this lesson. To access these papers, go to the `File` menu and select`Open...`.

In [3]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

In [5]:
from utils3 import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: selfrag.pdf


In [6]:
initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [7]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

为什么是6了？解释下，主要是作者想展示下LLamaindex的自主规划能力，就总结和搜索工具分别应用于3个文档，总共6个工具

In [8]:
len(initial_tools)

6

In [13]:
type(initial_tools)

list

In [15]:
for tool in initial_tools:
    print(tool)

<llama_index.core.tools.function_tool.FunctionTool object at 0x156d762b0>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x156dedc10>
<llama_index.core.tools.function_tool.FunctionTool object at 0x156ded310>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x157b93f10>
<llama_index.core.tools.function_tool.FunctionTool object at 0x156d3f850>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x156df0910>


In [16]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools, 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

单文本的RAG

In [17]:
response = agent.query(
    "Tell me about the evaluation dataset used in LongLoRA, "
    "and then tell me about the evaluation results"
)

Added user message to memory: Tell me about the evaluation dataset used in LongLoRA, and then tell me about the evaluation results
=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation dataset"}
=== Function Output ===
PG19 test split
=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation results"}
=== Function Output ===
The evaluation results include reporting perplexity for models and baselines on proof-pile (Azerbayev et al., 2022) and PG19 datasets. The models achieve better perplexity with longer context sizes, indicating the effectiveness of the fine-tuning method. The perplexity decreases as the context size increases, with improvements observed when increasing the context window size. Additionally, the maximum context length that can be fine-tuned on a single 8 × A100 machine is examined, with Llama2 models extended to different context lengths showing promising results.
=== LLM Response ===
T

多文本的RAG

In [18]:
response = agent.query("Give me a summary of both Self-RAG and LongLoRA")
print(str(response))

Added user message to memory: Give me a summary of both Self-RAG and LongLoRA
=== Calling Function ===
Calling function: summary_tool_selfrag with args: {"input": "Self-RAG"}
=== Function Output ===
Self-RAG is a framework that enhances the quality and factuality of a large language model through retrieval and self-reflection. It trains a single arbitrary language model to adaptively retrieve passages on-demand, generate and reflect on retrieved passages and its own generations using special tokens called reflection tokens. This framework enables the language model to tailor its behavior to diverse task requirements during the inference phase, leading to significant performance improvements on various tasks compared to state-of-the-art language models and retrieval-augmented models.
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "LongLoRA"}
=== Function Output ===
LongLoRA is an efficient method used for extending the context length of Large Langu

### 用中文版来试一下

In [54]:
papers_cn = [
    "ChatGPT_0_to_1.pdf",
    "Data_analysis_must_know.pdf",
    "Elon_Musk's_biography.pdf",
]

In [55]:
from utils3 import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers_cn:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: ChatGPT_0_to_1.pdf
Getting tools for paper: Data_analysis_must_know.pdf
Getting tools for paper: Elon_Musk's_biography.pdf


In [61]:
initial_tools = [t for paper in papers_cn for t in paper_to_tools_dict[paper]]

In [62]:
for tool in initial_tools:
    print(tool)

<llama_index.core.tools.function_tool.FunctionTool object at 0x16975a610>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x167bcae50>
<llama_index.core.tools.function_tool.FunctionTool object at 0x290630b50>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x290e66a90>
<llama_index.core.tools.function_tool.FunctionTool object at 0x29c506640>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x2a51e09d0>


In [63]:
from llama_index.llms.openai import OpenAI

llm_cn = OpenAI(model="gpt-3.5-turbo")

In [64]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools, 
    llm=llm_cn, 
    verbose=True
)
agent = AgentRunner(agent_worker)

看来目前并不支持中文

In [59]:
response = agent.query(
    "给我总结下马斯克的人生信条, "
    "并且再给我简单描述马斯克收购twitter的事件"
)

Added user message to memory: 给我总结下马斯克的人生信条, 并且再给我简单描述马斯克收购twitter的事件


Retrying llama_index.llms.openai.base.OpenAI._chat in 0.295800935877943 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}.
Retrying llama_index.llms.openai.base.OpenAI._chat in 0.7058150481631946 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}.
Retrying llama_index.llms.openai.base.OpenAI._chat in 1.5867561643343606 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches

BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}

In [65]:
response = agent.query(
    "Summarize Musk's life creed for me,"
    "And then provide me with a brief description of Musk's acquisition of Twitter"
)

Added user message to memory: Summarize Musk's life creed for me,And then provide me with a brief description of Musk's acquisition of Twitter


Retrying llama_index.llms.openai.base.OpenAI._chat in 0.9207053610321947 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}.
Retrying llama_index.llms.openai.base.OpenAI._chat in 1.7135073996153745 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}.
Retrying llama_index.llms.openai.base.OpenAI._chat in 3.631525834718196 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches

BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[4].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[4].function.name', 'code': 'invalid_value'}}

换英文的文档再试一次DALL_E_3_System_Card、Customer_Relationship_Management_Model

In [66]:
papers_cn = [
    "DALL_E_3_System_Card.pdf",
    "Customer_Relationship_Management_Model.pdf",
]

In [68]:
from utils3 import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers_cn:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: DALL_E_3_System_Card.pdf
Getting tools for paper: Customer_Relationship_Management_Model.pdf


In [69]:
initial_tools = [t for paper in papers_cn for t in paper_to_tools_dict[paper]]

In [70]:
for tool in initial_tools:
    print(tool)

<llama_index.core.tools.function_tool.FunctionTool object at 0x2a7d48700>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x2a2c9fc70>
<llama_index.core.tools.function_tool.FunctionTool object at 0x2a2ab11c0>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x2a2ab1550>


In [71]:
from llama_index.llms.openai import OpenAI

llm_cn = OpenAI(model="gpt-3.5-turbo")

In [72]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools, 
    llm=llm_cn, 
    verbose=True
)
agent = AgentRunner(agent_worker)

In [73]:
response = agent.query(
    "Summarize DALL_E_3_System_Card for me,"
    "And then  me what is Future Work about DALL_E_3"
)

Added user message to memory: Summarize DALL_E_3_System_Card for me,And then  me what is Future Work about DALL_E_3
=== Calling Function ===
Calling function: summary_tool_DALL_E_3_System_Card with args: {"input": "Summarize DALL_E_3_System_Card"}
=== Function Output ===
The DALL_E_3_System_Card provides detailed insights into the capabilities and considerations of the DALL ·E 3 model. It covers aspects such as improvements from early versions to launch versions, strategies to combat unsolicited racy content, addressing bias and representation issues, considerations regarding body image, risks related to dis- and misinformation, and the generation of images of public figures. The card also discusses the model's ability to generate images based on text prompts, challenges in handling requests related to public figures, CBRN risks, copyright and trademarks, artist styles, potential misuse, inaccuracies in scientific information generation, and considerations for commercial use. Additiona

## 2. Setup an agent over 11 papers

### Download 11 ICLR papers

In [19]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=LzPWWPAdY4",
    "https://openreview.net/pdf?id=VTF8yNQM66",
    "https://openreview.net/pdf?id=hSyW5go0v8",
    "https://openreview.net/pdf?id=9WD9KwssyT",
    "https://openreview.net/pdf?id=yV6fD7LYkF",
    "https://openreview.net/pdf?id=hnrB5YHoYu",
    "https://openreview.net/pdf?id=WbWtOYIzIK",
    "https://openreview.net/pdf?id=c5pwL0Soay",
    "https://openreview.net/pdf?id=TpD2aG1h0D"
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "loftq.pdf",
    "swebench.pdf",
    "selfrag.pdf",
    "zipformer.pdf",
    "values.pdf",
    "finetune_fair_diffusion.pdf",
    "knowledge_card.pdf",
    "metra.pdf",
    "vr_mcl.pdf"
]

To download these papers, below is the needed code:


    #for url, paper in zip(urls, papers):
         #!wget "{url}" -O "{paper}"
    
    
**Note**: The pdf files are included with this lesson. To access these papers, go to the `File` menu and select`Open...`.

In [21]:
from utils3 import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: loftq.pdf
Getting tools for paper: swebench.pdf
Getting tools for paper: selfrag.pdf
Getting tools for paper: zipformer.pdf
Getting tools for paper: values.pdf
Getting tools for paper: finetune_fair_diffusion.pdf
Getting tools for paper: knowledge_card.pdf
Getting tools for paper: metra.pdf
Getting tools for paper: vr_mcl.pdf


### Extend the Agent with Tool Retrieval

In [22]:
all_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [32]:
len(all_tools)

22

In [29]:
for paper in papers:
    for t in paper_to_tools_dict[paper]:
        print(t)

<llama_index.core.tools.function_tool.FunctionTool object at 0x163736fa0>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x1638cf4f0>
<llama_index.core.tools.function_tool.FunctionTool object at 0x157d1c970>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x157b59fa0>
<llama_index.core.tools.function_tool.FunctionTool object at 0x16393d0d0>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x163a77c70>
<llama_index.core.tools.function_tool.FunctionTool object at 0x1639ed130>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x16439b5b0>
<llama_index.core.tools.function_tool.FunctionTool object at 0x156d2e040>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x163955310>
<llama_index.core.tools.function_tool.FunctionTool object at 0x163736850>
<llama_index.core.tools.query_engine.QueryEngineTool object at 0x1639ed8e0>
<llama_index.core.tools.function_tool.FunctionTool object at 0x164385970>
<llama_index.core.tools.qu

In [34]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

先判断要选择谁，挑选3个工具

In [33]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [35]:
tools = obj_retriever.retrieve(
    "Tell me about the eval dataset used in MetaGPT and SWE-Bench"
)

In [40]:
len(tools)

3

In [39]:
print(tools)

[<llama_index.core.tools.query_engine.QueryEngineTool object at 0x1638cf4f0>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x163ffa580>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x16439b5b0>]


In [41]:
tools[3].metadata

IndexError: list index out of range

In [36]:
tools[2].metadata

ToolMetadata(description='Useful for summarization questions related to swebench', name='summary_tool_swebench', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [37]:
tools[1].metadata

ToolMetadata(description='Useful for summarization questions related to metra', name='summary_tool_metra', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [38]:
tools[0].metadata

ToolMetadata(description='Useful for summarization questions related to metagpt', name='summary_tool_metagpt', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [42]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm, 
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\

""",
    verbose=True
)
agent = AgentRunner(agent_worker)

In [43]:
response = agent.query(
    "Tell me about the evaluation dataset used "
    "in MetaGPT and compare it against SWE-Bench"
)
print(str(response))

Added user message to memory: Tell me about the evaluation dataset used in MetaGPT and compare it against SWE-Bench
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "evaluation dataset used in MetaGPT"}
=== Function Output ===
The evaluation dataset used in MetaGPT includes HumanEval, MBPP, and SoftwareDev.
=== Calling Function ===
Calling function: summary_tool_swebench with args: {"input": "evaluation dataset used in SWE-Bench"}
=== Function Output ===
The evaluation dataset used in SWE-Bench consists of task instances extracted from real GitHub issues and corresponding pull requests across 12 popular Python repositories. It includes task instructions, issue text, retrieved files and documentation, an example patch file, and a prompt for generating the patch file. The dataset is constructed by scraping pull requests from the top 100 packages of the top 5,000 most downloaded PyPI libraries in August 2023, ensuring that the PRs selected have a "Merge

In [44]:
response = agent.query(
    "Compare and contrast the LoRA papers (LongLoRA, LoftQ). "
    "Analyze the approach in each paper first. "
)

Added user message to memory: Compare and contrast the LoRA papers (LongLoRA, LoftQ). Analyze the approach in each paper first. 
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "Approach in LongLoRA"}
=== Function Output ===
The approach in LongLoRA involves efficiently extending the context length of large language models (LLMs) to significantly larger sizes while maintaining minimal accuracy compromise. It introduces shifted sparse attention (S2-Attn) during fine-tuning, where attention is split into groups and conducted individually within each group, allowing for training models with longer context lengths. LongLoRA saves trainable parameters and memory costs compared to full fine-tuning by using a low-rank decomposition method for weight updates. Additionally, LongLoRA emphasizes the importance of learnable embedding and normalization layers for effective and efficient fine-tuning of LLMs to longer context lengths.
=== Calling Function ===
Cal