# Lesson 4: Building a Multi-Document Agent

## Setup

In [1]:
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio
nest_asyncio.apply()

## 1. Setup an agent over 3 papers

**Note**: The pdf files are included with this lesson. To access these papers, go to the `File` menu and select`Open...`.

In [3]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "pdf/metagpt.pdf",
    "pdf/longlora.pdf",
    "pdf/selfrag.pdf",
]

In [4]:
from utils import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: pdf/metagpt.pdf
Getting tools for paper: pdf/longlora.pdf
Getting tools for paper: pdf/selfrag.pdf


In [5]:
initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [6]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

In [7]:
len(initial_tools)

6

In [8]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools, 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

In [9]:
response = agent.query(
    "Tell me about the evaluation dataset used in LongLoRA, "
    "and then tell me about the evaluation results"
)

Added user message to memory: Tell me about the evaluation dataset used in LongLoRA, and then tell me about the evaluation results
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "evaluation dataset used in LongLoRA"}
=== Function Output ===
The evaluation dataset used in LongLoRA is the PG19 test split.
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "evaluation results of LongLoRA"}
=== Function Output ===
The evaluation results of LongLoRA demonstrate comparable or superior performance to other Llama2-based long-context models across different benchmarks. LongLoRA showcases efficiency in terms of training hours and GPU memory usage, presenting significant enhancements in training speed and memory efficiency compared to full fine-tuning. Additionally, the incorporation of the S2-Attn mechanism in LongLoRA reduces FLOPs and training hours, making it a more efficient choice for extending context lengths.
=== LL

In [10]:
response = agent.query("Give me a summary of both Self-RAG and LongLoRA")
print(str(response))

Added user message to memory: Give me a summary of both Self-RAG and LongLoRA
=== Calling Function ===
Calling function: summary_tool_selfrag with args: {"input": "Self-RAG"}
=== Function Output ===
Self-RAG is a framework that enhances the quality and factuality of large language models by incorporating retrieval on demand and self-reflection. It involves training a single arbitrary LM to adaptively retrieve passages, generate text informed by these passages, and critique its own output using special reflection tokens. This framework significantly outperforms other LLMs and retrieval-augmented models on various tasks, demonstrating its effectiveness in improving generation quality, factuality, and citation accuracy. Additionally, Self-RAG evaluates the factual accuracy and relevance of generated text by training a Critic LM to predict the necessity of retrieval for generating responses and a Generator LM to produce responses based on the predicted reflection tokens.
=== Calling Functi

## 2. Setup an agent over 11 papers

### Download 11 ICLR papers

In [12]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=LzPWWPAdY4",
    "https://openreview.net/pdf?id=VTF8yNQM66",
    "https://openreview.net/pdf?id=hSyW5go0v8",
    "https://openreview.net/pdf?id=9WD9KwssyT",
    "https://openreview.net/pdf?id=yV6fD7LYkF",
    "https://openreview.net/pdf?id=hnrB5YHoYu",
    "https://openreview.net/pdf?id=WbWtOYIzIK",
    "https://openreview.net/pdf?id=c5pwL0Soay",
    "https://openreview.net/pdf?id=TpD2aG1h0D"
]

papers = [
    "pdf/metagpt.pdf",
    "pdf/longlora.pdf",
    "pdf/loftq.pdf",
    "pdf/swebench.pdf",
    "pdf/selfrag.pdf",
    "pdf/zipformer.pdf",
    "pdf/values.pdf",
    "pdf/finetune_fair_diffusion.pdf",
    "pdf/knowledge_card.pdf",
    "pdf/metra.pdf",
    "pdf/vr_mcl.pdf"
]

To download these papers, below is the needed code:


    #for url, paper in zip(urls, papers):
         #!wget "{url}" -O "{paper}"
    
    
**Note**: The pdf files are included with this lesson. To access these papers, go to the `File` menu and select`Open...`.

In [13]:
from utils import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: pdf/metagpt.pdf
Getting tools for paper: pdf/longlora.pdf
Getting tools for paper: pdf/loftq.pdf
Getting tools for paper: pdf/swebench.pdf
Getting tools for paper: pdf/selfrag.pdf
Getting tools for paper: pdf/zipformer.pdf
Getting tools for paper: pdf/values.pdf
Getting tools for paper: pdf/finetune_fair_diffusion.pdf
Getting tools for paper: pdf/knowledge_card.pdf
Getting tools for paper: pdf/metra.pdf
Getting tools for paper: pdf/vr_mcl.pdf


### Extend the Agent with Tool Retrieval

In [14]:
all_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [15]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [16]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [17]:
tools = obj_retriever.retrieve(
    "Tell me about the eval dataset used in MetaGPT and SWE-Bench"
)

In [18]:
tools[2].metadata

ToolMetadata(description='Use ONLY IF you want to get a holistic summary of MetaGPT. Do NOT use if you have specific questions over MetaGPT.', name='summary_tool_finetune_fair_diffusion', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [20]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm, 
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\

""",
    verbose=True
)
agent = AgentRunner(agent_worker)

In [21]:
response = agent.query(
    "Tell me about the evaluation dataset used "
    "in MetaGPT and compare it against SWE-Bench"
)
print(str(response))

Added user message to memory: Tell me about the evaluation dataset used in MetaGPT and compare it against SWE-Bench
=== Calling Function ===
Calling function: summary_tool_metra with args: {"input": "evaluation dataset used in MetaGPT"}
=== Function Output ===
The evaluation dataset used in MetaGPT is not explicitly mentioned in the provided context information.
=== Calling Function ===
Calling function: summary_tool_swebench with args: {"input": "evaluation dataset used in SWE-Bench"}
=== Function Output ===
The evaluation dataset used in SWE-Bench is constructed by scraping pull requests from the top 100 most downloaded PyPI libraries. Task instances are created from merged pull requests that resolve issues in the repository and introduce new tests. Each task instance consists of a codebase snapshot, a description of the issue to be resolved, and the associated pull request's code changes. The dataset is continuously updated to include new task instances from popular repositories, en

In [22]:
response = agent.query(
    "Compare and contrast the LoRA papers (LongLoRA, LoftQ). "
    "Analyze the approach in each paper first. "
)

Added user message to memory: Compare and contrast the LoRA papers (LongLoRA, LoftQ). Analyze the approach in each paper first. 
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "LongLoRA paper"}
=== Function Output ===
The LongLoRA paper introduces an efficient fine-tuning approach for extending the context length of Large Language Models (LLMs) using Shifted Sparse Attention (S2-Attn) to approximate standard self-attention patterns during training. It incorporates trainable normalization and embedding layers to bridge the gap between Low-rank Adaptation (LoRA) and full fine-tuning, showcasing improved performance in extending context lengths for Llama2 models with minimal accuracy compromise. Additionally, the paper introduces an Action Units Relation Learning framework comprising the Action Units Relation Transformer (ART) and Tampered AU Prediction (TAP) components for forgery detection, achieving state-of-the-art performance on cross-dataset an