In [1]:
import openai
import nest_asyncio

nest_asyncio.apply()

In [2]:
openai.api_key = 'YOUR_API_KEY'

## Setuping an agent over 3 papers

In [3]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

In [5]:
from utils import get_doc_tools
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper : {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper : metagpt.pdf
Getting tools for paper : longlora.pdf
Getting tools for paper : selfrag.pdf


In [6]:
initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]
initial_tools

[<llama_index.core.tools.function_tool.FunctionTool at 0x1600ee990>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x1600f3ce0>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x16009a000>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x16006b0e0>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x160090830>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x16017adb0>]

In [7]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

In [8]:
len(initial_tools)

6

In [9]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools,
    llm=llm,
    verbose=True
    
)

agent = AgentRunner(agent_worker)

In [10]:
response = agent.query(
    "Tell me about the evaluation dataset used in LongLoRA,"
    "and then tell me about the evaluation results"
)

Added user message to memory: Tell me about the evaluation dataset used in LongLoRA,and then tell me about the evaluation results
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "evaluation dataset"}
=== Function Output ===
The evaluation dataset used in the experiments is the PG19 dataset.
=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "evaluation results"}
=== Function Output ===
The evaluation results demonstrate that the proposed models achieve state-of-the-art performance on various benchmarks, such as LongBench and LEval. These models have been fine-tuned to different context lengths, ranging from 8192 to 65536, and have shown competitive perplexity scores. Additionally, an efficiency analysis indicates that the models, particularly with the S2-Attn mechanism, reduce computational overhead and training hours significantly compared to full fine-tuning methods. The models exhibit promising performance in t

In [12]:
response = agent.query("Give me a summary of both Self-RAG and LongLoRA")

Added user message to memory: Give me a summary of both Self-RAG and LongLoRA
=== Calling Function ===
Calling function: summary_tool_selfrag with args: {"input": "Self-RAG"}
=== Function Output ===
Self-RAG is a framework designed to improve the quality and factuality of large language models by incorporating retrieval on demand and self-reflection mechanisms. It enables a single arbitrary language model to retrieve, generate, and evaluate text passages and its own outputs using reflection tokens. By allowing the language model to adjust its behavior during the inference phase, Self-RAG aims to enhance performance across various tasks. Additionally, the system evaluates the factual precision and usefulness of text outputs by assessing aspects such as factual relevance, supportiveness, and overall utility. Through a structured evaluation process involving instructions, evidence, and output sentences, Self-RAG seeks to enhance the quality and reliability of text generation models.
=== C

## Setting an agent over 11 papers

In [13]:
paper_to_tools_dict = {}

for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem )
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: selfrag.pdf


### Extending the Agent With Tool Retrieval

In [14]:
all_tools = [t for paer in papers for t in paper_to_tools_dict[paper]]

In [15]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [16]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [17]:
tools = obj_retriever.retrieve(
    "Tell me about the eval dataset used in MetaGPT and SWE-Bench "
)

In [19]:
tools[1].metadata # returns second most relevant object

ToolMetadata(description='vector_tool_selfrag(query: str, page_numbers: Optional[List[str]] = None) -> str\nUse to answer questions over the MetaGPT paper.\n    \n        Useful if you have specific questions over the MetaGPT paper.\n        Always leave page_numbers as None UNLESS there is a specific page you want to search for.\n    \n        Args:\n            query (str): the string query to be embedded.\n            page_numbers (Optional[List[str]]): Filter by set of pages. Leave as NONE \n                if we want to perform a vector search\n                over all pages. Otherwise, filter by the set of specified pages.\n        \n        ', name='vector_tool_selfrag', fn_schema=<class 'llama_index.core.tools.utils.vector_tool_selfrag'>, return_direct=False)

In [21]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm, 
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\

""",
    verbose=True
)
agent = AgentRunner(agent_worker)

In [22]:
response = agent.query(
    "Tell me about the evaluation dataset used"
    "in MetaGPT and commpare it against SWE-Bench"
)
print(str(response))

Added user message to memory: Tell me about the evaluation dataset usedin MetaGPT and commpare it against SWE-Bench
=== Calling Function ===
Calling function: vector_tool_selfrag with args: {"query": "evaluation dataset", "page_numbers": ["6"]}
=== Function Output ===
The evaluation dataset includes a factverification dataset about public health (PubHealth; Zhang et al. 2023) and a multiple-choice reasoning dataset created from scientific exams (ARC-Challenge; Clark et al. 2018).
=== Calling Function ===
Calling function: vector_tool_selfrag with args: {"query": "SWE-Bench"}
=== Function Output ===
SWE-Bench is a benchmark for software engineering tasks.
=== LLM Response ===
The evaluation dataset used in MetaGPT includes a fact verification dataset about public health (PubHealth) and a multiple-choice reasoning dataset created from scientific exams (ARC-Challenge). On the other hand, SWE-Bench is a benchmark for software engineering tasks.
The evaluation dataset used in MetaGPT includ

In [23]:
response = agent.query(
    "Compare and contrast the LoRA papers (LongLora, LoftQ),"
    "Analyze the approach in each paper first."
)

Added user message to memory: Compare and contrast the LoRA papers (LongLora, LoftQ),Analyze the approach in each paper first.
=== Calling Function ===
Calling function: vector_tool_selfrag with args: {"query": "LongLora paper approach"}
=== Function Output ===
The LongLora paper approach focuses on improving the factuality of language model outputs by addressing issues such as misinformation and incorrect advice. The method aims to enhance performance, factuality, and citation accuracy, while also acknowledging the potential for generating outputs that may not be fully supported by citations. The authors emphasize the importance of self-reflection and detailed attribution to assist users in verifying factual errors in the model's outputs.
=== Calling Function ===
Calling function: vector_tool_selfrag with args: {"query": "LoftQ paper approach"}
=== Function Output ===
The paper approach involves aiming to enhance the factuality of outputs generated by large language models, with a foc