# building an autonomous research agent
- rag with llamaindex
- 1. routing: add decision making to route requests to multiple tools

- 2. tool use: create an interface for the agents to select a tool and generate the right argument for that tool

- 3. multi-step reasoning with a range of tool using the LLM while maintaining language throughout that process

- 4. let users optionally inject guidance during intermediate steps (during reasoning)

In [13]:
import sys
import os
import nest_asyncio # make jupyter nb play nicelly with llamaindex
from helper import load_api_key
OPENAI_API_KEY = load_api_key('openai_pat.txt')
OPENAI_ORG_KEY = load_api_key('openai_org.txt')
import openai 
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

# from openai import OpenAI

# # Set up the OpenAI client
# client = OpenAI(
#     api_key=OPENAI_API_KEY,
#     organization=OPENAI_ORG_KEY
# )

#### test API connections

In [12]:
# test API connection
from llama_index.llms.openai import OpenAI

resp = OpenAI().complete("Paul Graham is ")
print(resp)

CompletionResponse(text='a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on programming languages and web development. Graham has also written several influential essays on technology, startups, and entrepreneurship.', additional_kwargs={}, raw=ChatCompletion(id='chatcmpl-ACzfKF8IQI2pMnZ6qc3HthTMyYt5H', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on programming languages and web development. Graham has also written several influential essays on technology, startups, and entrepreneurship.', refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1727660342, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=Comple

In [None]:
from llama_index.llms.openai import OpenAI

llm = OpenAI()
resp = llm.stream_complete("Paul Graham is ")
for r in resp:
    print(r.delta, end="")

In [14]:
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = OpenAI().chat(messages)
print(resp)

assistant: Ahoy matey! The name's Captain Rainbowbeard, the most colorful pirate on the seven seas! What can I do for ye today?


## Rounting

### load & parse data, define indexes

In [15]:
# load data
from llama_index.core import SimpleDirectoryReader

# read pdf into a parsed documentation # ICRL 2024 multi-agent paper
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [16]:
# split the documents into even sized chucks
# split on the order of sentences
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
# define documents into nodes 
nodes = splitter.get_nodes_from_documents(documents)

In [19]:
# global confile setting 
# specify the LLM embedding model to use 
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from utils import *

# Settings.llm = OpenAI(model="gpt-3.5-turbo") 
# Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.llm = OpenAI(model="gpt-4o") 
Settings.embed_model =OpenAIEmbedding(model="text-embedding-3-large")

In [20]:
# define two indexes over these nodes: a summary index & a vector index 
# (index is a set of meta data over our data - can query an index, and different indexes have different retrieval behaviors) 

from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

### define query engines and tools

In [22]:
# turn indexes into query engines and query tools: turn indexes into query engines 
# query engines are derived from each of the query indexes

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()



# turn indexes into query engines and query tools: set metadata (query tool)
# a query tool for each of the query engine

from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

### Define Router 
- llm powered
- pydantic

In [23]:
# LLM powered single selector
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

### test query 

In [24]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

[1;3;38;5;200mSelecting query engine 0: The question asks for a summary of the document, which aligns with the purpose of summarization questions related to MetaGPT..
[0mThe document introduces MetaGPT, a meta-programming framework designed for multi-agent collaboration using Large Language Models (LLMs). MetaGPT incorporates Standardized Operating Procedures (SOPs) to streamline workflows and reduce errors in complex tasks. It assigns specific roles to agents, such as Product Manager, Architect, Engineer, and QA Engineer, to break down tasks into manageable subtasks. The framework uses structured communication interfaces and a publish-subscribe mechanism to enhance efficiency and minimize information overload. MetaGPT also includes an executable feedback mechanism to iteratively improve code quality during runtime. The framework has demonstrated state-of-the-art performance on benchmarks like HumanEval and MBPP, and it excels in handling complex software development tasks compared t

In [25]:
print(len(response.source_nodes))

34


In [26]:
response = query_engine.query(
    "How do agents share information with other agents?"
)
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question asks for specific context about how agents share information, which aligns with retrieving specific context from the MetaGPT paper..
[0mAgents share information with other agents by using a shared message pool to publish structured messages. They can also subscribe to relevant messages based on their profiles, allowing them to obtain directional information from other roles and public information from the environment.


### put all above into a module: get_router_query_engine

In [27]:
from utils import get_router_query_engine

query_engine = get_router_query_engine("metagpt.pdf")

In [28]:
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

[1;3;38;5;200mSelecting query engine 1: The question 'Tell me about the ablation study results?' is asking for specific context from the MetaGPT paper, which aligns with choice 2..
[0mThe ablation study results demonstrate the effectiveness of MetaGPT in addressing challenges related to context utilization, reducing hallucinations in software generation, and managing information overload. The study highlights how MetaGPT's unique designs successfully tackle issues such as ambiguity in natural language descriptions, maintaining information validity in lengthy contexts, reducing code hallucinations, and handling information overload through a global message pool and subscription mechanism.
