## Build Agentic RAG using Vertex AI and Llamaindex

In [None]:
# @title Copyright & License (click to expand)
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

### Pre-requirements
- Set up a project
- Create a bucket


References:
- https://colab.sandbox.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/vector_stores/VertexAIVectorSearchDemo.ipynb#scrollTo=_X0bKO2mnBHK

- https://learn.deeplearning.ai/courses/building-agentic-rag-with-llamaindex/lesson/2/router-query-engine


### Install libraries

In [None]:
!pip install llama-index llama-index-vector-stores-vertexaivectorsearch llama-index-llms-vertex

In [None]:
!pip install llama-index-vector-stores-vertexaivectorsearch

In [None]:
#!pip install --upgrade git+https://github.com/wadave/llama_index.git@main

In [None]:
#Colab only
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

In [None]:
#JupyterLab instance
!gcloud auth login 

In [None]:
#Colab only
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

In [1]:
# TODO : Set values as per your requirements

# Project and Storage Constants
PROJECT_ID = "astute-psyche-419021"
REGION = "us-central1"
GCS_BUCKET_NAME = "astute-psyche-419021-bucket"
GCS_BUCKET_URI = f"gs://{GCS_BUCKET_NAME}"

# The number of dimensions for the textembedding-gecko@003 is 768
# If other embedder is used, the dimensions would probably need to change.
VS_DIMENSIONS = 768

# Vertex AI Vector Search Index configuration
# parameter description here
# https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex#google_cloud_aiplatform_MatchingEngineIndex_create_tree_ah_index
VS_INDEX_NAME = "vertex_vector_search_index"  # @param {type:"string"}
VS_INDEX_ENDPOINT_NAME = "vector_search_endpoint"  # @param {type:"string"}

Download Sample Files

In [45]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=LzPWWPAdY4",
    "https://openreview.net/pdf?id=VTF8yNQM66",
    "https://openreview.net/pdf?id=hSyW5go0v8",
    "https://openreview.net/pdf?id=9WD9KwssyT",
    "https://openreview.net/pdf?id=yV6fD7LYkF",
    "https://openreview.net/pdf?id=hnrB5YHoYu",
    "https://openreview.net/pdf?id=WbWtOYIzIK",
    "https://openreview.net/pdf?id=c5pwL0Soay",
    "https://openreview.net/pdf?id=TpD2aG1h0D"
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "loftq.pdf",
    "swebench.pdf",
    "selfrag.pdf",
    "zipformer.pdf",
    "values.pdf",
    "finetune_fair_diffusion.pdf",
    "knowledge_card.pdf",
    "metra.pdf",
    "vr_mcl.pdf"
]
import requests

def download_file(url, file_path):
    """Downloads a file from a given URL and saves it to the specified file path.

    Args:
        url: The URL of the file to download.
        file_path: The path to save the downloaded file.
    """

    response = requests.get(url, stream=True)
    response.raise_for_status()  # Raise an exception for non-200 status codes

    with open(file_path, "wb") as f:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:  # Filter out keep-alive new chunks
                f.write(chunk)

    print(f"Downloaded file from {url} to {file_path}")


for url, paper in zip(urls, papers):
    download_file(url, paper)

Downloaded file from https://openreview.net/pdf?id=VtmBAGCN7o to metagpt.pdf
Downloaded file from https://openreview.net/pdf?id=6PmJoRfdaK to longlora.pdf
Downloaded file from https://openreview.net/pdf?id=LzPWWPAdY4 to loftq.pdf
Downloaded file from https://openreview.net/pdf?id=VTF8yNQM66 to swebench.pdf
Downloaded file from https://openreview.net/pdf?id=hSyW5go0v8 to selfrag.pdf
Downloaded file from https://openreview.net/pdf?id=9WD9KwssyT to zipformer.pdf
Downloaded file from https://openreview.net/pdf?id=yV6fD7LYkF to values.pdf
Downloaded file from https://openreview.net/pdf?id=hnrB5YHoYu to finetune_fair_diffusion.pdf
Downloaded file from https://openreview.net/pdf?id=WbWtOYIzIK to knowledge_card.pdf
Downloaded file from https://openreview.net/pdf?id=c5pwL0Soay to metra.pdf
Downloaded file from https://openreview.net/pdf?id=TpD2aG1h0D to vr_mcl.pdf


In [2]:
import nest_asyncio
nest_asyncio.apply()

In [3]:
papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

In [4]:
from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=REGION)

### Option 1: Create a new Vertex AI Vector Search

Create an empty index

In [5]:
# check if index exists
index_names = [
    index.resource_name
    for index in aiplatform.MatchingEngineIndex.list(
        filter=f"display_name={VS_INDEX_NAME}"
    )
]

if len(index_names) == 0:
    print(f"Creating Vector Search index {VS_INDEX_NAME} ...")
    vs_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
        display_name=VS_INDEX_NAME,
        dimensions=VS_DIMENSIONS,
        distance_measure_type="DOT_PRODUCT_DISTANCE",
        approximate_neighbors_count=150,
        shard_size="SHARD_SIZE_SMALL",
        index_update_method="STREAM_UPDATE",  # allowed values BATCH_UPDATE , STREAM_UPDATE
    )
    print(
        f"Vector Search index {vs_index.display_name} created with resource name {vs_index.resource_name}"
    )
else:
    vs_index = aiplatform.MatchingEngineIndex(index_name=index_names[0])
    print(
        f"Vector Search index {vs_index.display_name} exists with resource name {vs_index.resource_name}"
    )

Vector Search index vertex_vector_search_index exists with resource name projects/77923429797/locations/us-central1/indexes/8741403313842421760


Create an endpoint

In [None]:
endpoint_names = [
    endpoint.resource_name
    for endpoint in aiplatform.MatchingEngineIndexEndpoint.list(
        filter=f"display_name={VS_INDEX_ENDPOINT_NAME}"
    )
]

if len(endpoint_names) == 0:
    print(
        f"Creating Vector Search index endpoint {VS_INDEX_ENDPOINT_NAME} ..."
    )
    vs_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name=VS_INDEX_ENDPOINT_NAME, public_endpoint_enabled=True
    )
    print(
        f"Vector Search index endpoint {vs_endpoint.display_name} created with resource name {vs_endpoint.resource_name}"
    )
else:
    vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=endpoint_names[0]
    )
    print(
        f"Vector Search index endpoint {vs_endpoint.display_name} exists with resource name {vs_endpoint.resource_name}"
    )

Deploy index to endpoint

In [None]:
# check if endpoint exists
# it takes about 30 mins to finish
index_endpoints = [
    (deployed_index.index_endpoint, deployed_index.deployed_index_id)
    for deployed_index in vs_index.deployed_indexes
]

if len(index_endpoints) == 0:
    print(
        f"Deploying Vector Search index {vs_index.display_name} at endpoint {vs_endpoint.display_name} ..."
    )
    vs_deployed_index = vs_endpoint.deploy_index(
        index=vs_index,
        deployed_index_id=VS_INDEX_NAME,
        display_name=VS_INDEX_NAME,
        machine_type="e2-standard-16",
        min_replica_count=1,
        max_replica_count=1,
    )
    print(
        f"Vector Search index {vs_index.display_name} is deployed at endpoint {vs_deployed_index.display_name}"
    )
else:
    vs_deployed_index = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoints[0][0]
    )
    print(
        f"Vector Search index {vs_index.display_name} is already deployed at endpoint {vs_deployed_index.display_name}"
    )

### Option 2: Use an existing Vertex AI Vector Search

In [5]:
# TODO : replace 1234567890123456789 with your actual index ID
vs_index = aiplatform.MatchingEngineIndex(index_name='8741403313842421760')

# TODO : replace 1234567890123456789 with your actual endpoint ID
vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name='5953112194546663424'
)

#### Import libraries

In [55]:
# import modules needed
from llama_index.core import (
    StorageContext,
    Settings,
    VectorStoreIndex,
    SummaryIndex,
    SimpleDirectoryReader,
)
from llama_index.core.schema import TextNode
from llama_index.core.vector_stores.types import (
    MetadataFilters,
    MetadataFilter,
    FilterOperator,
)
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.vector_stores.vertexaivectorsearch import VertexAIVectorStore

from typing import List
from llama_index.core.vector_stores import FilterCondition
from llama_index.core.tools import FunctionTool
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter

from llama_index.core.tools import QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters
from pathlib import Path

from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner





### Set up Vector Search Store

In [7]:
# setup vector store
vector_store = VertexAIVectorStore(
    project_id=PROJECT_ID,
    region=REGION,
    index_id=vs_index.name,
    endpoint_id=vs_endpoint.name,
    gcs_bucket_name=GCS_BUCKET_NAME,
)

# set storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [8]:
# configure embedding model
embed_model = VertexTextEmbedding(
    model_name="textembedding-gecko@003",
    project=PROJECT_ID,
    location=REGION,
)

vertex_gemini = Vertex(model="gemini-1.5-pro-preview-0514", temperature=0, additional_kwargs={})

# setup the index/query process, ie the embedding model (and completion if used)
Settings.embed_model = embed_model
Settings.llm = vertex_gemini

### Task 1: Router query engine

In [9]:

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [10]:
# define index from vector store
vector_index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/indexes/8741403313842421760
MatchingEngineIndex index Upserted datapoints. Resource name: projects/77923429797/locations/us-central1/indexes/8741403313842421760


In [11]:
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [12]:
summary_index = SummaryIndex(nodes)

In [13]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [14]:
summary_query_engine.query("what's the summary of the document?")

merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information from multiple sources is below.
---------------------
page_label: 1
file_path: metagpt.pdf

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University 

Response(response='A new framework called MetaGPT leverages the power of large language models and a structured, team-based approach to develop software autonomously. This method, inspired by real-world software development processes, results in more efficient and high-quality code compared to other systems.  The framework shows great promise in revolutionizing software development, though it still has some limitations. \n', source_nodes=[NodeWithScore(node=TextNode(id_='3bb2b85e-6a71-4c58-8951-f303afa89ff3', embedding=None, metadata={'page_label': '1', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-06-06', 'last_modified_date': '2024-06-06'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], r

In [15]:
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

In [16]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [17]:
response = query_engine.query("What is the summary of the document?")
print(str(response))

merged_message user: Some choices are given below. It is provided in a numbered list (1 to 2), where each item in the list corresponds to a summary.
---------------------
(1) Useful for summarization questions related to MetaGPT

(2) Useful for retrieving specific context from the MetaGPT paper.
---------------------
Using only the choices above and not prior knowledge, return the choice that is most relevant to the question: 'What is the summary of the document?'


The output should be ONLY JSON formatted as a JSON instance.

Here is an example:
[
    {{
        choice: 1,
        reason: "<insert reason for choice>"
    }},
    ...
]

[1;3;38;5;200mSelecting query engine 0: The question 'What is the summary of the document?' is asking for a concise overview of the document's content. Option (1), which mentions usefulness for summarization questions, directly addresses this need..
[0mmerged_message user: You are an expert Q&A system that is trusted around the world.
Always answer th

In [18]:
print(len(response.source_nodes))

34


In [19]:
response = query_engine.query("How do agents share information with other agents?")
print(str(response))

merged_message user: Some choices are given below. It is provided in a numbered list (1 to 2), where each item in the list corresponds to a summary.
---------------------
(1) Useful for summarization questions related to MetaGPT

(2) Useful for retrieving specific context from the MetaGPT paper.
---------------------
Using only the choices above and not prior knowledge, return the choice that is most relevant to the question: 'How do agents share information with other agents?'


The output should be ONLY JSON formatted as a JSON instance.

Here is an example:
[
    {{
        choice: 1,
        reason: "<insert reason for choice>"
    }},
    ...
]

[1;3;38;5;200mSelecting query engine 1: We need to retrieve specific context from the MetaGPT paper to answer how agents share information. This is not a summarization question..
[0mmerged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not p

### Task 2: Tool calling

Auto-retrieval tool

In [20]:
query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What are some high-level results of MetaGPT?",
)

merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information is below.
---------------------
page_label: 2
file_path: metagpt.pdf

Preprint
Figure 1: The software development SOPs between MetaGPT and real-world human teams.
In software engineering, SOPs promote collaboration among various roles. MetaGPT showcases
its ability to decompose complex tasks into specific actionable procedures assigned to various roles
(e.g., Product Manager, Architect, Engineer, etc.).
documents, design artifacts, flowcharts, and interface specifications. The use of intermediate struc-
tured outputs significantly increases the success rate of target code generation. Because it help

In [21]:

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)

from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

In [22]:
print(str(response))

This system achieves state-of-the-art results in code generation, with 85.9% and 87.7% in Pass@1. Additionally, it has a 100% task completion rate. 



In [23]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-06-06', 'last_modified_date': '2024-06-06'}
{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-06-05', 'last_modified_date': '2024-06-05'}


Define auto-retrieval tool

In [24]:

def vector_query(
    query: str,
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.

    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.

    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]

    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response


vector_query_tool = FunctionTool.from_defaults(
    
    fn=vector_query,
    #name='vector_query'
    
)

In [25]:
def summary_query(
    query: str,
) -> str:
    """Perform a summary of document
    query (str): the string query to be embedded.
    """
    summary_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )

    response = summary_engine.query(query)
    return response


summary_tool = FunctionTool.from_defaults(
    
    fn=summary_query,
    #name='summary_query'
    
)

In [26]:
from llama_index.llms.vertex import Vertex
vertex_gemini = Vertex(model="gemini-1.5-flash")
response = vertex_gemini.predict_and_call(
    [vector_query_tool],
    "What are the high-level results of MetaGPT as described on page 2?",
    verbose=True
)

merged_message user: What are the high-level results of MetaGPT as described on page 2?
Function Calling, No Text Content
=== Calling Function ===
Calling function: vector_query with args: {"query": "What are the high-level results of MetaGPT as described on page 2?", "page_numbers": ["2"]}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information is below.
---------------------
page_label: 2
file_path: metagpt.pdf

Preprint
Figure 1: The software development SOPs between MetaGPT and real-world human teams.
In software engineering, SOPs promote collaboration among various roles. MetaGPT showcases
its ability to decompose complex tasks into specific actiona

In [27]:
response = vertex_gemini.predict_and_call(
    [summary_tool],
    "What is the summary of the document?",
    verbose=True
)

merged_message user: What is the summary of the document?
Function Calling, No Text Content
=== Calling Function ===
Calling function: summary_query with args: {"query": "What is the summary of the document?"}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information from multiple sources is below.
---------------------
page_label: 1
file_path: metagpt.pdf

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,

In [28]:
response = vertex_gemini.predict_and_call(
    [vector_query_tool, summary_tool],
    "What are the MetaGPT comparisons with ChatDev described on page 8?",
    verbose=True
)

merged_message user: What are the MetaGPT comparisons with ChatDev described on page 8?
Function Calling, No Text Content
=== Calling Function ===
Calling function: vector_query with args: {"query": "MetaGPT comparisons with ChatDev described on page 8", "page_numbers": ["8"]}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information is below.
---------------------
page_label: 8
file_path: metagpt.pdf

Preprint
Figure 5: Demo softwares developed by MetaGPT.
in these two public benchmarks. Moreover, as shown in Table 1, MetaGPT outperforms ChatDev on
the challenging SoftwareDev dataset in nearly all metrics. For example, considering the executabil-
ity, Met

In [29]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-06-06', 'last_modified_date': '2024-06-06'}


In [30]:
response = vertex_gemini.predict_and_call(
    [summary_tool, vector_query_tool],
    "What is a summary of the paper?",
    verbose=True
)

merged_message user: What is a summary of the paper?
Function Calling, No Text Content
=== Calling Function ===
Calling function: summary_query with args: {"query": "What is a summary of the paper?"}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information from multiple sources is below.
---------------------
page_label: 1
file_path: metagpt.pdf

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWis

### Task 3: Building an Agent Reasoning Loop

In [31]:
# TODO: abstract all of this into a function that takes in a PDF file name
from typing import Optional

def get_doc_tools(
    file_path: str,
    name: str,
) -> str:
    """Get vector query and summary query tools from a document."""

    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    vector_index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
    summary_index = SummaryIndex(nodes)

    def vector_query(
        query: str,
        page_numbers: Optional[List[str]] = None
    ) -> str:
        """Use to answer questions over the MetaGPT paper.

        Useful if you have specific questions over the MetaGPT paper.
        Always leave page_numbers as None UNLESS there is a specific page you want to search for.

        Args:
            query (str): the string query to be embedded.
            page_numbers (Optional[List[str]]): Filter by set of pages. Leave as NONE
                if we want to perform a vector search
                over all pages. Otherwise, filter by the set of specified pages.

        """

        page_numbers = page_numbers or []
        metadata_dicts = [
            {"key": "page_label", "value": p} for p in page_numbers
        ]

        query_engine = vector_index.as_query_engine(
            similarity_top_k=2,
            filters=MetadataFilters.from_dicts(
                metadata_dicts,
                condition=FilterCondition.OR
            )
        )
        response = query_engine.query(query)
        return response


    vector_query_tool = FunctionTool.from_defaults(
        name=f"vector_tool_{name}",
        fn=vector_query
    )

    def summary_query(
        query: str,
    ) -> str:
        """Perform a summary of document
        query (str): the string query to be embedded.
        """
        summary_engine = summary_index.as_query_engine(
            response_mode="tree_summarize",
            use_async=True,
        )
    
        response = summary_engine.query(query)
        return response
    
    
    summary_tool = FunctionTool.from_defaults(
        
        fn=summary_query,
        name=f'summary_tool_{name}'
        
    )

    return vector_query_tool, summary_tool

In [32]:
vector_query_tool, summary_tool = get_doc_tools("metagpt.pdf", "metagpt")

Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/indexes/8741403313842421760
MatchingEngineIndex index Upserted datapoints. Resource name: projects/77923429797/locations/us-central1/indexes/8741403313842421760


In [33]:

vertex_gemini = Vertex(model="gemini-1.5-pro-preview-0514")

In [34]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_query_tool, summary_tool], 
    llm=vertex_gemini, 
    verbose=True
)
agent = AgentRunner(agent_worker)

In [35]:
response = vertex_gemini.predict_and_call(
    [vector_query_tool, summary_tool],
    "what's the summary of the paper?",
    verbose=True
)


merged_message user: what's the summary of the paper?
Function Calling, No Text Content
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"query": "What is this paper about?"}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information from multiple sources is below.
---------------------
page_label: 1
file_path: metagpt.pdf

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepW

In [43]:
response = agent.query(
    #"what's the summary of the paper?"
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
merged_message user: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
Function Calling, No Text Content
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "What are the roles of agents in MetaGPT and how do they communicate?"}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information is below.
---------------------
page_label: 4
file_path: metagpt.pdf

Preprint
Figure 2: An example of the communication protocol (left) and iterative programming with exe-
cutable feedba

In [44]:
response = agent.chat(
    "Tell me about the evaluation datasets used."
)

Added user message to memory: Tell me about the evaluation datasets used.
merged_message user: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
merged_message model: 
merged_message user: Five roles — a product manager, architect, project manager, engineer, and QA engineer — are defined to represent a software company setting. Each role has a distinct profile that includes their name, goal, constraints, context, and skills. For example, a product manager might use web search tools, while an engineer might execute code.  All agents operate on a 'react-style' behavior, meaning they monitor their environment for important observations, such as messages from other agents, and react accordingly. These messages can either directly trigger actions or provide assistance in completing a task.  These agents communicate through a shared message pool, publishing structured messages and subscribing to relevant ones based on their roles. This structured commun

In [None]:
#!pip install --upgrade git+https://github.com/wadave/llama_index.git@main


### Task 4: Multi-document agent

In [66]:
papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "loftq.pdf",
    "swebench.pdf",
    "selfrag.pdf",
    "zipformer.pdf",
    "values.pdf",
    "finetune_fair_diffusion.pdf",
    "knowledge_card.pdf",
    "metra.pdf",  
]

In [67]:
paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/indexes/8741403313842421760
MatchingEngineIndex index Upserted datapoints. Resource name: projects/77923429797/locations/us-central1/indexes/8741403313842421760
Getting tools for paper: longlora.pdf
Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/indexes/8741403313842421760
MatchingEngineIndex index Upserted datapoints. Resource name: projects/77923429797/locations/us-central1/indexes/8741403313842421760
Getting tools for paper: loftq.pdf
Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/indexes/8741403313842421760
MatchingEngineIndex index Upserted datapoints. Resource name: projects/77923429797/locations/us-central1/indexes/8741403313842421760
Getting tools for paper: swebench.pdf
Upserting datapoints MatchingEngineIndex index: projects/77923429797/locations/us-central1/index

In [83]:
all_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [84]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [85]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [86]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=vertex_gemini, 
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please use the tools provided to answer a question as possible. Do not rely on prior knowledge. Summarize your answer\

""",
    verbose=True
)
agent = AgentRunner(agent_worker)

In [89]:
response = agent.query(
    "What is the evaluation dataset used in MetaGPT? Compare it against SWE-Bench"
)
print(str(response))

Added user message to memory: What is the evaluation dataset used in MetaGPT? Compare it against SWE-Bench
merged_message user:  You are an agent designed to answer queries over a set of given papers.
Please use the tools provided to answer a question as possible. Do not rely on prior knowledge. Summarize your answer

What is the evaluation dataset used in MetaGPT? Compare it against SWE-Bench
Function Calling, No Text Content
=== Calling Function ===
Calling function: vector_tool_swebench with args: {"query": "What is the evaluation dataset used in MetaGPT? Compare it against SWE-Bench"}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information is below.


In [90]:
response = agent.query(
    "Compare and contrast the LoRA papers (LongLoRA, LoftQ). "
    "Analyze the approach in each paper first. "
)

Added user message to memory: Compare and contrast the LoRA papers (LongLoRA, LoftQ). Analyze the approach in each paper first. 
merged_message user:  You are an agent designed to answer queries over a set of given papers.
Please use the tools provided to answer a question as possible. Do not rely on prior knowledge. Summarize your answer

Compare and contrast the LoRA papers (LongLoRA, LoftQ). Analyze the approach in each paper first. 
Function Calling, No Text Content
=== Calling Function ===
Calling function: summary_tool_loftq with args: {"query": "Summarize the LoftQ paper."}
merged_message user: You are an expert Q&A system that is trusted around the world.
Always answer the query using the provided context information, and not prior knowledge.
Some rules to follow:
1. Never directly reference the given context in your answer.
2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.
Context information from multiple sour