# 3 - Leverage LLamaIndex with VertexAI Vector Search to perform question answering RAG

## Multi-Document RAG

So far we have established how to set up set up the necessary Google Cloud resources, set up a LlamaIndex agent, and customize prompts for RAG question answering. 

In this section we will cover Multi-Document Agents that can effectively answer different set of questions over a larger set of documents. Questions can include QA and summaries over a individual document or across documents. 

To do this we will follow these steps:

+ setup a "document agent" over each Document: each doc agent can do QA/summarization within its doc
+ setup a top-level agent over this set of document agents: tool retrieval and answer over the set of tools responses to answer a question

### Set Up

In [1]:
#Imports
import os
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    Settings,
    SummaryIndex
)
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.llms.vertex import Vertex
from llama_index.core.agent import ReActAgent
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.objects import ObjectIndex
from llama_index.vector_stores.vertexaivectorsearch import VertexAIVectorStore


In [2]:
PROJECT_ID = "" #TODO - add your project-id here from the console
REGION = ""  #TODO - add your region here from the console
GCS_BUCKET = "llamaindex_gcs_bucket"  # @param {type:"string"}
DOC_FOLDER = "./data"

vs_index_id = "" #TODO - add your vertexai search id from setup here
vs_endpoint_id = ""  #TODO - add your vertexai search deployed endpoint id from setup here

### Add more documents

In [13]:
def ingest_multi_document():
    import os
    doc_dict = {}

    for filename in os.listdir(DOC_FOLDER):
        doc_dict[filename] = SimpleDirectoryReader(
            input_files=[os.path.join(DOC_FOLDER, filename)]
        ).load_data()

    return doc_dict

### Initialize Storage

In [17]:
def initialize_llm_and_storage(vs_index, vs_endpoint):
    """
    Initializes VertexAI Vector Store given a VertexAI Search index and deployed endpoint.
    Configures embedding and LLMs models to be gecko and Gemini.
    """
    # setup storage
    vector_store = VertexAIVectorStore(
        project_id=PROJECT_ID,
        region=REGION,
        index_id=vs_index,
        endpoint_id=vs_endpoint,
        gcs_bucket_name=GCS_BUCKET,
    )

    # set storage context
    storage_context = StorageContext.from_defaults(vector_store=vector_store)

    gemini_embedding_model = VertexTextEmbedding("text-embedding-004")
    llm = Vertex("gemini-pro")

    Settings.embed_model = gemini_embedding_model
    Settings.llm=llm


In [18]:
# Load documents 
storage_context = initialize_llm_and_storage(vs_index_id, vs_endpoint_id)
multi_docs = ingest_multi_document()

### Build Document Level Agents

First we will build document agents for each document. 

We will create two query engines, one for semantic search and one for summarization, for each document. These query engines will be converted into tools that can be passed to a function calling agent.

We will be using the ReAct agent (short for "Reasoning and Acting"). This agent is an LLM-powered agent designed to perform complex tasks over your data. It operates in both "read" and "write" modes, making it a versatile tool for various applications. 

In [9]:
def build_document_level_agents(documents, storage_context):
    """
    Sets up a vector search and summarization tool for each document. Generates an agent for each documents based on tools.
    """

    node_parser = SentenceSplitter()

    #Build agents dictionary
    agents = {}
    query_engines = {}

    for idx, doc_title in enumerate(documents):
        
        #A Node represents a "chunk" of a source Document, whether that is a text chunk, an image, or other. Similar to Documents, they contain metadata and relationship information with other nodes.
        nodes = node_parser.get_nodes_from_documents(documents[doc_title], show_progress=True)

        #Build query index
        vector_index = VectorStoreIndex.from_documents(
            documents[doc_title], storage_context=storage_context
        )
        #Build summary index
        summary_index = SummaryIndex(nodes)

        #Define engines
        vector_query_engine = vector_index.as_query_engine()
        summary_query_engine = summary_index.as_query_engine()

        #Define tools
        query_engine_tools = [
            QueryEngineTool(
                query_engine=vector_query_engine,
                metadata=ToolMetadata(
                    name="vector_tool",
                    description=(
                        "Useful for questions related to specific aspects of"
                        f" {doc_title}."
                    ),
                ),
            ),
            QueryEngineTool(
                query_engine=summary_query_engine,
                metadata=ToolMetadata(
                    name="summary_tool",
                    description=(
                        "Useful for any requests that require a holistic summary"
                        f" of EVERYTHING about {doc_title}. For questions about"
                        " more specific sections, please use the vector_tool."
                    ),
                ),
            ),
        ]

        #Build agent
        llm = Vertex("gemini-pro")
        agent = ReActAgent.from_tools(
            query_engine_tools,
            llm=llm,
            verbose=True,
            system_prompt=f"""\
            You are a specialized agent designed to answer queries about {doc_title}.
            You must ALWAYS use at least one of the tools provided when answering a question; do NOT rely on prior knowledge.\
            """,
            )

        agents[doc_title] = agent
        query_engines[doc_title] = vector_index.as_query_engine(
            similarity_top_k=2
        )

    return agents


In [None]:
agents = build_document_level_agents(multi_docs, storage_context)

### Build Top Level Agent

We build a top-level agent that can orchestrate across the different document agents to answer any user query. This agent takes in all document agents as tools that were built above. 



In [20]:
def build_top_level_agent(agents):
    #This agent takes in all document agents as tools
    all_tools = []

    for filename in os.listdir(DOC_FOLDER):
        summary = (
        f"This content contains a research paper articles about {filename}. Use"
        f" this tool if you want to answer any questions about {filename}.\n"
        )
        doc_tool = QueryEngineTool(
            query_engine=agents[filename],
            metadata=ToolMetadata(
                name=f"tool_{filename}".rstrip(".pdf"),
                description=summary,
            ),
        )
        
        all_tools.append(doc_tool)


    #define an "object" index and retriever over these tools
    obj_index = ObjectIndex.from_objects(
        all_tools,
        index_cls=VectorStoreIndex,
    )   

    #Create top level agent
    top_agent = ReActAgent.from_tools(
        tool_retriever=obj_index.as_retriever(similarity_top_k=3),
        system_prompt=""" \
            You are an agent designed to answer queries about energy systems.
            Please always use the tools provided to answer a question. Do not rely on prior knowledge.\

            """,
        verbose=True,
    )
    
    return top_agent

In [21]:
top_level_agent = build_top_level_agent(agents)

### Perform Multi-Document QA Rag

Know we can query the top level agent. We will experiment with various question types that require lookup in individual documents and across multiple documents.

In [None]:
#QA over a specific doc
response = top_level_agent.query("Tell me about the Higashi-Shimizu Substation of Chubu Electric Power Grid")

In [None]:
# summaries across documents 
response = top_level_agent.query("What are all projects introduced after the Great East Japan Earthquake?")

In [None]:
#cross document QA
response = top_level_agent.query("List out all the technologies that are used to stabilize the power system?")

In [None]:
response = top_level_agent.query("Explain to me what the building blocks of the MACH control and protection system and where it is used.")

## Conclusion

Congratulations! You have now built a multi document RAG workflow using LlamaIndex on VertexAI.

You leveraged using by buildings tools that can perform summarization and semantic search for each document. Then using a ReAct agent for each document that uses the tools to perform complex tasks over the data.

Then we built a top-level agent that can leverage all the document level agents information to answer complex questions that may span different documents.