<a href="https://colab.research.google.com/github/duanzhihua/-transformer-english2chinese-/blob/main/research_agent_databricks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building a Research Agent with Databricks

In this notebook we show you how to build a complete agent reasoning loop. Instead of tool calling in a single-shot setting, an agent is able to reason over tools in a multiple-steps. This necessitates that the agent can maintain state across the loop.

We will use our `FunctionCallingAgent` implementation, which is an agent that natively integrates with the function calling capabilities of LLMs.

### Setup


In [None]:
!pip install llama-index==0.10.28
%pip install llama-index-llms-databricks
%pip install llama-index-embeddings-huggingface
%pip install llama-parse

Collecting llama-index==0.10.28
  Downloading llama_index-0.10.28-py3-none-any.whl (6.9 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index==0.10.28)
  Downloading llama_index_agent_openai-0.2.7-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index==0.10.28)
  Downloading llama_index_cli-0.1.12-py3-none-any.whl (26 kB)
Collecting llama-index-core<0.11.0,>=0.10.28 (from llama-index==0.10.28)
  Downloading llama_index_core-0.10.44-py3-none-any.whl (15.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.4/15.4 MB[0m [31m32.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index==0.10.28)
  Downloading llama_index_embeddings_openai-0.1.10-py3-none-any.whl (6.2 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index==0.10.28)
  Downloading llama_index_indices_managed_llama_cloud-0.1.6-py3-none-any.whl (6.7 kB)
Collecting llama-index-l

In [None]:
import os
os.environ["LLAMA_CLOUD_API_KEY"] = "llx-..."

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
# databricks api key
api_key = ""

In [None]:
from llama_index.llms.databricks import Databricks
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
llm = Databricks(
    model="databricks-meta-llama-3-70b-instruct",
    api_key=api_key,
    api_base="https://<cluster_id>.cloud.databricks.com/serving-endpoints",
)

Settings.llm = llm
Settings.embed_model = embed_model


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
import nest_asyncio
nest_asyncio.apply()

## Download ~3 ICLR 2024 papers, use LlamaParse

Let's parse 3 ICLR 2024 research papers using LlamaParse.

In [None]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

In [None]:
for url, paper in zip(urls, papers):
    !wget "{url}" -O "{paper}"

--2024-06-12 00:14:21--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘metagpt.pdf’


2024-06-12 00:14:22 (31.6 MB/s) - ‘metagpt.pdf’ saved [16911937/16911937]

--2024-06-12 00:14:22--  https://openreview.net/pdf?id=6PmJoRfdaK
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1168720 (1.1M) [application/pdf]
Saving to: ‘longlora.pdf’


2024-06-12 00:14:22 (4.18 MB/s) - ‘longlora.pdf’ saved [1168720/1168720]

--2024-06-12 00:14:22--  https://openreview.net/pdf?id=hSyW5go0v8
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP req

In [None]:
from llama_parse import LlamaParse

In [None]:
from llama_index.core.schema import Document
def _load_data(file_path: str) -> Document:
    parser = LlamaParse(result_type="text")
    json_objs = parser.get_json_result(file_path)
    json_list = json_objs[0]["pages"]
    docs = []
    for item in json_list:
        doc = Document(
            text=item["text"], metadata={"page_label": item["page"]}
        )
        docs.append(doc)
    return docs

### Convert papers to Tools

In [None]:
# TODO: abstract all of this into a function that takes in a PDF file name

from llama_index.core import VectorStoreIndex, SummaryIndex
from llama_parse import LlamaParse
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import FunctionTool, QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters, FilterCondition
from typing import List, Optional


def get_doc_tools(
    file_path: str,
    name: str,
) -> str:
    """Get vector query and summary query tools from a document."""

    # load documents
    # documents = LlamaParse(result_type="text").load_data(file_path)
    documents = _load_data(file_path)
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    vector_index = VectorStoreIndex(nodes)

    def vector_query(
        query: str,
        page_numbers: Optional[List[int]] = None
    ) -> str:
        """Use to answer questions over a given paper.

        Useful if you have specific questions over the paper.
        Always leave page_numbers as None UNLESS there is a specific page you want to search for.

        Args:
            query (str): the string query to be embedded.
            page_numbers (Optional[List[int]]): Filter by set of pages. Leave as NONE
                if we want to perform a vector search
                over all pages. Otherwise, filter by the set of specified pages.

        """

        page_numbers = page_numbers or []
        metadata_dicts = [
            {"key": "page_label", "value": p} for p in page_numbers
        ]

        query_engine = vector_index.as_query_engine(
            similarity_top_k=2,
            filters=MetadataFilters.from_dicts(
                metadata_dicts,
                condition=FilterCondition.OR
            )
        )
        response = query_engine.query(query)
        return response


    vector_query_tool = FunctionTool.from_defaults(
        name=f"vector_tool_{name}",
        fn=vector_query
    )

    summary_index = SummaryIndex(nodes)
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )
    summary_tool = QueryEngineTool.from_defaults(
        name=f"summary_tool_{name}",
        query_engine=summary_query_engine,
        description=(
            f"Useful for summarization questions related to {name}"
        ),
    )

    return vector_query_tool, summary_tool

In [None]:
from pathlib import Path

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Started parsing the file under job_id cac11eca-0617-449c-98ef-c43594d8b8ff
Getting tools for paper: longlora.pdf
Started parsing the file under job_id cac11eca-06e5-45c6-9470-84d6903584a0
Getting tools for paper: selfrag.pdf
Started parsing the file under job_id cac11eca-4420-43e8-8b78-1bcab890bd15


## Setup an agent over 3 papers

We now setup our function calling agent over 3 papers. We do this by combining the vector/summary tools for each document into a list and passing it to the agent.

In [None]:
initial_papers = ["metagpt.pdf", "selfrag.pdf", "longlora.pdf"]
initial_tools = [t for paper in initial_papers for t in paper_to_tools_dict[paper]]

In [None]:
# tmp = paper_to_tools_dict["selfrag.pdf"][1]("summary")
# print(str(tmp))

In [None]:
from llama_index.core.agent import ReActAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = ReActAgentWorker.from_tools(
    initial_tools,
    # llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

We first query the agent.

In [None]:
response = agent.query(
    "Tell me about the evaluation dataset used in Self-RAG, and then tell me about the evaluation results"
)



APIConnectionError: Connection error.

In [None]:
print(str(response))

The evaluation dataset used in Self-RAG is not explicitly mentioned, but the model is evaluated on a diverse set of tasks, including Open-domain QA, reasoning, and fact verification tasks. The evaluation results show that Self-RAG significantly outperforms state-of-the-art LLMs and retrieval-augmented models on these tasks, and achieves significant gains in improving factuality and citation accuracy for long-form generations relative to these models.


In [None]:
response = agent.query("What are the MetaGPT comparisons with ChatDev described on page 8 of the MetaGPT paper?")
print(str(response))

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: vector_tool_metagpt
Action Input: {'query': 'What are the MetaGPT comparisons with ChatDev described on page 8 of the MetaGPT paper?', 'page_numbers': [8]}
[0m[1;3;34mObservation: The comparisons between MetaGPT and ChatDev are described in terms of several metrics, including executability, running times, token usage, code statistics, productivity, and human revision cost. Specifically, MetaGPT outperforms ChatDev in nearly all metrics, achieving a higher executability score, requiring less time, and having better code statistics, productivity, and human revision cost.
[0m[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The MetaGPT comparisons with ChatDev described on page 8 of the MetaGPT paper are in terms of several metrics, including executability, running times, token usage, code stat

In [None]:
response = agent.query(
    "Compare the complexity of the approaches in Self-RAG and MetaGPT. Which approach uses more tokens?"
)

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: summary_tool_selfrag
Action Input: {'input': 'Compare the complexity of the approaches in Self-RAG and MetaGPT. Which approach uses more tokens?'}
[0m



[1;3;34mObservation: Error: Error code: 429 - {'error_code': 'REQUEST_LIMIT_EXCEEDED', 'message': 'REQUEST_LIMIT_EXCEEDED: Exceeded workspace rate limit for databricks-meta-llama-3-70b-instruct. Please use a provisioned throughput Foundation Model APIs endpoint for a higher rate limit.'}
[0m[1;3;38;5;200mThought: It seems like I've hit a rate limit. Let me try a different tool to get the information I need.
Action: summary_tool_metagpt
Action Input: {'input': 'Compare the complexity of the approaches in Self-RAG and MetaGPT. Which approach uses more tokens?'}
[0m



[1;3;34mObservation: It is difficult to directly compare the complexity of the approaches in Self-RAG and MetaGPT, as Self-RAG is not explicitly described in the provided context. However, based on the information available, it appears that MetaGPT is a more complex approach that likely uses more tokens.

MetaGPT involves a multi-agent system with role specialization, workflow, and structured communication, which suggests a more intricate architecture. Additionally, MetaGPT incorporates executable feedback, a self-correction mechanism, and a publish-subscribe mechanism, which adds to its complexity.

In terms of token usage, MetaGPT's approach seems to require more tokens due to the generation of documents, diagrams, and code. The iterative programming process also involves writing and executing code, which would require more tokens.

While it is difficult to make a direct comparison without more information about Self-RAG, it can be inferred that MetaGPT's approach is more complex an

In [None]:
print(str(response))

MetaGPT is a more complex approach compared to Self-RAG, and it uses more tokens. Specifically, MetaGPT typically uses 24,613 or 31,255 tokens to generate code, whereas the token usage in Self-RAG is unknown.
