# LlamaIndex with Tool Calling

-   Type hints and docstrings is MUST for the tool to work properly
-   You can pre-filter the index with metadata before passing the vector store to llm, e.g. `page_label`
-   Similarly, you can let the user pass a list of metadata to filter the index
-   Still, you need a good model for the tool to work properly


In [1]:
import nest_asyncio

nest_asyncio.apply()
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

--2024-06-21 08:59:08--  https://openreview.net/pdf?id=VtmBAGCN7o
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘metagpt.pdf’


2024-06-21 08:59:10 (8.95 MB/s) - ‘metagpt.pdf’ saved [16911937/16911937]



## Simple Tool

In [38]:
from llama_index.core.tools import FunctionTool


# Type hints and docstrings is MUST for the tool to work properly
def add(x: int, y: int) -> int:
    """Adds two integer together"""
    return x + y


def mystery(x: int, y: int) -> int:
    """Mystery function that operates on two integers"""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [39]:
from llama_index.llms.ollama import Ollama

llm = Ollama(model="llama3")
response = llm.predict_and_call(
    [add_tool, mystery_tool],
    "Tell me the output of the mystery function on 3 and 4",
    verbose=True,
)
print(str(response))

[1;3;38;5;200mThought: The current language of the user is not specified. I need to use a tool to help me answer the question.
Action: mystery
Action Input: {'x': 3, 'y': 4}
[0m[1;3;34mObservation: 49
[0m49


## Auto-Retrieval Tool

In [40]:
from llama_index.core.readers import SimpleDirectoryReader

documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [41]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: data/metagpt.pdf
file_type: application/pdf
file_size: 16715764
creation_date: 2024-06-21
last_modified_date: 2024-06-21

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks, howe

In [42]:

from llama_index.core import Settings, VectorStoreIndex
from llama_index.embeddings.ollama import OllamaEmbedding

Settings.llm = Ollama(model="llama3")
Settings.embed_model = OllamaEmbedding(model_name="llama3")

# import os
# from llama_index.llms.groq import Groq

# llm = Groq(model="llama3-70b-8192",api_key=os.getenv("GROQ_API_KEY"))
# Settings.llm = llm

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [43]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts([{"key": "page_label", "value": "2"}]),
)

response = query_engine.query("What are some high-level results of MetaGPT?")

In [44]:
print(str(response))

for n in response.source_nodes:
    print(n.metadata)

MetaGPT achieves a state-of-the-art (SoTA) with 85.9% and 87.7% in Pass@1 for code generation benchmarks, demonstrating its robustness and efficiency. Additionally, it achieves a 100% task completion rate in experimental evaluations.
{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16715764, 'creation_date': '2024-06-21', 'last_modified_date': '2024-06-21'}


## Enhanced Auto-Retrieval Tool


In [45]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(query: str, page_numbers: List[str]) -> str:
    """Perform a vector search over an index.

    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.

    """

    metadata_dicts = [{"key": "page_label", "value": p} for p in page_numbers]

    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts, condition=FilterCondition.OR
        ),
    )
    response = query_engine.query(query)
    return response


vector_query_tool = FunctionTool.from_defaults(name="vector_tool", fn=vector_query)

In [46]:
response = llm.predict_and_call(
    [vector_query_tool],
    "What are the high-level results of MetaGPT as described on page 2?",
    verbose=True,
)

[1;3;38;5;200mThought: The current language of the user is English. I need to use a tool to help me answer the question.
Action: vector_tool
Action Input: {'query': 'high-level results MetaGPT', 'page_numbers': ['2']}
[0m[1;3;34mObservation: 85.9% and 87.7%.
[0m

In [47]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16715764, 'creation_date': '2024-06-21', 'last_modified_date': '2024-06-21'}


## Extra Tools

In [48]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=("Useful if you want to get a summary of MetaGPT"),
)

In [49]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool],
    "What are the MetaGPT comparisons with ChatDev described on page 8?",
    verbose=True,
)

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: summary_tool
Action Input: {'input': 'page 8'}
[0m[1;3;34mObservation: Error: 
[0m

In [50]:
for n in response.source_nodes:
    print(n.metadata)

In [51]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], "What is a summary of the paper?", verbose=True
)

[1;3;38;5;200mThought: The current language of the user is English. I need to use a tool to help me answer the question.
Action: summary_tool
Action Input: {'input': 'The paper'}
[0m[1;3;34mObservation: Error: 
[0m