# Lesson 2: Tool Calling

## Setup

In [1]:
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio
nest_asyncio.apply()

## 1. Define a Simple Tool

We will give a basic overview to tool calling. We will show how to define a tool interface from a Python function. And the LLM will automatically infer the parameters from the signature of the Python function using LlamaIndex abstractions. 

To illustrate this, let's define 2 toy calculator functions to show how tool calling works. We will define an `add()`and a `mystery()`. The core abstraction in LlamaIndex is the `FunctionTool`. This `FunctionTool` wraps any given Python function that we feed it. So we see that `FunctionTool` takes in both `add()` and `mystery()`. 

We see that `add()` and `mystery()` have type annotations for both the x and y variables, as well as the docstring. This is not just for stylistic purposes, this is actually important because these things will be used as a prompt for the LLM.

In [3]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int: 
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

`FunctionTool` integrates natively with the function calling capabilities of many LLM models, including OpenAI. To pass the tools to an LLM, we have to import the LLM module and then call `predict_and_call()`. This code snippent imports the OpenAI module explicitly, and we see that the model is GPT 3.5 turbo.

Then we call `predict_and_call()` on top of the LLM. `predict_and_call()` takes in a set of tools as well as an input prompt string, or a series of chat messages. Then it is able to make a decision on the tool to call, as well as call the tool itself to get back the final response. 

Here we see the intermediate steps calling mystery() with the arguments `x=2` and `y=9`, so we see that we call the right tools and infer the right parameters. And the output is `11*11 = 121`, which is the right answer. Note that this simple example is an expanded version of the router. The LLM not only picks the tool, but also decides what parameters to give to the tool.

In [4]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool], 
    "Tell me the output of the mystery function on 2 and 9", 
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


## 2. Define an Auto-Retrieval Tool

### Load Data

Let's use this key concept to define a slightly more sophisticated agentic layer on top of vector search. The LLM can choose vector search, and we can also get it to infer metadata filters, which is a structured list of tags that helps to return a more precise set of search results. 

We will use the MetaGPT PDF, and let's pay attention to the nodes themselves or the chunks, because we will take a look at the actual metadata attached to these chunks. 

Similar to the last lesson, we will use `SimpleDirectoryReader()` from LlamaIndex to load in the parsed representation of the PDF file. Next we will use `SentenceSplitter()` to split these documents into a set of even chunks, with a chunk size of 1024. Each node represents a chunk, and let's look at the content of an example chunk.

To download this paper, below is the needed code:

#!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

**Note**: The pdf file is included with this lesson. To access it, go to the `File` menu and select`Open...`.

In [5]:
from llama_index.core import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [6]:
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

So we will look at the content of the first chunk. We can do this from `node.get_content()`. We set `metadata_mode="all"`, which is a special setting that enables us to print out the content of the node, and also the metadata attached to the document which is propagated to every node.

Once we print this out, we get back a parsed representation of the paper's front page, we can also see the metadata attached at the very top. So this includes a few things: `page_label: 1`, `file_name: metagpt.pdf`, `file_type: pdf`, `file_size`, credit and the dates. 

We will pay special attention to the page labels because for instance, if we actually try a different node, we see that we get back a different page number. So we will add a page number annotation to every chunk.

In [7]:
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-10-22
last_modified_date: 2024-06-24

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks, however, 

In [8]:
print(nodes[4].get_content(metadata_mode="all"))

page_label: 4
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-10-22
last_modified_date: 2024-06-24

Preprint
Figure 2: An example of the communication protocol (left) and iterative programming with exe-
cutable feedback (right). Left: Agents use a shared message pool to publish structured messages.
They can also subscribe to relevant messages based on their profiles. Right : After generating the
initial code, the Engineer agent runs and checks for errors. If errors occur, the agent checks past
messages stored in memory and compares them with the PRD, system design, and code files.
3 M ETAGPT: A M ETA-PROGRAMMING FRAMEWORK
MetaGPT is a meta-programming framework for LLM-based multi-agent systems. Sec. 3.1 pro-
vides an explanation of role specialization, workflow and structured communication in this frame-
work, and illustrates how to organize a multi-agent system within the context of SOPs. Sec. 3.2
presents a communicati

Next we will define a vector store index over these nodes. This will basically build a RAG indexing pipeline over these nodes. It will add an embedding for each node and it will get back a query engine. 

Differently from Lesson 1, we can actually try querying this RAG pipeline via metadata filters. Just to show how metadata filtering works, we import `MetadataFilters`, and then we simply specify a filter where the `page label=2`, in addition to `similarity_top_k=2`.

Once we define this as a query engine, and we call some high level reuslts of MetaGPT, we get back a response. We will first look at the response string, which outlines the overall results of MetaGPT. 

Crucially we will take a look at the page number in the `source_nodes`. As we iterate through the source nodes, we can actually print out the metadata attached to these source nodes. We see that it is able to properly filter out the page numbers to only restrict the search to the set of pages where `page label = 2`.

In [9]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [10]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What are some high-level results of MetaGPT?", 
)

In [11]:
print(str(response))

MetaGPT achieves a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1, surpassing other popular frameworks like AutoGPT, LangChain, AgentVerse, and ChatDev. Additionally, MetaGPT demonstrates robustness and efficiency by achieving a 100% task completion rate in experimental evaluations, highlighting its effectiveness in handling higher levels of software complexity and offering extensive functionality.


In [12]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-10-22', 'last_modified_date': '2024-06-24'}


### Define the Auto-Retrieval Tool

We will define a Python function that encapsulates the data retrieval tool. We define a function called `vector_query`, which takes in the query and the page numbers. This allows us to perform a vector search over an index, along with specifying page numbers as a metadata filter. 

At the very end, we define `vector_query_tool = FunctionTool.from_defaults()`. So we pass in `vector_query()` into `vector_query_tool`, which allows us to then use it with a language model.

In [13]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str, 
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.
    
    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.
    
    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]
    
    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response
    

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

So let's try calling this tool with GPT 3.5 turbo. We will find that the LLM is able to infer both the string and the metadata filters. We do `predict_and_call()` on the `vector_query_tool`, and ask the same question high level results of MetaGPT as described on page 2.

We see that the LLM is able to formulate the right query, high level results of MetaGPT, as well as specify the page numbers, which is 2. And we get back the correct answer. Similarly as before, we verify the source nodes, and we see that there is 1 source node returned and the page label is 2.

In [14]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "What are the high-level results of MetaGPT as described on page 2?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2"]}
=== Function Output ===
MetaGPT achieves a new state-of-the-art (SoTA) in code generation benchmarks with 85.9% and 87.7% in Pass@1. It stands out in handling higher levels of software complexity and offering extensive functionality, demonstrating a 100% task completion rate in experimental evaluations.


In [15]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-10-22', 'last_modified_date': '2024-06-24'}


## Let's add some other tools!

Finally, we can bring in the summary tool from the router example in Lesson 1. And we can combine that with the vector tool to create this overall tool picking system. So this code just sets up a summary index over the same set of nodes, and wraps us in a summary tool.

In [16]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of MetaGPT"
    ),
)

Now let's try tool calling again. So the LLM has a slightly harder task of picking the right tool in addition to inferring the function parameters. We ask "What are the MetaGPT comparisons with ChatDev described on page 8?" We see that it stills calls a vector tool with page number = 8, and it's able to give back the right answer. We can verify this by printing out the source nodes.

In [17]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What are the MetaGPT comparisons with ChatDev described on page 8?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "MetaGPT comparisons with ChatDev", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT outperforms ChatDev in various aspects such as executability, running times, token usage, code statistic, productivity, and human revision cost. It demonstrates superior performance in generating code efficiently and effectively compared to ChatDev.


In [18]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-10-22', 'last_modified_date': '2024-06-24'}


Lastly, we can ask "What is a summary of the paper?" to show that the LLM can still pick the summary tool when necessary. And we see that it gives back the right response.

In [19]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What is a summary of the paper?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "This paper discusses the impact of climate change on global food security and explores potential solutions to address this issue."}
=== Function Output ===
The paper does not discuss the impact of climate change on global food security or explore potential solutions to address this issue.
