# Lesson 2: Tool Calling

In the previous lesson we show how to use LLMs to make decisions on which pipeline to use. This is a simplified example of tool calling. Here we extend the concept: we use an LLM not only to select a function, but to infer an argument to pass to it.

**Tool calling** describes an interface that allows an LLM to interact with external environments. In a basic RAG pipeline LLMs are only used for synthesis. In the previous lesson we used them in a slightly more sophisticated fashion, i.e., to pick functions to execute.

Allowing an LLM to infer the argument to be passed to a function adds a layer of query understanding on top of a RAG pipeline. This allows the LLM to figure out how to use a vector DB instead of just consuming its output.

## Setup

In [1]:
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio
nest_asyncio.apply()

## 1. Define a Simple Tool

We will define a tool interface from a pipeline function. The LLM will automatically infer the parameters from the signature of the function. We consider two toy calculator functions. The man Llamaindex abstraction is a `FunctionTool`, which wraps functions via, in this case, the `from_defaults(fn=...)` method. The two functions have type annotations and docstrings. This is important as this information will be used to prompt the LLMs. 

In [3]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int: 
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

To pass the tools to an LLM, we need to import the LLM and then call `predict_and_call()`. This function takes a set of tools as well as a prompt string or a series of chat messages and then the LLM is able to make a decision on which tool to call, call the tool, and get back the final response.

In [4]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool], 
    "Tell me the output of the mystery function on 2 and 9", 
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


## 2. Define an Auto-Retrieval Tool

Let's use this same concept for a slightly more sophisticated tool on top of vector search. Not only can the LLM choose vector search, but we can also get it to infer metadata filters, which is a structured list of tags which helps to return a more precise set of results.

### Load Data

We use the same paper used in lesson 1. Here we will pay more attention to the individual chunks, as these contain the metadata.

To download this paper, below is the needed code:

#!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

**Note**: The pdf file is included with this lesson. To access it, go to the `File` menu and select`Open...`.

In [5]:
from llama_index.core import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [6]:
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

Let's take a look at the content of the first chunk. The option `metadata_mode="all"` shows the metadata attached to the document that are propagated to every node. The metadata is at the very top of the output. The page labels are different for different nodes.

In [7]:
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-07-13
last_modified_date: 2024-06-24

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks, however, 

Netx we define a vector store index, which will build a RAG indexing pipeline over these nodes, add an embedding for each node, and give back a query engine. Differently from lesson 1, we can query this pipeline via metadata filters. This is done via the `MetadataFilters` class.

In [8]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [9]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What are some high-level results of MetaGPT?", 
)

In [10]:
print(str(response))

MetaGPT achieves a new state-of-the-art (SoTA) in code generation benchmarks with 85.9% and 87.7% in Pass@1. It stands out in handling higher levels of software complexity and offering extensive functionality. In experimental evaluations, MetaGPT achieves a 100% task completion rate, demonstrating robustness and efficiency in terms of time and token costs.


If we inspect the metadata of the returned nodes, we can see that the results were only extracted from page 2.

In [11]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-07-13', 'last_modified_date': '2024-06-24'}


### Define the Auto-Retrieval Tool

We will create a function that integrates metadata filters into a retrieval tool function. This function enables more precise retrieval by accepting a query string and optional metadata filters, such as page numbers. The LLM can then infer relevant metadata filters (e.g. page numbers) based on the user's query. Metadata can consist of section IDs, headers, footers, etc. The ability to use many metadata filters is especially prominent in better models like GPT4.

Here we define a `vector_query()` functin that we then pass to `FunctionTool.from_defaults()`.

In [12]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str, 
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.
    
    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.
    
    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]
    
    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response
    

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

In [13]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "What are the high-level results of MetaGPT as described on page 2?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2"]}
=== Function Output ===
MetaGPT achieves a new state-of-the-art (SoTA) in code generation benchmarks with 85.9% and 87.7% in Pass@1. It stands out in handling higher levels of software complexity and offering extensive functionality, achieving a 100% task completion rate in experimental evaluations.


In [14]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-07-13', 'last_modified_date': '2024-06-24'}


## Let's add some other tools!

We can use the summary tool from lesson 1 to create this overall tool-picking system. The code below sets a summary index over the set of nodes and wraps it into a summary tool.

In [15]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of MetaGPT"
    ),
)

Note that, in the case below, the pipeline uses the vector tool.

In [16]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What are the MetaGPT comparisons with ChatDev described on page 8?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "MetaGPT comparisons with ChatDev", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT outperforms ChatDev in various aspects such as executability, running times, token usage, code statistic, productivity, and human revision cost. It achieves a higher score in executability, takes less time for software generation, uses more tokens but is more efficient in generating code lines, produces more code files and lines of code per file, and requires less human revision cost compared to ChatDev.


In [17]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-07-13', 'last_modified_date': '2024-06-24'}


Here, instead, the summary tool is used.

In [18]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What is a summary of the paper?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "The paper discusses the impact of climate change on biodiversity and ecosystems."}
=== Function Output ===
The paper does not discuss the impact of climate change on biodiversity and ecosystems.
