<a href="https://colab.research.google.com/github/abdulsamadkhan/AgenticRag/blob/main/ToolCalling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Agentic RAG Application using LlamaIndex - Tool Calling

In the last tutorial, we learnt how to use LLM to pick one of the RAG pipelines using RouterQueryEngine. In this tutorial, we will see how LLM can not only pick the function or the RAG pipeline to execute but also infer the arguments of the function or the arguments of the pipeline. so, the last tutorial was a simple case of tool calling and in this tutorial, we will use tool calling with arguments.

Note that Tool Calling adds a layer of understanding on the top of RAG pipeline which enables users to ask complex query and get back more precise answers.


###Tool Calling
Tool calling enables LLM to interact with the external environment through a dynamic interface. This interface not only helps in selecting the appropriate function but also infers the necessary arguments for execution, this is called Tool Calling.

#  Requirements
**installing libraries:**This will install the necessary libraries.

In [None]:
!pip install llama-index==0.10.27
!pip install llama-index-llms-openai==0.1.15
!pip install llama-index-embeddings-openai==0.1.7

Weâ€™ll also use the nest-asyncio library since Llama-index uses a lot of asyncio functionality in the background

In [2]:
import nest_asyncio
nest_asyncio.apply()

**Setting OpenAI Key:**Load the Open API Key, in the colab environment.

In [None]:
#loading openAI API Key
!pip install openai
import openai
from google.colab import userdata
openai.api_key = userdata.get('OPENAI_KEY')
from openai import OpenAI

client = OpenAI(
    # This is the default and can be omitted
    api_key=openai.api_key,
)


# Creating a function Tool
A function tool allows users to easily convert any user-defined function into a Tool. It can also auto-infer the function schema.
We will demonstrate the idea of a function tool through a simple example. We will define two toy functions `add` and `mystery`. also note that both functions have type annotation and doc string. both the type annotations and doc string are important because they will be used as a prompt for LLM.



In [4]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int:
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

The function tool natively integrates with the function calling capabilities of many LLMs including openAI. The predict_and_call function of the LLM takes a set of function tools and the query and then calls the right tool with parameters and returns the results.

In [5]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool],
    "Tell me the output of the mystery function on 2 and 9",
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


from the output we see that LLM called the right tool with parameters, this is an extended example of the router, where not only is the right tool picked but also the parameters are inferred.

#Vector Search with Metadata
We can not only choose the vector search but also infer the metadata filters. These filters will help us return more precise results. Let's experiment with a PDF file and use the metadata for inferring a particular information.

**Download PDF** you can change the code to download the pdf of your interest.

In [6]:
# Import necessary libraries
import requests

# Define the URL and filename
url = "https://arxiv.org/pdf/2308.00352"
filename = "MetaGPT.pdf"

# Send a GET request to the URL
response = requests.get(url)

# Check for successful response
if response.status_code == 200:
    # Open the file in write binary mode
    with open(filename, "wb") as output_file:
        # Write the content of the response to the file
        output_file.write(response.content)
    print("Successfully downloaded the PDF file")
else:
    print("Error: Failed to download the PDF file")

Successfully downloaded the PDF file


**Loading data and creating chunks**





In [7]:
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter

# load documents
documents = SimpleDirectoryReader(input_files=["MetaGPT.pdf"]).load_data()
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

Let us look at the content of the first chunk.

In [None]:
print(nodes[0].get_content(metadata_mode="all"))

We will use the page_label durin the filtering

**Creation of  Vector Index**

The below lines of code will create a vector index for the given list of nodes

In [9]:
from llama_index.core import  VectorStoreIndex
vector_index = VectorStoreIndex(nodes)

We can query this RAG pipeline using a metadata filter.

In [10]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What are some high-level results of MetaGPT?",
)

Let's explore the source nodes

In [11]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'MetaGPT.pdf', 'file_path': 'MetaGPT.pdf', 'file_type': 'application/pdf', 'file_size': 16715764, 'creation_date': '2024-05-22', 'last_modified_date': '2024-05-22'}


This shows the research is only restricted to page number 2.

#Retrieval tool
We can integrate this MetaData filters into a retrieval tool function. This function takes the query string and optional metadata filters such as page number. The LLM can intelligently infer the relevant metadata filter (e.g., page number) based on the use's query. We can define different type of metadata filters like section IDS, Headers etc.
Let us define a function called vector query, which takes in the query and page number. This allows to perform a vector search over an index along with specifying a page number as a metadata filter.


In [12]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str,
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.

    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.

    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]

    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response


vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

Let us define a vector query tool as function tool. We pass the vector query function into a vector query tool, which allows us to use it with the language model.

In [13]:
vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

In [14]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool],
    "What are the high-level results of MetaGPT as described on page 2?",
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2"]}
=== Function Output ===
MetaGPT achieves a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1. It outperforms other popular frameworks for creating complex software projects and stands out in handling higher levels of software complexity while offering extensive functionality. In experimental evaluations, MetaGPT demonstrates a 100% task completion rate, showcasing its robustness and efficiency in terms of time and token costs.


We can see from the output that LLM successfully called the vector_query_tool and infered the argument i.e., page_no  and later printed the output.

In [15]:
print(str(response))

MetaGPT achieves a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1. It outperforms other popular frameworks for creating complex software projects and stands out in handling higher levels of software complexity while offering extensive functionality. In experimental evaluations, MetaGPT demonstrates a 100% task completion rate, showcasing its robustness and efficiency in terms of time and token costs.


Let us inspect the source of the response and we will see that LLM generated the output from the page number specified in the query.

In [16]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'MetaGPT.pdf', 'file_path': 'MetaGPT.pdf', 'file_type': 'application/pdf', 'file_size': 16715764, 'creation_date': '2024-05-22', 'last_modified_date': '2024-05-22'}


Let us add summary_index_tool along with the existing vector_query_tool to test in more complex scenarios.

In [17]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of MetaGPT"
    ),
)

In [18]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool],
    "What are the MetaGPT comparisons with ChatDev described on page 8?",
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "MetaGPT comparisons with ChatDev", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT demonstrates higher executability, lower running times, and lower token usage compared to ChatDev. Additionally, MetaGPT has more code files, more lines of code per file, and more total code lines than ChatDev. In terms of productivity, MetaGPT outperforms ChatDev. However, ChatDev shows lower human revision cost compared to MetaGPT.


In [19]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'MetaGPT.pdf', 'file_path': 'MetaGPT.pdf', 'file_type': 'application/pdf', 'file_size': 16715764, 'creation_date': '2024-05-22', 'last_modified_date': '2024-05-22'}


From the output we can see that LLM was able to successfully call the related function with infered page ino.

In [20]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool],
    "What is a summary of the paper?",
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "Please provide a summary of the paper."}
=== Function Output ===
The paper introduces MetaGPT, a meta-programming framework for LLM-based multi-agent systems that enhances problem-solving processes through collaborative interactions. MetaGPT incorporates Standard Operating Procedures (SOPs) to streamline workflows, assign roles to agents, and ensure structured communication. By leveraging human-like domain expertise, MetaGPT improves the quality and coherence of solutions generated by multi-agent collaborations. The framework achieves state-of-the-art performance in code generation benchmarks and offers a robust and efficient platform for developing LLM-based multi-agent systems. Additionally, MetaGPT focuses on recursive self-improvement mechanisms, multi-agent economies in software development, and a structured approach to transforming abstract requirements into detailed designs, showcasing its effectivenes

# Resources
* https://learn.deeplearning.ai/courses/building-agentic-rag-with-llamaindex
