<a href="https://colab.research.google.com/github/sudhirshahu51/RAG/blob/main/2_Tool_Calling_using_LlamaIndex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [49]:
%%writefile requirements.txt

llama-index==0.10.27
llama-index-llms-text-generation-inference
llama-index-llms-huggingface
llama-index-llms-huggingface-api
sentence-transformers #for embedding model
llama-index-embeddings-huggingface
llama-index-embeddings-instructor

Overwriting requirements.txt


In [50]:
!pip install -r requirements.txt --quiet

**Login for HuggingFace API for LLM**

In [51]:
import os
from google.colab import userdata
key = userdata.get('HUGGING_FACE')
os.environ['HUGGINGFACEHUB_API_TOKEN'] = key

In [52]:
import textwrap
from typing import List, Optional
from llama_index.core import Settings
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

In [53]:
# to make async work well with jupyter notebook
import nest_asyncio
nest_asyncio.apply()

1. Define a Simple Tool

* The FunctionTool can wrap any python functions, which then can be made available to LLM
* The annotation of the functions is important as it guides the LLM to choose the appropriate tool
* LLM not only chooses the tool, but also decides what parameter to pass on to the function.



In [54]:
from llama_index.core.tools import FunctionTool

In [55]:
def add(x: int, y: int) -> int:
  """Adds two integers together."""
  return x+y

def mystery(x: int, y: int) -> int:
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)

add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [56]:
#from llama_index.llms.openai import OpenAI
llm = HuggingFaceInferenceAPI(
                          model_name="HuggingFaceH4/zephyr-7b-alpha",
                          task='complete',
                          token=key)
#llm = OpenAI(model="gpt-3.5-turbo")


In [57]:
completion_response = llm.complete("To infinity, and")
print(completion_response)


 beyond!

The Toy Story franchise has been a beloved part of pop culture for over two decades, and it's not slowing down anytime soon. The latest installment, Toy Story 4, is set to hit theaters this summer, and it's already generating buzz.

The movie follows the adventures of Woody, Buzz, and the gang as they embark on a new adventure with a new toy, Forky. The trailer for the movie has been released, and it's already getting fans excited for the film.

One of the most exciting things about Toy Story 4 is the return of some beloved characters. Bo Peep, who was last seen in Toy Story 2, is back and looking better than ever. She's now a modern, independent woman, and her new look has been getting a lot of attention.

Another exciting addition to the movie is the introduction of new characters, including Forky, who is voiced by Tony Hale. Forky is a spork with a popsicle stick for a handle, and he's not exactly thrilled about being a toy.

The trailer for Toy Story 4 has been viewed ove

In [58]:
from typing import Union

In [59]:
user_msg = "Tell me the output of the mystery function on 2 and 92"

In [64]:
response = llm.predict_and_call(
    [add_tool, mystery_tool],
    user_msg,
    verbose=True
)
print(str(response))

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: mystery
Action Input: {'x': 2, 'y': 92}
[0m[1;3;34mObservation: 8836
[0m8836


In [63]:
response

AgentChatResponse(response='8836', sources=[ToolOutput(content='8836', tool_name='mystery', raw_input={'args': (), 'kwargs': {'x': 2, 'y': 92}}, raw_output=8836, is_error=False)], source_nodes=[], is_dummy_stream=False, metadata=None)

len

In [48]:
# huggingface does like messages to be passed from LLM, hence incase of errors tranform the message using below function.
def messages_to_prompt(messages):
  message_str = "\n".join([str(x) for x in messages])
  return f"[INST] {message_str} [/INST] "

#llm = HuggingFaceLLM(..., messages_to_prompt=messages_to_prompt)

'[INST] T\ne\nl\nl\n \nm\ne\n \nt\nh\ne\n \no\nu\nt\np\nu\nt\n \no\nf\n \nt\nh\ne\n \nm\ny\ns\nt\ne\nr\ny\n \nf\nu\nn\nc\nt\ni\no\nn\n \no\nn\n \n2\n \na\nn\nd\n \n9\n2 [/INST] '

2. Define an Auto-Retrieval Tool

In [66]:
from llama_index.core import SimpleDirectoryReader
#Load Documents
path =  "/content/drive/MyDrive/Colab Notebooks/Data/metagpt.pdf"
documents =  SimpleDirectoryReader(input_files = [path]).load_data()

In [67]:
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size = 1024, chunk_overlap = 200)
nodes = splitter.get_nodes_from_documents(documents)

In [68]:
#take a look at the conetent of the 1st chunk with the metadata.
print(nodes[0].get_content(metadata_mode = "all"))

page_label: 1
file_name: metagpt.pdf
file_path: /content/drive/MyDrive/Colab Notebooks/Data/metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-12-22
last_modified_date: 2024-12-20

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks

In [70]:
#Embedding Model and LLM
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5") # Try using Stella too dunzhang/stella_en_400M_v5 https://huggingface.co/spaces/mteb/leaderboard

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/777 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [80]:
Settings.llm = llm
Settings.embed_model = embed_model

In [74]:
Settings.embed_model

HuggingFaceEmbedding(model_name='BAAI/bge-base-en-v1.5', embed_batch_size=10, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7cdd258e5f30>, num_workers=None, max_length=512, normalize=True, query_instruction=None, text_instruction=None, cache_folder=None)

In [75]:
from llama_index.core import VectorStoreIndex
vector_index = VectorStoreIndex(nodes, embed_model = embed_model)


In [81]:
query_engine = vector_index.as_query_engine(similarity_top_k=4, verbose=True)

In [82]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
              llm = llm,
              similarity_top_k = 4,
              filters =  MetadataFilters.from_dicts(
                  [
                      {'key': "page_label", 'value': '2' }
                  ]
              )
)

In [84]:
response = query_engine.query("What are some high-level results of MetaGPT?")
print(textwrap.fill(str(response), 100))

  MetaGPT achieves a new state-of-the-art (SoTA) with 85.9% and 87.7% in Pass@1 in code generation
benchmarks. In experimental evaluations, MetaGPT achieves a 100% task completion rate, demonstrating
the robustness and efficiency (time and token costs) of our design.


In [87]:
for i in response.source_nodes:print(i.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': '/content/drive/MyDrive/Colab Notebooks/Data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-12-22', 'last_modified_date': '2024-12-20'}


Define the Auto-Retrieval Tool
* Integrating metadata filter into a retrieval
tool function.
* This function enables more precise retrieval by accepting a query string and optional metadata filters, such as page numbers.
* The LLM can intelligently infer relevant metadata filters based on user's query.
* We can define different type of metadata filters such as section IDs and headers.


In [88]:
from typing import List
from llama_index.core.vector_stores import FilterCondition

In [89]:
def vector_query(
    query: str,
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.

    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.

    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]

    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response


In [90]:
vector_query_tool = FunctionTool.from_defaults(
    name = "vector_tool",
    fn =  vector_query
)

In [91]:
query = "What are the high-level results of MetaGPT as described on page 2?"

In [92]:
response =  llm.predict_and_call([vector_query_tool],
                                 user_msg = query,
                                 verbose =  True)

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: vector_tool
Action Input: {'query': 'high-level results of metagpt as described on page 2', 'page_numbers': ['2']}
[0m[1;3;34mObservation: 

On page 2, the high-level results of MetaGPT are described as achieving a new state-of-the-art (SoTA) with 85.9% and 87.7% in Pass@1 in code generation benchmarks, compared to other popular frameworks for creating complex software projects. Additionally, MetaGPT offers extensive functionality and achieves a 100% task completion rate in experimental evaluations, demonstrating its robustness and efficiency.
[0m

In [93]:
for n in response.source_nodes:print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': '/content/drive/MyDrive/Colab Notebooks/Data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-12-22', 'last_modified_date': '2024-12-20'}


Let's add some other tools!

In [95]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    verbose=True
)

summary_tool = QueryEngineTool.from_defaults(
    name =  "summary_tool",
    query_engine=summary_query_engine,
    description=("Useful for summarization questions.")
)

In [96]:
query = "What are the MetaGPT comparisons with ChatDev described on page 8?"

In [97]:
response =  llm.predict_and_call([vector_query_tool, summary_tool],
                                 user_msg = query,
                                 verbose =  True)

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: vector_tool
Action Input: {'query': 'MetaGPT comparisons with ChatDev described on page 8', 'page_numbers': ['8']}
[0m[1;3;34mObservation: 

On page 8, the paper presents a comparison between MetaGPT and ChatDev on the SoftwareDev dataset. The results show that MetaGPT outperforms ChatDev in nearly all metrics, including executability, running times, token usage, code statistics, and productivity. MetaGPT also requires less human revision cost. The paper highlights the benefits of SOPs in collaborations between multiple agents and demonstrates the autonomous software generation capabilities of MetaGPT through visualization samples.
[0m

In [98]:
for i in response.source_nodes:print(i.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': '/content/drive/MyDrive/Colab Notebooks/Data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-12-22', 'last_modified_date': '2024-12-20'}


Check if llm can pick the right tool.

In [99]:
query = "What is the summary of the paper?"

In [100]:
response = llm.predict_and_call([vector_query_tool, summary_tool],
                                 user_msg = query,
                                 verbose =  True)

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: summary_tool
Action Input: {'input': 'The paper discusses the use of machine learning algorithms for predicting stock prices. The authors propose a new algorithm that outperforms existing methods in terms of accuracy and speed. The results are validated on a large dataset and the algorithm is shown to be robust to various market conditions. The paper also provides insights into the behavior of the stock market and suggests potential applications for the proposed algorithm.'}
[0m[1;3;34mObservation: 

Yes, the paper discusses the use of machine learning algorithms for predicting stock prices. The authors propose a new algorithm that outperforms existing methods in terms of accuracy and speed. The results are validated on a large dataset and the algorithm is shown to be robust to various market conditions. The paper also provides insights into the behavior 