# Lesson 2: Tool Calling

## Setup

In [1]:
from helper import get_openai_api_key
OPENAI_API_KEY = get_openai_api_key()

In [2]:
import nest_asyncio
nest_asyncio.apply()

## 1. Define a Simple Tool

In [3]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int: 
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)


add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [4]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool], 
    "Tell me the output of the mystery function on 2 and 9", 
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


## 2. Define an Auto-Retrieval Tool

### Load Data

To download this paper, below is the needed code:

#!wget "https://arxiv.org/pdf/2405.13063" -O AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf

**Note**: The pdf file is included with this lesson. To access it, go to the `File` menu and select`Open...`.

In [5]:
from llama_index.core import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader(input_files=["AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf"]).load_data()

In [6]:
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [7]:
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf
file_path: AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf
file_type: application/pdf
file_size: 9864973
creation_date: 2024-06-08
last_modified_date: 2024-06-08

AURORA : A F OUNDATION MODEL OF THE ATMOSPHERE
Cristian Bodnar*, 1, Wessel P. Bruinsma*, 1, Ana Lucic*, 1, Megan Stanley*, 1,
Johannes Brandstetter3,†, Patrick Garvan1, Maik Riechert1, Jonathan Weyn2, Haiyu Dong2,
Anna Vaughan4, Jayesh K. Gupta5,†, Kit Tambiratnam2, Alex Archibald4, Elizabeth Heider1,
Max Welling6,†, Richard E. Turner1, 4, and Paris Perdikaris1
1Microsoft Research AI for Science
2Microsoft Corporation3JKU Linz4University of Cambridge5Poly Corporation6University of Amsterdam
*Equal contribution†Work done while at Microsoft Research
ABSTRACT
Deep learning foundation models are revolutionizing many facets of science by leveraging vast
amounts of data to learn general-purpose representations that can be adapted to tackle diverse
downstream t

In [8]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [9]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What is the primary difference between Aurora and classical atmospheric simulation approaches?", 
)

In [10]:
print(str(response))

Aurora is optimized to minimize a loss on multiple heterogeneous datasets with different resolutions, variables, and pressure levels, and is fine-tuned in two stages using Low Rank Adaptation (LoRA). In contrast, classical atmospheric simulation approaches do not typically involve optimization on multiple heterogeneous datasets or utilize Low Rank Adaptation for fine-tuning.


In [11]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}


### Define the Auto-Retrieval Tool

In [12]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str, 
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.
    
    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.
    
    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]
    
    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response
    

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

In [13]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "How does Aurora adapt to distribution shifts in CAMS data, as discussed on page 5?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "Aurora adapt to distribution shifts in CAMS data", "page_numbers": ["5"]}
=== Function Output ===
Aurora does not explicitly use emissions data as inputs like CAMS does. Instead, it learns from historical data affected by natural and anthropogenic factors to implicitly account for these effects. It is fine-tuned on CAMS analysis data and also incorporates lower resolution CAMS reanalysis data in the process. This adaptation allows Aurora to perform competitively with CAMS on various targets, demonstrating its ability to handle distribution shifts in the data.


In [14]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '5', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}
{'page_label': '5', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}


## Let's add some other tools!

In [15]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of AURORA"
    ),
)

In [16]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "How does Aurora outperform state-of-the-art classical simulation tools and specialized deep learning models as described on page 2?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "Aurora outperforms state-of-the-art classical simulation tools and specialized deep learning models by leveraging a novel approach that combines the strengths of both classical and deep learning methods. This hybrid approach allows Aurora to achieve superior performance in terms of accuracy, efficiency, and scalability compared to existing tools. Additionally, Aurora's unique architecture enables it to handle complex simulations with high-dimensional data more effectively than traditional simulation tools or specialized deep learning models."}
=== Function Output ===
Aurora's innovative hybrid approach, combining classical simulation tools and specialized deep learning models, allows it to achieve superior performance in accuracy, efficiency, and scalability compared to existing tools. Its unique architecture enables effective handling of complex simulations with high-dimensional data, surpassing the capabili

In [17]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '1', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}
{'page_label': '2', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}
{'page_label': '3', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 9864973, 'creation_date': '2024-06-08', 'last_modified_date': '2024-06-08'}
{'page_label': '3', 'file_name': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_path': 'AURORA A FOUNDATION MODEL OF THE ATMOSPHERE.pdf', 'file_type': 'application/pdf', 'file_size': 

In [18]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What is a summary of the paper?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "paper"}
=== Function Output ===
The paper discussed in the context is focused on the development, implementation, and performance evaluation of the Aurora model, a foundation model of the atmosphere. It highlights Aurora's superiority in atmospheric forecasting compared to traditional simulation tools and specialized deep learning models. The paper emphasizes Aurora's ability to outperform operational CAMS and IFS-HRES models in predicting various atmospheric variables and phenomena, showcasing its competitive forecasting capabilities. Additionally, it mentions Aurora's success in adapting to high-resolution data settings efficiently and capturing extreme weather events accurately.
