# Function Calling with Gemini 

**Learning Objectives**

1. Learn about function calling and relevant use cases
1. Learn how to implement function calling with Gemini Pro
1. Learn patterns for handling function calls in a chat session
1. Learn how function calling can be used in different situations and use cases 

Function calling allows developers to define custom functions and provide these functions to Gemini. While processing a query, Gemini can choose to delegate certain data processing tasks to these functions. Gemini does not call these functions, rather it provides structured data output that includes the name of a selected function and the arguments the function should be called with. You can use this output to perform tasks like invoking external APIs, performing mathematical computations, extracting structured data, and more. You can then provide the function response back to the model, allowing it to complete its answer to the query.

<img src="https://cloud.google.com/static/vertex-ai/generative-ai/docs/multimodal/images/function-calling.png" alt="Function Calling" class="center">

In [1]:
from google.cloud import aiplatform

print(aiplatform.__version__)

1.43.0


In [2]:
from typing import Any, Callable, Optional, Tuple, Union

from google.cloud import bigquery
from vertexai.generative_models import (
    ChatSession,
    Content,
    FunctionDeclaration,
    GenerationConfig,
    GenerationResponse,
    GenerativeModel,
    Part,
    Tool,
)

2024-07-08 16:48:38.627281: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
REGION = "us-central1"
PROJECT = !(gcloud config get-value core/project)
PROJECT = PROJECT[0]

## Chat Session with Function Calling
First, lets think about how function calling can be implemented within a chat session. Essentially, when the model returns a function call, instead of returning to the user, we need to invoke a Python function that executes the specified function with the provided arguments, then feeds the result back into the model. This may happen multiple times (e.g. model returns function call -> function call response fed back into model -> model returns another function call -> ... ). 

Our goal is to create a simple class for example chat sessions, that implements a reasoning loop in its `send_message` method. The class should be instantiated with:
1) `model`: An instance of `GenerativeModel`
2) `tool_handler_fn`. A Python callable that accepts the function call name (str) and the function call arguments (dict) when invoked. This function should implement the logic of the function call itself and return the result.

For example, if we had a tool with function calls for reading from and writing to a database, then we may have a `tool_handler_fn` that looks like:

```python
def tool_handler_fn(fn_name, fn_args):
    """This assumes function call read_row has parameter row_id, and function call write_row has parameter row"""
    if fn_name == "read_row":
        result = db.read_row(fn_args["row_id"])
    elif fn_name == "write_row":
        result = db.write_row(fn_args["row"])
    return result 
```

In [4]:
class ChatAgent:
    def __init__(
        self,
        model: GenerativeModel,
        tool_handler_fn: Callable[[str, dict], Any],
        max_iterative_calls: int = 5,
    ):
        self.tool_handler_fn = tool_handler_fn
        self.chat_session = model.start_chat()
        self.max_iterative_calls = 5

    def send_message(self, message: str) -> GenerationResponse:
        response = self.chat_session.send_message(message)

        # This is None if a function call was not triggered
        fn_call = response.candidates[0].content.parts[0].function_call

        num_calls = 0
        # Reasoning loop. If fn_call is None then we never enter this
        # and simply return the response
        while fn_call:
            if num_calls > self.max_iterative_calls:
                break

            # Handle the function call
            fn_call_response = self.tool_handler_fn(
                fn_call.name, dict(fn_call.args)
            )
            num_calls += 1

            # Send the function call result back to the model
            response = self.chat_session.send_message(
                Part.from_function_response(
                    name=fn_call.name,
                    response={
                        "content": fn_call_response,
                    },
                ),
            )

            # If the response is another function call then we want to
            # stay in the reasoning loop and keep calling functions.
            fn_call = response.candidates[0].content.parts[0].function_call

        return response

## Simple API Example
Now that we have a way to use function calling in a chat session, let's implement some common use cases. Imagine you want to call an API to get the current weather for a specific location, when a user asks for it. This requires a mechanism to identify that the current weather is being asked for, and also to extract the location required in the API request. 

With function calling, this is fairly straightforward. Simply define a function declaration with the intent and required parameters.

In [5]:
current_weather_func = FunctionDeclaration(
    name="current_weather",
    description="Get the current weather at a specified location",
    parameters={
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "Location",
            }
        },
        "required": ["location"],
    },
)

# Simulate a function that calls a weather API


def current_weather(location: str) -> dict:
    print("Executing current_weather function...")
    api_response = {
        "location": "New York City",
        "temperature": "55 degrees (F)",
        "wind": "8 mph",
        "wind_direction": "West",
        "skies": "clear/sunny",
        "chance_of_rain": "0%",
    }
    return api_response

Instantiate a `Tool` with the single function declaration, and then write a tool handler function to invoke when the model returns a function call. Then instantiate the model with the `Tool`.

In [6]:
# Tools can wrap around one or multiple functions
weather_tool = Tool(
    function_declarations=[current_weather_func],
)


# Instantiate model with weather tool
model = GenerativeModel(
    "gemini-1.0-pro-001",
    tools=[weather_tool],
)

Send a chat through the model without using `ChatAgent` to see what the response of a function call looks like.

In [7]:
chat = model.start_chat()
response = chat.send_message("What is the weather like in New York City?")
response

candidates {
  content {
    role: "model"
    parts {
      function_call {
        name: "current_weather"
        args {
          fields {
            key: "location"
            value {
              string_value: "New York City"
            }
          }
        }
      }
    }
  }
  finish_reason: STOP
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
}
usage_metadata {
  prompt_token_count: 24
  candidates_token_count: 7
  total_token_count: 31
}

Notice how instead of returning a text response, Gemini returned the function name to call and arguments to call it with. Now implement a function that we can instantiate `ChatAgent` with, that we will pass the function name and arguments to any time Gemini returns a function call.

In [8]:
def weather_tool_handler_fn(fn_name: str, fn_args: dict) -> dict:
    if fn_name == "current_weather":
        return current_weather(fn_args["location"])
    else:
        raise ValueError(f"Unknown function call: {fn_name}")


chat = ChatAgent(model=model, tool_handler_fn=weather_tool_handler_fn)
response = chat.send_message("What is the weather like in New York City?")
response

Executing current_weather function...


candidates {
  content {
    role: "model"
    parts {
      text: "The current weather in New York City is clear and sunny with a temperature of 55 degrees Fahrenheit. There is an 0% chance of rain. The wind is blowing at 8 mph from the west."
    }
  }
  finish_reason: STOP
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
  }
}
usage_metadata {
  prompt_token_count: 65
  candidates_token_count: 43
  total_token_count: 108
}

If we take a look at the chat history, we can see that a function call was returned, our handler function was invoked, which then invoked the Python function simulating an API. The response from that was then sent back into the model and incorporated in its response about the weather!

In [9]:
chat.chat_session.history

[role: "user"
 parts {
   text: "What is the weather like in New York City?"
 },
 role: "model"
 parts {
   function_call {
     name: "current_weather"
     args {
       fields {
         key: "location"
         value {
           string_value: "New York City"
         }
       }
     }
   }
 },
 role: "user"
 parts {
   function_response {
     name: "current_weather"
     response {
       fields {
         key: "content"
         value {
           struct_value {
             fields {
               key: "chance_of_rain"
               value {
                 string_value: "0%"
               }
             }
             fields {
               key: "location"
               value {
                 string_value: "New York City"
               }
             }
             fields {
               key: "skies"
               value {
                 string_value: "clear/sunny"
               }
             }
             fields {
               key: "temperature"
               

## Function calling to perform mathematical operations
Function calling can also help in an area that LLMs have long struggled - mathematics. Language models build up deep and insightful representations of natural language, but often lack the ability to (correctly and consistently) perform mathematical operations. We can provide a degree of consistency and accuracy by creating a tool that identifies when a mathematical operation is needed, and calls a function to actually perform that operation. 

Create function declarations for simple mathematical operations (addition, subtraction, multiplication, division).

#### Exercise
Implement the functions to invoke when a function call is triggered

In [10]:
parameters = {
    "type": "object",
    "properties": {
        "first_number": {
            "type": "number",
            "description": "First number",
        },
        "second_number": {"type": "number", "description": "Second number"},
    },
    "required": ["first_number", "second_number"],
}

# TODO: Create function declarations for core math operations
add_two_numbers_func = None
subtract_two_numbers_func = None
multiply_two_numbers_func = None
divide_two_numbers_func = None

math_tool = Tool(
    function_declarations=[
        add_two_numbers_func,
        subtract_two_numbers_func,
        multiply_two_numbers_func,
        divide_two_numbers_func,
    ],
)

Instead of simulating the response from functions, lets actually write the Python functions that we will call with arguments provided when Gemini responds with a function call.

In [11]:
# Define functions for each function declaration used in the math tool
add_two_numbers = lambda a, b: a + b
subtract_two_numbers = lambda a, b: a - b
multiply_two_numbers = lambda a, b: a * b
divide_two_numbers = lambda a, b: a / b

#### Exercise 

Implement a handler function to route function calls

In [12]:
def handle_math_fn_call(fn_name: str, fn_args: dict) -> Union[int, float]:
    """Handles math tool function calls."""

    print(f"Function calling: {fn_name} with args: {fn_args}")
    a = fn_args["first_number"]
    b = fn_args["second_number"]

    # TODO: Complete this function to handle different function calls

Instantiate a model, chat agent, and test out some queries!

In [13]:
model = GenerativeModel(
    "gemini-1.0-pro-001",
    tools=[math_tool],
    generation_config=GenerationConfig(temperature=0.0),
)

chat = ChatAgent(model=model, tool_handler_fn=handle_math_fn_call)

In [14]:
response = chat.send_message("What is one plus one?")
response.text

Function calling: add_two_numbers with args: {'second_number': 1.0, 'first_number': 1.0}


'The answer is 2.'

In [15]:
response = chat.send_message("Thanks! What is (5 * 5) / (4 + 1)?")
response.text

Function calling: multiply_two_numbers with args: {'second_number': 5.0, 'first_number': 5.0}
Function calling: add_two_numbers with args: {'second_number': 1.0, 'first_number': 4.0}
Function calling: divide_two_numbers with args: {'first_number': 25.0, 'second_number': 5.0}


'The answer is 5.'

Notice how Gemini called more than one function, sequentially and logically, in order to answer the question. Very cool! 

## Natural Language to SQL with Database Execution 
Function calling can be helpful with systems that require SQL generation and execution, and using the response to answer a query. Start by creating a dataset in BigQuery and cloning some public tables into it.

In [16]:
# Create the dataset
!bq mk --location="US" iowa_liquor_sales

Dataset 'kylesteckler-sandbox:iowa_liquor_sales' successfully created.


Create the table by querying the public data

In [17]:
%%bigquery
CREATE OR REPLACE TABLE iowa_liquor_sales.sales AS 
SELECT * FROM `bigquery-public-data.iowa_liquor_sales.sales`

Query is running:   0%|          |

Update the schema of your table to include column definitions

In [18]:
SCHEMA_FILE = "liquor_sales_schema.json"
!bq show --schema --format=prettyjson bigquery-public-data:iowa_liquor_sales.sales > {SCHEMA_FILE}
!bq update {PROJECT}:iowa_liquor_sales.sales {SCHEMA_FILE}

Table 'kylesteckler-sandbox:iowa_liquor_sales.sales' successfully updated.


Create function declarations and instantiate a new `Tool`. The function declarations should be:
1) Listing available tables `list_available_tables`
2) Retrieving information and schema about a specific table `get_table_info`
3) Retrieves information from BigQuery to answer a users question `sql_query`

#### Exercise 
Implement a tool that can generate SQL and execute queries.

In [19]:
# Since we only have one table we will just hardcode the response from this function if triggered
list_available_tables_func = FunctionDeclaration(
    name="list_available_tables",
    description="Get and list all available BigQuery tables with fully qualified IDs.",
    parameters={"type": "object", "properties": {}},
)

get_table_info_func = FunctionDeclaration(
    name="get_table_info",
    description="Get information about a BigQuery table and it's schema so you can better answer user questions.",
    parameters={
        "type": "object",
        "properties": {
            "table_id": {
                "type": "string",
                "description": "Fully qualified ID of BigQuery table",
            }
        },
    },
)

sql_query_func = FunctionDeclaration(
    name="sql_query",
    description="Get information from data in BigQuery using SQL queries",
    parameters={
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": f"SQL query on a single line (no \\n characters) that will help give answers to users questions when run on BigQuery. Only query tables in project: {PROJECT}",
            }
        },
        "required": [
            "query",
        ],
    },
)

# TODO: Instantiate the query tool using the above function declarations
query_tool = None

Now we need to create Python functions that will be executed when the model returns any of these function calls. The functions should be implement as such:
* `list_available_tables` should accept no parameters and simply return the name of the BigQuery table created above: `f"{PROJECT}/iowa_liquor_sales.sales"`
* `get_table_info` should accept a table_id parameter and use the BigQuery client library to retrieve table information and schema 
* `sql_query` should accept a query_sting parameter and use the BigQuery client library to execute a sql query and return the results

In [20]:
def list_available_tables():
    return [f"{PROJECT}.iowa_liquor_sales.sales"]


def get_table_info(table_id: str) -> dict:
    """Returns dict from BigQuery API with table information"""
    bq_client = bigquery.Client()
    return bq_client.get_table(table_id).to_api_repr()


def sql_query(query_str: str):
    bq_client = bigquery.Client()
    try:
        # clean up query string a bit
        query_str = (
            query_str.replace("\\n", "").replace("\n", "").replace("\\", "")
        )
        # print(query_str)
        query_job = bq_client.query(query_str)
        result = query_job.result()
        result = str([dict(x) for x in result])
        return result
    except Exception as e:
        return f"Error from BigQuery Query API: {str(e)}"

Create a Python function to handle function calls returned by the model and invoke the needed logic

In [21]:
def handle_query_fn_call(fn_name: str, fn_args: dict):
    """Handles query tool function calls."""

    print(f"Function calling: {fn_name} with args: {str(fn_args)}\n")
    if fn_name == "list_available_tables":
        result = list_available_tables()

    elif fn_name == "get_table_info":
        result = get_table_info(fn_args["table_id"])

    elif fn_name == "sql_query":
        result = sql_query(fn_args["query"])

    else:
        raise ValueError(f"Unknown function call: {fn_name}")

    return result

Instantiate a model, chat agent, and test out some queries!

In [22]:
model = GenerativeModel(
    "gemini-1.0-pro-001",
    tools=[query_tool],
    generation_config=GenerationConfig(temperature=0.0),
)
chat = ChatAgent(model=model, tool_handler_fn=handle_query_fn_call)

In [23]:
# Insert an initialization prompt before the first chat to help guide model behavior and output style/format

init_prompt = """
    Please give a concise and easy to understand answer to any questions. 
    Only use information that you learn by querying the BigQuery table. 
    Do not make up information. Be sure to look at which tables are available 
    and get the info of any relevant tables before trying to write a query. 
    
    Question:
"""

prompt = "Which store has sold the most bottles of all time?"
response = chat.send_message(init_prompt + prompt)
print(response.text)

Function calling: list_available_tables with args: {}

Function calling: get_table_info with args: {'table_id': 'kylesteckler-sandbox.iowa_liquor_sales.sales'}

Function calling: sql_query with args: {'query': 'SELECT store_name, SUM(bottles_sold) AS total_bottles_sold FROM `kylesteckler-sandbox.iowa_liquor_sales.sales` GROUP BY store_name ORDER BY total_bottles_sold DESC LIMIT 1'}

The store with the highest total number of bottles sold is HY-VEE #3 / BDI / DES MOINES, with 7,971,741 bottles sold.


In [24]:
response = chat.send_message(
    "Interesting! What is the most popular bottle of all time?"
)
print(response.text)

Function calling: sql_query with args: {'query': 'SELECT item_description, SUM(bottles_sold) AS total_bottles_sold FROM `kylesteckler-sandbox.iowa_liquor_sales.sales` GROUP BY item_description ORDER BY total_bottles_sold DESC LIMIT 1'}

The most popular bottle of all time is FIREBALL CINNAMON WHISKEY, with 17,082,679 bottles sold.


In [25]:
response = chat.send_message(
    "What are the five most popular bottles in polk county?"
)
print(response.text)

Function calling: sql_query with args: {'query': "SELECT item_description, SUM(bottles_sold) AS total_bottles_sold FROM `kylesteckler-sandbox.iowa_liquor_sales.sales` WHERE county = 'POLK' GROUP BY item_description ORDER BY total_bottles_sold DESC LIMIT 5"}

The five most popular bottles in Polk County are:

1. FIREBALL CINNAMON WHISKEY (4,840,632 bottles sold)
2. TITOS HANDMADE VODKA (2,412,290 bottles sold)
3. BLACK VELVET (2,190,747 bottles sold)
4. HAWKEYE VODKA (1,866,134 bottles sold)
5. FIREBALL CINNAMON (1,029,414 bottles sold)


In [26]:
response = chat.send_message(
    "What vendors have made the most revenue selling liquor?"
)
print(response.text)

Function calling: sql_query with args: {'query': 'SELECT vendor_name, SUM(sale_dollars) AS total_revenue FROM `kylesteckler-sandbox.iowa_liquor_sales.sales` GROUP BY vendor_name ORDER BY total_revenue DESC LIMIT 5'}

The top 5 vendors with the highest total revenue are:

1. DIAGEO AMERICAS ($896,548,524.26)
2. SAZERAC COMPANY INC ($345,882,242.31)
3. JIM BEAM BRANDS ($321,401,583.08)
4. PERNOD RICARD USA ($199,918,196.21)
5. HEAVEN HILL BRANDS ($184,274,291.31)


Feel free to execute the generated SQL to verify/validate the responses! An easy way to do this is in a code cell with `%%bigquery` at the top. For example:

```
%%bigquery
SELECT ... 
FROM ... 
```

## Function calling for entity extraction 
In the previous examples we used entity extraction to pass parameters along to another function, API, or client library. However, you might want to only perform the entity extraction step, and stop there without actually invoking anything else. You can think of this functionality as a convenient way to transform unstructured text data into structured fields.

For example, we can easily build a log extractor that transforms raw logs into structured data with details about error messages.

Start by specifying the function declaration.

In [27]:
extract_log_data_func = FunctionDeclaration(
    name="extract_log_data",
    description="Extracts specific details from errors in log data",
    parameters={
        "type": "object",
        "properties": {
            "errors": {
                "type": "array",
                "description": "Errors",
                "items": {
                    "description": "Details of the error",
                    "type": "object",
                    "properties": {
                        "error_message": {
                            "type": "string",
                            "description": "Full error message",
                        },
                        "error_code": {
                            "type": "string",
                            "description": "Error code",
                        },
                        "error_type": {
                            "type": "string",
                            "description": "Error type",
                        },
                    },
                },
            }
        },
    },
)

error_extraction_tool = Tool(
    function_declarations=[extract_log_data_func],
)

In [28]:
model = GenerativeModel(
    "gemini-1.0-pro-001",
    tools=[error_extraction_tool],
    generation_config=GenerationConfig(temperature=0.0),
)

prompt = """
[15:43:28] ERROR: Could not process image upload: Unsupported file format. (Error Code: 308)
[15:44:10] INFO: Search index updated successfully. 
[15:45:02] ERROR: Service dependency unavailable (payment gateway). Retrying... (Error Code: 5522) 
[15:45:33] ERROR: Application crashed due to out-of-memory exception. (Error Code: 9001) 
"""
response = model.generate_content(prompt)
function_call = response.candidates[0].content.parts[0].function_call

In [29]:
for err in dict(function_call.args).get("errors"):
    print(dict(err))

{'error_type': 'ERROR', 'error_message': 'Could not process image upload: Unsupported file format.', 'error_code': '308'}
{'error_type': 'ERROR', 'error_message': 'Service dependency unavailable (payment gateway). Retrying...', 'error_code': '5522'}
{'error_message': 'Application crashed due to out-of-memory exception.', 'error_type': 'ERROR', 'error_code': '9001'}


Function calling is an incredibly versatile tool! 

Copyright 2024 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

     https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.