Skip to content

simonleewm/vertexai-function-calling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Supercharge Gemini: A Beginner's Guide to Function Calling in Vertex AI

Large Language Models (LLMs) like Google's Gemini are incredibly powerful at understanding and generating human-like text. They can write poems, summarize articles, and even answer complex questions. But what happens when you need them to do something they weren't explicitly trained for, like performing precise mathematical calculations, looking up real-time data from an API, or interacting with a private database?

This is where Function Calling comes in. It's a powerful feature that allows you to connect an LLM to your own custom code, effectively giving it superpowers. Instead of trying to perform a task it's not good at (like math), the model can instead call a reliable, purpose-built function you've provided.

In this guide, we'll walk through a hands-on lab where you'll teach Gemini how to do basic math. You'll learn how to:

  1. Define custom Python functions.
  2. Describe those functions to Gemini so it knows when and how to use them.
  3. Build a loop that lets Gemini call your code and use the results to answer user questions.

By the end, you'll have a solid understanding of the ReAct (Reason and Act) pattern and be ready to connect Gemini to any external tool or service you can imagine.


Lab Setup: Your Google Cloud Environment

First, we need to get our digital workspace ready. We'll be using Vertex AI Workbench, a Jupyter notebook environment hosted on Google Cloud.

1. Open Vertex AI Workbench

  • In the Google Cloud Console, use the navigation menu (☰) to go to Vertex AI > Workbench.
  • You'll see a pre-configured notebook instance named generative-ai-jupyterlab. Click OPEN JUPYTERLAB. This will open the familiar Jupyter interface in a new tab.

2. Create a New Notebook

  • In the JupyterLab launcher, under the "Notebook" section, click on Python 3.
  • A new Untitled.ipynb file will be created. Right-click on its tab and rename it to function_lab.ipynb.

Step 1: Preparing the Notebook

With our notebook open, the first step is to install the necessary libraries and initialize the Vertex AI SDK.

Install and Initialize

Run the following cells one by one.

  1. Install the Vertex AI SDK. This library allows our Python code to communicate with Google Cloud's AI services.

    ! pip3 install --upgrade --quiet --user google-cloud-aiplatform==1.88.0
  2. Restart the kernel. This is a crucial step to ensure the notebook can use the library we just installed.

    # Restart kernel after installs so that your environment can access the new packages
    import IPython
    
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
  3. Initialize Vertex AI. This cell gets your Google Cloud Project ID and tells the SDK which project and region to work in.

    import vertexai
    
    PROJECT_ID = ! gcloud config get-value project
    PROJECT_ID = PROJECT_ID[0]
    LOCATION = "us-central1"
    
    vertexai.init(project=PROJECT_ID, location=LOCATION)
  4. Import required libraries. We'll need these classes to define our functions and interact with the Gemini model.

    import requests
    from vertexai.generative_models import (
        Content,
        FunctionDeclaration,
        GenerationConfig,
        GenerativeModel,
        Part,
        Tool,
    )

Step 2: Define Your Python Functions

This is where we create the "tools" we'll give to Gemini. For this lab, we'll create two simple functions: one for addition and one for multiplication.

# Define a function to add two numerical inputs and return the result.
def add(a: float, b: float) -> float:
    """Adds two numbers."""
    print(f"Calling add function with {a} and {b}")
    return a + b

# Define a function to subtract two numerical inputs and return the result.
def subtract(a: float, b: float) -> float:
    """Subtracts the second number from the first number."""
    print(f"Calling subtract function with {a} and {b}")
    return a - b

# Define a function to multiply two numerical inputs and return the result.
def multiply(a: float, b: float) -> float:
    """Multiplies two numbers."""
    print(f"Calling multiply function with {a} and {b}")
    return a * b

Notice the docstrings ("""Adds two numbers.""") and type hints (a: float). The model will use this information to understand what the function does!


Step 3: Describe the Functions to Gemini

Just having the Python code isn't enough. We need to create a structured description, or a "manual," for each function that Gemini can understand. We do this using FunctionDeclaration.

This tells the model:

  • The function's name (add).
  • A description of what it does ("Adds two numbers").
  • The parameters it expects (a and b, both of which are numbers).
# Create FunctionDeclarations for your functions
add_func = FunctionDeclaration(
    name="add",
    description="Adds two numbers",
    parameters={
        "type": "object",
        "properties": {
            "a": {"type": "number"},
            "b": {"type": "number"},
        },
        "required": ["a", "b"],
    },
)
subtract_func = FunctionDeclaration(
    name="subtract",
    description="Subtracts the second number from the first number",
    parameters={
        "type": "object",
        "properties": {
            "a": {"type": "number"},
            "b": {"type": "number"},
        },
        "required": ["a", "b"],
    },
)
multiply_func = FunctionDeclaration(
    name="multiply",
    description="Multiplies two numbers",
    parameters={
        "type": "object",
        "properties": {
            "a": {"type": "number"},
            "b": {"type": "number"},
        },
        "required": ["a", "b"],
    },
)

Step 4: Equip and Instruct the Model

Now we bundle our function declarations into a Tool and initialize the Gemini model, giving it the tool and a set of instructions.

The system_instruction is critical. It's our chance to guide the model's behavior, telling it when and how to use the tools we've provided.

# Bundle the functions into a single tool
math_tool = Tool(function_declarations=[add_func, multiply_func])

# Define system instructions
system_instruction = """
- Fulfill the user's instructions, including telling jokes.
- Answer the user's question using the appropriate function tool if available.
- You may call the 'multiply', 'add', and 'subtract' functions one after the other if needed.
- Use the 'multiply' function only for multiplication-related tasks.
- Use the 'add' function only for addition-related tasks.
- Use the 'subtract' function only for subtraction-related tasks.
- A function tool may only be invoked if its capabilities are an exact match for the described task. Avoid speculative or improper use of tools.
- In the absence of a suitable tool, respond to the user with a generic answer without performing the mathematical calculation yourself.
    """

# Initialize the model
model = GenerativeModel(
    model_name="gemini-1.5-flash-001",
    generation_config=GenerationConfig(temperature=0),
    system_instruction=system_instruction,
    tools=[math_tool]
)

# Start a chat session
chat = model.start_chat()

Step 5: Handle the Model's Response

This is the most important part. When we send a prompt to the model, it might respond with text or it might respond with a request to call one of our functions. Our code needs to handle both cases.

This function creates a loop:

  1. You ask a question.
  2. Gemini decides if it needs a tool. If so, it returns a function_call with the right arguments.
  3. Your code executes the function and sends the result back to Gemini.
  4. Gemini uses the result to formulate a final, human-friendly answer.
def handle_response(response):
    """Handles the model's response, invoking functions if necessary."""
    # Check if the model wants to call a function
    if response.candidates[0].content.parts[0].function_call:
        function_call = response.candidates[0].content.parts[0].function_call
    else:
        # If not, just print the text response and exit
        print(response.text)
        return

    # If the model wants to call the 'add' function...
    if function_call.name == "add":
        args = {key: value for key, value in function_call.args.items()}
        result = add(a=args["a"], b=args["b"])
        # Send the result back to the model
        response = chat.send_message(
            Part.from_function_response(name=function_call.name, response={"result": result})
        )
        # Continue the loop to get the final answer
        handle_response(response)

    # If the model wants to call the 'multiply' function...
    elif function_call.name == "multiply":
        args = {key: value for key, value in function_call.args.items()}
        result = multiply(a=args["a"], b=args["b"])
        # Send the result back to the model
        response = chat.send_message(
            Part.from_function_response(name=function_call.name, response={"result": result})
        )
        # Continue the loop to get the final answer
        handle_response(response)

    elif function_call.name == "subtract":
        # Extract the arguments to use in your function
        args = {key: value for key, value in function_call.args.items()}
        # Call your function
        result = subtract(a=args["a"], b=args["b"])
        # Send the result back to the chat session with the model
        response = chat.send_message(
            Part.from_function_response(name=function_call.name, response={"result": result})
        )
        # Make a recursive call of this handler function
        handle_response(response)

Step 6: Putting It All to the Test!

Let's see our creation in action. We'll send a few different prompts to our chat session and let handle_response do its magic.

A non-math question:

response = chat.send_message("Tell me a joke?")
handle_response(response)
# Expected Output: Why don't scientists trust atoms? Because they make up everything!

A multiplication problem:

response = chat.send_message("I have 5 cakes each with 10 slices. How many slices do I have?")
handle_response(response)
# Expected Output:
# Multiplying 5 and 10
# You have 50 slices.

An addition problem:

response = chat.send_message("John brought 4 cakes. Mary brought 3 cakes. How many cakes did they bring together?")
handle_response(response)
# Expected Output:
# Calling add function with 4 and 3
# They brought 7 cakes together.

A multi-step problem:

response = chat.send_message("John brought 4 cakes, but Mary ate 2 cakes. There are 10 slices per cake. How many slices are left?")
handle_response(response)
# Expected Output:
# Calling subtract function with 4 and 2
# Multiplying 2 and 10
# There are 20 slices left.

Congratulations! You've successfully extended Gemini's capabilities with your own code. You've given it a calculator, and it knows how to use it to solve complex, multi-step word problems. From here, the possibilities are endless. You could connect it to a weather API, a product database, or any other tool that would make your AI application smarter and more useful.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors