In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

## Overview

### Gemini

Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases.

### Calling functions from Gemini

[Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling) in Gemini lets developers create a description of a function in their code, then pass that description to a language model in a request. The response from the model includes the name of a function that matches the description and the arguments to call it with.



### Objectives

In this , we will learn how to use the Vertex AI Gemini API with the Vertex AI SDK for Python to make function calls via the Gemini 1.5 Pro (`gemini-1.5-pro`) model.

You will complete the following tasks:

- Install the Vertex AI SDK for Python
- Use the Vertex AI Gemini API to interact with the Gemini 1.5 Pro (`gemini-1.5-pro`) model:
- Use Function Calling in a chat session to answer user's questions about products in the Google Store
- Use Function Calling to geocode addresses with a maps API
- Use Function Calling for entity extraction on raw logging data

## Getting Started


### Install Vertex AI SDK for Python


In [None]:
!pip3 install --upgrade --user --quiet google-cloud-aiplatform

: 

### Restart current runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel.

In [3]:
# Restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench).

In [24]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [25]:
PROJECT_ID = "qwiklabs-gcp-01-32be716382e6"  # @param {type:"string"}
LOCATION = "europe-west1"  # @param {type:"string"}

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Code Examples

### Import libraries


In [26]:
import requests
from vertexai.generative_models import (
    FunctionDeclaration,
    GenerationConfig,
    GenerativeModel,
    Part,
    Tool,
)

### Chat example: Using Function Calling in a chat session to answer user's questions about the Google Store

In this example, you'll use Function Calling along with the chat modality in the Gemini model to help customers get information about products in the Google Store.

You'll start by defining three functions: one to get product information, another to get the location of the closest stores, and one more to place an order:

In [None]:

fetch_product_details = FunctionDeclaration(
    name="fetch_product_details",
    description="Retrieve the stock amount and identifier for a specific product",
    parameters={
        "type": "object",
        "properties": {
            "item_name": {"type": "string", "description": "Name of the product"}
        },
    },
)

locate_nearest_store = FunctionDeclaration(
    name="locate_nearest_store",
    description="Find the location of the nearest store",
    parameters={
        "type": "object",
        "properties": {"region": {"type": "string", "description": "User's location"}},
    },
)

create_order_request = FunctionDeclaration(
    name="create_order_request",
    description="Submit a request to place an order",
    parameters={
        "type": "object",
        "properties": {
            "item": {"type": "string", "description": "Name of the product"},
            "delivery_address": {"type": "string", "description": "Shipping address for the order"},
        },
    },
)


print("Product details function renamed to 'fetch_product_details'")
print("Store location function renamed to 'locate_nearest_store'")
print("Order placement function renamed to 'create_order_request'")


Note that function parameters are specified as a Python dictionary in accordance with the [OpenAPI JSON schema format](https://spec.openapis.org/oas/v3.0.3#schemawr).

Define a tool that allows the Gemini model to select from the set of 3 functions:

In [None]:
retail_tool = Tool(
    function_declarations=[
        fetch_product_details,
        locate_nearest_store,
        create_order_request,
    ],
)

Now you can initialize the Gemini model with Function Calling in a multi-turn chat session.

You can specify the `tools` kwarg when initializing the model to avoid having to send this kwarg with every subsequent request:

In [29]:
model = GenerativeModel(
    "gemini-1.5-pro-001",
    generation_config=GenerationConfig(temperature=0),
    tools=[retail_tool],
)
chat = model.start_chat()

*Note: The temperature parameter controls the degree of randomness in this generation. Lower temperatures are good for functions that require deterministic parameter values, while higher temperatures are good for functions with parameters that accept more diverse or creative parameter values. A temperature of 0 is deterministic. In this case, responses for a given prompt are mostly deterministic, but a small amount of variation is still possible.*

We're ready to chat! Let's start the conversation by asking if a certain product is in stock:

In [None]:

query_product_availability = """
Are any polar bears there Pro in Norway?
"""


response_message = chat.send_message(query_product_availability)


response_content = response_message.candidates[0].content.parts[0]


print("Query sent to check population of polar bears in Norway. Response received:")
print(response_content)


function_call {
  name: "get_product_info"
  args {
    fields {
      key: "product_name"
      value {
        string_value: "Pixel 8 Pro"
      }
    }
  }
}

The response from the Gemini API consists of a structured data object that contains the name and parameters of the function that Gemini selected out of the available functions.

Since this notebook focuses on the ability to extract function parameters and generate function calls, you'll use mock data to feed synthetic responses back to the Gemini model rather than sending a request to an API server (not to worry, we'll make an actual API call in a later example!):

In [None]:

polar_bear_api_response = {"species": "Polar Bear", "location": "Norway", "population_status": "healthy"}

In reality, you would execute function calls against an external system or database using your desired client library or REST API.

Now, you can pass the response from the (mock) API request and generate a response for the end user:

In [None]:
response = chat.send_message(
    Part.from_function_response(
        name="fetch_product_details",
        response={
            "content": polar_bear_api_response,
        },
    ),
)
response.text

: 

Next, the user might ask where they can buy a different phone from a nearby store:

In [None]:
prompt = """
What about in Canada? How many polar bears are there in Canada?
"""

response = chat.send_message(prompt)
response.candidates[0].content.parts[0]

: 

Again, you get a response with structured data, and the Gemini model selected the `get_product_info` function. This happened since the user asked about the "Pixel 8" phone this time rather than the "Pixel 8 Pro" phone.

Now you can build another synthetic payload that would come from an external API:

In [None]:
# Here you can use your preferred method to make an API request and get a response.
# In this example, we'll use synthetic data to simulate a payload from an external API response.

polar_bear_api_response = {"species": "Polar Bear", "location": "Norway", "population_status": "healthy"}


Again, you can pass the response from the (mock) API request back to the Gemini model:

In [None]:
response = chat.send_message(
    Part.from_function_response(
        name="fetch_product_details",
        response={
            "content": api_response,
        },
    ),
)
response.candidates[0].content.parts[0]

function_call {
  name: "get_store_location"
  args {
    fields {
      key: "location"
      value {
        string_value: "Mountain View, CA"
      }
    }
  }
}

Wait a minute! Why did the Gemini API respond with a second function call to `get_store_location` this time rather than a natural language summary? Look closely at the prompt that you used in this conversation turn a few cells up, and you'll notice that the user asked about a product -and- the location of a store.

In cases like this when two or more functions are defined (or when the model predicts multiple function calls to the same function), the Gemini model might sometimes return back-to-back or parallel function call responses within a single conversation turn.

This is expected behavior since the Gemini model predicts which functions it should call at runtime, what order it should call dependent functions in, and which function calls can be parallelized, so that the model can gather enough information to generate a natural language response.

Not to worry! You can repeat the same steps as before and build another synthetic payload that would come from an external API:

In [None]:

polar_bear_habitat_response = {
    "species": "Polar Bear",
    "habitat_location": "Svalbard, Norway",
    "protection_status": "Protected Area"
}

response_message = chat.send_message(
    Part.from_function_response(
        name="get_habitat_location",  
        response={"content": polar_bear_habitat_response},
    )
)


response_content = response_message.candidates[0].content.parts[0]


print("Query sent to check polar bear habitat location. Response received:")
print(response_content)


And you can pass the response from the (mock) API request back to the Gemini model:

In [None]:
response = chat.send_message(
    Part.from_function_response(
        name="locate_nearest_store",
        response={
            "content": polar_bear_api_response,
        },
    ),
)
response.text

: 

And send the payload from the external API call so that the Gemini API returns a natural language summary to the end user.

In [None]:

polar_bear_order_response = {
    "order_status": "confirmed",
    "order_number": 78910,
    "delivery_estimate": "3 days",
    "items": ["Polar Bear Conservation Kit", "Educational Booklet"]
}


order_response_message = chat.send_message(
    Part.from_function_response(
        name="process_conservation_order", 
        response={"content": polar_bear_order_response},
    )
)


order_response_content = order_response_message.candidates[0].content.parts[0]


print("Order placed for polar bear conservation materials. Response received:")
print(order_response_content)


: 