In [None]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Intro to Batch Predictions with the Gemini API


<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fbatch-prediction%2Fintro_batch_prediction.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/batch-prediction/intro_batch_prediction.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/bigquery/import?url=https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/bigquery/v1/32px.svg" alt="BigQuery Studio logo"><br> Open in BigQuery Studio
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/53/X_logo_2023_original.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>            

| | |
|-|-|
|Author(s) | [Eric Dong](https://github.com/gericdong), [Holt Skinner](https://github.com/holtskinner) |

## Overview

Different from getting online (synchronous) responses, where you are limited to one input request at a time, the batch predictions with the Gemini API in Vertex AI allow you to send a large number of multimodal requests to a Gemini model in a single batch request. Then, the model responses asynchronously populate to your storage output location in [Cloud Storage](https://cloud.google.com/storage/docs/introduction) or [BigQuery](https://cloud.google.com/bigquery/docs/storage_overview).

Batch predictions are generally more efficient and cost-effective than online predictions when processing a large number of inputs that are not latency sensitive.

To learn more, see the [Get batch predictions for Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini) page.

### Objectives

In this tutorial, you learn how to make batch predictions with the Gemini API in Vertex AI. This tutorial shows how to use **Cloud Storage** and **BigQuery** as input sources and output locations.

You will complete the following tasks:

- Preparing batch inputs and an output location
- Submitting a batch prediction job
- Retrieving batch prediction results


## Get started

### Install Google Gen AI SDK


In [None]:
%pip install --upgrade --quiet google-genai pandas google-cloud-storage google-cloud-bigquery

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Import libraries


In [1]:
from datetime import datetime
import time

from google import genai
from google.cloud import bigquery
from google.genai.types import CreateBatchJobConfig
import pandas as pd

### Set Google Cloud project information and create client

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [2]:
import os

PROJECT_ID = "[your-project-id]"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

In [3]:
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

### Load model

You can find a list of the Gemini models that support batch predictions in the [Multimodal models that support batch predictions](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini#multimodal_models_that_support_batch_predictions) page.

This tutorial uses Gemini 2.0 Flash (`gemini-2.0-flash-001`) model.

In [4]:
MODEL_ID = "gemini-2.0-flash-001"  # @param {type:"string", isTemplate: true}

## Cloud Storage

### Prepare batch inputs

The input for batch requests specifies the items to send to your model for prediction. You can learn more about the batch input formats in the [Batch text generation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini#prepare_your_inputs) page.

This tutorial uses Cloud Storage as an example. The requirements for Cloud Storage input are:

- File format: [JSON Lines (JSONL)](https://jsonlines.org/)
- Located in `us-central1`
- Appropriate read permissions for the service account

Each request that you send to a model can include parameters that control how the model generates a response. Learn more about Gemini parameters in the [Experiment with parameter values](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values) page.

This is one of the example requests in the input JSONL file `batch_requests_for_multimodal_input_2.jsonl`:

```json
{"request":{"contents": [{"role": "user", "parts": [{"text": "List objects in this image."}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/image/office-desk.jpeg", "mime_type": "image/jpeg"}}]}],"generationConfig":{"temperature": 0.4}}}
```

In [7]:
INPUT_DATA = "gs://cloud-samples-data/generative-ai/batch/batch_requests_for_multimodal_input_2.jsonl"  # @param {type:"string"}

### Prepare batch output location

When a batch prediction task completes, the output is stored in the location that you specified in your request.

- The location is in the form of a Cloud Storage prefix.
  - For example: `gs://path/to/output/data`.

- You can specify the URI of your Cloud Storage bucket in `BUCKET_URI`, or
- If it is not specified, this notebook will create a Cloud Storage bucket in the form of `gs://PROJECT_ID-TIMESTAMP`.

In [5]:
BUCKET_URI = "[your-cloud-storage-bucket]"  # @param {type:"string"}

if BUCKET_URI == "[your-cloud-storage-bucket]":
    TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
    BUCKET_URI = f"gs://{PROJECT_ID}-{TIMESTAMP}"

    ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}

### Send a batch prediction request

To make a batch prediction request, you specify a source model ID, an input source and an output location where Vertex AI stores the batch prediction results.

To learn more, see the [Batch prediction API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/batch-prediction-api) page.


In [None]:
gcs_batch_job = client.batches.create(
    model=MODEL_ID,
    src=INPUT_DATA,
    config=CreateBatchJobConfig(dest=BUCKET_URI),
)
gcs_batch_job.name

Print out the job status and other properties. You can also check the status in the Cloud Console at https://console.cloud.google.com/vertex-ai/batch-predictions

In [None]:
gcs_batch_job = client.batches.get(name=gcs_batch_job.name)
gcs_batch_job

Optionally, you can list all the batch prediction jobs in the project.

In [None]:
for job in client.batches.list():
    print(job.name, job.create_time, job.state)

### Wait for the batch prediction job to complete

Depending on the number of input items that you submitted, a batch generation task can take some time to complete. You can use the following code to check the job status and wait for the job to complete.

In [None]:
# Refresh the job until complete
while gcs_batch_job.state == "JOB_STATE_RUNNING":
    time.sleep(5)
    gcs_batch_job = client.batches.get(name=gcs_batch_job.name)

# Check if the job succeeds
if gcs_batch_job.state == "JOB_STATE_SUCCEEDED":
    print("Job succeeded!")
else:
    print(f"Job failed: {gcs_batch_job.error}")

### Retrieve batch prediction results

When a batch prediction task is complete, the output of the prediction is stored in the bucket in JSONL that you specified in your request.

The file name should look like this: `{gcs_batch_job.dest.gcs_uri}/prediction-model-TIMESTAMP/predictions.jsonl`

Example output:

```json
{"status": "", "processed_time": "2024-11-13T14:04:28.376+00:00", "request": {"contents": [{"parts": [{"file_data": null, "text": "List objects in this image."}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/image/gardening-tools.jpeg", "mime_type": "image/jpeg"}, "text": null}], "role": "user"}], "generationConfig": {"temperature": 0.4}}, "response": {"candidates": [{"avgLogprobs": -0.10394711927934126, "content": {"parts": [{"text": "Here's a list of the objects in the image:\n\n* **Watering can:** A green plastic watering can with a white rose head.\n* **Plant:** A small plant (possibly oregano) in a terracotta pot.\n* **Terracotta pots:** Two terracotta pots, one containing the plant and another empty, stacked on top of each other.\n* **Gardening gloves:** A pair of striped gardening gloves.\n* **Gardening tools:** A small trowel and a hand cultivator (hoe).  Both are green with black handles."}], "role": "model"}, "finishReason": "STOP"}], "modelVersion": "gemini-1.5-flash-002@default", "usageMetadata": {"candidatesTokenCount": 110, "promptTokenCount": 264, "totalTokenCount": 374}}}
```


The example code below shows how to load the `.jsonl` file in the Cloud Storage output location into a Pandas DataFrame and print out the object.

You can retrieve the specific responses in the `response` field.

In [None]:
# Load the JSONL file into a DataFrame
df = pd.read_json(f"{gcs_batch_job.dest.gcs_uri}/*/predictions.jsonl", lines=True)
df = df.join(pd.json_normalize(df["response"], "candidates"))
df

## BigQuery

### Batch Input Preparation  

To send batch requests for prediction, you need to structure your input properly. For more details, visit the [Batch text generation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini#prepare_your_inputs) page.  

This guide uses **BigQuery** as an example. To use a BigQuery table as input:  
- Ensure the dataset is created in a supported region (e.g., `us-central1`). Multi-region locations (e.g., `us`) are not allowed.  
- The input table must include a `request` column of type `JSON` or `STRING` containing valid JSON, structured as a [`GenerateContentRequest`](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference).  
- Additional columns can use any BigQuery data types except `array`, `struct`, `range`, `datetime`, and `geography`. These are ignored for generation but appear in the output table. The system reserves `response` and `status` for output.  
- Only public YouTube or Cloud Storage URIs are supported in the `fileData` or `file_data` field.  
- Requests can include parameters to customize the model's output. Learn more in the [Gemini parameters guide](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values).

This is an example BigQuery table with sample requests:

In [23]:
INPUT_DATA = "bq://storage-samples.generative_ai.batch_requests_for_multimodal_input_2"  # @param {type:"string"}

You can query the BigQuery table to review the input data.

In [None]:
bq_client = bigquery.Client(project=PROJECT_ID)

bq_table_id = INPUT_DATA.replace("bq://", "")
sql = f"""
        SELECT *
        FROM {bq_table_id}
        """

query_result = bq_client.query(sql)

df = query_result.result().to_dataframe()
df.head()

### Prepare batch output location

When a batch prediction task completes, the output is stored in the location that you specified in your request.

- The location is in the form of a BigQuery URI prefix, for example: `bq://projectId.bqDatasetId`.
- If not specified, `bq://PROJECT_ID.gen_ai_batch_prediction.predictions_TIMESTAMP` will be used.

This tutorial uses a **BigQuery** table as an example.

- You can specify the URI of your BigQuery table in `BQ_OUTPUT_URI`, or
- If it is not specified, this notebook will create a new dataset `bq://PROJECT_ID.gen_ai_batch_prediction` for you.

In [None]:
BQ_OUTPUT_URI = "[your-bigquery-table]"  # @param {type:"string"}

if BQ_OUTPUT_URI == "[your-bigquery-table]":
    bq_dataset_id = "gen_ai_batch_prediction"

    # The output table will be created automatically if it doesn't exist
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    bq_table_id = f"prediction_result_{timestamp}"
    BQ_OUTPUT_URI = f"bq://{PROJECT_ID}.{bq_dataset_id}.{bq_table_id}"

    bq_dataset = bigquery.Dataset(f"{PROJECT_ID}.{bq_dataset_id}")
    bq_dataset.location = "us-central1"

    bq_dataset = bq_client.create_dataset(bq_dataset, exists_ok=True, timeout=30)
    print(
        f"Created BigQuery dataset {bq_client.project}.{bq_dataset.dataset_id} for batch prediction output."
    )

print(f"BigQuery output URI: {BQ_OUTPUT_URI}")

### Send a batch prediction request

To make a batch prediction request, you specify a source model ID, an input source and an output location where Vertex AI stores the batch prediction results.

To learn more, see the [Batch prediction API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/batch-prediction-api) page.


In [None]:
bq_batch_job = client.batches.create(
    model=MODEL_ID,
    src=INPUT_DATA,
    config=CreateBatchJobConfig(dest=BQ_OUTPUT_URI),
)
bq_batch_job.name

Print out the job status and other properties. You can also check the status in the Cloud Console at https://console.cloud.google.com/vertex-ai/batch-predictions

In [None]:
bq_batch_job = client.batches.get(name=bq_batch_job.name)
bq_batch_job

Optionally, you can list all the batch prediction jobs in the project.

In [None]:
for job in client.batches.list():
    print(job.name, job.create_time, job.state)

### Wait for the batch prediction job to complete

Depending on the number of input items that you submitted, a batch generation task can take some time to complete. You can use the following code to check the job status and wait for the job to complete.

In [None]:
# Refresh the job until complete
while bq_batch_job.state == "JOB_STATE_RUNNING":
    time.sleep(5)
    bq_batch_job = client.batches.get(name=bq_batch_job.name)

# Check if the job succeeds
if bq_batch_job.state == "JOB_STATE_SUCCEEDED":
    print("Job succeeded!")
else:
    print(f"Job failed: {bq_batch_job.error}")

### Retrieve batch prediction results

When a batch prediction task is complete, the output of the prediction is stored in the location that you specified in your request. It is also available in `batch_job.dest.bigquery_uri` or `batch_job.dest.gcs_uri`.

- When you are using BigQuery, the output of batch prediction is stored in an output dataset. If you had provided a dataset, the name of the dataset (`BQ_OUTPUT_URI`) is the name you had provided earlier. 
- If you did not provide an output dataset, a default dataset `bq://PROJECT_ID.gen_ai_batch_prediction` will be created for you.
- The name of the table is formed by appending `predictions_` with the timestamp of when the batch prediction job started.

You can use the example code below to retrieve predictions and store them into a Pandas DataFrame.


In [None]:
bq_table_id = bq_batch_job.dest.bigquery_uri.replace("bq://", "")

sql = f"""
        SELECT *
        FROM {bq_table_id}
        """

query_result = bq_client.query(sql)

df = query_result.result().to_dataframe()
df.head()

## Cleaning up

Clean up resources created in this notebook.

In [None]:
# Delete the batch prediction jobs
if gcs_batch_job:
    client.batches.delete(name=gcs_batch_job.name)
if bq_batch_job:
    client.batches.delete(name=bq_batch_job.name)