In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Analyze Multimodal Data in BigQuery

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fuse-cases%2Fapplying-llms-to-data%2Fmultimodal-analysis-bigquery%2Fanalyze_multimodal_data_bigquery.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/bigquery/import?url=https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/bigquery/v1/32px.svg" alt="BigQuery Studio logo"><br> Open in BigQuery Studio
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb">
      <img width="32px" src="https://www.svgrepo.com/download/217753/github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/applying-llms-to-data/multimodal-analysis-bigquery/analyze_multimodal_data_bigquery.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>

| Author |
| --- |
| [Jeff Nelson](https://github.com/jeffonelson) |

## Overview

This notebook provides a hands-on example of BigQuery's powerful multimodal capabilities. You'll learn how to perform sophisticated, AI-driven analysis on your data - both structured and unstructured - right from your familiar SQL environment.

Organizations typically store data in separate locations:
* There's **structured data** stored in neat rows and columns, like in [BigQuery tables](https://cloud.google.com/bigquery/docs/introduction)
* And there's **unstructured data**, which includes images, audio, video, and more. This generally lives in a cloud object store like [Google Cloud Storage (GCS)](https://cloud.google.com/storage/docs/introduction).

The difficulty is querying both types of data together. For example, asking a question about customer satisfaction by analyzing the *audio* of a support call alongside customer support history.

Fortunately, [BigQuery's multimodal](https://cloud.google.com/bigquery/docs/analyze-multimodal-data) and [generative AI capabilities](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model) cover these use cases, making it possible to analyze both structured in unstrctured data in a single query.

### Objectives

You will learn to:

* Understand what an [`ObjectRef`](https://cloud.google.com/bigquery/docs/analyze-multimodal-data#objectref_values) is, and how it bridges the gap between data in BigQuery and files in Google Cloud Storage (GCS).
* Create `ObjectRef`s in two different ways.
* Combine structured tables with unstructured files to create a single, unified view.
* Run generative AI models over your new multimodal tables to extract insights.

### Services and Costs

This tutorial uses the following billable components of Google Cloud:

* **BigQuery**: [Pricing](https://cloud.google.com/bigquery/pricing)

* **BigQuery ML**: [Pricing](https://cloud.google.com/bigquery/pricing#bqml)

* **Vertex AI**: [Pricing](https://cloud.google.com/vertex-ai/generative-ai/pricing)

You can use the [Pricing Calculator](https://cloud.google.com/products/calculator) to generate a cost estimate based on your projected usage.

---

## Before you begin

### Set up your Google Cloud project
**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

3. [Enable the BigQuery, BigQuery Connection, and Vertex AI APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,aiplatform.googleapis.com).

4. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

### Set your project ID

In [None]:
PROJECT_ID = "YOUR-PROJECT-ID"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

### Authenticate to your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Colab Enterprise in BigQuery Studio or Vertex AI**
* Do nothing as you are already authenticated.

**2. Colab Consumer - uncomment and run the following:**

In [None]:
from google.colab import auth
auth.authenticate_user()

**3. Local JupyterLab instance, uncomment and run the following:**



In [None]:
# ! gcloud auth login

### Create BigQuery Cloud resource connection

You will need to create a [Cloud resource connection](https://cloud.google.com/bigquery/docs/create-cloud-resource-connection) to enable BigQuery to interact with Vertex AI services.

In [None]:
!bq mk --connection --location=us \
    --connection_type=CLOUD_RESOURCE test_connection

### Set permissions for Service Account

The resource connection service account requires certain project-level permissions to interact with Vertex AI and Google Cloud Storage.

In [None]:
SERVICE_ACCT = !bq show --format=prettyjson --connection us.test_connection | grep "serviceAccountId" | cut -d '"' -f 4
SERVICE_ACCT_EMAIL = SERVICE_ACCT[-1]
print(SERVICE_ACCT_EMAIL)

In [None]:
import time

!gcloud projects add-iam-policy-binding --format=none $PROJECT_ID --member=serviceAccount:$SERVICE_ACCT_EMAIL --role='roles/storage.objectViewer'
!gcloud projects add-iam-policy-binding --format=none $PROJECT_ID --member=serviceAccount:$SERVICE_ACCT_EMAIL --role='roles/aiplatform.user'

# Wait ~60 seconds, to give IAM updates time to propagate. Otherwise, subsequent cells may fail.
time.sleep(60)

#### Create a BigQuery Dataset

Running the following query creates a [BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets-intro) called **`bq_mm_tutorial`** to house any tables or remote models for this tutorial:

In [None]:
%%bigquery --project {PROJECT_ID}

CREATE SCHEMA `bq_mm_tutorial` OPTIONS (location = 'US');

### Set Colab display options

Colab includes the `google.colab.data_table` package that can be used to display large pandas dataframes as an interactive data table. It can be enabled with:

In [2]:
%load_ext google.colab.data_table

---

## What is an `ObjectRef`?

[`ObjectRef`](https://cloud.google.com/bigquery/docs/analyze-multimodal-data#objectref_values) is the foundation of multimodal analysis in BigQuery. Think of it as a secure pointer to a file (image, PDF, video, etc.) in GCS. It doesn't hold the file's data itself, but it contains the information BigQuery needs to find and access the file during a query.

An `ObjectRef` is stored as a `STRUCT` and contains details like the file's URI path (i.e. its direct GCS "address"), a secure authorizer, and other metadata. By using `ObjectRef` columns in your BigQuery tables, you can effectively query against GCS objects right alongside your structured data.

### Creating ObjectRefs:

There are two primary ways to create tables with `ObjectRef` columns. We'll detail them in two short demo scenarios.
* Scenario 1: Use an [object table](https://cloud.google.com/bigquery/docs/object-table-introduction) to automatically generate an `ObjectRef` for every file in a GCS bucket
* Scenario 2: Use [built-in SQL functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/objectref_functions) like `OBJ.MAKE_REF()` to create `ObjectRef`s programmatically from URIs in an existing table

---

## Scenario 1: Analyze Customer Service Calls

### 1. Load structured data

Let's begin with a practical example. Imagine we have a standard BigQuery table with details about customer support calls. The table contains a `call_id` column, which is a unique identifier for a call. Let's load this sample data into BigQuery.

In [None]:
%%bigquery --project {PROJECT_ID}

LOAD DATA OVERWRITE bq_mm_tutorial.calls
FROM FILES (
  uris = ['gs://sample-data-and-media/customer-support/tables/calls'],
  format = 'PARQUET'
);

Query is running:   0%|          |

Taking a peek at the table, we can see the `call_id` column alongside other call attributes.

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT * FROM bq_mm_tutorial.calls;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,call_id,company_name,company_revenue,customer_name,product_name,call_reason
0,call_1_20250411045535,Global Tech Enterprises,12500000,Sarah Miller,Global Shield Antivirus,Antivirus subscription activation issue
1,call_3_20250411045541,Global Tech Enterprises,9800000,Robert Jones,Global Secure VPN,VPN service disconnecting frequently
2,call_5_20250411045547,Global Tech Enterprises,12000000,Michael Brown,Global Protect Data Backup,Need to upgrade data backup plan
3,call_2_20250411045538,Synergy Corp,18000000,David Martin,Synergy Cloud Storage,Inquiry about enterprise cloud storage pricing
4,call_4_20250411045544,Synergy Corp,20000000,Emily Carter,Synergy Project Manager,Praise for excellent customer support


### 2. Create an Object Table

We also have audio files stored in GCS corresponding to these customer support calls. The following creates an [object table](https://cloud.google.com/bigquery/docs/object-table-introduction) over a bucket containing customer service calls.

The object table is a read-only BigQuery table that mirrors the contents of a GCS directly and [automatically generates an `ObjectRef`](https://cloud.google.com/bigquery/docs/analyze-multimodal-data#object_tables) for each file.

In [None]:
%%bigquery --project {PROJECT_ID}

CREATE OR REPLACE EXTERNAL TABLE `bq_mm_tutorial.object_table`
WITH CONNECTION `us.test_connection`
OPTIONS (
  object_metadata = 'SIMPLE',
  uris = ['gs://sample-data-and-media/customer-support/calls/*.mp3']
);

Query is running:   0%|          |

Let's check the table contents. Notice the `ref` column, which is an `ObjectRef` that we can use.

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT * FROM `bq_mm_tutorial.object_table`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,uri,generation,content_type,size,md5_hash,updated,metadata,ref
0,gs://sample-data-and-media/customer-support/ca...,1753936794296133,audio/mpeg,65952,1e99258238c0830c9a59a1a45d5dd095,2025-07-31 04:39:54.327000+00:00,[],{'uri': 'gs://sample-data-and-media/customer-s...
1,gs://sample-data-and-media/customer-support/ca...,1753936794657505,audio/mpeg,65376,c8c575ce7cf437c840c28cbb84174e43,2025-07-31 04:39:54.689000+00:00,[],{'uri': 'gs://sample-data-and-media/customer-s...
2,gs://sample-data-and-media/customer-support/ca...,1753936794931315,audio/mpeg,75744,de6342b8cea147fdba8a3e3d1edcdeee,2025-07-31 04:39:54.963000+00:00,[],{'uri': 'gs://sample-data-and-media/customer-s...
3,gs://sample-data-and-media/customer-support/ca...,1753936795174045,audio/mpeg,65184,960ca9e91bca3335b18c2ec1e37c84f2,2025-07-31 04:39:55.204000+00:00,[],{'uri': 'gs://sample-data-and-media/customer-s...
4,gs://sample-data-and-media/customer-support/ca...,1753936795493499,audio/mpeg,63552,4b846ecfda4b7e57761b55ed4b8b1e2b,2025-07-31 04:39:55.523000+00:00,[],{'uri': 'gs://sample-data-and-media/customer-s...


### 3. Create a multimodal table

We'll first join the `calls` table that contains structured attributes to the `object_table` table, which contains the `ObjectRef` column, `ref`. We'll refer to this as a "multimodal table" because it contains structured and unstructured fields.

In [None]:
%%bigquery --project {PROJECT_ID}

CREATE OR REPLACE TABLE `bq_mm_tutorial.calls_combined` AS
SELECT
  c.*,
  o.ref
FROM
  `bq_mm_tutorial.calls` AS c
LEFT JOIN
  `bq_mm_tutorial.object_table` AS o
ON
  c.call_id = REGEXP_EXTRACT(o.uri, r'calls/([^.]+)')

Query is running:   0%|          |

We now have a single table, `calls_combined` that contains both traditional structured data like `company_revenue` and `product_name` alongside a reference to unstructured customer support calls.

We can run a single query against both data types.

### 4. Run a [multimodal query](https://cloud.google.com/bigquery/docs/analyze-multimodal-data#generative_ai_functions)

Since `calls_combined` contains structured data and pointers to audio in GCS, we can filter the table on both conditions.

In this case, we'll look for "high value" companies with audio indicating they're looking to purchase something USING [`AI.GENERATE_BOOL`](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate-bool).

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT company_name, customer_name, product_name
FROM `bq_mm_tutorial.calls_combined`
WHERE company_revenue > 15000000
AND
  AI.GENERATE_BOOL(
    prompt => ("Wants to buy something", ref),
    connection_id => "us.test_connection").result
;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,company_name,customer_name,product_name
0,Synergy Corp,David Martin,Synergy Cloud Storage
1,Synergy Corp,Emily Carter,Synergy Project Manager


### 5. Run a multimodal query on an ARRAY of `ObjectRef`s

Because an `ObjectRef` is a `STRUCT` data type, it can also be nested as an array and passed to Gemini for inference.

In this example, we generate high-level themes customers are calling about, grouped by `company_name` using [`AI.GENERATE`](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate).

In [3]:
%%bigquery --project {PROJECT_ID}

SELECT
  company_name,
  AI.GENERATE(
    ('Give 1-2 word themes customers are calling about', refs),
    connection_id => 'us.test_connection',
    endpoint => 'gemini-2.5-flash',
    output_schema => 'themes ARRAY<STRING>').themes
FROM (SELECT company_name, ARRAY_AGG(ref) AS refs FROM `bq_mm_tutorial.calls_combined` GROUP BY company_name);


Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,company_name,themes
0,Synergy Corp,"[Customer Service, Cloud Storage]"
1,Global Tech Enterprises,"[VPN, Disconnect, Refund, Data Backup, Storage..."


### Recap

In this scenario we:
* Began with a table called `calls` containing structured data
* Created an object table over audio files in GCS. This contained an `ObjectRef` column
* Created a multimodal table called `calls_combined`
* Ran a query against the `calls_combined` table using structured *and* unstructured data in a `WHERE` clause
* Used an array of `ObjectRef`s to understand key themes by `company_name`

This scenario used an object table to create our `ObjectRef` column. In the next scenario, we'll create it programmatically.



---



## Scenario 2: City 311 Response

In this scenario, we'll work with a common public sector dataset: [311 reports](https://en.wikipedia.org/wiki/311_(telephone_number)). Citizens submit issues (like potholes or graffiti) that include text descriptions and often upload supporting media like photos, audio recordings, or videos.

Our goal is to use multimodal analysis to automatically triage these reports, assess their urgency, and route them to the correct city department.

### 1. Load structured data

We'll first begin by adding two tables to our BigQuery environment:
* **`reports`** contains 311 report tabular data, like the `ticket_id`, `location`, `text_description` and more.
* **`media`** contains `uri` fields corresponding to any media associated with a 311 ticket (e.g. images, audio).

In [None]:
%%bigquery --project {PROJECT_ID}

LOAD DATA OVERWRITE bq_mm_tutorial.reports
FROM FILES (
  uris = ['gs://sample-data-and-media/311-demo/tables/city_311_reports'],
  format = 'PARQUET'
);

LOAD DATA OVERWRITE bq_mm_tutorial.media
FROM FILES (
  uris = ['gs://sample-data-and-media/311-demo/tables/city_311_media'],
  format = 'PARQUET'
);

Query is running:   0%|          |

### 2. Programmatically create an `ObjectRef`

In **Scenario 1**, you learned to create an `ObjectRef` using [object tables](https://cloud.google.com/bigquery/docs/analyze-multimodal-data#object_tables).

In this scenario, you'll use a more dynamic workflow, where you create `ObjectRef`s using a set of built-in SQL functions.

* [`OBJ.MAKE_REF`](https://cloud.google.com/bigquery/docs/reference/standard-sql/objectref_functions#objmake_ref): This function takes the string from the `uri` column and converts it into an `ObjectRef`

* [`OBJ.FETCH_METADATA`](https://cloud.google.com/bigquery/docs/reference/standard-sql/objectref_functions#objfetch_metadata) populates the `ObjectRef` with important file metadata from GCS

Here's an example with `OBJ.MAKE_REF` that references a single object with a connection:

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT OBJ.MAKE_REF('gs://sample-data-and-media/311-demo/images/311-20250409-1515EF.png', 'us.test_connection') AS image_ref;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,image_ref
0,{'uri': 'gs://sample-data-and-media/311-demo/i...


We can then get [GCS metadata](https://cloud.google.com/storage/docs/metadata) for the object by wrapping the `OBJ.FETCH_METADATA` function around the prior query.

This provides additional metadata, like a GCS object version, the object type, when it was last updated, and more.

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT OBJ.FETCH_METADATA(OBJ.MAKE_REF('gs://sample-data-and-media/311-demo/images/311-20250409-1515EF.png', 'us.test_connection')) AS image_ref;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,image_ref
0,{'uri': 'gs://sample-data-and-media/311-demo/i...


### 3. Create a multimodal table

To create our multimodal table, `reports_mm` we first join the `reports` and `media` table and also convert the `audio_uri`, `image_uri`, and `video_uri` fields to `ObjectRef` fields using the `OBJ.` functions.

Note that a multimodal table can have multiple `ObjectRef` columns and you can alias these columns.

In [None]:
%%bigquery --project {PROJECT_ID}

CREATE OR REPLACE TABLE `bq_mm_tutorial.reports_mm` AS
SELECT
  r.*,
  OBJ.FETCH_METADATA(OBJ.MAKE_REF(m.audio_uri, 'us.test_connection')) as audio_ref,
  OBJ.FETCH_METADATA(OBJ.MAKE_REF(m.image_uri, 'us.test_connection')) as image_ref,
  OBJ.FETCH_METADATA(OBJ.MAKE_REF(m.video_uri, 'us.test_connection')) as video_ref
FROM `bq_mm_tutorial.reports` r
LEFT JOIN `bq_mm_tutorial.media` m
ON r.ticket_id = m.ticket_id;

SELECT * FROM `bq_mm_tutorial.reports_mm`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,ticket_id,timestamp,reporter_id,location,location_description,category_reported,description,status,district,audio_ref,image_ref,video_ref
0,311-20250409-52F19E,2025-04-09 17:28:21+00:00,Citizen-4A2E,POINT(-118.245 34.052),Large trash pile in the alley behind the build...,Illegal Dumping/Trash Pile,There is a huge pile of trash in the alley and...,Open,North,{'uri': 'gs://n25-311-demo/calls/311-20250409-...,{'uri': 'gs://n25-311-demo/images/311-20250409...,{'uri': 'gs://n25-311-demo/videos/311-20250409...
1,311-20250409-A57DB4,2025-04-09 17:29:21+00:00,Citizen-77E1,POINT(-118.33 34.09),Leaking fire hydrant on Highland Ave near the ...,Leaking Fire Hydrant,A leaky fire hydrant that has been creating a ...,Open,North,{'uri': 'gs://n25-311-demo/calls/311-20250409-...,{'uri': 'gs://n25-311-demo/images/311-20250409...,{'uri': 'gs://n25-311-demo/videos/311-20250409...
2,311-20250409-59911E,2025-04-09 17:28:21+00:00,Citizen-69DD,POINT(-118.385 34.06),Car parked on the sidewalk on Beverly Hills Bl...,Abandoned Vehicle,Someone drove their car on the sidewalk and ab...,Open,North,{'uri': 'gs://n25-311-demo/calls/311-20250409-...,{'uri': 'gs://n25-311-demo/images/311-20250409...,{'uri': 'gs://n25-311-demo/videos/311-20250409...
3,311-20250409-1515EF,2025-04-09 17:35:21+00:00,Citizen-5DFC,POINT(-118.341 34.063),Large pothole on Sunset Blvd near the bus stop...,Pothole,There is a small pothole near the bus stop tha...,Open,South,{'uri': 'gs://n25-311-demo/calls/311-20250409-...,{'uri': 'gs://n25-311-demo/images/311-20250409...,{'uri': 'gs://n25-311-demo/videos/311-20250409...
4,311-20250409-60290D,2025-04-09 17:16:21+00:00,Citizen-4B3B,POINT(-118.35 34.055),Large sinkhole on the sidewalk near apartment ...,Major Sinkhole,A huge sinkhole that looks like it will eat a ...,Open,South,{'uri': 'gs://n25-311-demo/calls/311-20250409-...,{'uri': 'gs://n25-311-demo/images/311-20250409...,{'uri': 'gs://n25-311-demo/videos/311-20250409...


### 4. Run a multimodal query with on `ObjectRef`

Our `reports_mm` table has column called `description`, which provides a text description of a reported incident.

However, there may be additional, secondary issues that AI can help infer from provided media.

In this example, we pass the text `category_reported` alongside a prompt and then append `image_ref`. In this way, the [`AI.GENERATE_TABLE`](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-table) function passes text **and** an image in the same Gemini model call before returning the results.

But for BigQuery's `AI.GENERATE_TABLE` to call Gemini to parse unstructured data, we first need to create a [Remote Model](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model). We'll create that and move to our multimodal query.

In [4]:
%%bigquery --project {PROJECT_ID}

CREATE OR REPLACE MODEL `bq_mm_tutorial.gemini`
REMOTE WITH CONNECTION `us.test_connection`
  OPTIONS(ENDPOINT = 'gemini-2.5-flash');

Query is running:   0%|          |

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT
  ticket_id,
  category_reported,
  secondary_category,
  description
FROM AI.GENERATE_TABLE(
  MODEL `bq_mm_tutorial.gemini`,
  (
    SELECT (
      category_reported, ' is the primary issue.'
      'Note a secondary issue, only if you find something severe needing attention.'
      'Inspect the background and in the object.'
      , image_ref
    ) AS prompt,
     ticket_id,
     category_reported,
     description
    FROM `bq_mm_tutorial.reports_mm`
  ),
  STRUCT(
     "secondary_category ARRAY<STRING>" AS output_schema
  )
);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,ticket_id,category_reported,secondary_category,description
0,311-20250409-59911E,Abandoned Vehicle,[Damaged Tree],Someone drove their car on the sidewalk and ab...
1,311-20250409-60290D,Major Sinkhole,[Widespread pavement cracks],A huge sinkhole that looks like it will eat a ...
2,311-20250409-52F19E,Illegal Dumping/Trash Pile,[Graffiti],There is a huge pile of trash in the alley and...
3,311-20250409-1515EF,Pothole,[Cracked Road],There is a small pothole near the bus stop tha...
4,311-20250409-A57DB4,Leaking Fire Hydrant,[Slipping Hazard],A leaky fire hydrant that has been creating a ...


### 5. Run a multimodal query with multiple `ObjectRef`s

We can pass **multiple** `ObjectRef` fields in the same call to a Gemini model. In this example, we pass:
* The text description a user reported
* An image (`image_ref`)
* Call audio (`audio_ref`)
* Video (`video_ref`)

Providing all of this multimodal data in the same function call to the Gemini model allows it to perform a complete analysis that takes into account all four pieces of information at once. The output helps a 311 Operator prioritize which tickets are highest urgency and need immediate assistance.

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT
  ticket_id,
  issue,
  secondary_issue,
  description AS original_description,
  ai_summary_description,
  recommended_action,
  urgency_score,
  city_response_department
FROM AI.GENERATE_TABLE(
  MODEL `bq_mm_tutorial.gemini`,
  (
    SELECT (
      'Describe the primary issue in the photo in 2-3 words. Be descriptive.'
      'Only if you notice a secondary issue, list it too. Inspect the background and everywhere.'
      'Rate the urgency score for the city to respond from 1-10 where 1 is low, 10 is absolutely critical to safety.'
      'Write an AI generated description of the issue taking into account the text, image, audio, and video.'
      'Write a 1 sentence description to city dispatch for a recommended action.'
      'Assign a single city response department (e.g. Roads, Sanitation, Parks, Fire)'
      ,  description, image_ref, audio_ref, video_ref
    ) AS prompt,
    ticket_id,
    description
    FROM `bq_mm_tutorial.reports_mm`
  ),
  STRUCT(
     "issue STRING, secondary_issue STRING, urgency_score INT64, ai_summary_description STRING, recommended_action STRING, city_response_department STRING" AS output_schema
  )
)
ORDER BY urgency_score DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,ticket_id,issue,secondary_issue,original_description,ai_summary_description,recommended_action,urgency_score,city_response_department
0,311-20250409-52F19E,Alleyway blocked by trash,"Graffiti, fire hazard",There is a huge pile of trash in the alley and...,The scene depicts an alleyway completely obstr...,Dispatch Sanitation and Fire Departments immed...,10,Fire
1,311-20250409-60290D,Massive sidewalk sinkhole,Extensive pavement cracking,A huge sinkhole that looks like it will eat a ...,A substantial and rapidly expanding sinkhole h...,Dispatch emergency crews immediately to cordon...,9,Public Works
2,311-20250409-1515EF,Deep road pothole,Extensive road cracking,There is a small pothole near the bus stop tha...,A significant and deep pothole is present in t...,Dispatch a crew to immediately repair the larg...,9,Roads
3,311-20250409-59911E,Car blocking sidewalk,Abandoned vehicle,Someone drove their car on the sidewalk and ab...,A silver car is illegally parked on a bustling...,Dispatch a tow truck to remove the illegally p...,7,Police
4,311-20250409-A57DB4,Leaking fire hydrant,,A leaky fire hydrant that has been creating a ...,A red fire hydrant located on Highland Avenue ...,Dispatch a crew to Highland Avenue to address ...,7,Water Department


### 6. Run a multimodal query on an ARRAY of `ObjectRef`s

Of course, we can also aggregate `STRING` and `ObjectRef` fields as an array and pass them to a Gemini model too.

In this example, we'll return the top issues that need attention in the North and South districts of the city. This maybe useful for citywide reporting.

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT
  district,
  summary
FROM AI.GENERATE_TABLE(
  MODEL `bq_mm_tutorial.gemini`,
  (
    SELECT (
      'Describe the primary issues that need to be fixed in this district'
      , ARRAY_AGG(description), ARRAY_AGG(image_ref), ARRAY_AGG(video_ref), ARRAY_AGG(audio_ref)
    ) AS prompt,
    district
    FROM `bq_mm_tutorial.reports_mm`
    GROUP BY district
  ),
  STRUCT(
     "summary ARRAY<STRING>" AS output_schema
  )
);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,district,summary
0,South,[A small pothole near the bus stop needs to be...
1,North,[Massive pile of trash in the alley causing ob...


### Recap

In this scenario we:
* Began with two tables, `reports` and `media` containing structured data and `uri` columns pointing to GCS objects
* Explored how to programmatically create an `ObjectRef` using `OBJ.` functions.
* Created a multimodal table called `reports_mm`
* Ran several queries against the `reports_mm` table using structured data and one *or more* `ObjectRef` columns
* Ran a multimodal query against an ARRAY of `ObjectRef`s to produce aggregated insights


---


# Cleaning Up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
# Delete the BigQuery tables
! bq rm --table -f bq_mm_tutorial.object_table
! bq rm --table -f bq_mm_tutorial.calls
! bq rm --table -f bq_mm_tutorial.calls_combined
! bq rm --table -f bq_mm_tutorial.media
! bq rm --table -f bq_mm_tutorial.reports
! bq rm --table -f bq_mm_tutorial.reports_mm

# Delete the remote model
! bq rm --model -f bq_mm_tutorial.gemini

# Delete the remote connection
! bq rm --connection --project_id=$PROJECT_ID --location=us test_connection

# Delete the BigQuery dataset
! bq rm -r -f $PROJECT_ID:bq_mm_tutorial