In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Use Gemini and OSS Text-Embedding Models Against Your BigQuery Data

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fopen-models%2Fuse-cases%2Fbigquery_ml_text_embedding_inference.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/bigquery/import?url=https://github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/bigquery/v1/32px.svg" alt="BigQuery Studio logo"><br> Open in BigQuery Studio
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb">
      <img width="32px" src="https://www.svgrepo.com/download/217753/github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/use-cases/bigquery_ml_text_embedding_inference.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>

| Author(s) |
| --- |
| [Jasper Xu](https://github.com/ZehaoXU), [Haiyang Qi](https://github.com/pursuitdan) |

## Overview

This notebook showcases a simple end-to-end process for generating text embeddings using BigQuery in conjunction with both Google's Gemini and OSS text embedding models. We use Google's `gemini-embedding-001` and the open-source `multilingual-e5-small` model as examples, and the process involves:

- Deploying the `multilingual-e5-small` model from HuggingFace to Vertex AI.
- Creating a remote model in BigQuery for aginst the Gemini embedding model, and the deployed OSS model endpoint.
- Employing the ML.GENERATE_EMBEDDING function to generate embeddings from text data using both models.
- Cleaning up deployed resources to manage costs.

## Costs
This tutorial uses billable components of Google Cloud:

- Vertex AI
- BigQuery

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [BigQuery pricing](https://cloud.google.com/bigquery/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.



## Get started

### Install Google Vertex AI SDK and other required packages


In [4]:
%pip install --upgrade google-cloud-aiplatform

Collecting google-cloud-aiplatform
  Downloading google_cloud_aiplatform-1.111.0-py2.py3-none-any.whl.metadata (38 kB)
Collecting anyio<5.0.0,>=4.8.0 (from google-genai<2.0.0,>=1.0.0->google-cloud-aiplatform)
  Using cached anyio-4.10.0-py3-none-any.whl.metadata (4.0 kB)
Downloading google_cloud_aiplatform-1.111.0-py2.py3-none-any.whl (8.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.0/8.0 MB[0m [31m44.1 MB/s[0m eta [36m0:00:00[0m
[?25hUsing cached anyio-4.10.0-py3-none-any.whl (107 kB)
Installing collected packages: anyio, google-cloud-aiplatform
  Attempting uninstall: anyio
    Found existing installation: anyio 3.7.1
    Uninstalling anyio-3.7.1:
      Successfully uninstalled anyio-3.7.1
  Attempting uninstall: google-cloud-aiplatform
    Found existing installation: google-cloud-aiplatform 1.74.0
    Uninstalling google-cloud-aiplatform-1.74.0:
      Successfully uninstalled google-cloud-aiplatform-1.74.0
[31mERROR: pip's dependency resolver does not

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information

To get started using Vertex AI and BigQuery, you must have an existing Google Cloud project and enable the [Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com) & [BigQuery API](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com).

In [1]:
# Use the environment variable if the user doesn't provide Project ID.
import os

PROJECT_ID = "jasperxu-test"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = "us-central1" #@param {type: "string", placeholder: "[your-preferred-location]", isTemplate: true}

from google.cloud import aiplatform
import vertexai

aiplatform.init(project=PROJECT_ID, location=LOCATION)

vertexai.init(
    project=PROJECT_ID,
    location=LOCATION,
)

from google.cloud import bigquery

bq_client = bigquery.Client()

### Create a New BigQuery Dataset

This will house any tables and models created throughout this notebook

In [14]:
!bq mk --location={LOCATION} --dataset --project_id={PROJECT_ID} demo_dataset

Dataset 'jasperxu-test:demo_dataset' successfully created.


## Use Gemini Text Embedding Model in BigQuery

First, let's explore how to generate embeddings using the state-of-the-art Gemini model directly in BigQuery. This process involves two simple steps: creating a remote model and then using it for inference.

### Create a Remote Model in BigQuery

Before you can generate embeddings, you need to create a REMOTE MODEL against the `gemini-embedding-001` in BigQuery, using the statement below:

In [5]:
%%bigquery --project $PROJECT_ID

CREATE OR REPLACE MODEL demo_dataset.gemini_embedding_model
REMOTE WITH CONNECTION DEFAULT
OPTIONS(endpoint="gemini-embedding-001")

Query is running:   0%|          |

### Generate Embeddings

Once the model is created, you can call the ML.GENERATE_EMBEDDING function to generate embeddings. The following code will generate embeddings for 10,000 records from the public `bigquery-public-data.hacker_news.full` dataset.

In [7]:
%%bigquery --project $PROJECT_ID

SELECT
  *
FROM
  ML.GENERATE_EMBEDDING(
    MODEL demo_dataset.gemini_embedding_model,
    (
      SELECT
        text AS content
      FROM
        bigquery-public-data.hacker_news.full
      WHERE
        text IS NOT NULL
      LIMIT 10000
    )
  );


Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,ml_generate_embedding_result,ml_generate_embedding_statistics,ml_generate_embedding_status,content
0,"[-0.014899546280503273, -0.014863749034702778,...","{""token_count"":2,""truncated"":false}",,Arcology
1,"[0.008587316609919071, -0.019451744854450226, ...","{""token_count"":2,""truncated"":false}",,Soapbox
2,"[-0.0326937772333622, 0.011020665988326073, 0....","{""token_count"":3,""truncated"":false}",,Microsoft! LOL
3,"[-0.0005699801840819418, 0.002384473569691181,...","{""token_count"":3,""truncated"":false}",,Online Collaboration Site
4,"[-0.026845205575227737, 0.008600174449384212, ...","{""token_count"":4,""truncated"":false}",,Total agreement here.
...,...,...,...,...
95,"[0.0019643178675323725, 0.0009948242222890258,...","{""token_count"":399,""truncated"":false}",,"Java used to be my hero, ever since the day I ..."
96,"[0.00833573192358017, 0.04018007591366768, 0.0...","{""token_count"":402,""truncated"":false}",,"""But even if you have positive cashflow, it ge..."
97,"[-0.01720813289284706, -0.020577890798449516, ...","{""token_count"":619,""truncated"":false}",,One of the problems with our YC interview was ...
98,"[-0.018108271062374115, -0.0021691489964723587...","{""token_count"":852,""truncated"":false}",,"Wow, this is awesome. Unfortunately it will pr..."


## Use an OSS Text Embedding Model in BigQuery

Now, let's walk through using an open-source model. This process gives you maximum flexibility and control over quality and scalability. Unlike using the managed Gemini model, this workflow involves hosting the model yourself on a Vertex AI endpoint.

### Deploy an OSS Model to a Vertex AI Endpoint

First, you need to choose an open-source model from a repository like [Hugging Face](https://huggingface.co/models?other=text-embeddings-inference&sort=trending) and deploy it to a Vertex AI endpoint. For this example, we use [`intfloat/multilingual-e5-small`](https://huggingface.co/intfloat/multilingual-e5-small), which delivers respectable performance (ranking 38th on the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard)) while being massively scalable and cost-effective. The following code will deploy the model, which creates prediction server with dedicated-resource for your use.

The model is served by default on a single `g2-standard-12` machine replica with one `NVIDIA_L4` GPU. You can adjust the `min_replica_count`, `max_replica_count`, and `machine_type` to balance scalability and cost.

In [7]:
from vertexai import model_garden

model = model_garden.OpenModel("publishers/intfloat/models/e5@multilingual-e5-small")

# BigQuery only support public shared endpoint currently. Dedicated endpoint is not supported
endpoint = model.deploy(dedicated_endpoint_disabled=True)

INFO:vertexai.model_garden._model_garden:Deploying model: publishers/intfloat/models/e5@multilingual-e5-small
INFO:vertexai.model_garden._model_garden:LRO: projects/1079516131935/locations/us-central1/operations/2454382374381682688
INFO:vertexai.model_garden._model_garden:Start time: 2025-09-04 20:25:31.568048
INFO:vertexai.model_garden._model_garden:End time: 2025-09-04 20:28:37.448734
INFO:vertexai.model_garden._model_garden:Endpoint: projects/1079516131935/locations/us-central1/endpoints/mg-endpoint-be417de5-4728-4efd-9cb7-392a7a82cf63


### Create a Remote Model in BigQuery

Similar to the Gemini workflow, you need to create a remote model in BigQuery. However, this time the model will point to the URL of the Vertex AI endpoint you just created. This tells BigQuery where to send the data for embedding generation.

In [8]:
ENDPOINT_ID = f"https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint.name}"
print("Endpoint ID: ", ENDPOINT_ID)

query = f"""
CREATE OR REPLACE MODEL demo_dataset.multilingual_e5_small
REMOTE WITH CONNECTION DEFAULT
OPTIONS(
  endpoint='{ENDPOINT_ID}'
);
"""

bq_client.query_and_wait(query).to_dataframe()

Endpoint ID:  https://us-central1-aiplatform.googleapis.com/v1/projects/jasperxu-test/locations/us-central1/endpoints/mg-endpoint-be417de5-4728-4efd-9cb7-392a7a82cf63


### Generate Embeddings

With the model created, you can use the exact same ML.GENERATE_EMBEDDING function as before. For this particular E5 model with default deployemnt settings, it takes around 2 hour and 10 minutes to embed over 38M non-null rows in the Hacker News dataset.

In [9]:
%%bigquery --project $PROJECT_ID

SELECT
  *
FROM
  ML.GENERATE_EMBEDDING(
    MODEL demo_dataset.multilingual_e5_small,
    (
      SELECT
        text AS content
      FROM
        bigquery-public-data.hacker_news.full
      WHERE
        text IS NOT NULL
      LIMIT 10000
    )
  );

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,ml_generate_embedding_result,ml_generate_embedding_status,content
0,"[0.046088517, -0.04064304, -0.071781255, -0.02...",,I couldn't agree with you more. It isn't unti...
1,"[0.0047469777, -0.005203721, -0.056900736, -0....",,"It depends on your software. In general, here ..."
2,"[0.059171423, -0.035687443, -0.06522188, -0.01...",,The link to 'The Other Road Ahead' is broken.<...
3,"[0.036872875, -0.0104556475, -0.056996714, -0....",,"Wanted to point out a typo. You have:<p>""Thank..."
4,"[0.0034843513, -0.010216616, -0.032130696, -0....",,I no longer consult but if I had a small busin...
...,...,...,...
9995,"[0.020694472, 0.013985133, -0.053725068, -0.02...",,"As much as I really don't like the guy, I thin..."
9996,"[0.01669293, 0.03440433, -0.08619712, -0.04722...",,What do you think of Fantasy Interactive (<a h...
9997,"[0.012449909, -0.03812472, -0.036249734, -0.04...",,I don't think it comes down solely to the desi...
9998,"[0.015606218, 0.027555749, -0.027346795, -0.03...",,One designer not mentioned on this list is Jas...


### Delete the Vertex AI Model and Endpoint

This is a critical step for cost management. Since you deployed an OSS model to an active endpoint, it will continue to incur costs even when idle. The following code will "undeploy" the model from the endpoint, which stops the billing.

For batch workloads, the most cost-effective pattern is to deploy the model, run your inference job, and immediately undeploy it, achieving by run these steps sequentially.

In [10]:
endpoint.undeploy_all()
endpoint.delete()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete BigQuery dataset created in this demo, assuming you have already ran the above code block to delete resources on the Vertex AI side:

In [15]:
!bq rm -r -f --dataset {PROJECT_ID}:demo_dataset