##### Copyright 2025 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on ai.google.dev</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers.ipynb""><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://kaggle.com/kernels/welcome?src=https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers.ipynb"><img src="https://www.kaggle.com/static/images/logos/kaggle-logo-transparent-300.png" height="32" width="70"/>Run in Kaggle</a>
  </td>
  <td>
    <a target="_blank" href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/google/generative-ai-docs/main/site/en/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers.ipynb"><img src="https://ai.google.dev/images/cloud-icon.svg" width="40" />Open in Vertex AI</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

# Fine-tune EmbeddingGemma

Fine-tuning helps close the gap between a model's general-purpose understanding and the specialized, high-performance accuracy that your application requires. Since no single model is perfect for every task, fine-tuning adapts it to your specific domain.

Imagine your company, "Shibuya Financial" offers various complex financial products like investment trusts, NISA accounts (a tax-advantaged savings account), and home loans. Your customer support team uses an internal knowledge base to quickly find answers to customer questions.

## Setup

Before starting this tutorial, complete the following steps:

* Get access to EmbeddingGemma by logging into [Hugging Face](https://huggingface.co/google/embeddinggemma-300M) and selecting **Acknowledge license** for a Gemma model.
* Generate a Hugging Face [Access Token](https://huggingface.co/docs/hub/en/security-tokens#how-to-manage-user-access-token) and use it to login from Colab.

This notebook will run on either CPU or GPU.

### Install Python packages

Install the libraries required for running the EmbeddingGemma model and generating embeddings. Sentence Transformers is a Python framework for text and image embeddings. For more information, see the [Sentence Transformers](https://www.sbert.net/) documentation.

In [1]:
!pip install -U sentence-transformers git+https://github.com/huggingface/transformers@v4.56.0-Embedding-Gemma-preview

Collecting git+https://github.com/huggingface/transformers@v4.56.0-Embedding-Gemma-preview
  Cloning https://github.com/huggingface/transformers (to revision v4.56.0-Embedding-Gemma-preview) to /tmp/pip-req-build-khkyzca5
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-khkyzca5
  Running command git checkout -q df86ccad5d183a07127e7dc001bf53020d885fbc
  Resolved https://github.com/huggingface/transformers to commit df86ccad5d183a07127e7dc001bf53020d885fbc
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: transformers
  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone
  Created wheel for transformers: filename=transformers-4.57.0.dev0-py3-none-any.whl size=12604538 sha256=92752b76fb35d2f4591e256d870ce2e68ab4440a093833b67326377cd83908a1
  S

After you have accepted the license, you need a valid Hugging Face Token to access the model.

In [2]:
# Login into Hugging Face Hub
from huggingface_hub import login
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### Load Model

Use the `sentence-transformers` libraries to create an instance of a model class with EmbeddingGemma.

In [3]:
import torch
from sentence_transformers import SentenceTransformer

device = "cuda" if torch.cuda.is_available() else "cpu"

model_id = "google/embeddinggemma-300M"
model = SentenceTransformer(model_id).to(device=device)

print(f"Device: {model.device}")
print(model)
print("Total number of parameters in the model:", sum([p.numel() for _, p in model.named_parameters()]))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/573 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/997 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/16.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/58.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.49k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.21G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.16M [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.69M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/33.4M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/35.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/662 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/312 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/134 [00:00<?, ?B/s]

2_Dense/model.safetensors:   0%|          | 0.00/9.44M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/134 [00:00<?, ?B/s]

3_Dense/model.safetensors:   0%|          | 0.00/9.44M [00:00<?, ?B/s]

Device: cpu
SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)
Total number of parameters in the model: 307581696


## Prepare the Fine-Tuning Dataset

This is the most crucial part. You need to create a dataset that teaches the model what "similar" means in your specific context. This data is often structured as triplets: (anchor, positive, negative)

- Anchor: The original query or sentence.
- Positive: A sentence that is semantically very similar or identical to the anchor.
- Negative: A sentence that is on a related topic but semantically distinct.

In this example, we only prepared 3 triplets, but for a real application, you would need a much larger dataset to perform well.

In [4]:
from datasets import Dataset

dataset = [
    ["How do I open a NISA account?", "What is the procedure for starting a new tax-free investment account?", "I want to check the balance of my regular savings account."],
    ["Are there fees for making an early repayment on a home loan?", "If I pay back my house loan early, will there be any costs?", "What is the management fee for this investment trust?"],
    ["What is the coverage for medical insurance?", "Tell me about the benefits of the health insurance plan.", "What is the cancellation policy for my life insurance?"],
]

# Convert the list-based dataset into a list of dictionaries.
data_as_dicts = [ {"anchor": row[0], "positive": row[1], "negative": row[2]} for row in dataset ]

# Create a Hugging Face `Dataset` object from the list of dictionaries.
train_dataset = Dataset.from_list(data_as_dicts)
print(train_dataset)

Dataset({
    features: ['anchor', 'positive', 'negative'],
    num_rows: 3
})


## Before Fine-Tuning

A search for "tax-free investment" might have given the following results, with similarity scores:

1. Document: Opening a NISA account (Score: 0.45)
2. Document: Opening a Regular Saving Account (Score: 0.48) <- *Similar score, potentially confusing*
3. Document: Home Loan Application Guide (Score: 0.42)

> Note: To generate optimal embeddings with EmbeddingGemma, you should add an "instructional prompt" or "task" to the beginning of your input text. You will use `STS` for sentence similarity. For details on all available EmbeddingGemma prompts, see the [model card](http://ai.google.dev/gemma/docs/embeddinggemma/model_card#prompt_instructions).

In [5]:
task_name = "STS"

def get_scores(query, documents):
  # Calculate embeddings by calling model.encode()
  query_embeddings = model.encode(query, prompt=task_name)
  doc_embeddings = model.encode(documents, prompt=task_name)

  # Calculate the embedding similarities
  similarities = model.similarity(query_embeddings, doc_embeddings)

  for idx, doc in enumerate(documents):
    print("Document: ", doc, "-> 🤖 Score: ", similarities.numpy()[0][idx])

query = "I want to start a tax-free installment investment, what should I do?"
documents = ["Opening a NISA Account", "Opening a Regular Savings Account", "Home Loan Application Guide"]

get_scores(query, documents)

Document:  Opening a NISA Account -> 🤖 Score:  0.4569875
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.48092675
Document:  Home Loan Application Guide -> 🤖 Score:  0.42127043


## Training

Using a framework like `sentence-transformers` in Python, the base model gradually learns the subtle distinctions in your financial vocabulary.

In [6]:
from sentence_transformers import SentenceTransformerTrainer, SentenceTransformerTrainingArguments
from sentence_transformers.losses import MultipleNegativesRankingLoss
from transformers import TrainerCallback

loss = MultipleNegativesRankingLoss(model)

args = SentenceTransformerTrainingArguments(
    # Required parameter:
    output_dir="my-embedding-gemma",
    # Optional training parameters:
    prompts=model.prompts[task_name],    # use model's prompt to train
    num_train_epochs=5,
    per_device_train_batch_size=1,
    learning_rate=2e-5,
    warmup_ratio=0.1,
    # Optional tracking/debugging parameters:
    logging_steps=train_dataset.num_rows,
    report_to="none",
)

class MyCallback(TrainerCallback):
    "A callback that evaluates the model at the end of eopch"
    def __init__(self, evaluate):
        self.evaluate = evaluate # evaluate function

    def on_log(self, args, state, control, **kwargs):
        # Evaluate the model using text generation
        print(f"Step {state.global_step} finished. Running evaluation:")
        self.evaluate()

def evaluate():
  get_scores(query, documents)

trainer = SentenceTransformerTrainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
    loss=loss,
    callbacks=[MyCallback(evaluate)]
)
trainer.train()

Computing widget examples:   0%|          | 0/1 [00:00<?, ?example/s]



Step,Training Loss
3,0.0483
6,0.0
9,0.0
12,0.0
15,0.0


Step 3 finished. Running evaluation:
Document:  Opening a NISA Account -> 🤖 Score:  0.6449201
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.4412307
Document:  Home Loan Application Guide -> 🤖 Score:  0.46752426
Step 6 finished. Running evaluation:
Document:  Opening a NISA Account -> 🤖 Score:  0.6887378
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.3406956
Document:  Home Loan Application Guide -> 🤖 Score:  0.50065374
Step 9 finished. Running evaluation:
Document:  Opening a NISA Account -> 🤖 Score:  0.7148903
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.30480397
Document:  Home Loan Application Guide -> 🤖 Score:  0.52454764
Step 12 finished. Running evaluation:
Document:  Opening a NISA Account -> 🤖 Score:  0.72614646
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.29255384
Document:  Home Loan Application Guide -> 🤖 Score:  0.53700054
Step 15 finished. Running evaluation:
Document:  Opening a NISA Account -> 🤖 Score:  0.72940296


TrainOutput(global_step=15, training_loss=0.009651267528511198, metrics={'train_runtime': 138.3298, 'train_samples_per_second': 0.108, 'train_steps_per_second': 0.108, 'total_flos': 0.0, 'train_loss': 0.009651267528511198, 'epoch': 5.0})

## After Fine-Tuning

The same search now yields much clearer results:

1. Document: Opening a NISA account (Score: 0.72) <- *Much more confident*
2. Document: Opening a Regular Saving Account (Score: 0.28) <- *Clearly less relevant*
3. Document: Home Loan Application Guide (Score: 0.54)

In [7]:
get_scores(query, documents)

Document:  Opening a NISA Account -> 🤖 Score:  0.72940296
Document:  Opening a Regular Savings Account -> 🤖 Score:  0.28930163
Document:  Home Loan Application Guide -> 🤖 Score:  0.5408772


To upload your model to the Hugging Face Hub, you can use the `push_to_hub` method from the Sentence Transformers library.

Uploading your model makes it easy to access for inference directly from the Hub, share with others, and version your work. Once uploaded, anyone can load your model with a single line of code, simply by referencing its unique model ID `<username>/my-embedding-gemma`


In [8]:
# Push to Hub
model.push_to_hub("eea-embedding-gemma")

Processing Files (0 / 0)                : |          |  0.00B /  0.00B            

New Data Upload                         : |          |  0.00B /  0.00B            

  /tmp/tmpdq18u_xv/tokenizer.model      : 100%|##########| 4.69MB / 4.69MB            

  /tmp/tmpdq18u_xv/tokenizer.json       :  74%|#######4  | 24.9MB / 33.4MB            

  ...pdq18u_xv/3_Dense/model.safetensors: 100%|##########| 9.44MB / 9.44MB            

  /tmp/tmpdq18u_xv/model.safetensors    :   2%|2         | 25.1MB / 1.21GB            

  ...pdq18u_xv/2_Dense/model.safetensors: 100%|##########| 9.44MB / 9.44MB            

'https://huggingface.co/EmmanuelEA/eea-embedding-gemma/commit/d5b127ac3a39c43881fa8411c25ec53896830810'

## Summary and next steps

You have now learned how to adapt an EmbeddingGemma model for a specific domain by fine-tuning it with the Sentence Transformers library.

Explore what more you can do with EmbeddingGemma:

* [Training Overview](https://sbert.net/docs/sentence_transformer/training_overview.html) in Sentence Transformers Documentation
* [Generate embeddings with Sentence Transformers](https://ai.google.dev/gemma/docs/embeddinggemma/inference-embeddinggemma-with-sentence-transformers)
* [Simple RAG example](https://github.com/google-gemini/gemma-cookbook/blob/main/Gemma/%5BGemma_3%5DRAG_with_EmbeddingGemma.ipynb) in the Gemma Cookbook
