##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Tuning Quickstart with Python

<a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Tuning.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

In this notebook, you'll learn how to get started with model tuning.

## What is model tuning?

Prompt design strategies such as few shot prompting may not always produce the results you need. Use model tuning to improve a model's performance on specific tasks or help the model adhere to specific output requirements when instructions aren't sufficient and you have a set of examples that demonstrate the outputs you want.

The goal of model tuning is to further improve the performance of the model for your specific task. Model tuning works by providing the model with a training dataset containing many examples of the task. For niche tasks, you can get significant improvements in model performance by tuning the model on a modest number of examples.

Your training data should be structured as examples with prompt inputs and expected response outputs. The goal is to teach the model to mimic the wanted behavior or task, by giving it many examples illustrating that behavior or task.

You can also tune models using example data directly in Google AI Studio.

## Setup

In [138]:
%pip install -q -U "google-genai>=1.0.0"

In [139]:
from google import genai

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart for an example.

In [140]:
from google.colab import userdata
client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))

You can check your existing tuned models with the `clien.models.list()` method.

In [141]:
for m in client.models.list():
    print(m.name)

models/embedding-gecko-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-1.5-flash-8b-exp-0827
models/gemini-1.5-flash-8b-exp-0924
models/gemini-2.5-pro-exp-03-25
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-04-17
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash-preview-04-17-thinking
models/gemini-2.5-pro-preview-05-06
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview

## Prepare your dataset

Before you can start fine-tuning, you need a dataset to tune the model with. For the best performance, the examples in the dataset should be of high quality, diverse, and representative of real inputs and outputs.

For this example, you will tune a model to generate the next number in the sequence. For example, if the input is `1`, the model should output `2`. If the input is `one hundred`, the output should be `one hundred one`.

Dataset for tuning the model can be one of the following types:
1. `Iterable` of dicts or tuples.
2. `Mapping` of `Iterable[str]`.
3. CSV file.
4. JSON file.

To know more about preparing a dataset for fine-tuning visit [model-tuning documentation](https://ai.google.dev/gemini-api/docs/model-tuning#prepare-dataset).

Note: In general, you need between 100 and 500 examples to significantly change the behavior of the model.

The following sections illustrate how to provide the dataset as an `Iterable` or a CSV file.

### Training data as an `Iterable`

Data can be an `Iterable` of:
* `{'text_input': text_input, 'output': output}` dicts
* `(text_input, output)` tuples.

In [142]:
# Provide data as a list of dicts

dict_data =[
    {
          'text_input': '1',
          'output': '2',
    },{
          'text_input': '3',
          'output': '4',
    },{
          'text_input': '-3',
          'output': '-2',
    },{
          'text_input': 'twenty two',
          'output': 'twenty three',
    },{
          'text_input': 'two hundred',
          'output': 'two hundred one',
    },{
          'text_input': 'ninety nine',
          'output': 'one hundred',
    },{
          'text_input': '8',
          'output': '9',
    },{
          'text_input': '-98',
          'output': '-97',
    },{
          'text_input': '1,000',
          'output': '1,001',
    },{
          'text_input': '10,100,000',
          'output': '10,100,001',
    },{
          'text_input': 'thirteen',
          'output': 'fourteen',
    },{
          'text_input': 'eighty',
          'output': 'eighty one',
    },{
          'text_input': 'one',
          'output': 'two',
    },{
          'text_input': 'three',
          'output': 'four',
    },{
          'text_input': 'seven',
          'output': 'eight',
    }
]

### Training data as a CSV file

You can provide your CSV file to the tuning API in one of the following ways:
  * A path of type `str` or `pathlib.Path` to a local CSV file.
  * A URL to the CSV file.
  * The public URL of a Google Sheets file.

For this example, you will provide the path to a local CSV file containing the training dataset as `pathlib.Path` to the tuning API.


Run the following cell to create the CSV file, `data.csv`.
The CSV file has the default columns, `text_input` for the input and `output` for the output.


In [143]:
%%writefile data.csv
text_input,output
1,2
3,4
-3,-2
twenty two,twenty three
two hundred,two hundred one
ninety nine,one hundred
8,9
-98,-97
"1,000","1,001"
"1,01,00,000","1,01,00,001"
thirteen,fourteen
eighty,eighty one
one,two
three,four
seven,eight

Overwriting data.csv


If your CSV file doesn't have the default field names, you can mention your input and output field directly in the `create_tuned_model` function.

```
create_tuned_model(
    training_data = <csv file path>,
    ...
    input_key= <input field name>,
    output_key = <output field name>
)
```

Get the CSV file path as a `pathlib.Path` object.

In [144]:
import pathlib

# Provide data as a CSV file `pathlib.Path` object.
csv_file=pathlib.Path('data.csv')

### Pass your dataset as training data

In [145]:
# Here you can specify any of the supported formats, e.g. dict_data or csv_file.
train_data = dict_data

## Create tuned model

Get the list of models available for tuning.


In [146]:
# List models that support tuning
tunable_models = [
    m.name
    for m in client.models.list()
    if "createTunedModel" in m.supported_actions
]

print(tunable_models)

['models/gemini-1.5-flash-001-tuning']


Select the source model for tuning.


In [147]:
# Pick the last “flash” model for tuning
base_model = [m for m in tunable_models if 'flash' in m][-1]

print(base_model)  # e.g. 'models/gemini-1.5-flash-001-tuning'

models/gemini-1.5-flash-001-tuning



To create a tuned model, you need to convert your dataset into a `TuningDataset` and pass it to the `client.tunings.tune()` method. The tuning process requires specifying the base model, training dataset, and configuration parameters.

In [148]:
import random
from google.genai import types

# Wrap your raw train_data into a TuningDataset
training_dataset = types.TuningDataset(
    examples=[
        types.TuningExample(
            text_input=example['text_input'],
            output=example['output']
        )
        for example in train_data
    ]
)

# Pick a unique display name for the tuned model
name = f'generate-num-{random.randint(0, 10000)}'

# Launch the tuning job
tuning_job = client.tunings.tune(
    base_model=base_model,              # e.g., 'models/gemini-1.5-flash-001-tuning'
    training_dataset=training_dataset,  # your types.TuningDataset
    config=types.CreateTuningJobConfig(
        epoch_count=100,
        batch_size=4,
        learning_rate=0.001,
        tuned_model_display_name=name,
    )
)

Your tuned model is immediately added to the list of tuned models, but its state is set to "JOB_STATE_RUNNING" while the model is tuned.

In [149]:
model = client.tunings.get(name=tuning_job.name)

print("Model: ", model)
print("Model state: ", model.state)

Model:  name='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3' state=<JobState.JOB_STATE_RUNNING: 'JOB_STATE_RUNNING'> create_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 161390, tzinfo=TzInfo(UTC)) start_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 804150, tzinfo=TzInfo(UTC)) end_time=None update_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 161390, tzinfo=TzInfo(UTC)) error=None description=None base_model='models/gemini-1.5-flash-001-tuning' tuned_model=TunedModel(model='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3', endpoint='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3', checkpoints=None) supervised_tuning_spec=None tuning_data_stats=None encryption_spec=None partner_model_tuning_spec=None distillation_spec=None experiment=None labels=None pipeline_job=None tuned_model_display_name=None
Model state:  JobState.JOB_STATE_RUNNING


Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

Your tuned model is immediately added to the list of tuned models, but its status is set to "creating" while the model is tuned.

### Check tuning progress

Use `metadata` to check the state:

In [150]:
print(model)

name='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3' state=<JobState.JOB_STATE_RUNNING: 'JOB_STATE_RUNNING'> create_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 161390, tzinfo=TzInfo(UTC)) start_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 804150, tzinfo=TzInfo(UTC)) end_time=None update_time=datetime.datetime(2025, 5, 22, 5, 35, 21, 161390, tzinfo=TzInfo(UTC)) error=None description=None base_model='models/gemini-1.5-flash-001-tuning' tuned_model=TunedModel(model='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3', endpoint='tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3', checkpoints=None) supervised_tuning_spec=None tuning_data_stats=None encryption_spec=None partner_model_tuning_spec=None distillation_spec=None experiment=None labels=None pipeline_job=None tuned_model_display_name=None


Wait for the training to finish using `operation.result()`, or `operation.wait_bar()`

In [153]:
import sys, time
from itertools import cycle
from google.genai.types import JobState

job_name = tuning_job.name

# Fetch initial status
current_status = client.tunings.get(name=job_name)

spinner = cycle(["|", "/", "-", "\\"])
print("Waiting for tuning job to complete ", end="", flush=True)

# Loop until the job reaches any terminal state (SUCCEEDED, FAILED, CANCELLED)
while not current_status.has_ended:
    sys.stdout.write(next(spinner))
    sys.stdout.flush()
    time.sleep(0.2)
    sys.stdout.write("\b")
    # Refresh status from the server
    current_status = client.tunings.get(name=job_name)

print(f"\nTuning job finished with state: {current_status.state.name}")

Waiting for tuning job to complete |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-

You can cancel your tuning job any time using the `cancel()` method. Uncomment the line below and run the code cell to cancel your job before it finishes.

In [None]:
# tuning_job.cancel()

Once the tuning is complete, you can view the loss curve from the tuning results. The [loss curve](https://generativeai.devsite.corp.google.com/guide/model_tuning_guidance#recommended_configurations) shows how much the model's predictions deviate from the ideal outputs.

`result()` decpricated in the new google-genai SDK

In [16]:
# import pandas as pd
# import seaborn as sns

# model = operation.result()

# snapshots = pd.DataFrame(model.tuning_task.snapshots)

# sns.lineplot(data=snapshots, x = 'epoch', y='mean_loss')


## Evaluate your model

You can use the `clien.models.get()` method and specify the name of your model to test your model performance.

In [154]:
model = client.models.get(name=tuning_job.name)

In [155]:
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="55"
)

print(result.text)

56


In [156]:
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="123455"
)

print(result.text)

123456


In [157]:
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="four"
)

print(result.text)

five


In [158]:
# French 4
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="quatre"
)

print(result.text) # French 5 is "cinq"

cinq


In [159]:
# Roman numeral 3
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="III"
)

print(result.text) # Roman numeral 4 is IV

IV


In [160]:
# Japanese 7
result = client.models.generate_content(
    model=model.tuned_model.model,
    contents="七"
)

print(result.text) # Japanese 8 is 八!

八


It really seems to have picked up the task despite the limited examples, but "next" is a simple concept, see the [tuning guide](https://ai.google.dev/docs/model_tuning_guidance) for more guidance on improving performance.

## Update the description

You can update the description of your tuned model any time using the `client.models.update()` method.

In [161]:
# update only the description field
updated_model = client.models.update(
    model=model.name,
    config=types.UpdateModelConfig(
        description="This is my model."
    )
)

print("New description:", updated_model.description)

New description: This is my model.


In [162]:
refetched = client.models.get(model=model.name)

refetched.description

'This is my model.'

## Delete the model

You can clean up your tuned model list by deleting models you no longer need. Use the `client.models.delete()` method to delete a model. If you canceled any tuning jobs, you may want to delete those as their performance may be unpredictable.

In [163]:
client.models.delete(model=model.name)

DeleteModelResponse()

The model no longer exists:

In [165]:
try:
    # attempt to re-fetch the deleted tuned model
    deleted_model = client.models.get(model=model.name)
    print(deleted_model)

except Exception as e:
    # catch any other unexpected errors
    print(f"{type(e).__name__}: {e}")

ClientError: 404 NOT_FOUND. {'error': {'code': 404, 'message': 'Tuned model tunedModels/generatenum2528-394ofcsjtbyv7yeo7rzk9pj3 does not exist.', 'status': 'NOT_FOUND'}}
