##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Exploring Advanced Parameters with Updated Models

<a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/New_in_002.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

This notebook explores the new options added with the 002 versions of the 1.5 series models:

* Candidate count
* Presence and frequency penalties
* Response logprobs


**Note**:

* This notebook was originally designed for `002` models. It has been updated to work with the latest models in the SDK, focusing on advanced parameter exploration.

## Setup

Install the latest version of the `google-genai` SDK:

In [1]:
%pip install -q "google-genai"

Note: you may need to restart the kernel to use updated packages.


Import the package and set up your API key:

In [2]:
from google import genai

In [3]:
import os
api_key = os.getenv("GEMINI_API_KEY")

In [4]:
client = genai.Client(api_key=api_key)

Import other packages.

In [5]:
from IPython.display import display, Markdown, HTML

Check model availibility:

In [6]:
# Try the simplest possible call first
try:
    response = client.models.generate_content(
        model="gemini-2.5-flash", 
        contents="Hello"
    )
    print("Success! API key works")
    print(response.text)
except Exception as e:
    print(f"Still failing: {e}")


Success! API key works
Hello! How can I help you today?


In [7]:
for model in client.models.list():  
        print(model.name)

models/embedding-gecko-001
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/gemini-2.0-flash-thinking-exp-1219
models/gemini-2.5-flash-preview-tts
models/gemini-2.5-pro-preview-tts
models/learnlm-2.0-flash-experimental
models/gemma-3-1b-it
models/gemma-3-4b-it
models/gemma-3-12b-it
models/gemma-3-27b-it
models/gemma-3n-e4b-it
mo

In [8]:
model_name = "models/gemini-2.5-flash"
test_prompt="Why don't people have tails?"

## Quick refresher on `generation_config` [Optional]

In [9]:
from google.genai import types

In [10]:
client = genai.Client()

response = client.models.generate_content(
model= model_name,
contents='hello',
config=types.GenerateContentConfig(    
temperature=1.0,
max_output_tokens=5,
),
)

Note:

* Each `generate_content` request is sent with a `generation_config` (`chat.send_message` uses `generate_content`).
* You can set the `generation_config` by either passing it to the model's initializer, or passing it in the arguments to `generate_content` (or `chat.send_message`).
* Any `generation_config` attributes set in `generate_content` override the attributes set on the model.
* You can pass the `generation_config` as either a Python `dict`, or a `genai.GenerationConfig`.
* If you're ever unsure about the parameters of `generation_config` check `genai.GenerationConfig`.

## Candidate count

With 002 models you can now use `candidate_count > 1`.

In [11]:
model = model_name

In [12]:
generation_config = dict(candidate_count=2)

In [13]:
response = client.models.generate_content(
    model= model,
    contents=test_prompt, 
    config=generation_config
)

Here the `.text` quick-accessor returns the text result from the first candidate when multiple candidates are present.

In [14]:
try:
  response.text # Does not fail with multiple candidates, Returns text from the first candidate when multiple candidates are present.
except ValueError as e:
  print(e)

there are 2 candidates, returning text result from the first candidate. Access response.candidates directly to get the result from other candidates.


When multiple candidates are present, iterate over `response.candidates` to access the content of each candidate individually:

In [15]:
for candidate in response.candidates:
  display(Markdown(candidate.content.parts[0].text))
  display(Markdown("-------------"))


The simple answer is **evolution**. Humans, along with other great apes (like chimpanzees, gorillas, and orangutans), evolved in a way that made tails unnecessary and even disadvantageous for our primary mode of locomotion and lifestyle.

Here's a more detailed breakdown:

1.  **Our Ancestors Had Tails:** Our very distant primate ancestors *did* have tails. These were useful for balance when climbing through trees, for grasping branches (in some cases, like spider monkeys, who have prehensile tails), and for communication.

2.  **The Shift to Apes:** The evolutionary lineage leading to humans diverged from monkeys millions of years ago. One of the key distinctions between Old World Monkeys (many of whom have tails) and Apes (none of whom have external tails) is this very feature.

3.  **Bipedalism and Upright Posture:** This is the most significant factor. As our ancestors began to spend more time on the ground and eventually evolved to walk upright on two legs (bipedalism):
    *   **Balance:** A tail, which helps quadrupedal animals (four-legged) balance, became less useful for bipedal balance. Our upright posture and changes in our spine and pelvis allow us to balance differently.
    *   **Energy Cost:** Building and maintaining a tail (with bones, muscles, nerves, and skin) requires energy and resources. If it's not providing a significant advantage, natural selection will favor individuals who don't expend those resources on a redundant structure.
    *   **Hindrance:** A tail might even have become a hindrance, potentially getting in the way or becoming a liability when navigating varied terrain or escaping predators while walking upright.

4.  **Changes in Pelvis and Spine:** The evolution of bipedalism profoundly changed the structure of our pelvis and the curvature of our spine. These adaptations are crucial for supporting our body weight and maintaining balance while walking upright. A tail would require a different anatomical configuration that wouldn't fit well with our current structure.

5.  **The Coccyx (Tailbone):** We still have a remnant of a tail! It's called the **coccyx**, or tailbone. This small, fused bone at the very bottom of our spine is the vestigial (non-functional remnant) of our ancestral tail. It serves a minor purpose today, providing an attachment point for some muscles and ligaments.

6.  **Embryonic Development:** Interestingly, human embryos *do* have a tail-like structure (a "caudal eminence") during early development, around weeks 4-6. However, this structure normally recedes and is absorbed, fusing to form the coccyx. This fleeting embryonic tail is a powerful piece of evidence for our evolutionary history.

In rare cases, babies are born with what is sometimes called a "true human tail." This is a developmental anomaly (an atavism) where the embryonic tail structure doesn't fully regress. These are typically soft tissue growths and are usually removed surgically.

-------------

It's a fascinating question that delves deep into our evolutionary history! The short answer is that **humans, along with other great apes (chimpanzees, gorillas, orangutans, and bonobos), lost their tails over millions of years of evolution because it was no longer advantageous and eventually became a disadvantage for their changing lifestyles.**

Here's a breakdown of why:

1.  **Our Ancestors Had Tails:** Our very distant primate ancestors, like many monkeys today, definitely had tails. These tails were incredibly useful for:
    *   **Balance:** Especially for arboreal (tree-dwelling) animals, a tail acts like a counterbalance, helping them navigate branches.
    *   **Locomotion:** Some monkeys have prehensile tails, which can grip branches and act as a "fifth limb."
    *   **Communication:** Tails can be used to signal warnings, attract mates, or display dominance.

2.  **The Evolutionary Split: Apes vs. Monkeys:**
    *   Around 20-25 million years ago, the lineage that would lead to apes and humans diverged from the lineage of Old World monkeys. This was a critical point.
    *   As our ape ancestors began to adopt different methods of moving through trees – like **brachiation** (swinging arm-over-arm) or spending more time upright – the tail became less essential for balance. In fact, a long, cumbersome tail could even become a hindrance.

3.  **Advantages of Losing the Tail:**
    *   **Improved Upright Posture:** For ancestors who were starting to spend more time on the ground or adopting more upright postures in trees, a tail wasn't needed for balance. A streamlined torso became more efficient.
    *   **Enhanced Stability for Sitting:** Losing the tail allowed for a more stable, broad base for sitting, which is important for a variety of activities, from eating to tool-making.
    *   **Reduced Energy Expenditure:** Growing and maintaining a tail takes energy. If it's not useful, that energy is better spent elsewhere.
    *   **Simplified Body Plan:** A tail is an appendage that can be injured or caught. Removing it simplified the body plan for a changing lifestyle.

4.  **The Vestige: Our Coccyx (Tailbone):**
    *   While we don't have an external tail, we do have a remnant: the **coccyx**, or tailbone. This small, triangular bone at the very end of our spine is made up of several fused vertebrae.
    *   It's a **vestigial structure** – a clear anatomical "fossil" from our tailed ancestors. It still serves minor functions, like providing attachment points for some muscles and ligaments, and supporting the pelvic floor.

5.  **Embryonic Development:**
    *   Interestingly, human embryos *do* develop a tail-like structure early in gestation (around 4-8 weeks). This is a fascinating glimpse into our evolutionary past, as it reflects the ancestral blueprint.
    *   However, specific genes then trigger a process called **apoptosis** (programmed cell death), which causes these tail cells to be reabsorbed, and the structure regresses to form the coccyx. This genetic "switch" prevents the full development of an external tail.

6.  **Rare Atavisms:**
    *   Very rarely, due to a genetic mutation or developmental anomaly, a baby can be born with a small, fleshy "vestigial tail." These are called **atavisms** – the reappearance of an ancestral trait that has been lost through evolution. They are usually benign and can be surgically removed.

In summary, the absence of a tail in humans is not an accident or a lack, but rather a profound evolutionary adaptation. It reflects a shift in locomotion and lifestyle that began millions of years ago, ultimately leading to the unique way humans and other apes navigate the world.

-------------

The response contains multiple full `Candidate` objects.

In [16]:
response

GenerateContentResponse(
  automatic_function_calling_history=[],
  candidates=[
    Candidate(
      content=Content(
        parts=[
          Part(
            text="""The simple answer is **evolution**. Humans, along with other great apes (like chimpanzees, gorillas, and orangutans), evolved in a way that made tails unnecessary and even disadvantageous for our primary mode of locomotion and lifestyle.

Here's a more detailed breakdown:

1.  **Our Ancestors Had Tails:** Our very distant primate ancestors *did* have tails. These were useful for balance when climbing through trees, for grasping branches (in some cases, like spider monkeys, who have prehensile tails), and for communication.

2.  **The Shift to Apes:** The evolutionary lineage leading to humans diverged from monkeys millions of years ago. One of the key distinctions between Old World Monkeys (many of whom have tails) and Apes (none of whom have external tails) is this very feature.

3.  **Bipedalism and Upright Posture:

### Temperature, Top-k & Top-p Parameters

parameters like `temperature`, `top_k`, and `top_p` can be used to influence the diversity and randomness of the model's output.

* Temperature: Controls the randomness of token selection. Higher values (e.g., 1.5) increase diversity, while lower values (e.g., 0.2) make the output more deterministic.
* Top-k: Limits the sampling to the top k most probable tokens, reducing randomness and encouraging repetition.
* Top-p: Implements nucleus sampling, where the model considers the smallest set of tokens whose cumulative probability exceeds p.
These parameters provide fine-grained control over the token sampling process, allowing you to achieve effects similar to those previously achieved with the `presence_penalty` parameter.

In [17]:
from statistics import mean

In [18]:
def unique_words(prompt, generation_config, N=10):
  responses = []
  vocab_fractions = []
  for n in range(N):
    response = client.models.generate_content(
      model= model_name,
      contents=prompt, 
      config=generation_config
    )
    responses.append(response)

    # Access the text content of the first candidate
    words = response.candidates[0].content.parts[0].text.lower().split()
    score = len(set(words)) / len(words)
    print(score)
    vocab_fractions.append(score)

  return vocab_fractions

In [19]:
prompt='Tell me a story'

In [168]:
# baseline
v = unique_words(prompt, generation_config={})

0.5416666666666666
0.5492772667542707
0.5197792088316467
0.5365205843293492
0.5467741935483871
0.46190102120974075
0.5428571428571428
0.5366379310344828
0.5372208436724566
0.5211912943871707


In [169]:
mean(v)

0.5293826153291314

In [170]:
# these temperature, top_k, and top_p parameters can be used for diversity in output tokens.
v = unique_words(prompt, generation_config= dict(temperature=1.5))

0.533249686323714
0.5445614035087719
0.5016
0.5420054200542005
0.579734219269103
0.5617647058823529
0.5618479880774963
0.5410225921521997
0.5762514551804424
0.560853199498118


NOTE: 
* Penalty parameters (`presence_penalty`, `frequency_penalty`) are not supported in Gemini 2.5 models.


Migration Note: 
* The penalty parameters previously available in older models (like `-002` versions) are no longer supported in the `Gemini 2.5` series. The new model architecture inherently handles issues like repetition, and attempts to set penalties will result in an `INVALID_ARGUMENT` error.

In [171]:
mean(v)

0.5502890669946399

In [None]:
# parameters like temperature, top_k, top_p can also discourage diversity in the output tokens.
v = unique_words(prompt, generation_config=dict(temperature=0.2))

0.5728900255754475
0.5348258706467661
0.49242424242424243
0.5516826923076923
0.4813780260707635
0.5251989389920424
0.5091093117408907
0.5685019206145967
0.5222857142857142
0.4861407249466951


In [128]:
mean(v)

0.5244437467604851

The `temperature`, `top_k`, and `top_p` parameters can be used to influence the diversity and randomness of the model's output. These parameters provide fine-grained control over the token sampling process, allowing you to achieve effects similar to those previously achieved with the `presence_penalty` parameter.

## Top-k Sampling

The `top_k` parameter limits the sampling to the top `k` most probable tokens, encouraging repetition and reducing randomness. This can be used as an alternative to the unsupported `frequency_penalty` parameter.

The easiest way to see that it works is to ask the model to do something repetitive. The model has to get creative while trying to complete the task.

In [20]:
response = client.models.generate_content(
    model= model_name,
    contents='please repeat "Cat" 50 times, 10 per line, with random capitalization.',
    config=dict(top_k=10)
)

In [21]:
print(response.candidates[0].content.parts[0].text)

Here are 50 instances of "Cat" with random capitalization, 10 per line:

cAT cAt caT CAT Cat cAT CAt cat CAt caT
cAT CA T CaT cAt cAT CA T CAt CA T CaT Cat
caT cAT CAt Cat cAt Cat cAT CA T CAt Cat
cAt cAt CAT CAt caT cAt cAt cAt CA T CaT
CaT CAt caT cAt CA T CAt cAt cAt CaT cAT


Since the frequency penalty accumulates with usage, it can have a much stronger effect on the output compared to the presence penalty.

> Caution: Be careful with negative frequency penalties: A negative penalty makes a token more likely the more it's used. This positive feedback quickly leads the model to just repeat a common token until it hits the `max_output_tokens` limit (once it starts the model can't produce the `<STOP>` token).

In [33]:
from IPython.display import display, Markdown

In [34]:
response = client.models.generate_content(
    model=model,
    contents ='Tell me a story about a brave cat by the sea, make it poetic and descriptive.',
    config=dict(
        max_output_tokens=400,
        top_p=0.2,
        temperature=0.2,
        )    # Controls randomness in token selection
)
print(response)

sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    role='model'
  ),
  finish_reason=<FinishReason.MAX_TOKENS: 'MAX_TOKENS'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='E8_vaPP1OqGb1e8Px5vc6Qs' usage_metadata=GenerateContentResponseUsageMetadata(
  prompt_token_count=19,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=19
    ),
  ],
  thoughts_token_count=399,
  total_token_count=418
) automatic_function_calling_history=[] parsed=None


### Token Consumption in Gemini 2.5 Models

In the Gemini 2.5 models, the "thoughts" token count often consumes the majority of the allocated tokens, leaving little room for generating meaningful output. This behavior is particularly noticeable when using the `max_output_tokens` parameter, as shown in the example below.

Additionally, the `frequency_penalty` parameter, which was available in older models, is not supported in Gemini 2.5. The new model architecture inherently handles repetition issues, but this can lead to scenarios where token allocation is dominated by internal reasoning rather than output generation.

> **Observation**: The `finish_reason` in the response indicates `MAX_TOKENS`, highlighting that the model reached the token limit without producing meaningful content.

In [35]:
Markdown(response.text)  # the, the, the, ...

<IPython.core.display.Markdown object>

In [36]:
response.candidates[0].finish_reason

<FinishReason.MAX_TOKENS: 'MAX_TOKENS'>