In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Intro to Gemini 2.5 Flash-Lite

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fgetting-started%2Fintro_gemini_2_5_flash_lite.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb">
      <img width="32px" src="https://www.svgrepo.com/download/217753/github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_flash_lite.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>

| Authors |
| --- |
| [Eric Dong](https://github.com/gericdong) |
| [Holt Skinner](https://github.com/holtskinner) |

## Overview

Gemini 2.5 Flash-Lite is Google's most cost-effective Gemini 2.5 model yet, optimized for performance in high-volume workloads. Delivering higher performance than the previous 2.0 Flash and Flash-Lite models and significantly improved latency, it's ideal for tasks like classification, translation, intelligent routing, and other cost-sensitive, high-scale operations.

### Objectives

In this tutorial, you will learn how to use the Gemini API and the Google Gen AI SDK for Python with the Gemini 2.5 Flash-Lite model.

You will complete the following tasks:

- Generate text
- Control thinking budget
- View summarized thoughts
- Set model parameters and system instruction
- Use safety filters
- Use controlled generation
- Use search as a tool
- Use code execution

Gemini 2.5 Flash-Lite also supports other standard Gemini features such as multimodal input, automatic and manual function calling, counting tokens, chat completions. These features are not covered in this tutorial.


## Getting Started

### Install the Google Gen AI SDK for Python


In [None]:
%pip install --upgrade --quiet google-genai

### Authenticate your notebook environment

If you are running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Authenticate to Vertex AI on Google Cloud

You'll need to set up authentication by choosing **one** of the following methods:

1.  **Use a Google Cloud Project:** (Recommended for most users)
    - See instructions [Set up a project and development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)
2.  **Use a Vertex AI API Key (Express Mode):** For quick experimentation.
    - [Get an API Key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview)
    - See tutorial [Getting started with Gemini using Vertex AI in Express Mode](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_express.ipynb).

This tutorial uses a Google Cloud Project for authentication.

In [None]:
import os

PROJECT_ID = "[your-project-id]"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = "global"

### Import libraries


In [None]:
from typing import List

from IPython.display import HTML, Markdown, display
from google import genai
from google.genai.types import (
    GenerateContentConfig,
    GoogleSearch,
    HarmBlockThreshold,
    HarmCategory,
    SafetySetting,
    ThinkingConfig,
    Tool,
    ToolCodeExecution,
)
from pydantic import BaseModel

### Connect to the generative AI service on Vertex AI

In [None]:
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

## Use Gemini 2.5 Flash-Lite

Learn more about the [Gemini 2.5 Flash-Lite model](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-lite). The model ID used in this notebook is `gemini-2.5-flash-lite-preview-06-17`.

In [None]:
MODEL_ID = "gemini-2.5-flash-lite-preview-06-17"  # @param {type: "string"}

### Send Your First Prompt

- Use the `generate_content` method to generate responses to your prompts. You can pass text and other multimodal input to `generate_content`.
- Use the `generate_content_stream` method to stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated.
- Use the `.text` property to get the text content of the response. By default, Gemini outputs formatted text using [Markdown](https://daringfireball.net/projects/markdown/) syntax.

In [None]:
response = client.models.generate_content(
    model=MODEL_ID,
    contents="Why is the sky blue?",
)

display(Markdown(response.text))

### Set Model Parameters and System Instruction

- You can set [Gemini API parameters](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#parameters) in each model request to control how the model generates a response.
- A [system instruction](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/system-instruction-introduction) gives the model additional context to understand the task, provide more customized responses, and adhere to guidelines over the user interaction.
- You can also configure [safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters) to adjust how the model handles different content categories.


In [None]:
system_instruction = """
Persona: You are 'The Curator,' a sophisticated AI historian. Your tone is eloquent and engaging.
Task: Create a thematic travel itinerary that creatively combines the user's interests.
Rules:
- Use Markdown for all formatting.
- Each day must have a ### Theme:.
- Every day must include these exact bolded sections: Morning:, Afternoon:, Evening:, The Curator's Note:, and Required Attire:.
- Directly connect the itinerary to all of the user's stated interests.
"""

prompt = """
Destination: Florence, Italy
Duration: 2 Days
Interests: Medici family banking and the use of light/optics in Renaissance art.
Goal: Create an itinerary showing how money and light converged to create Florentine masterpieces.
"""

safety_settings = [
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    ),
]

response = client.models.generate_content(
    model=MODEL_ID,
    contents=prompt,
    config=GenerateContentConfig(
        system_instruction=system_instruction,
        temperature=0.7,
        top_p=0.95,
        candidate_count=1,
        max_output_tokens=2048,
        safety_settings=safety_settings,
    ),
)

display(Markdown(response.text))

### Control Thinking Budget

Gemini 2.5 Flash-Lite is a hybrid reasoning model that can apply an extended amount of thinking to tasks and use tools to maximize response accuracy.  Unlike Gemini 2.5 Flash, the thinking mode is OFF by default in Gemini 2.5 Flash-Lite. You can set the optional `thinking_budget` parameter to control how much the model thinks for a given prompt.


- `thinking_budget=0` : Thinking OFF (default)
- `thinking_budget=-1`: Dynamic thinking
- `thinking_budget=[512-24576]`: Allocate thinking budget

Then use the `generate_content` or `generate_content_stream` method to send a request to generate content with the `ThinkingConfig`.


In [None]:
THINKING_BUDGET = 1024  # @param {type: "integer"}

response = client.models.generate_content(
    model=MODEL_ID,
    contents="How many R's are in the word strawberry?",
    config=GenerateContentConfig(
        thinking_config=ThinkingConfig(
            include_thoughts=True,
            thinking_budget=THINKING_BUDGET,
        )
    ),
)

display(Markdown(response.text))

Optionally, you can print the `usage_metadata` and token counts from the model response.

In [None]:
print(f"prompt_token_count: {response.usage_metadata.prompt_token_count}")
print(f"candidates_token_count: {response.usage_metadata.candidates_token_count}")
print(f"thoughts_token_count: {response.usage_metadata.thoughts_token_count}")
print(f"total_token_count: {response.usage_metadata.total_token_count}")

### View Summarized Thoughts

You can set the `include_thoughts` flag to `True` to have the model return a summary of its "thoughts" in addition to the final answer.  When you make a request with this setting, the response will contain multiple parts.  You can check the `part.thought` field to identify which part contains the model's thoughts.

In [None]:
response = client.models.generate_content(
    model=MODEL_ID,
    contents="How many R's are in the word strawberry?",
    config=GenerateContentConfig(
        thinking_config=ThinkingConfig(include_thoughts=True, thinking_budget=-1)
    ),
)

for part in response.candidates[0].content.parts:
    if part.thought:
        display(Markdown(f"**Thoughts**: {part.text}"))
    else:
        display(Markdown(f"**Answer**: {part.text}"))

## Use Controlled Generation

Controlled generation allows you to define a response schema to specify the structure of the model's output, including field names and expected data types.  The model's output will strictly follow the provided schema.

In [None]:
class Recipe(BaseModel):
    name: str
    description: str
    ingredients: list[str]


class RecipeList(BaseModel):
    recipes: List[Recipe]


response = client.models.generate_content(
    model=MODEL_ID,
    contents="List a few popular cookie recipes and their ingredients.",
    config=GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=RecipeList,
    ),
)

recipes: List[Recipe] = response.parsed
for recipe in recipes.recipes:
    print(f"Recipe: {recipe.name}")
    print(f"Ingredients: {recipe.ingredients}")
    print("\n")

## Grounding with Google Search (Search as a Tool)

Google Search is available as a built-in tool in Gemini 2.5 Flash-Lite, which is not a feature of the 2.0 Flash-Lite model.  This allows the model to decide when to use Google Search to improve the accuracy and recency of its responses.

In [None]:
google_search_tool = Tool(google_search=GoogleSearch())

response = client.models.generate_content(
    model=MODEL_ID,
    contents="What is the current temperature in Austin, TX?",
    config=GenerateContentConfig(
        tools=[google_search_tool],
    ),
)

display(Markdown(response.text))

HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content)

## Code Execution

The Gemini API's code execution feature enables the model to generate and run Python code.  The model can learn iteratively from the code's results to arrive at a final output.

In [None]:
code_execution_tool = Tool(code_execution=ToolCodeExecution())

response = client.models.generate_content(
    model=MODEL_ID,
    contents="Calculate 20th fibonacci number. Then find the nearest palindrome to it.",
    config=GenerateContentConfig(
        tools=[code_execution_tool],
        temperature=0,
    ),
)

display(
    Markdown(
        f"""
## Code
```py
{response.executable_code}
```
### Output
```
{response.code_execution_result}
```
"""
    )
)

## What's next

- Explore other notebooks in the [Google Cloud Generative AI GitHub repository](https://github.com/GoogleCloudPlatform/generative-ai).
- Explore AI models in [Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models).