##### Copyright 2025 Patrick Loeber

In [None]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Workshop: Build with Gemini (Part 3)

<a target="_blank" href="https://colab.research.google.com/github/patrickloeber/workshop-build-with-gemini/blob/main/notebooks/part-3-thinking-and-tools.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This workshop teaches how to build with Gemini using the Gemini API and Python SDK.

Course outline:

- **[Part1: Quickstart + Text prompting](https://github.com/patrickloeber/workshop-build-with-gemini/blob/main/notebooks/part-1-text-prompting.ipynb)**

- **[Part 2: Multimodal understanding (image, video, audio, docs, code)](https://github.com/patrickloeber/workshop-build-with-gemini/blob/main/notebooks/part-2-multimodal-understanding.ipynb)**

- **Part 3 (this notebook): Thinking models + agentic capabilities (tool usage)**
  - Thinking models
  - Structured outputps
  - Code execution
  - Grounding with Google Search
  - Function calling
  - Final excercise: Give Gemini access to the PokéAPI to answer Pokémon questions

## 0. Use the Google AI Studio as playground

Explore and play with all models in the [Google AI Studio](https://aistudio.google.com/apikey).

## 1. Setup

Get a free API key in the [Google AI Studio](https://aistudio.google.com/apikey) and set up the [Google Gen AI Python SDK](https://github.com/googleapis/python-genai)

In [1]:
%pip install -U -q google-genai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/159.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m153.6/159.7 kB[0m [31m7.9 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m153.6/159.7 kB[0m [31m7.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m159.7/159.7 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
from google.colab import userdata

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

In [3]:
from google import genai
from google.genai import types

client = genai.Client(api_key=GOOGLE_API_KEY)

## Thinking models

Starting with Gemini 2.5, all models have thinking capabilities. These models use an internal "thinking process" during response generation. This process contributes to their improved reasoning capabilities and allows them to solve complex tasks, particularly complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents.

Thinking models are also great at working with tools to perform actions beyond generating text. This allows them to interact with external systems, execute code, or access real-time information, incorporating the results into their reasoning and final response.

(Note: Tools are also available with Gemini 2.0 models)

In [None]:
# 2.5 Pro
MODEL = "gemini-2.5-pro-exp-03-25"  # with paid tier: gemini-2.5-pro-preview-03-25

# 2.5 Flash
# MODEL = "gemini-2.5-flash-preview-04-17"

In [6]:
response = client.models.generate_content(
    model=MODEL,
    contents="If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?"
)

print(response.text)

It still takes **5 minutes** to boil three eggs.

You can boil them all at the same time in the same pot of water. The cooking time for each egg doesn't change based on how many others are in the pot (as long as the pot is big enough and the water stays boiling).


## **!! Exercise !!** ##

- Go to [Google AI Studio](https://ai.dev/?model=gemini-2.5-pro-preview-03-25), use Gemini 2.5 Pro, give it a complex task, and pbserve the thinking process. For example, create a p5js game in one shot:

```
Make a p5js soccer game simulation. There should be 2 teams and each player on the team should have their path traveled displayed. Add live stats on the right side and score in the top bar. no HTML
```

## Structured output

Gemini generates unstructured text by default, but some applications require structured text. For these use cases, you can constrain Gemini to respond with JSON, a structured data format suitable for automated processing. You can also constrain the model to respond with one of the options specified in an enum.

In [None]:
from pydantic import BaseModel

class Recipe(BaseModel):
  recipe_name: str
  ingredients: list[str]

response = client.models.generate_content(
    model=MODEL,
    contents='List a three popular cookie recipes. Be sure to include the amounts of ingredients.',
    config={
        'response_mime_type': 'application/json',
        'response_schema': list[Recipe],
    },
)
# Use the response as a JSON string.
print(response.text)

# Use instantiated objects.
my_recipes: list[Recipe] = response.parsed

Contrain to enums:

In [7]:
response = client.models.generate_content(
    model=MODEL,
    contents='What type of food is a banana?',
    config={
        'response_mime_type': 'text/x.enum',
        'response_schema': {
            "type": "STRING",
            "enum": ["froot", "vegetable", "grains", "protein foods", "dairy"],
        },
    },
)

print(response.text)

froot


Or use the builtin Python enum class:

In [8]:
import enum

class FOOD(enum.Enum):
  FROOT = "froot"
  VEGETABLE = "vegetable"
  GRAINS = "grains"
  PROTEIN_FOODS = "protein foods"
  DAIRY = "dairy"

response = client.models.generate_content(
    model=MODEL,
    contents='What type of food is cheese?',
    config={
        'response_mime_type': 'text/x.enum',
        'response_schema': FOOD,
    },
)

print(response.text)

dairy


## Code execution

The code execution feature enables the model to generate and run Python code and learn iteratively from the results until it arrives at a final output. You can use this code execution capability to build applications that benefit from code-based reasoning and that produce text output. For example, you could use code execution in an application that solves equations or processes text.

In [10]:
from google.genai import types

# In your prompt, give instruction to use/generate code

response = client.models.generate_content(
  model=MODEL,
  contents='What is the sum of the first 50 prime numbers? '
           'Generate and run code for the calculation.',
  config=types.GenerateContentConfig(
    tools=[types.Tool(
      code_execution=types.ToolCodeExecution
    )]
  )
)

In [11]:
response

GenerateContentResponse(candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text="Okay, I can help with that.\n\nHere's the plan:\n1.  Define a function `is_prime(n)` to check if a given number `n` is prime.\n2.  Initialize a counter for primes found and a variable for the sum.\n3.  Iterate through numbers starting from 2.\n4.  If a number is prime, add it to the sum and increment the prime counter.\n5.  Stop when 50 primes have been found.\n6.  Print the final sum.\n\nHere is the Python code to perform the calculation:\n"), Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=ExecutableCode(code='import math\n\ndef is_prime(n):\n    """Checks if a number n is prime."""\n    if n <= 1:\n        return False\n    if n == 2:\n        return True\n    if n % 2 == 0:\n        return False\n    # Check

In [12]:
from IPython.display import Image, Markdown, Code, HTML

def display_code_execution_result(response):
  for part in response.candidates[0].content.parts:
    if part.text is not None:
      display(Markdown(part.text))
    if part.executable_code is not None:
      code_html = f'<pre style="background-color: #6a0cad;">{part.executable_code.code}</pre>' # Change code color
      display(HTML(code_html))
    if part.code_execution_result is not None:
      display(Markdown("#### Output"))
      display(Markdown(part.code_execution_result.output))
    if part.inline_data is not None:
      display(Image(data=part.inline_data.data, format="png"))
    display(Markdown("---"))

display_code_execution_result(response)

Okay, I can help with that.

Here's the plan:
1.  Define a function `is_prime(n)` to check if a given number `n` is prime.
2.  Initialize a counter for primes found and a variable for the sum.
3.  Iterate through numbers starting from 2.
4.  If a number is prime, add it to the sum and increment the prime counter.
5.  Stop when 50 primes have been found.
6.  Print the final sum.

Here is the Python code to perform the calculation:


---

---

#### Output

The sum of the first 50 prime numbers is: 5117


---

**Findings:**

Based on the executed code:
The sum of the first 50 prime numbers is **5117**.

---

## Grounding with Google Search

If Google Search is configured as a tool, Gemini can decide when to use Google Search to improve the accuracy and recency of responses.

Here's a question about a recent event without Google Search:



In [16]:
response = client.models.generate_content(
    model=MODEL,
    contents="Who won the super bowl in 2025?",
)

print(response.text)

The Super Bowl in 2025 (Super Bowl LIX) hasn't happened yet!

It is scheduled to be played on **February 9, 2025**, at the Caesars Superdome in New Orleans, Louisiana. It will determine the champion of the 2024 NFL season.

We'll have to wait until then to find out who wins!


In [17]:
from google.genai.types import Tool, GenerateContentConfig, GoogleSearch

google_search_tool = Tool(
    google_search = GoogleSearch()
)

response = client.models.generate_content(
    model=MODEL,
    contents="Who won the super bowl in 2025?",
    config=GenerateContentConfig(
        tools=[google_search_tool],
        response_modalities=["TEXT"],
    )
)

In [18]:
for part in response.candidates[0].content.parts:
    print(part.text)

The **Philadelphia Eagles** won Super Bowl LIX in 2025.

Here are some details about the game:

*   **Date:** February 9, 2025
*   **Location:** Caesars Superdome, New Orleans, Louisiana
*   **Matchup:** Philadelphia Eagles (NFC Champion) vs. Kansas City Chiefs (AFC Champion and two-time defending Super Bowl champion)
*   **Final Score:** Philadelphia Eagles 40, Kansas City Chiefs 22
*   **Outcome:** The Eagles secured their second Super Bowl title in franchise history, preventing the Chiefs from achieving an unprecedented three consecutive Super Bowl wins.
*   **MVP:** Eagles quarterback Jalen Hurts was named Super Bowl MVP. He threw for 221 yards and two touchdowns, and rushed for 72 yards and another touchdown.


In [19]:
# To get grounding metadata as web content.
HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content)

#### **!! Exercise !!**

Use Gemini with Google Search for the current weather and the forecast for the next weekend in Berlin

In [20]:
from google.genai.types import Tool, GenerateContentConfig, GoogleSearch

google_search_tool = Tool(
    google_search = GoogleSearch()
)

response = client.models.generate_content(
    model=MODEL,
    contents="Get the current weather in Berlin. Also get the forecast for the weekend",
    config=GenerateContentConfig(
        tools=[google_search_tool],
        response_modalities=["TEXT"],
    )
)

for part in response.candidates[0].content.parts:
    print(part.text)

Based on the search results, here is the current weather and weekend forecast for Berlin:

**Current Weather in Berlin:**

*   The current temperature is around 11-15°C.
*   Conditions are currently cloudy or a mix of sun and clouds. Some sources mention possible showers, but generally dry.
*   The temperature is expected to rise to a high of about 19°C today.
*   The wind is generally weak.

**Weekend Weather Forecast for Berlin:**

*   **Saturday:** The weather is expected to be lightly clouded or a mix of sun and clouds. Maximum temperatures are forecasted to be around 13°C to 16°C, with minimums around 3°C to 9°C. There is a low chance of precipitation.
*   **Sunday:** Similar conditions to Saturday are expected, with partly cloudy skies or a mix of sun and clouds. Temperatures might be slightly warmer, with highs around 15°C to 16°C and lows around 4°C to 5°C. There is also a low chance of precipitation.

Please note that weather forecasts can change, especially specific details l

## Function calling

Function calling lets you connect models to external tools and APIs. Instead of generating text responses, the model understands when to call specific functions and provides the necessary parameters to execute real-world actions.

In [21]:
from google.genai import types

# Define the function declaration for the model
weather_function = {
    "name": "get_current_temperature",
    "description": "Gets the current temperature for a given location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name",
            },
        },
        "required": ["location"],
    },
}

# Configure the client and tools
tools = types.Tool(function_declarations=[weather_function])

# Send request with function declarations
response = client.models.generate_content(
    model=MODEL,
    contents="What's the temperature in London?",
    config=types.GenerateContentConfig(tools=[tools])
)

Check for a function call

In [22]:
if response.candidates[0].content.parts[0].function_call:
    function_call = response.candidates[0].content.parts[0].function_call
    print(f"Function to call: {function_call.name}")
    print(f"Arguments: {function_call.args}")
    #  In a real app, you would call your function here:
    #  result = get_current_temperature(**function_call.args)
else:
    print("No function call found in the response.")
    print(response.text)

Function to call: get_current_temperature
Arguments: {'location': 'London'}


### Automatic Function Calling (Python Only)

When using the Python SDK, you can provide Python functions directly as tools.

The SDK handles the function call and returns the final text.

In [23]:
# Define the function with type hints and docstring
def get_current_temperature(location: str) -> dict:
    """Gets the current temperature for a given location.

    Args:
        location: The city and country, e.g. San Francisco, USA

    Returns:
        A dictionary containing the temperature and unit.
    """
    # ... (implementation) ...
    return {"temperature": 25, "unit": "Celsius"}


response = client.models.generate_content(
    model=MODEL,
    contents="What's the temperature in Boston?",
    config=types.GenerateContentConfig(
    tools=[get_current_temperature],
    # to diable automatic funtion calling, you can set this:
    # automatic_function_calling=types.AutomaticFunctionCallingConfig(disable=True)
    )
)

print(response.text)

The current temperature in Boston, USA is 25 degrees Celsius.


Check the function calling history:

In [24]:
for content in response.automatic_function_calling_history:
    for part in content.parts:
        if part.function_call:
            print(part.function_call)

id=None args={'location': 'Boston, USA'} name='get_current_temperature'


## Exercise: Get Pokémon stats

- Define a function that can work with the PokéAPI and get Pokémon stats.
- Endpoint to use: `GET https://pokeapi.co/api/v2/pokemon/<pokekon_name>`
- Call Gemini and give it access to the function, then answer questions like: `"What stats does the Pokemon Squirtle have?"`


In [25]:
import requests

def get_pokemon_info(pokemon: str) -> dict:
    """Gets pokemon info for a given pokemon name.

    Args:
        pokemon: The name of the pokemon.

    Returns:
        A dictionary containing the info.
    """
    resp = requests.get(f"https://pokeapi.co/api/v2/pokemon/{pokemon.lower()}")
    return resp.json()


response = client.models.generate_content(
    model=MODEL,
    contents="What stats does the Pokemon Squirtle have?",
    config=types.GenerateContentConfig(tools=[get_pokemon_info])
)

print(response.text)

Squirtle has the following base stats:
*   **HP**: 44
*   **Attack**: 48
*   **Defense**: 65
*   **Special Attack**: 50
*   **Special Defense**: 64
*   **Speed**: 43


In [26]:
for content in response.automatic_function_calling_history:
    for part in content.parts:
        if part.function_call:
            print(part.function_call)

id=None args={'pokemon': 'Squirtle'} name='get_pokemon_info'


## Recap & Next steps

Awesome work! You learned about thinking models with advanced reasoning capabilities and how to combine Gemini with tools for agentic use cases.

More helpful resources:

- [Thinking docs](https://ai.google.dev/gemini-api/docs/thinking)
- [Structured output docs](https://ai.google.dev/gemini-api/docs/structured-output?lang=python)
- [Code execution docs](https://ai.google.dev/gemini-api/docs/code-execution?lang=python)
- [Grounding docs](https://ai.google.dev/gemini-api/docs/grounding?lang=python)
- [Function calling docs](https://ai.google.dev/gemini-api/docs/function-calling?example=weather)

🎉🎉**Conratulations, you completed the workshop!**🎉🎉

**Next steps**: There's even more you can do with Gemini which we didn't cover in this workshop:

- [Image creation and editing with Gemini 2.0](https://github.com/patrickloeber/genai-tutorials/blob/main/notebooks/gemini-image-editing.ipynb)
- [Live API: Talk to Gemini and share your camera](https://aistudio.google.com/live) & [Live API cookbook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Get_started_LiveAPI.ipynb)
