##### Copyright 2024 Google LLC.

In [30]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Introduction to Gemini 2.0 Flash Thinking

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/gemini-2/thinking.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

[Gemini 2.0 Flash Thinking](https://ai.google.dev/gemini-api/docs/thinking-mode), is an experimental model that explicitly showcases its thoughts. Built on the speed and performance of Gemini 2.0 Flash, this model is trained to use thoughts in a way that leads to stronger reasoning capabilities.

You'll see examples of those reasoning capabilities with [code understanding](#scrollTo=GAa7sCD7tuMW), [geometry](#scrollTo=ADiJV-fFyjRe) and [math](#scrollTo=EXPPWpt6ttJZ) problems and for [generating questions](#scrollTo=dtBDPf4kAyG1) adapted to a specific level of knowledge.

As you will see, the model is exposing its thoughts so you can have a look at its reasoning and how it did reach its conclusions.

## 0/ Setup

This section install the SDK, set it up using your [API key](../quickstarts/Authentication.ipynb), imports the relevant libs, downloads the sample videos and upload them to Gemini.

Just collapse (click on the little arrow on the left of the title) and run this section if you want to jump straight to the examples (just don't forget to run it otherwise nothing will work).

### Install SDK

The new **[Google Gen AI SDK](https://ai.google.dev/gemini-api/docs/sdks)** provides programmatic access to Gemini 2.0 (and previous models) using both the [Google AI for Developers](https://ai.google.dev/gemini-api/docs) and [Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/overview) APIs. With a few exceptions, code that runs on one platform will run on both. This means that you can prototype an application using the Developer API and then migrate the application to Vertex AI without rewriting your code.

More details about this new SDK on the [documentation](https://ai.google.dev/gemini-api/docs/sdks) or in the [Getting started](../gemini-2/get_started.ipynb) notebook.

In [31]:
!pip install -U -q "google-genai>=0.3.0"

### Setup your API key

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](../quickstarts/Authentication.ipynb) for an example.

In [32]:
from google.colab import userdata

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

### Initialize SDK client

With the new SDK you now only need to initialize a client with you API key (or OAuth if using [Vertex AI](https://link_to_vertex_AI)). The model is now set in each call.

In [33]:
from google import genai
from google.genai import types

client = genai.Client(api_key=GOOGLE_API_KEY)

### Check the "thinking" model info

The [Gemini 2.0 Flash Thinking](https://ai.google.dev/gemini-api/docs/thinking-mode) model is optimized for complex tasks that need multiple rounds of strategyzing and iteratively solving.

For more information about all Gemini models, check the [documentation](https://ai.google.dev/gemini-api/docs/models/gemini) for extended information on each of them.


In [34]:
from pprint import pprint
pprint(client.models.get(model="gemini-2.0-flash-thinking-exp").model_dump(exclude_defaults=True))

### Imports

In [35]:
import json
from PIL import Image
from IPython.display import display, Markdown

# 1/ Examples

Here are some quite complex examples of what Gemini 2.0 thinking model can solve.

In each of them you can select different models to see how this new model compares to its predecesors.

In some cases, you'll still get the good answer from the other models, in that case, re-run it a couple of times and you'll see that Gemini 2.0 thinking is more consistent thanks to its thinking step.

## Example #1: code simplification

First, try with a simple code comprehension and simplification example.

In [36]:
response = client.models.generate_content(
    model="gemini-2.0-flash-thinking-exp",
    contents='How can I simplify this? `(Math.round(radius/pixelsPerMile * 10) / 10).toFixed(1);`'
)

pprint(response.candidates[0].content)

As you can see, your response has multiple parts. While you could use `response.text` to get all of it right away as usual it's actually more interesting to check each of them separately when using the thinking model.

The first part is the "inner thoughts" of the model, that where it analyzes the problem and comes up with its strategy:

In [37]:
# First part is the inner thoughts of the model
Markdown(response.candidates[0].content.parts[0].text)

Most of the time you won't need to checks the thoughts as you'll be mostly interested in the answer, but having access to them gives you a way to check where the answers comes from and how the model came up with it. It's not a black box anymore!

If you are using the `v1alpha` API, you'll see a `thought=True`, indicating that the first part is indeed thoughts.

Then the second part is the actual answer:

In [38]:
# Second part is the response from the model
Markdown(response.candidates[0].content.parts[1].text)

As a comparison here's what you'd get with the "classic" [Gemini 2.0](https://ai.google.dev/gemini-api/docs/models/gemini-v2) model.

Unlike thinking mode, the normal model does not articulates its thoughts and tries to answer right away which can lead to more simpler answers to complex problems.

In [None]:
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents='How can I simplify this? `(Math.round(radius/pixelsPerMile * 10) / 10).toFixed(1);`'
)

Markdown(response.text)

## Example #2: Geometry problem (with image)

This geometry problem requires complex reasoning and is also using Gemini multimodal abilities to read the image.

In [40]:
!wget https://storage.googleapis.com/generativeai-downloads/images/geometry.png -O geometry.png -q

im = Image.open("geometry.png").resize((256,256))
im

In [41]:
model_name = "gemini-2.0-flash-thinking-exp" # @param ["gemini-1.5-flash-8b","gemini-1.5-flash","gemini-1.5-pro","gemini-2.0-flash-exp", "gemini-2.0-flash-thinking-exp"] {"allow-input":true}

response = client.models.generate_content(
    model=model_name,
    contents=[
        im,
        "What's the area of the overlapping region?"
    ]
)

display(Markdown("## Thoughts"))
display(Markdown(response.candidates[0].content.parts[0].text))
display(Markdown("## Answer"))
display(Markdown(response.candidates[0].content.parts[1].text))

## Example #3: Brain teaser with a twist

Here's another brain teaser based on an image, this time it looks like a mathematical problem, but it cannot actually be solved mathematically. If you check the toughts of the model you'll see that it will realize it and come up with an out-of-the-box solution.

In [47]:
!wget https://storage.googleapis.com/generativeai-downloads/images/pool.png -O pool.png -q

im = Image.open("pool.png")
im

In [57]:
model_name = "gemini-2.0-flash-thinking-exp" # @param ["gemini-1.5-flash-8b","gemini-1.5-flash","gemini-1.5-pro","gemini-2.0-flash-exp", "gemini-2.0-flash-thinking-exp"] {"allow-input":true}

response = client.models.generate_content(
    model=model_name,
    contents=[
        im,
        "How do I use three of these numbers to sum up to 30?"
    ]
)

display(Markdown("## Thoughts"))
display(Markdown(response.candidates[0].content.parts[0].text))
display(Markdown("## Answer"))
display(Markdown(response.candidates[0].content.parts[1].text))

## Example #4: Generating question for a specific level of knowledge

This time, the questions requires a few types of knowledge, including what is relevant to the Physics C exam. The questions generated are not the interesting part, but the reasoning to come up with them shows they are not just randomly generated.


In [44]:
model_name = "gemini-2.0-flash-thinking-exp" # @param ["gemini-1.5-flash-8b","gemini-1.5-flash","gemini-1.5-pro","gemini-2.0-flash-exp", "gemini-2.0-flash-thinking-exp"] {"allow-input":true}

response = client.models.generate_content(
    model=model_name,
    contents="Give me a practice question I can use for the AP Physics C exam?"
)

display(Markdown("## Thoughts"))
display(Markdown(response.candidates[0].content.parts[0].text))
display(Markdown("## Answer"))
display(Markdown(response.candidates[0].content.parts[1].text))

# Next Steps

Try the [Gemini 2.0 Flash Thinking](https://aistudio.google.com/app/prompts/new_chat?model=gemini-2.0-flash-thinking-exp) model in AI Studio with all your crazy problems and brain teasers.

For more examples of the Gemini 2.0 capabilities, check the [Gemini 2.0 folder of the cookbook](https://github.com/google-gemini/cookbook/blob/main/gemini-2/). You'll learn how to use the [Live API](live_api_starter.ipynb), juggle with [multiple tools](./plotting_and_mapping.ipynb) or use Gemini 2.0 [spatial understanding](./spatial_understanding.ipynb) abilities.