# How to use Google Gemini models with python
In this notebook we look into:
1. The basics on how to use a Google Gemini model with just a few lines of codes.
2. Which settings you can play with to tune the behaviour of the model on your use case.

## Google API setup

In order to use a Google Gemini models, you'll need to create an API key and configure it in your Google Colab Secrets.


1. You get your api key from Google AI Studio [here](https://aistudio.google.com/app/apikey)
4. Open your Colab secrets (click on the key icon here on the left)
3. Give a the name, for instance `GOOGLE_API_KEY`, and past the value in `Value`.
4. Toggle `Notebook access` to give access to this specific notebook to the API key.

🔑 Note that this api key will now be available in your secrets everytime you open or create a new colab notebook. You'll however still need to grant explicit access to each notebook.


💸 You'll be able to start **free of charge** but you will just be limited in the number of requests you could make to Gemini per minute and per day.
- I recommend you start using `gemini-1.5-flash` because you have 15 request per minute and a total of 1500 request per day free, so it's pretty good to strart.

In [1]:
# if you are running the notebook outside of Google Colab, uncomment this line to install the Google Generative AI library.
#!pip install -q -U google-generativeai

In [2]:
import google.generativeai as genai
import os

In [3]:
from google.colab import userdata
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')

In [4]:
genai.configure(api_key=GOOGLE_API_KEY)

## Simple inference with Gemini model from Google
Text generation is very simple.
- You need to create a `model` instance.
    - This is where you provide a `model_name` 🧠
- Then you call the `generate_content()` function.
    - THis is where you provide the user input query 💬


## Google Models
All the model we see are actually multimodals in that we could pass not just text but also Audio, Images, and Videos. But we focus on text for now.

I recommend testing models in the following order (from cheaper to more expensive and better).
1. `gemini-1.5-flash`: Fast and versatile performance across a diverse variety of tasks
    - Input token limit: **1 million !**
    - Latest update: September 2024.
    - 15 Request per minutes, 1500 requests per day on the free plan
2. `gemini-1.5-pro`: For complex reasoning tasks requiring more intelligence.
    - Input token limit: **2 millions 😱!!**
    - Latest update: October 2024.
    - 2 Request per minutes, 50 requests per day on the free plan

If you need more capacity you'll have to configure a billing account. And [here](https://ai.google.dev/pricing#1_5flash)

🗞️  Gemini 2.0 is arriving with more modalities and a thinking mod: `gemini-2.0-flash-exp` with a Knowledge cut-off of August 2024.


In [5]:
model = genai.GenerativeModel(model_name="gemini-1.5-flash")
response = model.generate_content("Write a very short poem about an astronaut on the Moon")
print(response.text)

Gray dust, a giant's leap,
Silent, stark, a crater deep.
Star-strewn void, a flag unfurls,
Earth a jewel, in distant swirls.



In [6]:
model = genai.GenerativeModel(model_name="gemini-2.0-flash-exp")
response = model.generate_content("Write a very short poem about an astronaut on the Moon")
print(response.text)

Dusty boots on silver ground,
Earth a marble, small and round.
Silence screams, a lonely sound,
Moon's cold gaze, forever bound.



# Advanced Parameters

[here](https://ai.google.dev/gemini-api/docs/models/generative-models#model-parameters) is the documentation if you want to dig deeper, but I'll show you the most important ones below.

**Having a conversation**


In [8]:
# How to make an interactive chat?
model = genai.GenerativeModel("gemini-1.5-flash")
chat = model.start_chat(
    history=[
        {"role": "user", "parts": "Hello"},
        {"role": "model", "parts": "Great to meet you. What would you like to know?"},
    ]
)
response = chat.send_message("I have 2 dogs in my house.")
print(f"Model response 1:\n")
print(response.text)
print(f"Model response 2:\n")
response = chat.send_message("How many paws are in my house?")
print(response.text)

Model response 1:

That's wonderful!  Do you have any questions about them, or would you like to tell me more about them?  I'd love to hear about your furry friends!

Model response 2:

If you have two dogs, and each dog has four paws, there are eight paws in your house.



**Configure text generation parameters**

Every prompt you send to the model includes [parameters](https://ai.google.dev/gemini-api/docs/models/generative-models#model-parameters) that control how the model generates responses. You can use [GenerationConfig](https://ai.google.dev/api/rest/v1/GenerationConfig) to configure these parameters. If you don't configure the parameters, the model uses default options, which can vary by model.

In [9]:
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(
    "Tell me a story about a magic backpack.",
    generation_config=genai.types.GenerationConfig(
        candidate_count=1, #  Number of generated responses to return
        stop_sequences=["x"], # step of character which stop generation. Rarely used.
        max_output_tokens=20, # The maximum number of tokens to include in a response candidate.
        temperature=1.0, # Controls the randomness of the output: Betweem 0.0 and 2.0
        presence_penalty=1.0,
        top_k=10
    ),
)

print(response.text)

Elara lived a life dictated by dust and deadlines. A struggling artist in a bustling city, her


**Note**: Like with OpenAI API you can control the following other parameters (for the full details look at [GenerationConfig](https://ai.google.dev/api/rest/v1/GenerationConfig)).
- response_schema;
- top_p
- top_k
- presence_penalty
- frequence_penalty
- logprobs
- ...

In [10]:
# You can also count tokens easily
print(response.usage_metadata)

prompt_token_count: 10
candidates_token_count: 20
total_token_count: 30



Every prompt you send to the model includes parameters that control how the model generates responses. You can use GenerationConfig to configure these parameters. If you don't configure the parameters, the model uses default options, which can vary by model.

# Streaming

In [11]:
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("Write a story about a magic backpack.", stream=True)
for chunk in response:
    print(chunk.text)
    print("_" * 80)

El
________________________________________________________________________________
ara clutched the worn leather straps of the backpack, its faded crimson a stark contrast
________________________________________________________________________________
 to the grey, rain-slicked cobblestones of the market square.
________________________________________________________________________________
 It wasn’t just any backpack; it was her grandmother’s, a relic whispered to possess a touch of magic.  Elara, a struggling artist
________________________________________________________________________________
 with a talent for painting but a dearth of luck, had dismissed the tales as folklore until last week.

That week, she'd been utterly broke,
________________________________________________________________________________
 staring at a blank canvas and an empty stomach.  In desperation, she’d opened the ancient backpack, intending to use it for its intended purpose – carrying her meager b