# Gemini API

The API (Application Programming Interface) of a large language model (LLM) refers to a set of predefined functions, protocols, and tools that allow developers to interact with the model programmatically. This interface enables external applications or systems to send prompts in form of texts or texts in combination with images, if the LLM is multimodal, specify criteria for generation such as temperature and system instructions, and receive generated outputs. The API essentially provides a way to integrate the LLM’s capabilities into other code.

Google provides a dedicated [GitHub Repository](https://github.com/google-gemini/cookbook) with code examples for using the API. The materials that we will see here are adapted from [this notebook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Get_started.ipynb).

## Getting an API Key

Gemini API has a free tier where users can register with a *personal gmail account* and obtain and API key. They don't have to enter any billing information. There are some limits in usage rates and models. But it is a great place to start exploring LLM APIs.

- Open [Google AI Studio](https://aistudio.google.com/app/u/1/prompts/new_chat)
  
<img src="../img/gemini_ai_studio.png" width="700"/>

- You will be prompted to log in. Please log in with your **personal Google account**.
- If you are logged in with your institutional account, you might see that you are not allowed by the organization admin to use these services.

<img src="../img/gemini_ai_studio_institutional.png" width="700"/>

- Click `Get API key` on the upper right corner.

<img src="../img/gemini_ai_studio_api.png" width="700"/>

- Click `Create API key` on the right corner below the navigation bar. If you are doing this for the first time, you should see a pop up that asks you to agree to the terms of use.

<img src="../img/gemini_ai_studio_consent.png" width="500"/>

- After you agree, you should see another pop up that allows you to create an API key in an existing project or in a new one if you do not have any projects on Google Cloud Services.

<img src="../img/gemini_ai_studio_create_api.png" width="400"/>

- Your API key should appear on the screen. Please copy this key for the next steps.

<img src="../img/gemini_ai_studio_api_key.png" width="600"/>



## API keys and .env files

In lesson 4-3_api_NYT.ipynb we saw one way to safely use an API key in a Jupyter Notebook. Another way to do this is using `.env` files. This is particularly useful when you are coding in a repository where you might accidently push your API secret keys to GitHub. Or if you ever want to deploy your code, you will need to deal with `.env` files.

- Create a new file in your repository called `.env` This is the entire filename. Leave it empty for now.
- Create another file called `.gitignore`.
  You can add anything in the `.gitignore` file that you want GitHub to not track. This will prevent you from pushing files that you don't want changed.
- On an empty line in the  `.gitignore` file, type `.env` and push the changes.
  This tells GitHub to ignore the `.env` file going forward.
- Now you can save your API key to your `.env` file. Open your `.env` file and type `GEMINI_API_KEY="YOUR_API_KEY"`
- Save your `.env` file and close it. Make sure that .env file is not listed in the source control section.
  
API keys, other secrets as well as their management are really important skills to learn. A great place to start reading more about this topic is on this [blog post](https://dev.to/jakewitcher/using-env-files-for-environment-variables-in-python-applications-55a1). While preparing these materials, I found a new [VSCode extension](https://marketplace.visualstudio.com/items?itemName=josephdavidwilsonjr.api-vault), which I have not tested yet so I cannot recommend or urge against the use of it. If you want, you can take a look at it [here](https://medium.com/@dingersandks/why-every-developers-api-keys-are-probably-in-the-wrong-place-and-how-a-vs-code-extension-finally-c966d081d132). If for nothing else, this post shows why API keys are so important.

Additionally, if you are using `Google Colab`, you can use the Colab Secrets manager to securely store your API keys. More details on that can be found in this [Authentication notebook](https://github.com/google-gemini/cookbook/blob/8d7b26bfb8701a31ad16ac549de2753819bbafc9/quickstarts/Authentication.ipynb.)

In [None]:
# Let's load our API key from the environment
# we will use the python-dotenv package to load the .env file 
# https://pypi.org/project/python-dotenv/

import os
from dotenv import load_dotenv

load_dotenv() # By default load_dotenv will look for the .env file in the current working directory

api_key = os.getenv("GEMINI_API_KEY")

if api_key:
    print("API key loaded successfully.")

In [None]:
# Install the google-genai package
%pip install -qU 'google-genai>=1.0.0'

In [None]:
# Client is essentially how we interact with the Gemini API

from google import genai
from google.genai import types

client = genai.Client(api_key=api_key)

In [None]:
# We can now specify the model we want to use
# More on models here: https://ai.google.dev/gemini-api/docs/models
# Currently, these are the available models: ["gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-flash-preview-05-20","gemini-2.5-pro-preview-06-05"]

model_id = "gemini-2.5-flash-preview-05-20"

## API Use Examples

### Text Prompts

Use the `generate_content` method to generate responses to your prompts. You can pass text directly to `generate_content` and use the `.text`
property to get the text content of the response. Note that the .text field will work when there's only one part in the output.

In [None]:
# Let's see if this is working by generating some content

from IPython.display import Markdown

response = client.models.generate_content(
    model=model_id,
    contents="What's the largest planet in our solar system?"
)

Markdown(response.text)

### Count tokens

Tokens are the basic inputs to the Gemini models. You can use the `count_tokens` method to calculate the number of input tokens before sending a request to the Gemini API.

In [None]:
response = client.models.count_tokens(
    model=model_id,
    contents="What's the highest mountain in Africa?",
)

print(response)

### Send multimodal prompts

Use Gemini 2.0 model `(gemini-2.0-flash)` or a newer **multimodal model** that supports multimodal prompts. You can include text, PDF documents, images, audio and video in your prompt requests and get text or code responses.

In this first example, you'll download an image from a specified URL, save it as a byte stream and then write those bytes to a local file named `jetpack.png.`


In [None]:
import requests
import pathlib
from PIL import Image

IMG = "https://storage.googleapis.com/generativeai-downloads/data/jetpack.png" # @param {type: "string"}

img_bytes = requests.get(IMG).content

img_path = pathlib.Path('jetpack.png')
img_path.write_bytes(img_bytes)

In this second example, you'll open a previously saved image, create a thumbnail of it and then generate a short blog post based on the thumbnail, displaying both the thumbnail and the generated blog post.

In [None]:
from IPython.display import display, Markdown
image = Image.open(img_path)
image.thumbnail([512,512])

response = client.models.generate_content(
    model=model_id,
    contents=[
        image,
        "Write a short and engaging blog post based on this picture."
    ]
)

display(image)
Markdown(response.text)

### Configure model parameters

You can include parameter values in each call that you send to a model to control how the model generates a response. Learn more about [experimenting with parameter values](https://ai.google.dev/gemini-api/docs/text-generation?lang=node#configure).


In [None]:
response = client.models.generate_content(
    model=model_id,
    contents="Tell me how the internet works, but pretend I'm a puppy who only understands squeaky toys.",
    config=types.GenerateContentConfig(
        temperature=0.4,
        top_p=0.95,
        top_k=20,
        candidate_count=1,
        seed=5,
        stop_sequences=["STOP!"],
        presence_penalty=0.0,
        frequency_penalty=0.0,
    )
)

Markdown(response.text)

### Configurations

Let's take a look at this part of our code more closely.

```python
config=types.GenerateContentConfig(
        temperature=0.4,
        top_p=0.95,
        top_k=20,
        candidate_count=1,
        seed=5,
        stop_sequences=["STOP!"],
        presence_penalty=0.0,
        frequency_penalty=0.0,
    )
```

1. **temperature**
   controls the randomness of the model’s output. A low temperature (close to 0) makes the model more focused and deterministic, meaning it will give more predictable and safer responses. A higher temperature (close to 1) makes the model more random and creative, producing more varied and less predictable outputs.

2. **top_p**
   is part of a technique called "nucleus sampling." It means that the model will only consider the specified percentage of the probability distribution when generating the next word. In other words, it narrows down the pool of potential words to choose from, making the output more coherent and less likely to pick random or nonsensical words. A value of 0.95 means that the model will take into account 95% of the most probable words, allowing for more diversity in the output, while avoiding extreme randomness.

3. **top_k**
   controls how many of the most likely next words are considered at each step. If top_k=20, the model will only consider the top 20 most probable words for generating each word in the output.

4. **candidate_count**
   defines how many different "candidates" (or possible completions) the model should generate. If set to 1, the model will only generate one response. If set to a higher number, the model can generate multiple different responses from which you can choose.

5. **seed**
   A "seed" is a starting point for the random number generator used in generating text. By setting a seed, the model will produce the same result each time with the same input and configuration, which is useful for reproducibility. If you want different results each time, you can change the seed. The number 5 here is just a fixed starting point for randomness.

6. **stop_sequences**
   These are special sequences or words that tell the model when to stop generating text. When the model generates the sequence "STOP!" (or any other stop sequence you provide), it will halt further generation.

7. **presence_penalty**
   This parameter controls how much the model should avoid repeating concepts or phrases. If set to a positive value, the model is discouraged from using the same words or phrases too frequently in the output. A value of 0.0 means that there is no penalty for repeating words or ideas, so the model is free to repeat as needed.

8. **frequency_penalty**
   Similar to the presence penalty, this controls how much the model should avoid repeating the same words or phrases. The difference is that the frequency penalty focuses more on how often a word or phrase is used, rather than just its presence. Again, a value of 0.0 means there is no penalty for frequent repetition, allowing the model to repeat words as often as necessary.



### Configure safety filters

The Gemini API provides safety filters that you can adjust across multiple filter categories to restrict or allow certain types of content. You can use these filters to adjust what is appropriate for your use case. See the [Configure safety filters](https://ai.google.dev/gemini-api/docs/safety-settings) page for details.

In this example, you'll use a safety filter to only block highly dangerous content, when requesting the generation of potentially disrespectful phrases.

In [None]:
prompt = """
    Write a list of 2 disrespectful things that I might say to the universe after stubbing my toe in the dark.
"""

safety_settings = [
    types.SafetySetting(
        category="HARM_CATEGORY_DANGEROUS_CONTENT",
        threshold="BLOCK_ONLY_HIGH",
    ),
]

response = client.models.generate_content(
    model=model_id,
    contents=prompt,
    config=types.GenerateContentConfig(
        safety_settings=safety_settings,
    ),
)

Markdown(response.text)

Safety settings and content moderation are important aspects of LLM research. Shameless plug, I wrote a [paper](https://aclanthology.org/2025.latechclfl-1.20/) about how approaches to and implementations of safety in Gemini interferes with the processing of historical texts and what this might tell us about LLMs more broadly.

### System Instructions

You can guide the behavior of Gemini models with system instructions.

In [None]:
system_instruction = """
  You are an expert software developer and a helpful coding assistant.
  You are able to generate high-quality code in any programming language.
"""

code_config = types.GenerateContentConfig(
    system_instruction=system_instruction,
)

response = client.models.generate_content(
    model=model_id,
    config=code_config,
    contents="How can I implement a binary search algorithm in Python?", 
)

Markdown(response.text)


Note: Here we skipped the multi-turn chat example, which you can study if you are interested on [this notebook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Get_started.ipynb)

### Generate JSON

The [controlled generation](https://ai.google.dev/gemini-api/docs/structured-output?lang=python#generate-json) capability in Gemini API allows you to constraint the model output to a structured format. You can provide the schemas as Pydantic Models or a JSON string.

In [None]:
from pydantic import BaseModel
import json

class Recipe(BaseModel):
    recipe_name: str
    recipe_description: str
    recipe_ingredients: list[str]

response = client.models.generate_content(
    model=model_id,
    contents="Provide a popular cookie recipe and its ingredients.",
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=Recipe,
    ),
)

print(json.dumps(json.loads(response.text), indent=4))

### Generate Images

Gemini can output images directly as part of a conversation:


In [None]:
from IPython.display import Image, Markdown

response = client.models.generate_content(
    model="gemini-2.0-flash-exp", # note the change in model
    contents='Hi, can create a 3d rendered image of a pig with wings and a top hat flying over a happy futuristic scifi city with lots of greenery?',
    config=types.GenerateContentConfig(
        response_modalities=['Text', 'Image']
    )
)

for part in response.candidates[0].content.parts:
  if part.text is not None:
    display(Markdown(part.text))
  elif part.inline_data is not None:
    mime = part.inline_data.mime_type
    print(mime)
    data = part.inline_data.data
    display(Image(data=data))

### Generate content stream

By default, the model returns a response after completing the entire generation process. You can also use the `generate_content_stream` method to stream the response as it's being generated, and the model will return chunks of the response as soon as they're generated.

In [None]:
for chunk in client.models.generate_content_stream(
    model=model_id,
    contents="Tell me a story about a lonely robot who finds friendship in a most unexpected place."
):
  print(chunk.text, end="")

### Uploading files

We can upload files to the model using the API so that it can generate content with these files. It is especially useful for cases where your files are too large for the context window of these models. In this case, you'll use a 400 page transcript from [Apollo 11](https://www.nasa.gov/history/alsj/a11/a11trans.html) for the text file example and PDF page of an article titled [Smoothly editing material properties of objects](https://research.google/blog/smoothly-editing-material-properties-of-objects-with-text-to-image-models-and-synthetic-data/) with text-to-image models and synthetic data available on the Google Research Blog.


In [None]:
# Prepare the file to be uploaded
TEXT = "https://storage.googleapis.com/generativeai-downloads/data/a11.txt"  # @param {type: "string"}
text_bytes = requests.get(TEXT).content

text_path = pathlib.Path('a11.txt')
text_path.write_bytes(text_bytes)

In [None]:
# Upload the file using the API
file_upload = client.files.upload(file=text_path)

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        file_upload,
        "Can you give me a summary of this information please?",
    ]
)

Markdown(response.text)

In [None]:
# Prepare the file to be uploaded
PDF = "https://storage.googleapis.com/generativeai-downloads/data/Smoothly%20editing%20material%20properties%20of%20objects%20with%20text-to-image%20models%20and%20synthetic%20data.pdf"  # @param {type: "string"}
pdf_bytes = requests.get(PDF).content

pdf_path = pathlib.Path('article.pdf')
pdf_path.write_bytes(pdf_bytes)

In [None]:
# Upload the file using the API
file_upload = client.files.upload(file=pdf_path)

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        file_upload,
        "Can you summarize this file as a bulleted list?",
    ]
)

Markdown(response.text)

Note: Similar to text files and PDFs, you can upload videos, images, and sound files. Examples of those can be found on the same Google example notebook.

### Use url context

The URL Context tool empowers Gemini models to directly access, process, and understand content from user-provided web page URLs. This is key for enabling dynamic agentic workflows, allowing models to independently research, analyze articles, and synthesize information from the web as part of their reasoning process.

In this example you will use two links as reference and ask Gemini to find differences between the cook receipes present in each of the links:


In [None]:
prompt = """
Compare recipes from https://www.food.com/recipe/homemade-cream-of-broccoli-soup-271210
and from https://www.allrecipes.com/recipe/13313/best-cream-of-broccoli-soup/,
list the key differences between them.
"""

tools = []
tools.append(types.Tool(url_context=types.UrlContext))

config = types.GenerateContentConfig(
    tools=tools,
)

response = client.models.generate_content(
      contents=[prompt],
      model="gemini-2.0-flash",
      config=config
)

Markdown(response.text)

### Get text embeddings

You can get text embeddings for a snippet of text by using `embed_content` method and using the `gemini-embedding-exp-03-07` or `text-embedding-004` models.

The Gemini Embeddings model produces an output with 3072 dimensions by default. However, you've the option to choose an output dimensionality between 1 and 3072. See the [embeddings guide](https://ai.google.dev/gemini-api/docs/embeddings) for more details and [this notebook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Embeddings.ipynb) for more in-depth examples.


In [None]:
TEXT_EMBEDDING_MODEL_ID = "gemini-embedding-exp-03-07"

response = client.models.embed_content(
    model=TEXT_EMBEDDING_MODEL_ID,
    contents=[
        "How do I get a driver's license/learner's permit?",
        "How do I renew my driver's license?",
        "How do I change my address on my driver's license?"
        ],
    config=types.EmbedContentConfig(output_dimensionality=512) #
)

print(response.embeddings)

In [None]:
len(response.embeddings)

In [None]:
print(len(response.embeddings[0].values))
print((response.embeddings[0].values[:4], '...'))

**Remember `numpy`?**

An embedding is a dense, continuous vector representation of data (such as words, sentences, or images) that captures its semantic meaning in a high-dimensional space. Embeddings are typically represented as 1-D numpy arrays, i.e. vectors. Each element in the array corresponds to a feature or dimension in the vector space.

More on this soon!

## Example Application

Below is an example of what we can do with API calls and how we can integrate it into our code. 

We will build a mini translator using Gemini API!

### Dataset

For machine translation, we need parallel data. Luckily, we have some datasets available for our use, such as [Helsinki NLP's OPUS MT Datasets](https://github.com/Helsinki-NLP/OPUS-MT-train/blob/master/models/de-en/README.md).

I downloaded the [test set](https://object.pouta.csc.fi/OPUS-MT-models/de-en/opus-2019-12-04.test.txt), reformatted it and uploaded to our Data folder as `de-en.csv`

In [None]:
import pandas as pd

translation_set = pd.read_csv("../Data/de-en.csv")
translation_set.head()

In [None]:
translation_set.describe()

In [None]:
# We have 5000 examples and we don't need all of them for our example, so let's sample 10 of them

sampled_translations = translation_set.sample(n=10, random_state=42).reset_index(drop=True)
sampled_translations

### Evaluations



In [None]:
# Evaluation
# https://pypi.org/project/sacrebleu/

!pip install sacrebleu

In [None]:
import sacrebleu

def evaluate_translation(model_translation, references):
    """
    Evaluate translation quality using BLEU and CHRF metrics.
    the `model_translation` is a list of translated sentences,
    and `references` is a list of lists of reference sentences.
    """
    bleu = sacrebleu.corpus_bleu(model_translation, references)
    chrf = sacrebleu.corpus_chrf(model_translation, references)
    return bleu, chrf

In [None]:
help(sacrebleu.corpus_bleu)

In [None]:
help(sacrebleu.corpus_chrf)

### Gemini Translator

I used [this notebook](https://github.com/google-gemini/cookbook/blob/main/examples/Translate_a_Public_Domain_Book.ipynb) as a reference

In [None]:
# Let's start building our API client for translation

system_instruction = """You are a helpful translation assistant. 
You can translate text from German to English.
You will be given a sentence in German, and you should return the translation in English.
Do not return any additional text, just the translation.
"""

In [None]:
safety_settings =[
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold=types.HarmBlockThreshold.BLOCK_NONE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
        threshold=types.HarmBlockThreshold.BLOCK_NONE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
        threshold=types.HarmBlockThreshold.BLOCK_NONE,
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=types.HarmBlockThreshold.BLOCK_NONE,
    )
]

In [None]:
def gemini_translate(prompt):
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=prompt,
        config = types.GenerateContentConfig(
            system_instruction=system_instruction,
            safety_settings=safety_settings,
            temperature=0.1, # we want the model to be deterministic
            top_p=0.95,
            top_k=20,
            candidate_count=1, # we want only one translation
            seed=5,
        )
    )
    # Let's see what the whole response looks like
    # print(response)
    print(response.text)

    try:
        return response.text
    except Exception as ex:
        raise ex

In [None]:
# let's see if our translation function works

example = translation_set.iloc[0]
print(f"German: {example['german']}")

In [None]:
print(gemini_translate(example['german']))

This is what the whole response looks like:

candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, inline_data=None, file_data=None, thought_signature=None, code_execution_result=None, executable_code=None, function_call=None, function_response=None, text="I think we'd better go now.")], role='model'), citation_metadata=None, finish_message=None, token_count=None, finish_reason=<FinishReason.STOP: 'STOP'>, url_context_metadata=None, avg_logprobs=None, grounding_metadata=None, index=0, logprobs_result=None, safety_ratings=None)] create_time=None response_id=None model_version='models/gemini-2.5-flash-preview-05-20' prompt_feedback=None usage_metadata=GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=9, candidates_tokens_details=None, prompt_token_count=60, prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=60)], thoughts_token_count=163, tool_use_prompt_token_count=None, tool_use_prompt_tokens_details=None, total_token_count=232, traffic_type=None) automatic_function_calling_history=[] parsed=None


In [None]:
# Add a new column for the translations
sampled_translations['gemini_translation'] = sampled_translations['german'].apply(gemini_translate)

# Display the updated DataFrame
sampled_translations

In [None]:
# Evaluate Gemini translations against the reference translations
references = [sampled_translations['english1'].tolist(), sampled_translations['english2'].tolist()]
model_translations = sampled_translations['gemini_translation'].tolist()

# Initialize lists to store scores
bleu_scores = []
chrf_scores = []

# Loop through each row to evaluate BLEU and CHRF
for index, row in sampled_translations.iterrows():
    refs = [[row['english1'], row['english2']]]
    hyp = [row['gemini_translation']]
    bleu, chrf = evaluate_translation(hyp, refs)
    bleu_scores.append(round(bleu.score, 2))
    chrf_scores.append(round(chrf.score, 2))

# Add scores as new columns
sampled_translations['bleu_score'] = bleu_scores
sampled_translations['chrf_score'] = chrf_scores

print(sampled_translations[['german', 'gemini_translation', 'bleu_score', 'chrf_score']])