# Introduction to Gemini

The Gemini API can be used to generate many different outputs such as text, images, video and audio

## Text generation

In [28]:
from google import genai

# The client gets the API key from the environment variable `GEMINI_API_KEY`.
client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Generate some funny jokes about data engineering. Give 5 points",
)
print(response.text)

Here are 5 funny jokes about data engineering:

1.  **Why did the data engineer get kicked out of the orchestra?**
    He kept yelling, "Your pipes are broken, and the whole show is failing!"

2.  **A data engineer's favorite bedtime story starts with...**
    "Once upon a time, in a distributed system far, far away, a single byte went missing, and the entire downstream report was off by 0.0001%..."

3.  **My therapist told me I need to confront the things that scare me.**
    So, I walked into the office, stared at our production database's `ALTER TABLE` statement, and screamed.

4.  **What's a data engineer's worst nightmare?**
    Being told the "data lake" is actually just a folder on someone's desktop named "misc\_stuff\_final\_final\_v2.zip".

5.  **How many data engineers does it take to change a lightbulb?**
    None. They'll build an automated pipeline to detect darkness, trigger a JIRA ticket for the facilities team, and then spend three days debugging why the sensor reported

In [58]:
response.__dict__.keys()

dict_keys(['sdk_http_response', 'candidates', 'create_time', 'model_version', 'prompt_feedback', 'response_id', 'usage_metadata', 'automatic_function_calling_history', 'parsed'])

### Analyze tokens

Tokens are the basic units of text for LLMs used to process and generate language. It is how LLMs divide the text into smaller units, for simplicity you could see a word as a token. Tokens are also what you pay for when you use the APIs

The free tier in gemini API allows for (Gemini 2.5 flash)

- Requests per minute (RPM): 10
- Tokens per minute (TPM): 250 000
- Requests per day (RDP): 250

as of 2025-11-11

[Check documentation here for latest limits](https://ai.google.dev/gemini-api/docs/rate-limits)

There is a possiblity to upgrade to higher tiers, which allows for more generous rate limits, but comes with higher costs. 

In [29]:
metadata = response.usage_metadata
metadata

GenerateContentResponseUsageMetadata(
  candidates_token_count=262,
  prompt_token_count=13,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=13
    ),
  ],
  thoughts_token_count=972,
  total_token_count=1247
)

In [30]:
from pydantic import BaseModel

# GenerateContentResponseUsageMetadata is a pydantic model, which means we can
# use dot operator to get different attributes
isinstance(metadata, BaseModel)

True

In [31]:
print("Output tokens - number of tokens in models response")
print(f"{metadata.candidates_token_count = }")

Output tokens - number of tokens in models response
metadata.candidates_token_count = 262


In [32]:
print("Tokens in user input")
print(f"{metadata.prompt_token_count = }")

Tokens in user input
metadata.prompt_token_count = 13


In [None]:
# a lot of tokens used here
print("Tokens used for internal thinking")
print(f"{metadata.thoughts_token_count = }")

Tokens used for internal thinking
metadata.thoughts_token_count = 972


In [None]:
print("Total tokens used - this is the billing number")
print(f"{metadata.total_token_count = }")

Tokens used for internal thinking
metadata.total_token_count = 1247


### Thinking 

- we can see that the thinking process takes a lot of tokens and also takes long time, so if you want answers quicker but chooses to have less thinking you can set the `thinking budget`


In [44]:
from google.genai import types

# The client gets the API key from the environment variable `GEMINI_API_KEY`.
client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Generate some funny jokes about data engineering. Give 5 points",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_budget=0)
    ),
)
print(response.text)

Here are 5 funny jokes about data engineering:

1. **Why did the data engineer break up with their partner?** Because they kept bringing up old, unstructured data from their past, and the engineer just couldn't process it anymore.

2. **What's a data engineer's favorite type of music?** Anything with a good *flow*... especially if it's orchestrated.

3. **My doctor told me I needed to reduce my stress, so I became a data engineer.** Now, instead of worrying about my own problems, I worry about data pipelines failing at 3 AM. Much better!

4. **A data engineer walks into a bar.** The bartender asks, "What can I get for you?" The engineer replies, "Just give me all your raw ingredients, and I'll figure out how to make a beer that actually works tomorrow morning... maybe."

5. **You know you're a data engineer when...** you see a beautifully organized spreadsheet and your first thought isn't "Wow," but "I bet I could automate that into a robust, scalable ELT process."


In [61]:
print(repr(response.usage_metadata))

print("Ah much cheaper, but is the result as good as the thinking?")

GenerateContentResponseUsageMetadata(
  candidates_token_count=233,
  prompt_token_count=13,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=13
    ),
  ],
  total_token_count=246
)
Ah much cheaper, but is the result as good as the thinking?


## System instructions