# **Gemini API quickstart**

In [1]:
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")

from google import genai

client = genai.Client(api_key=api_key)

response = client.models.generate_content(
    model="gemini-2.0-flash", contents="Explain how AI works in a few words"
)
print(response.text)

AI learns patterns from data to make predictions or decisions.



## Working with local images
If you are using the Python imaging library (Pillow), you can use PIL image objects too.

In [12]:
from google import genai
from google.genai import types

import PIL.Image

image = PIL.Image.open('data/interview.png')

client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=["What is this image?", image])

print(response.text)

The image shows a person sitting at a desk using a computer. There are also books on the desk.


## Base64 encoded images
You can upload public image URLs by encoding them as Base64 payloads. The following code example shows how to do this using only standard library tools:

In [10]:
from google import genai
from google.genai import types

import requests

image_path = "https://static.toiimg.com/thumb/msid-118701222,imgsize-13452,width-400,resizemode-4/118701222.jpg"
image = requests.get(image_path)

client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=["What is this image?",
              types.Part.from_bytes(data=image.content, mime_type="image/jpeg")])

print(response.text)

The image shows a man with dark curly hair and a beard holding a microphone in his hands. He is wearing a black shirt and appears to be speaking or presenting something. The background is a solid black color.



In [None]:
from google import genai
from google.genai import types

import requests

image_path = "https://yt3.googleusercontent.com/ytc/AIdro_msUercxyRDA0NKQsaIS0IXMTCF_GId4oiTzQYyKbP4AI4=s900-c-k-c0x00ffffff-no-rj"
image = requests.get(image_path)

client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=["Can you indentify who is in the image he is a famous comedian?",
              types.Part.from_bytes(data=image.content, mime_type="image/jpeg")])

print(response.text)

That is comedian Samay Raina.


## Multiple images
To prompt with multiple images, you can provide multiple images in the call to generate_content. These can be in any supported format, including base64 or PIL.

In [13]:
from google import genai
from google.genai import types

import pathlib
import PIL.Image

image_path_1 = "data/interview.png"  # Replace with the actual path to your first image
image_path_2 = "data/meditation.png" # Replace with the actual path to your second image

image_url_1 = "https://yt3.googleusercontent.com/ytc/AIdro_msUercxyRDA0NKQsaIS0IXMTCF_GId4oiTzQYyKbP4AI4=s900-c-k-c0x00ffffff-no-rj" # Replace with the actual URL to your third image

pil_image = PIL.Image.open(image_path_1)

b64_image = types.Part.from_bytes(
    data=pathlib.Path(image_path_2).read_bytes(),
    mime_type="image/jpeg"
)

downloaded_image = requests.get(image_url_1)

client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=["What do these images have in common?",
              pil_image, b64_image, downloaded_image])

print(response.text)

Here's what the images have in common:

*   **Sitting at a Desk/Table:** Each image depicts a person engaging in an activity while seated at a desk or table.
*   **Learning/Study Related:** They all convey the theme of learning, studying, or engaging in intellectual activities.
*   **Modern/Digital Element:** There are hints of modern or digital elements in both images. The first image shows a person working on a computer and the second image depicts someone with headphones listening to an audiobook.


## Large image payloads
When the combination of files and system instructions that you intend to send is larger than 20 MB in size, use the File API to upload those files.

**Note** - Use the media.upload method of the File API to upload an image of any size.

In [14]:
from google import genai

client = genai.Client(api_key=api_key)

img_path = "data/interview.png"
file_ref = client.files.upload(file=img_path)
print(f'{file_ref=}')

client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=["What can you tell me about these instruments?",
              file_ref])

print(response.text)

file_ref=File(name='files/lm3hp6jwkp8p', display_name=None, mime_type='image/png', size_bytes=66555, create_time=datetime.datetime(2025, 4, 6, 20, 12, 32, 793926, tzinfo=TzInfo(UTC)), expiration_time=datetime.datetime(2025, 4, 8, 20, 12, 32, 766088, tzinfo=TzInfo(UTC)), update_time=datetime.datetime(2025, 4, 6, 20, 12, 32, 793926, tzinfo=TzInfo(UTC)), sha256_hash='NTk2NTdhNjUzZWQxZjg2YTgyZmQyYTc4NjRmNzkyZjA5YWFkMTkzNTI3NDZjMjg3ZDMxYjgwODk2YTlkYWJjMw==', uri='https://generativelanguage.googleapis.com/v1beta/files/lm3hp6jwkp8p', download_uri=None, state=<FileState.ACTIVE: 'ACTIVE'>, source=<FileSource.UPLOADED: 'UPLOADED'>, video_metadata=None, error=None)
The image depicts a desk setup which includes:

*   **A computer monitor:** A screen used for displaying information from a computer.
*   **A keyboard:** An input device used to type information into the computer.
*   **Books:** Stacks of books placed on the desk.
