In [1]:
import os
import openai
from dotenv import load_dotenv

In [2]:
load_dotenv()  # Load environment variables from .env file

# Ensure the OPENAI_API_KEY is set
openai.api_key = os.getenv("OPENAI_API_KEY")
if openai.api_key is None:
    raise ValueError("OPENAI_API_KEY environment variable not set")

client = openai.OpenAI()

# Analyze images
Vision is the ability for a model to "see" and understand images. If there is text in an image, the model can also understand the text. It can understand most visual elements, including objects, shapes, colors, and textures, even if there are some limitations.

We can provide images as input to generation requests in multiple ways:
- By providing a fully qualified URL to an image file
- By providing an image as a Base64-encoded data URL
- By providing a file ID (created with the Files API)

We can provide multiple images as input in a single request by including multiple images in the content array, but keep in mind that images count as tokens and **will be billed accordingly**.

## Image as URL

In [3]:
client = openai.OpenAI()

response = client.responses.create(
    model="gpt-4.1-mini",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "what's in this image?"},
            {
                "type": "input_image",
                "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
            },
        ],
    }],
)

print(response.output_text)

The image shows a scenic landscape with a wooden boardwalk path running through a lush green field. The field is filled with tall grass and some scattered bushes. In the distance, there are more trees and shrubs. The sky above is wide and blue, with wispy clouds spread across it, suggesting a calm and pleasant day. The overall scene is bright, vibrant, and tranquil.


## Image as Base64

In [None]:
import base64

client = openai.OpenAI()

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Path to your image
image_path = "arquivos/otter.png"

# Getting the Base64 string
base64_image = encode_image(image_path)

response = client.responses.create(
    model="gpt-4.1-nano",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what's in this image?" },
                {
                    "type": "input_image",
                    "image_url": f"data:image/jpeg;base64,{base64_image}",
                },
            ],
        }
    ],
)

print(response.output_text)

This image is an illustration depicting a veterinarian and a young boy using stethoscopes to check the health of a small otter. The veterinarian is wearing a white coat, and both he and the boy have friendly expressions as they care for the animal. The scene takes place indoors, with a plant visible in the background.


## Image as File ID

In [5]:
from openai import OpenAI

client = openai.OpenAI()

# Function to create a file with the Files API
def create_file(file_path):
  with open(file_path, "rb") as file_content:
    result = client.files.create(
        file=file_content,
        purpose="vision",
    )
    return result.id

# Getting the file ID
file_id = create_file("arquivos/otter.png")

response = client.responses.create(
    model="gpt-4.1-nano",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "what's in this image?"},
            {
                "type": "input_image",
                "file_id": file_id,
            },
        ],
    }],
)

print(response.output_text)

The image depicts a veterinarian examining a small animal, which appears to be a ferret. The veterinarian is using a stethoscope to listen to the animal's heart, and a young boy is holding the ferret. The setting looks like a veterinary clinic or an examination room.

