# Images and Vision

## Basic Connection and Packages

### Importing OpenAI and Initializing the Client

To begin, we'll import the `OpenAI` class from the `openai` library, which allows us to interact with the OpenAI API. Next, we initialize a client instance, which we'll use to send requests and receive responses from the OpenAI models.

In [1]:
"""
This script is a simple example of using the OpenAI API
It uses the OpenAI Python client library to open a connection to the OpenAI API.
This also looks for the OPENAI_API_KEY environment variable to authenticate the client.
"""
from openai import OpenAI

client = OpenAI()

## Passing an Image URL
In the following code cell, we'll use **"gpt-4o-mini"** to analyze an image. We'll provide the model with an image along with the prompt: *"what's in this image?"* The model will examine the picture and generate a descriptive response, which we'll then print out. This demonstrates how AI can interpret visual content alongside text-based instructions.


<img src="https://upload.wikimedia.org/wikipedia/commons/5/53/202412_Taiwan_Railway_Haifeng_EMU500_Tourist_Train_at_Houlong_Station.jpg" width="512" height="512">



In [13]:

response = client.responses.create(
    model="gpt-4o",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "Tell me what is in this image."},
            {
                "type": "input_image",
                "image_url": "https://upload.wikimedia.org/wikipedia/commons/5/53/202412_Taiwan_Railway_Haifeng_EMU500_Tourist_Train_at_Houlong_Station.jpg",
            },
        ],
    }],
)

print(response.output_text)

The image shows a light green train at a station platform. The train has two large windows and two smaller circular lights on the front. There are visible cables and mechanical parts on the lower front end. The word "HAIFENG" is written on the train. The platform is empty, and there are buildings and trees in the background.


## Low vs High Resolution
Now let's look at low vs high resolution

In [None]:
response = client.responses.create(
    model="gpt-4o",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "Tell me what is in this image and were it is taken at. Lookup the location and tell me more about it."},
            {
                "type": "input_image",
                "image_url": "https://upload.wikimedia.org/wikipedia/commons/5/53/202412_Taiwan_Railway_Haifeng_EMU500_Tourist_Train_at_Houlong_Station.jpg",
            },
        ],
    }],
)

print(response.output_text)