In [1]:
!pip install openai

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY")



# 🖼️ Using an Image with GPT (from internet)

With the `gpt-4o` model, we can send **both text and images** to GPT in the same request.

In this example, we send:
- A **text prompt**: *"Give me the name of the animal in the image."*
- A **URL pointing to an image** from the web

The model will analyze the image and respond with the most likely answer.

In [8]:
url = "https://upload.wikimedia.org/wikipedia/commons/f/f0/Ophiopteris_antipodum.JPG"

response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "user",
            "content":
            [
                {
                    "type": "text",
                    "text": """
                    return a JSON file with only the name of the animal in the image.
                    No other text or explanation.
                    Give me the name of the animal in the image."""
                },
                {
                    "type": "image_url",
                    "image_url": {"url": url}
                }
            ]
        }



    ]
)

print(response.choices[0].message.content)

{
    "name": "Brittle Star"
}


# 🖼️ Using Many Images with GPT (from internet)

In [None]:
url1 = "https://upload.wikimedia.org/wikipedia/commons/f/f0/Ophiopteris_antipodum.JPG"
url2 = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Echinaster_sepositus_Linosa_092.jpg/2560px-Echinaster_sepositus_Linosa_092.jpg"

response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "user",
            "content":
            [
                {
                    "type": "text",
                    "text": "Are both animals the same animals?"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": url1}
                },
                {
                    "type": "image_url",
                    "image_url": {"url": url2}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

No, these are not the same animals. The first image shows a brittle star, characterized by its long, slender, flexible arms. The second image shows a sea star (or starfish), which has thicker, less flexible arms. Both belong to the echinoderm phylum but are different types of animals.


# 🖼️ Using an Image with GPT (from a local file)

You can also ask ChatGPT questions about **images stored on your computer**, not just ones from a URL.

Just run the following code!

It load the image from internet and store it in the file "downloaded_image.jpg"

In [2]:
from PIL import Image
import requests
from io import BytesIO
from IPython.display import Image as IPImage, display

image_url = "https://upload.wikimedia.org/wikipedia/commons/f/f0/Ophiopteris_antipodum.JPG"

# Add headers to make it look like a browser
headers = {
    "User-Agent": "Mozilla/5.0"
}

response = requests.get(image_url, headers=headers)

# Check the content type to make sure it's an image
if "image" in response.headers["Content-Type"]:
    img = Image.open(BytesIO(response.content))

    # Save the image locally
    local_filename = "downloaded_image.jpg"
    img.save(local_filename)

    # Display the saved image inline in Colab
    print(f"✅ Image saved as '{local_filename}'")

else:
    print("⚠️ The content is not an image. Check the URL.")


✅ Image saved as 'downloaded_image.jpg'


The code below transforms the local image into a special format that ChatGPT can understand.

🧠 **You don't need to understand how it works** — just run the cell!

It converts the image into a text-based format (called **base64**) so it can be sent directly in the API call.


In [3]:
import base64

# Encode image as base64
image_path = "downloaded_image.jpg"
with open(image_path, "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode("utf-8")
data_url = f"data:image/jpeg;base64,{base64_image}"

The **image**, now encoded in the format for GPT

In [4]:


# Send the image and prompt
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Give me the name of the animal in the image."},
                {"type": "image_url", "image_url": {"url": data_url}},
            ],
        }
    ]
)

# Print the model's response
print(response.choices[0].message.content)

This is a black brittle star, a type of echinoderm similar to a starfish. It is known for its long, flexible arms that can move independently.


# ✏️ Practice Questions

1. **Try it with a different image**

   🔍 Find another image on the internet — it can be an animal, an object, a famous place, etc.  
   📸 Copy the image URL and replace the value of `url` in the code.  
   ✏️ Then, change the question to something relevant, like:

   - "What is shown in this photo?"
   - "What kind of bird is this?"
   - "Is this a historical building?"

   👉 Run the code and see how ChatGPT responds!

   For example: https://upload.wikimedia.org/wikipedia/commons/f/f2/Mercedes-Benz_W115_front_20080816.jpg