# OpenAI Model Discovery and Multimodal Capability Check

This notebook demonstrates how to:

1. Configure network access (proxy)
2. Load an OpenAI API key from a `.env` file
3. Query all OpenAI models available to the account
4. Programmatically test which models accept image inputs

The goal is **inspection and understanding**.

In [1]:
import os

# DWD proxy
os.environ["HTTP_PROXY"]  = "http://ofsquid.dwd.de:8080"
os.environ["HTTPS_PROXY"] = "http://ofsquid.dwd.de:8080"

# Optional but recommended
os.environ["http_proxy"]  = os.environ["HTTP_PROXY"]
os.environ["https_proxy"] = os.environ["HTTPS_PROXY"]

In [2]:
from dotenv import load_dotenv
import os
from openai import OpenAI

# Load environment variables from .env
load_dotenv()

# Create OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

print("API key loaded:", bool(os.getenv("OPENAI_API_KEY")))

API key loaded: True


In [3]:
# List all models accessible to this OpenAI account
models = client.models.list()

print("Available models:\n")
for m in models.data:
    print(" -", m.id)

Available models:

 - gpt-4-0613
 - gpt-4
 - gpt-3.5-turbo
 - chatgpt-image-latest
 - gpt-4o-mini-tts-2025-03-20
 - gpt-4o-mini-tts-2025-12-15
 - gpt-realtime-mini-2025-12-15
 - gpt-audio-mini-2025-12-15
 - davinci-002
 - babbage-002
 - gpt-3.5-turbo-instruct
 - gpt-3.5-turbo-instruct-0914
 - dall-e-3
 - dall-e-2
 - gpt-4-1106-preview
 - gpt-3.5-turbo-1106
 - tts-1-hd
 - tts-1-1106
 - tts-1-hd-1106
 - text-embedding-3-small
 - text-embedding-3-large
 - gpt-4-0125-preview
 - gpt-4-turbo-preview
 - gpt-3.5-turbo-0125
 - gpt-4-turbo
 - gpt-4-turbo-2024-04-09
 - gpt-4o
 - gpt-4o-2024-05-13
 - gpt-4o-mini-2024-07-18
 - gpt-4o-mini
 - gpt-4o-2024-08-06
 - chatgpt-4o-latest
 - gpt-4o-audio-preview
 - gpt-4o-realtime-preview
 - omni-moderation-latest
 - omni-moderation-2024-09-26
 - gpt-4o-realtime-preview-2024-12-17
 - gpt-4o-audio-preview-2024-12-17
 - gpt-4o-mini-realtime-preview-2024-12-17
 - gpt-4o-mini-audio-preview-2024-12-17
 - o1-2024-12-17
 - o1
 - gpt-4o-mini-realtime-preview
 - gpt

### Loading and Encoding an Image for Multimodal Input

OpenAI multimodal models expect images to be provided as URLs or
base64-encoded image data.

In this notebook, the radar reflectivity image is:
1. loaded from disk,
2. encoded as base64,
3. embedded into a data URL (`data:image/png;base64,...`).

This representation can be passed directly to vision-enabled models.

In [4]:
import base64

# Path to the image used for multimodal testing
image_path = "radar_map_germany.png"

# Read image and encode as base64 string
with open(image_path, "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode("utf-8")

print("Image successfully loaded and encoded.")


Image successfully loaded and encoded.


### Testing Multimodal Capabilities of OpenAI Models

The following function tests whether a given OpenAI model can process image inputs.

It sends a minimal multimodal request consisting of:
- a short text prompt, and
- a base64-encoded PNG image.

If the model supports image input, it returns a textual description of the image.
If not, the request fails and the exception is caught.

This approach is used only for **demonstration and inspection**, not for
automatic capability discovery.

In [5]:
def test_model(model_name, image_b64):
    """
    Test whether a given OpenAI model supports image input.

    Parameters
    ----------
    model_name : str
        Name of the OpenAI model to test.
    image_b64 : str
        Base64-encoded PNG image.

    The function sends a minimal multimodal request consisting of:
    - a short text instruction
    - an image passed via image_url

    If the model supports vision input, a textual description is printed.
    If not, the API raises an error which is caught and reported.
    """
    try:
        response = client.chat.completions.create(
            model=model_name,
            messages=[{
                "role": "user",
                "content": [
                    {"type": "text",
                     "text": "Describe this image briefly."},
                    {"type": "image_url",
                     "image_url": {
                         "url": f"data:image/png;base64,{image_b64}"
                     }}
                ]
            }],
            max_tokens=120,
        )

        print(f"\n[{model_name}]")
        print(response.choices[0].message.content)

    except Exception as e:
        print(f"\n[{model_name}] FAILED:", e)


In [6]:
# Run function on particular model
test_model("gpt-5-chat-latest", image_b64)


[gpt-5-chat-latest]
This image is a weather radar map titled “DWD Radar Reflectivity with State Borders.” It shows radar reflectivity data over Central Europe, including Germany and surrounding countries. Different colors represent varying levels of reflectivity (in dBZ), with the color scale on the right ranging from 0 (dark blue) to 40 (dark red). The areas with green, yellow, and red colors indicate stronger radar returns, likely corresponding to regions of heavier precipitation or storms.


In [7]:
# Here a model which does not support image input
test_model("gpt-3.5-turbo", image_b64)


[gpt-3.5-turbo] FAILED: Error code: 400 - {'error': {'message': 'Invalid content type. image_url is only supported by certain models.', 'type': 'invalid_request_error', 'param': 'messages.[0].content.[1].type', 'code': None}}
