# Images

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/https://github.com/9juhnke/llm-api-gwdg-saia/main?filepath=images.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/9juhnke/llm-api-gwdg-saia/blob/main/images.ipynb)

This notebook shows how images can be processed via SAIA API (see [README.md](./README.md)).

In [None]:
# To install the required packages remove the comment character before the next line
# !pip install openai

The API specification is compatible with the [OpenAI Image API](https://platform.openai.com/docs/guides/vision). See the following minimal example in Python.

In [None]:
from openai import OpenAI
import base64

# API configuration
api_key = "<Your API_KEY>" # Insert your API Key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "qwen2.5-vl-72b-instruct" # Choose an model from: {gemma-3-27b-it, qwen2.5-vl-72b-instruct, internvl2.5-8b}

# Path to your image
image_path = "test-image.png"

# Start OpenAI client
client = OpenAI(
    api_key=api_key,
    base_url=base_url,
    )

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
base64_image = encode_image(image_path)

# Get response
response = client.chat.completions.create(
    model = model,
    messages=[
        {"role": "user",  "content": [
            {"type": "text", "text": "What is in this image?",},
            {"type": "image_url", "image_url": {"url":  f"data:image/jpeg;base64,{base64_image}"},},
            ],
        }
    ],
)
# Print full response as JSON or extract the response text from the JSON object
print(response.choices[0].message.content)

The image shows a collection of various musical instruments. Here is a list of the instruments visible:

1. **French Horn** - A brass instrument with a wide, circular body and a large flared bell.
2. **Violin** - A stringed instrument with a small body and four strings.
3. **Cello** - A larger stringed instrument with four strings, typically played while seated.
4. **Double Bass** - The largest and lowest-pitched bowed string instrument.
5. **Flute** - A woodwind instrument that is played by blowing across a hole in the mouthpiece.
6. **Clarinet** - A single-reed woodwind instrument.
7. **Trombone** - A brass instrument with a slide mechanism.
8. **Harp** - A large stringed instrument with a triangular frame and strings of varying lengths.
9. **Snare Drum** - A percussion instrument with a shallow, cylindrical body.
10. **Timpani (Kettle Drum)** - A type of drum with a large, deep bowl-shaped kettleshaped head.

These instruments are commonly used in classical orchestras and other musi

Or fetching images from the web:

In [4]:
# Get response
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "user", "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/a/ad/The_Panathenaic_Stadium_on_April_22%2C_2021.jpg"},},
            ],
        }
    ],
)

# Print full response as JSON or extract the response text from the JSON object
print(response.choices[0].message.content)

This image shows the Panathenaic Stadium, also known as the Panathinaiko Stadium, located in Athens, Greece. It is the only stadium in the world built entirely of marble and is located in the Kotzia. The stadium was used for the first modern Olympic Games in 1896 and has been used for special events and the finale of the Olympic torch relay until today. The stadium'sU-shaped running track is clearly visible, along with its tiered stone seating and flagpoles surrounding the field. In the background, modern buildings, trees, and mountains can be seen under a clear sky.
