# Chat with vision models

**If you're looking for the web application, check the src/ folder.**

This notebook is just provided for manual experimentation with the vision model.

## Authenticate to OpenAI

The following code connects to OpenAI, either using an Azure OpenAI account, GitHub models, or local Ollama model. See the README for instruction on configuring the `.env` file.

In [22]:
import os

import azure.identity
import openai
from dotenv import load_dotenv

load_dotenv(".env", override=True)

openai_host = os.getenv("OPENAI_HOST", "github")
model_name = os.getenv("OPENAI_MODEL", "gpt-4o")

if openai_host == "local":
    print("Using local OpenAI-compatible API with no key")
    openai_client = openai.OpenAI(api_key="no-key-required", base_url=os.environ["LOCAL_OPENAI_ENDPOINT"])
elif openai_host == "github":
    print("Using GitHub Models with GITHUB_TOKEN as key")
    openai_client = openai.OpenAI(
        api_key=os.environ["GITHUB_TOKEN"],
        base_url="https://models.inference.ai.azure.com",
    )
elif openai_host == "azure" and os.getenv("AZURE_OPENAI_KEY_FOR_CHATVISION"):
    # Authenticate using an Azure OpenAI API key
    # This is generally discouraged, but is provided as a convenience
    print("Using Azure OpenAI with key")
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_KEY_FOR_CHATVISION"],
    )
elif openai_host == "azure" and os.getenv("AZURE_OPENAI_ENDPOINT"):
    tenant_id = os.environ["AZURE_TENANT_ID"]
    print("Using Azure OpenAI with Azure Developer CLI credential for tenant id", tenant_id)
    default_credential = azure.identity.AzureDeveloperCliCredential(tenant_id=tenant_id)
    token_provider = azure.identity.get_bearer_token_provider(
        default_credential, "https://cognitiveservices.azure.com/.default"
    )
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        azure_ad_token_provider=token_provider,
    )
print(f"Using model {model_name}")

Using GitHub Models with GITHUB_TOKEN as key
Using model gpt-4o


## Send an image by URL

In [None]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "Is this a unicorn?", "type": "text"},
            {
                "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/6/6e/Ur-painting.jpg"},
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=model_name, messages=messages, temperature=0.5)

print(response.choices[0].message.content)

No, this is not a unicorn. This is an illustration of an aurochs, an extinct species of large wild cattle that once roamed Europe, Asia, and North Africa. Unicorns are mythical creatures typically depicted with a single horn on their forehead, while this animal clearly has two horns.


## Send an image by Data URI



In [14]:
import base64


def open_image_as_base64(filename):
    with open(filename, "rb") as image_file:
        image_data = image_file.read()
    image_base64 = base64.b64encode(image_data).decode("utf-8")
    return f"data:image/png;base64,{image_base64}"

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "are these alligators or crocodiles?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("mystery_reptile.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

These are crocodiles. You can tell by their slender, V-shaped snouts, which are a characteristic feature of crocodiles, as opposed to the broader, U-shaped snouts of alligators.


## Use cases for image analysis

### Accessibility

#### Assistance for vision-impaired

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "is there anything good for vegans on this menu?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("menu.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

This menu doesn't seem to have any explicitly vegan dishes, as many options contain meat, seafood, cheese, or other animal-derived ingredients. However, some dishes could potentially be modified to make them vegan-friendly. Here are some dishes that seem like they may be adapted:

### Antipasti:
- **Spinaci Soffritti (8)**: Fresh spinach sautéed with lemon and garlic. Ensure no butter or animal-based oils are used for preparation.
- **Bruschetta Trio (18)**: The avocado, smoked salmon, and crème fraiche bruschetta sounds flexible. Ask if it can be served without the salmon and crème fraiche, and confirm the base bread is vegan.

### Zuppe & Insalate:
- **Panzanella con Fagioli (18)**: A vine tomato and bread salad mixed with onions, beans, cucumbers, and avocado. Ask the kitchen to confirm that Carmine's House Vinaigrette is vegan.
- **Insalata Di Mista (13)**: Seasonal greens tossed in the house vinaigrette. Double-check if the dressing is vegan and skip the option of adding chicken o

#### Automated image captioning

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Suggest an alt text for this image", "type": "text"},
                {"image_url": {"url": open_image_as_base64("azure_arch.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Diagram showing the architecture of an Azure-based deployment for a containerized chat application. The central component is a "Container App" connected to various Azure resources. It links to a "Container Apps Environment" at the top, Azure Cognitive Services ("Azure AI Services") on the top right, and "Managed Identity" on the bottom left. Other connected resources include "Log Analytics Workspace" for monitoring, a "Container Registry" for image storage, and an Azure "Key Vault" for secret management. The flow of dependencies is visually represented with arrows between the components, illustrating their interactions.


### Business process automation

#### Insurance claim processing

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "system",
            "content": (
                "You are an AI assistant that helps auto insurance companies process claims."
                "You accept images of damaged cars that are submitted with claims, and you are able to make judgments "
                "about the causes of automobile damage, and the validity of claims regarding that damage."
            ),
        },
        {
            "role": "user",
            "content": [
                {"text": "Claim states that this damage is due to hail. Is it valid?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dented_car.jpg")}, "type": "image_url"},
            ],
        },
    ],
)

print(response.choices[0].message.content)

The damage shown in this image is not consistent with hail damage. Hail damage is typically characterized by multiple small, round dents across the surface of the vehicle, usually on the hood, roof, and trunk, as well as occasionally cracked or broken windows.

The significant and centralized deformation of the hood, along with substantial damage to the front grille and surrounding areas, suggests impact with a large object (e.g., another vehicle, a tree, or a pole), not hailstones. Based on this observation, the claim is not valid if it is attributed solely to hail.


#### Graph analysis

In [19]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "What zone are we losing the most trees in?", "type": "text"},
            {
                "image_url": {
                    "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/1f/20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg/1280px-20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg.png"
                },
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=os.environ["OPENAI_MODEL"], messages=messages, temperature=0.5)

print(response.choices[0].message.content)

The zone where we are losing the most trees is the **Tropical zone**, represented by the dark green bars in the graph. It consistently shows the largest amount of tree cover loss compared to the Boreal, Temperate, and Subtropical zones.


#### Table analysis

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "What's the cheapest plant?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("page_0.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

The cheapest plant listed on the availability sheet is *Agrostis pallens* (Thringrass) under the "Grass" category, priced at **$0.58** per stub.


#### Appliance support

In [None]:
response = openai_client.chat.completions.create(
    model=model_name,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "How do I set this to wash the dishes quickly?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dishwasher.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

To wash the dishes quickly with this Bosch dishwasher, follow these steps:

1. **Turn on the dishwasher**: Press the **"On/Off"** button.
2. **Select the quick program**: Press the **"Quick 45°"** button. This is a designated fast wash program, typically for lightly soiled dishes.
3. **Optional - Use VarioSpeed**: If you want to make the cycle even faster, press the **"VarioSpeed"** button, which reduces the cycle time further by increasing energy and water usage.
4. **Start the dishwasher**: Press the **"Start"** button to begin the cycle.

That's it! Your dishwasher will now wash the dishes quickly. Keep in mind that this setting is best suited for lightly soiled dishes and not for heavy loads or tough stains.
