# Chat with vision models

**If you're looking for the web application, check the src/ folder.**

This notebook is just provided for manual experimentation with the vision model.

## Authenticate to OpenAI

The following code connects to OpenAI, either using an Azure OpenAI account, GitHub models, or local Ollama model. See the README for instruction on configuring the `.env` file.

In [1]:
import os

import azure.identity
import openai
from dotenv import load_dotenv

load_dotenv(".env", override=True)

openai_host = os.getenv("OPENAI_HOST")
if openai_host == "local":
    print("Using local OpenAI-compatible API with no key")
    openai_client = openai.OpenAI(api_key="no-key-required", base_url=os.environ["LOCAL_OPENAI_ENDPOINT"])
elif os.getenv("AZURE_OPENAI_KEY_FOR_CHATVISION"):
    # Authenticate using an Azure OpenAI API key
    # This is generally discouraged, but is provided as a convenience
    print("Using Azure OpenAI with key")
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_KEY_FOR_CHATVISION"],
    )
elif os.getenv("AZURE_OPENAI_ENDPOINT"):
    tenant_id = os.environ["AZURE_TENANT_ID"]
    print("Using Azure OpenAI with Azure Developer CLI credential for tenant id", tenant_id)
    default_credential = azure.identity.AzureDeveloperCliCredential(tenant_id=tenant_id)
    token_provider = azure.identity.get_bearer_token_provider(
        default_credential, "https://cognitiveservices.azure.com/.default"
    )
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        azure_ad_token_provider=token_provider,
    )

Using Azure OpenAI with Azure Developer CLI credential for tenant id e47e6fc9-3a2c-454a-8b8f-90cc6972fb77


## Send an image by URL

In [None]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "Is this a unicorn?", "type": "text"},
            {
                "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/6/6e/Ur-painting.jpg"},
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=os.environ["OPENAI_MODEL"], messages=messages, temperature=0.5)

print(response.choices[0].message.content)

No, this is not a unicorn. The animal in the image is a bovine, likely an aurochs or a similar type of wild cattle. Unicorns are mythical creatures typically depicted as horses with a single horn on their foreheads. This animal has two horns, which is characteristic of cattle.


## Send an image by Data URI



In [2]:
import base64


def open_image_as_base64(filename):
    with open(filename, "rb") as image_file:
        image_data = image_file.read()
    image_base64 = base64.b64encode(image_data).decode("utf-8")
    return f"data:image/png;base64,{image_base64}"

In [3]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "are these alligators or crocodiles?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("mystery_reptile.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

These appear to be young crocodiles. One key distinguishing feature between crocodiles and alligators is the shape of their snout; crocodiles typically have a V-shaped, pointed snout, whereas alligators have a wider, U-shaped snout. The individuals in the photo have the more pointed snout characteristic of crocodiles.


## Use cases for image analysis

### Accessibility

#### Assistance for vision-impaired

In [11]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "is there anything good for vegans on this menu?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("menu.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Based on the menu provided, here are some options that may be suitable for vegans:

1. **Bruschetta Trio** (check for vegan options in tapenade)
2. **Spinaci Soffitti** (confirm without any non-vegan ingredients)
3. **Panzanella con Fagioli** (make sure to omit any non-vegan additions)
4. **Insalata Di Mixta** (check the dressing for vegan compatibility)

Be sure to confirm with the restaurant about any potential animal-derived ingredients in dressings or garnishes.


#### Automated image captioning

In [12]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Suggest an alt text for this image", "type": "text"},
                {"image_url": {"url": open_image_as_base64("azure_arch.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Flowchart illustrating a cloud infrastructure setup for a chat application. It includes the following components: a Container Apps Environment, Azure AI services, a Container App, Managed Identity, Log Analytics workspace, a Container Registry, and a Key Vault. Arrows indicate relationships and workflows among these components.


### Business process automation

#### Insurance claim processing

In [9]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "system",
            "content": (
                "You are an AI assistant that helps auto insurance companies process claims."
                "You accept images of damaged cars that are submitted with claims, and you are able to make judgments "
                "about the causes of automobile damage, and the validity of claims regarding that damage."
            ),
        },
        {
            "role": "user",
            "content": [
                {"text": "Claim states that this damage is due to hail. Is it valid?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dented_car.jpg")}, "type": "image_url"},
            ],
        },
    ],
)

print(response.choices[0].message.content)

The damage shown in the image appears to be extensive and structural, likely indicating a more significant impact or collision rather than hail damage. Hail damage typically results in dents and minor surface imperfections, rather than crumpling or major deformation as seen here. Based on this assessment, the claim stating that the damage is due to hail may not be valid.


#### Graph analysis

In [8]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "What zone are we losing the most trees in?", "type": "text"},
            {
                "image_url": {
                    "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/1f/20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg/1280px-20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg.png"
                },
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=os.environ["OPENAI_MODEL"], messages=messages, temperature=0.5)

print(response.choices[0].message.content)

The zone losing the most trees, as indicated in the chart, is the **tropical zone**. It consistently shows the highest annual loss in tree cover compared to the boreal, temperate, and subtropical zones.


#### Table analysis

In [13]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "What's the cheapest plant?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("page_0.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

The cheapest plant on the list is **Agrostis pallens (Thingrass)**, priced at **$0.58**.


#### Appliance support

In [10]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "How do I set this to wash the dishes quickly?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dishwasher.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

To wash the dishes quickly using this dishwasher, you should select the **Quick** wash setting. Here’s how to do it:

1. **Turn on the dishwasher** by pressing the **On/Off** button.
2. **Select the Quick program** by pressing the button labeled "Quick" (usually indicated with a time of 45 minutes).
3. **Check any additional options** (like Half Load if applicable) to optimize for fewer dishes.
4. **Press the **Start** button to begin the cycle.

This should give you a fast wash for your dishes.
