# Chat with vision models

**If you're looking for the web application, check the src/ folder.**

This notebook is just provided for manual experimentation with the vision model.

## Authenticate to OpenAI

The following code connects to OpenAI, either using an Azure OpenAI account, GitHub models, or local Ollama model. See the README for instruction on configuring the `.env` file.

In [1]:
import os

import azure.identity
import openai
from dotenv import load_dotenv

load_dotenv(".env", override=True)

openai_host = os.getenv("OPENAI_HOST")
if openai_host == "local":
    # Use a local endpoint like llamafile server
    print("Using local OpenAI-compatible API with no key")
    openai_client = openai.OpenAI(api_key="no-key-required", base_url=os.getenv("LOCAL_OPENAI_ENDPOINT"))
elif openai_host == "github":
    print("Using GitHub-hosted model")
    openai_client = openai.OpenAI(
        api_key=os.environ["GITHUB_TOKEN"],
        base_url=os.environ["GITHUB_MODELS_ENDPOINT"],
    )
elif os.getenv("AZURE_OPENAI_KEY"):
    # Authenticate using an Azure OpenAI API key
    # This is generally discouraged, but is provided for developers
    # that want to develop locally inside the Docker container.
    print("Using Azure OpenAI with key")
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        api_key=os.getenv("AZURE_OPENAI_KEY"),
    )
elif os.getenv("AZURE_OPENAI_ENDPOINT"):
    # Authenticate using the default Azure credential chain
    # See https://docs.microsoft.com/azure/developer/python/azure-sdk-authenticate#defaultazurecredential
    # This will *not* work inside a local Docker container.
    # If using managed user-assigned identity, make sure that AZURE_CLIENT_ID is set
    # to the client ID of the user-assigned identity.
    print("Using Azure OpenAI with default credential")
    default_credential = azure.identity.DefaultAzureCredential(exclude_shared_token_cache_credential=True)
    token_provider = azure.identity.get_bearer_token_provider(
        default_credential, "https://cognitiveservices.azure.com/.default"
    )
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-05-01-preview",
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        azure_ad_token_provider=token_provider,
    )

print("Model name:", os.getenv("OPENAI_MODEL"))

Using GitHub-hosted model
Model name: Phi-3.5-vision-instruct


## Send an image by URL

In [2]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "Is this a unicorn?", "type": "text"},
            {
                "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/6/6e/Ur-painting.jpg"},
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=os.environ["OPENAI_MODEL"], messages=messages, temperature=0.5)

print(response.choices[0].message.content)

 No, this is not a unicorn. The image depicts an ox or bull, which is a large, domesticated animal with a hump, dewlaps, and long horns.


## Send an image by Data URI



In [3]:
import base64


def open_image_as_base64(filename):
    with open(filename, "rb") as image_file:
        image_data = image_file.read()
    image_base64 = base64.b64encode(image_data).decode("utf-8")
    return f"data:image/png;base64,{image_base64}"

In [4]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "how could I make this into a unicorn though??", "type": "text"},
                {"image_url": {"url": open_image_as_base64("ur.jpg")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

 Sorry, the task is not possible as unicorns do not exist in reality.


## Use cases for image analysis

### Graph analysis

In [5]:
messages = [
    {
        "role": "user",
        "content": [
            {"text": "What zone are we losing the most trees in?", "type": "text"},
            {
                "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/1f/20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg/1280px-20210331_Global_tree_cover_loss_-_World_Resources_Institute.svg.png"},
                "type": "image_url",
            },
        ],
    }
]
response = openai_client.chat.completions.create(model=os.environ["OPENAI_MODEL"], messages=messages, temperature=0.5)

print(response.choices[0].message.content)

 Tropical


### Insurance claim processing

In [6]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "system",
            "content": (
                "You are an AI assistant that helps auto insurance companies process claims."
                "You accept images of damaged cars that are submitted with claims, and you are able to make judgments "
                "about the causes of automobile damage, and the validity of claims regarding that damage."
            ),
        },
        {
            "role": "user",
            "content": [
                {"text": "Claim states that this damage is due to hail. Is it valid?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dented_car.jpg")}, "type": "image_url"},
            ],
        },
    ],
)

print(response.choices[0].message.content)

 Based on the image provided, it appears that the vehicle has significant front-end damage with a shattered windshield and bumper. The damage pattern does not typically align with hail damage, which generally results in smooth, rounded dents rather than the jagged and shattered appearance seen here. It is more likely that this damage was caused by a high-speed collision or a similar event. Therefore, it would not be valid to claim this damage is due to hail based on the image provided.


### Appliance help

In [7]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "How do I set this to wash the dishes quickly?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("dishwasher.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

 Sorry, I cannot answer this question. The image shows a close-up of the control panel of a dishwasher with various settings and a digital timer display showing 1:05, but there is no information provided in the image that indicates how to set the dishwasher to wash the dishes quickly.


### Assistance for vision-impaired

In [8]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "is there anything good for vegans on this menu?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("menu.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

 Based on the menu items listed, there are several vegan-friendly options available. These include the following:

1. Antipasto Assorted - This includes assorted antipasto that is free from animal products.
2. Bruschetta Fiore - This is a fresh basil and tomato bruschetta that is vegan-friendly.
3. Farinata Di Pannocchia - A vegan frittata made without eggs.
4. Fresh Spinach with Lemon and Garlic - This dish contains no animal products and is suitable for vegans.
5. Grilled Hearts of Romanesco with Panettone - This is a vegan-friendly dish that includes Romanesco hearts and panettone.
6. Insalata Di Balsamic Vinagrete - This is a seasonal green salad with balsamic vinagrette that is vegan-friendly.
7. Insalata Di Pellino - This is a classic Caesar salad with house-made croutons, and it is vegan-friendly.
8. Insalata Di Pelle - This is a seasonal green salad with carrot, parsley, and radish, and it is vegan-friendly.
9. Insalata Di Pelle - This is a seasonal green salad with parsley, ra

## Automated image captioning

In [9]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Suggest an alt text for this image", "type": "text"},
                {"image_url": {"url": open_image_as_base64("azure_arch.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

 This image appears to be a diagram illustrating various components of a cloud-based application environment, specifically related to chat services and container applications. The components include a container app environment, managed identity, a chat-app-cnbr6e7fwgzj2-id-aca, a chat-app-cnbr6e7fwgzj2-ca, a chatappcnbr6e7fwgzj2-log, a chatappcnbr6e7f-vault, a chatapp-cnbr6e7fwgzj2-log, a chatappcnbr6e7f-vault, and a container registry. These components are likely part of a larger Azure cloud service architecture.


### Table analysis

In [10]:
response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "What's the cheapest plant?", "type": "text"},
                {"image_url": {"url": open_image_as_base64("page_0.png")}, "type": "image_url"},
            ],
        }
    ],
)

print(response.choices[0].message.content)

 Based on the image, the cheapest plant listed is the 'California poppy' priced at $1.65, which is the lowest price compared to other plants listed in the image.
