# Analyze Images using Azure OpenAI

Pre-requisites:
1. Create Azure OpenAI resource
2. Deploy gpt-4 and above model

## Load Azure Configuration

In [1]:
import os

azure_openai_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
azure_openai_key = os.getenv("AZURE_OPENAI_API_KEY")
azure_openai_deployment = os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME")
azure_openai_api_version = os.getenv("AZURE_OPENAI_API_VERSION")

## Get and Prepare the Image

In [3]:
import base64
from pathlib import Path

# Create a Path object for the image file
image_path = Path("generated_image.jpg")

# Using a context manager to open the file with Path.open()
with image_path.open("rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode("utf-8")

# Prepare the image content in the required format for the Azure OpenAI service
content_images = [
    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
    for base64_image in [base64_image]
]

## Create a Client

In [4]:
from openai import AsyncAzureOpenAI

# Create the Vision client
vision_client = AsyncAzureOpenAI(
    api_key=azure_openai_key, 
    api_version=azure_openai_api_version,
    azure_endpoint=azure_openai_endpoint
)
vision_deployment_name = "gpt-4o"

## Analyze the Image

In [5]:
# Define the user prompt for the image description
user_prompt = "Describe this image in detail."

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)


Response: The image captures a serene residential street scene at what appears to be either early morning or late afternoon due to the soft lighting. The central focus of the image is on a cat, possibly a domestic short-haired cat with a sleek white and grey coat and distinct markings on its face, body, and legs. The cat is positioned in the middle of the street, heading towards the photographer but looking slightly to the side, giving a sense of curiosity about its surroundings.

The street is lined with modern houses that feature clean architectural lines and appear to be well-maintained. These are likely suburban homes, each with a small garden space in front. The scene includes details such as green plants, a blue recycling bin, and various foliage which adds a touch of color and liveliness. On the left side, there are blossoming pink flowers, contributing to the overall appeal of the neighborhood. 

In the background, the image fades into a slightly blurred effect, showcasing more

## Getting results similar to Azure AI Vision: Image Analysis

In [6]:
# Define the user prompt for the image description
user_prompt = "Provide me 10 captions for this image."

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)


Response: 1. "Morning patrol: this cat takes its neighborhood watch duties seriously."
2. "Caught in the act of exploring the tranquil morning streets."
3. "This kitty's got places to be and things to see!"
4. "On the lookout for adventure in the quiet suburbs."
5. "Curiosity leads the way for this feline wanderer."
6. "Who's crossing the street? Just your friendly neighborhood cat."
7. "Exploring the block, one paw at a time."
8. "The street is my runway, and I'm ready to strut!"
9. "A cat’s-eye view of suburban life."
10. "Morning strolls with whiskers and wonder."


## Get Structured format

In [7]:
# Define the user prompt for the image description
user_prompt = """Analyze this image.
    Provide response in sample JSON format.
    {
        "description": "Describe the image in less than 50 words",
        "category": cat, dog, or mouse
    }

"""

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)

Response: {
    "description": "A cat walking on a suburban street with houses and plants in the background.",
    "category": "cat"
}
