# Analyze Images using Azure OpenAI

Pre-requisites:
1. Create Azure OpenAI resource
2. Deploy gpt-4 and above model

## Load Azure Configuration

In [1]:
import os

azure_openai_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
azure_openai_key = os.getenv("AZURE_OPENAI_API_KEY")
azure_openai_deployment = os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME")
azure_openai_api_version = os.getenv("AZURE_OPENAI_API_VERSION")

## Get and Prepare the Image

In [4]:
import base64
from pathlib import Path

# Create a Path object for the image file
image_path = Path("images/generated_image.jpg")

# Using a context manager to open the file with Path.open()
with image_path.open("rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode("utf-8")

# Prepare the image content in the required format for the Azure OpenAI service
content_images = [
    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
    for base64_image in [base64_image]
]

## Create a Client

In [5]:
from openai import AsyncAzureOpenAI

# Create the Vision client
vision_client = AsyncAzureOpenAI(
    api_key=azure_openai_key, 
    api_version=azure_openai_api_version,
    azure_endpoint=azure_openai_endpoint
)
vision_deployment_name = "gpt-4o"

## Analyze the Image

In [6]:
# Define the user prompt for the image description
user_prompt = "Describe this image in detail."

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)


Response: The image depicts a quiet residential street during early dawn. A cat is seen walking in the middle of the road, facing away from the camera and heading further down the street. The street is wet, suggesting it may have rained recently, and there is a slight mist in the air contributing to the serene and calm atmosphere.

On either side of the street, houses with pitched roofs and garages are lined up neatly. The sky has a light grayish-blue tint, indicating the start of a new day. Streetlights are on, casting a soft, warm light over the scene. Various objects such as trash bins and a parked car are visible in front of the houses, adding to the suburban setting. There is also a noticeable gradient in the road's surface, with a darker patch in the center, possibly due to water pooling or various tire tracks. The overall mood of the image is tranquil and peaceful.


## Getting results similar to Image Analysis

In [7]:
# Define the user prompt for the image description
user_prompt = "Provide me 10 captions for this image."

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)


Response: 1. "Morning stroll before the neighborhood wakes up."
2. "This kitty is off on an early morning adventure."
3. "Exploring the quiet streets all alone."
4. "Curiosity leads the way."
5. "A cat's journey through an empty road."
6. "Early bird catches the worm, or in this case, cat!"
7. "When the world belongs to just you and the sunrise."
8. "Paws on the pavement, eyes on the road."
9. "A peaceful start to the day with a feline friend."
10. "Walking towards a new day."


In [9]:
# Define the user prompt for the image description
user_prompt = """Analyze this image.
    Provide response in sample JSON format.
    {
        "description": "Describe the image in less than 50 words",
        "category": cat, dog, or mouse
    }

"""

# Send a request to the Azure OpenAI service to analyze the image and generate a description
response = await vision_client.chat.completions.create(
    model=vision_deployment_name,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt,
                },
                *content_images,  # Include the image content in the request
            ],
        }
    ],
    max_tokens=1000,  # Set the maximum number of tokens for the response
)

# Print the generated description of the image
print("Response: " + response.choices[0].message.content)

Response: ```json
{
    "description": "A cat walks down a residential street lined with houses and cars during early morning.",
    "category": "cat"
}
```
