# Azure OpenAI GPT-4o Vision in fashion

GPT-4o with Vision capabilities on Azure OpenAI service is a large multimodal model (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. It incorporates both natural language processing and visual understanding. With enhanced mode, you can use the Azure AI Vision features to generate additional insights from the images.
> https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure%2Cglobal-standard%2Cstandard-chat-completions#gpt-4o-and-gpt-4-turbo

In this lab, you will learn how the Azure OpenAI GPT-4 model can be applied to the fashion industry. 
The model will be instructed to extract fasion features from images and create content for garment promotion

In [None]:
#%pip install openai --upgrade

Import libraries

In [2]:
import base64
import datetime
import json
import openai
import os
import requests
import sys

from dotenv import load_dotenv
from io import BytesIO
from PIL import Image

In [None]:
def check_openai_version():
    """
    Check Azure Open AI version
    """
    installed_version = openai.__version__

    try:
        version_number = float(installed_version[:3])
    except ValueError:
        print("Invalid OpenAI version format")
        return

    print(f"Installed OpenAI version: {installed_version}")

    if version_number < 1.0:
        print("[Warning] You should upgrade OpenAI to have version >= 1.0.0")
        print("To upgrade, run: %pip install openai --upgrade")
    else:
        print(f"[OK] OpenAI version {installed_version} is >= 1.0.0")

print(f"Python version: {sys.version}")
check_openai_version()

## 1. Azure Open AI

load environment variables

In [3]:
load_dotenv(override=True)

# Azure Open AI
openai.api_type: str = "azure"
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai.api_version = os.getenv("AZURE_OPENAI_API_VERSION")

model =   os.getenv("AZURE_OPENAI_MODEL")  # This is the deployed name of your GPT4o model from the Azure Open AI studio
print("Model: ",  model)

Model:  gpt-4o


## 2. Functions

Function to display image from file

In [None]:
# helper function to display image
def image_view(image_file):
    """
    View image
    """
    if not os.path.exists(image_file):
        print(f"[Error] Image file {image_file} does not exist.")
        return None

    else:
        print(image_file)
        img = Image.open(image_file)
        display(img)

Function to generate image description with focus on fashion features

In [4]:
def gpt4V_fashion(image_file):
    """
    GPT4-Vision
    """
    # Checking if file exists
    if not os.path.exists(image_file):
        print(f"[Error] Image file {image_file} does not exist.")
        return None

    # Endpoint
    base_url = f"{openai.api_base}/openai/deployments/{model}"
    gpt4vision_endpoint = f"{base_url}/chat/completions?api-version={openai.api_version}"


    # Header
    headers = {"Content-Type": "application/json", "api-key": openai.api_key}

    # Encoded image
    encoded_image = base64.b64encode(open(image_file, "rb").read()).decode(
        "ascii"
    )

    context = """ 
    You are a fashion expert, familiar with identifying features of fashion articles from images.
    A user uploads an image and asks you to describe one particular piece in the shot: jacket, shoes, pants, \
    watches, etc.
    """

    prompt = """
    You respond with your analysis of the following fields:

    1. ITEM'S TYPE: Identify if it's a top, bottom, dress, outerwear, footwear, bag, jewelry...
    2. BRAND: identity the brand of the item.
    3. COLOR: Note the main color(s) and any secondary colors.
    4. PATTERN: Identify any visible patterns such as stripes, florals, animal print, or geometric designs.\
    Feel free to use any other patterns here.
    5. MATERIAL: Best guess at the material that the item is made from.
    6. FEATURES: Note any unique details or embellishments, like embroidery, sequins, studs, fringes, buttons,
    zippers...
    7. ITEM TYPE SPECIFIC: For each type of item, feel free to add any additional descriptions that are relevant \
    to help describe the item. For example, for a jacket you can include the neck and sleeve design, plus the length.
    8. MISC.: Anything else important that you notice.
    9. SIZE: Print the size of the item if you get it from the image.
    10. ITEM SUMMARY: Write a one line summary for this item.
    11. ITEM CLASSIFICATION: Classify this item into CLOTHES, BAG, SHOES, WATCH or OTHERS.
    12. ITEM TAGS: Generate 10 tags to describe this item. Each tags should be separated with a comma.
    13. STORIES: Write multiple stories about this product in 5 lines.
    14. TWEETER PUBLICATION: Write a tweeter ad for this item with some hashtags and emojis.
    15. ECOMMERCE AD: Generate an item description for a publication on a ecommerce website with a selling message.
    16. SWEDISH ECOMMERCE AD: Generate an item description in Swedish for a publication on a ecommerce website with \
    a selling message.

    The output should be a numbered bulleted list. Just print an empty line between each items starting at item 12.
    """

    # Prompt
    json_data = {
        "messages": [
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": context
        }
      ]
    }, 
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": prompt
        },
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{encoded_image}"
          }
        }
      ]
    }
  ],
        "max_tokens": 4000,
        "temperature": 0.7,
    }

    # Results
    response = requests.post(
        gpt4vision_endpoint, headers=headers, data=json.dumps(json_data)
    )

    if response.status_code == 200:
        now = str(datetime.datetime.today().strftime("%d-%b-%Y %H:%M:%S"))
        print(f"Analysis of image: {image_file}")
        resp = json.loads(response.text)["choices"][0]["message"]["content"]
        print("\033[1;31;34m")
        print(resp)
        print("\n\033[1;31;32mDone:", now)
        print(
            "\033[1;31;32m[Note] These results are generated by an AI (Azure Open AI GPT4-Vision)"
        )

    elif response.status_code == 429:
        print(
            "[429 Error] Too many requests. Please wait a couple of seconds and try again."
        )

    else:
        print("[Error] Error code:", response.status_code)

## 3. Examples

Let's try to analyse an image

In [None]:
image_file = "../data/fashion/image1.jpg"

image_view(image_file)

Run GPT-4o analysis. Notice that the model can provide very reach content in just one-go

In [None]:
# Run GPT4-Vision
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image2.jpg"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image3.png"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image4.jpg"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image5.jpg"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image6.png"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image7.jpg"
image_view(image_file)
gpt4V_fashion(image_file)

### Another example

In [None]:
image_file = "../data/fashion/image8.jpg"

image_view(image_file)
gpt4V_fashion(image_file)

## 4. Challenge
1. Create description for your own fashion image(s)
2. Modify prompt in gpt4V_fashion function to create different content for an image


In [None]:
image_file = "..." # Add your image file here

image_view(image_file)
gpt4V_fashion(image_file)