# OpenAI Vision API (Preview)

This notebook demonstrates how to use the OpenAI Vision API to determine if a product is in stock or out of stock. The API uses GPT-4 to analyze an image and provide a response. The API is currently in preview and is not available to the public.

__Note:__ The OpenAI Vision API is currently in preview and is not available to the public. You will need to have access to the API to use it.

__Note:__ Processing images with the OpenAI Vision API may incur additional costs. With the current pricing and the images tested in this notebook, the cost was approximately $0.01 per image.

In [32]:
import base64
import json
import os
import re
import requests
import pandas as pd
from openai import OpenAI
from IPython.display import display, Image

openai_api_key = os.environ.get("OPENAI_API_KEY")
client = OpenAI(api_key=openai_api_key)

class CustomOpenAIResponse:
    def __init__(self, content: str, total_tokens: int):
        self.content = content
        self.total_tokens = total_tokens

content_pattern = re.compile('[^a-zA-Z\s]')
def get_openai_vision_response(prompt_text: str, img_url: str):
    response = client.chat.completions.create(
        model="gpt-4-vision-preview",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt_text},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": img_url,
                        },
                    },
                ],
            }
        ],
        max_tokens=100,
    )
    
    content = content_pattern.sub('', response.choices[0].message.content)
    return CustomOpenAIResponse(content, response.usage.total_tokens)
    
def load_input_data(input_file_path):
    with open(input_file_path, 'r') as file:
        json_data = json.load(file)
    return json_data

def load_base64_image_from_url(img_url):
    response = requests.get(img_url)
    img_data = base64.b64encode(response.content).decode('utf-8')
    return img_data

def color_cells(row):
    if row["In-Stock"] == "Yes" and row["Out-of-Stock"] == "No":
        return ["background-color: #296644"] * len(row)
    else:
        return ["background-color: #82323a"] * len(row)

## Execution

The code below will load the input data from the `input.json` file and then use the OpenAI Vision API to determine if the product is in stock or out of stock. The results will be displayed in a table with green cells indicating the product is in stock and red cells indicating the product is out of stock.

In [35]:
img_size = 500

prompt = "Is the product available for purchase? Please answer 'Yes' or 'No' only."

def process_data(data):
    total_tokens_used = 0
    df = pd.DataFrame(columns=["Source", "In-Stock", "Out-of-Stock", "IS Image", "OOS Image"])

    for i in range(0, len(data)):
        in_stock_img_url = data[i]['in_stock']
        in_stock_response = get_openai_vision_response(prompt, in_stock_img_url)
        total_tokens_used += in_stock_response.total_tokens
    
        out_of_stock_img_url = data[i]['out_of_stock']
        out_of_stock_response = get_openai_vision_response(prompt, out_of_stock_img_url)
        total_tokens_used += out_of_stock_response.total_tokens
    
        in_stock_img_tag = f'<img src="{in_stock_img_url}" width="{img_size}" height="{img_size}">'
        out_of_stock_img_tag = f'<img src="{out_of_stock_img_url}" width="{img_size}" height="{img_size}">'
    
        df.loc[i] = [data[i]["name"], in_stock_response.content, out_of_stock_response.content, in_stock_img_tag, out_of_stock_img_tag]

    print(f"Total tokens used: {total_tokens_used}")
    return df

In [37]:
batch_data = load_input_data('../input.json')
output_df = process_data(batch_data)
output_df.style.apply(color_cells, axis=1)

Total tokens used: 21260


Unnamed: 0,Source,In-Stock,Out-of-Stock,IS Image,OOS Image
0,alkosto.com,Yes,No,,
1,alza.cz,Yes,No,,
2,bestbuy.com,Yes,No,,
3,biccamera.com,Yes,No,,
4,coolblue.nl,Yes,No,,
5,costco.com,Yes,No,,
6,digitec.ch.de,Yes,No,,
7,yodobashi.com,Yes,No,,
8,yamada.denkiweb.com,Yes,No,,
9,walmart.com,Yes,No,,


In [39]:
batch_data = load_input_data('../input2.json')
output_df = process_data(batch_data)
output_df.style.apply(color_cells, axis=1)

Total tokens used: 19221


Unnamed: 0,Source,In-Stock,Out-of-Stock,IS Image,OOS Image
0,casasbahia.com.br,Yes,Yes,,
1,magazineluiza.com.br,No,Yes,,
2,pbtech.co.nz,Yes,No,,
3,alza.cz,Yes,Yes,,
4,staple.ca,Yes,Yes,,
5,amazon.fr,Yes,Yes,,
6,otto.de,Yes,Yes,,
7,currys.co.uk,No,Yes,,
8,ksdenki.com,Yes,Yes,,
9,edion.com,Yes,Yes,,


In [3]:
url = 'https://i2-prod.mirror.co.uk/incoming/article7539650.ece/ALTERNATES/s1200/Chihuahua-or-Muffin.jpg'
prompt = 'please give the exact number of how many are chihuahuas and cupcakes inside the image?'

display(Image(url=url))

custom_response = get_openai_vision_response(prompt, url)
print(custom_response.content)

The image is a humorous mix of Chihuahuas and blueberry muffins designed to play on the visual similarities between the two To identify each you need to look closely as the Chihuahuas have eyes and noses that reflect light differently than the blueberries and their texture varies from the more granular look of the muffins

In this image there appears to be an equal number of Chihuahuas and blueberry muffins with each type alternating in the grid
