<h1 align ="center"> REST API Video Samples</h1>
<hr>
   
# Chat Completions

In [3]:
import json
import os
import re
%run shared_functions.ipynb

### Setup Parameters


Here we will load the configurations from _config.json_ file to setup vision_api_key, vision_api_endpoint, video_SAS_url, video_index_name, and video_id.

In [4]:
# Load config values
with open(r'config.json') as config_file:
    config_details = json.load(config_file)

# Setting up the vision resource key
vision_api_key = os.getenv("VISION_API_KEY")

# The base URL for your vision resource endpoint, e.g. "https://<your-resource-name>.cognitiveservices.azure.com"
# You must create your resource in the East US region.
vision_api_endpoint = config_details['VISION_API_ENDPOINT']

# Insert your video SAS URL, e.g. https://<your-storage-account-name>.blob.core.windows.net/<your-container-name>/<your-video-name>?<SAS-token>
video_SAS_url = "https://gpt4vsamples.blob.core.windows.net/videos/Microsoft%20Copilot%20Short.mp4" #config_details["VIDEO_SAS_URL"] 

# This index name must be unique
video_index_name = config_details["VIDEO_INDEX_NAME"]

# This video ID must be unique
video_id = config_details["VIDEO_INDEX_ID"] # This video ID must be unique

### Create Video Index


In [5]:
# You only need to run this cell once to create the index
process_video_indexing(vision_api_endpoint, vision_api_key, video_index_name, video_SAS_url, video_id)

201 {"name":"new-test-zhi-v5","userData":{},"features":[{"name":"vision","modelVersion":"2023-05-31","domain":"surveillance"},{"name":"speech","modelVersion":"2023-06-30","domain":"generic"}],"eTag":"\"d9b271e7335e40b2aab35f0c69562040\"","createdDateTime":"2023-12-05T18:14:55.4017333Z","lastModifiedDateTime":"2023-12-05T18:14:55.4017333Z"}
202 {"name":"my-ingestion","state":"Running","batchName":"e8adb161-5bb5-4a77-bd2c-738086c815a2","createdDateTime":"2023-12-05T18:14:56.2923663Z","lastModifiedDateTime":"2023-12-05T18:14:56.6048786Z"}
{'value': [{'name': 'my-ingestion', 'state': 'Completed', 'batchName': 'e8adb161-5bb5-4a77-bd2c-738086c815a2', 'createdDateTime': '2023-12-05T18:14:56.2923663Z', 'lastModifiedDateTime': '2023-12-05T18:18:37.8844717Z'}]}
Ingestion completed.


### Call GPT-4V API with Video Index

In [6]:
# System messages and user prompt
sys_message = """
Your task is to assist in analyzing and optimizing creative assets. 
You will be presented with advertisement videos for products. 
First describe the video in detail paying close attention to Product characteristics highlighted, 
Background images, Lighting, Color Palette and Human characteristics for persons in the video. 
Finally provide a summary of the video and talk about the main message the advertisement video tries to convey to the viewer. 
"""
user_prompt = "Summarize the ad video"

messages = [
            {
                "role": "system",
                "content": [
                    {
                        "type": "text", 
                        "text": sys_message
                    }
                ]
            },
            {
                "role": "user",
                "content": [
                        {
                            "type": "acv_document_id",
                            "acv_document_id": video_id
                        }
                    ]
            },
            {
                "role": "user",
                "content": [
                        {
                            "type": "text",
                            "text": user_prompt # Prompt for the user
                        }
                    ]
            }
        ]

vision_api_config = {
    "endpoint": vision_api_endpoint,
    "key": vision_api_key
}

video_config = {
    "video_SAS_url": video_SAS_url,
    "video_index_name": video_index_name,
}

# Call GPT-4V API and print the response
try:
    response = call_GPT4V_video(messages, vision_api=vision_api_config, video_index=video_config)
    text = response['choices'][0]['message']['content']
    sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
    for sentence in sentences:  # Print the content of the response
        print(sentence)
except Exception as e:
    raise SystemExit(f"Failed to call GPT-4V API. Error: {e}")

The video appears to be a visually stimulating advertisement for a digital or software product.
It begins with a blurred image that sets a serene mood with pastel colors.
At timestamp 00:00:04.6000000, the video presents a dynamic logo with a vibrant color palette of blues, pinks, and purples, suggesting creativity and innovation.
By 00:00:09.2000000, we see a user interface that seems to be for a computer application or operating system, indicating the product's utility in organizing and enhancing digital workspaces.

At 00:00:13.8000000, there's a close-up of a digital screen with simulated features and a disclaimer noting that the screens are simulated, implying that the product is in development and features may vary.
The focus on the visuals continues with abstract, soft textures and colors at 00:00:18.4000000, which could represent the ease and fluidity of using the software.

The video continues to highlight the software's features through various screen displays set against a b