# REST API Video Samples

## Objective
Conducting Q&A with video inputs in GPT-4V.	

## Time

You should expect to spend 5-10 minutes running this sample.

## Before you begin

#### Installation

In [None]:
%pip install -r ../requirements.txt

### Parameters
You need to set a series of configurations such as GPT-4V_DEPLOYMENT_NAME, OPENAI_API_BASE, OPENAI_API_VERSION, VISION_API_ENDPOINT, VIDEO_SAS_URL, VIDEO_INDEX_NAME, VIDEO_INDEX_ID  in _config.json_ file.

```js
{
    "GPT-4V_DEPLOYMENT_NAME":"<GPT-4V Deployment Name>",
    "OPENAI_API_BASE":"https://<Your Azure Resource Name>.openai.azure.com",
    "OPENAI_API_VERSION":"<OpenAI API Version>",

    "VISION_API_ENDPOINT": "https://<Your Azure Vision Resource Name>.cognitiveservices.azure.com",

    "VIDEO_SAS_URL": "<Your Azure Blob Storage SAS URL>",
    "VIDEO_INDEX_NAME": "<Your Azure Video Index Name>",
    "VIDEO_INDEX_ID": "<Your Azure Video Index ID>"    
}
```
Add "OPENAI_API_KEY" and "VISION_API_KEY" as variable name and \<Your API Key Value\> and \<Your VISION Key Value\> as variable value in the environment variables.
 <br>
      
      WINDOWS Users: 
         setx OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
         setx VISION_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

      MACOS/LINUX Users: 
         export OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"
         export VISION_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"

Then you will load the configurations from _config.json_ file to setup vision_api_key, vision_api_endpoint, video_SAS_url, video_index_name, and video_id.

In [None]:
import json
import os
import re
from pathlib import Path
import sys

parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from shared_functions import call_GPT4V_video, process_video_indexing

should_cleanup: bool = False

current_script_dir = Path(parent_dir)

# Load config values
with Path(current_script_dir / "config.json").open() as config_file:
    config_details = json.load(config_file)

# Setting up the vision resource key
vision_api_key = os.getenv("VISION_API_KEY")

# The base URL for your vision resource endpoint, e.g. "https://<your-resource-name>.cognitiveservices.azure.com"
# You must create your resource in the East US region.
vision_api_endpoint = config_details["VISION_API_ENDPOINT"]

# Insert your video SAS URL, e.g. https://<your-storage-account-name>.blob.core.windows.net/<your-container-name>/<your-video-name>?<SAS-token>
video_SAS_url = "https://gpt4vsamples.blob.core.windows.net/videos/Microsoft%20Copilot%20Short.mp4"  # config_details["VIDEO_SAS_URL"]

# This index name must be unique
video_index_name = "copilot-video-demo-index"  # config_details["VIDEO_INDEX_NAME"]

# This video ID must be unique
video_id = "copilot-video-1"  # config_details["VIDEO_INDEX_ID"]

### Create Video Index


In [None]:
# You only need to run this cell once to create the index
process_video_indexing(vision_api_endpoint, vision_api_key, video_index_name, video_SAS_url, video_id)

### Call GPT-4V API with Video Index

In [None]:
# System messages and user prompt
sys_message = """
Your task is to assist in analyzing and optimizing creative assets. 
You will be presented with advertisement videos for products. 
First describe the video in detail paying close attention to Product characteristics highlighted, 
Background images, Lighting, Color Palette and Human characteristics for persons in the video. 
Finally provide a summary of the video and talk about the main message the advertisement video tries to convey to the viewer. 
"""
user_prompt = "Summarize the ad video"

messages = [
    {"role": "system", "content": [{"type": "text", "text": sys_message}]},
    {"role": "user", "content": [{"type": "acv_document_id", "acv_document_id": video_id}]},
    {"role": "user", "content": [{"type": "text", "text": user_prompt}]},  # Prompt for the user
]

vision_api_config = {"endpoint": vision_api_endpoint, "key": vision_api_key}

video_config = {
    "video_SAS_url": video_SAS_url,
    "video_index_name": video_index_name,
}

# Call GPT-4V API and print the response
try:
    response = call_GPT4V_video(messages, vision_api=vision_api_config, video_index=video_config)
    text = response["choices"][0]["message"]["content"]
    sentences = re.split(r"(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s", text)
    for sentence in sentences:  # Print the content of the response
        print(sentence)
except Exception as e:
    print(f"Failed to call GPT-4V API. Error: {e}")

## Cleaning up

To clean up all Azure ML resources used in this example, you can delete the individual resources you created in this tutorial.

If you made a resource group specifically to run this example, you could instead [delete the resource group](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/delete-resource-group).

In [None]:
if should_cleanup:
    # {{TODO: Add resource cleanup}}
    pass