# REST API Video Samples

## Objective
Conducting Q&A with video inputs in GPT-4 Turbo with Vision.	

## Time

You should expect to spend 5-10 minutes running this sample.

## Before you begin

### Prerequisites

- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services/)
- An Azure AI resource. For guidance on setting up an Azure AI resource, see: [How to create and manage an Azure AI resource](https://review.learn.microsoft.com/en-us/azure/ai-studio/how-to/create-azure-ai-resource?branch=main)

#### Installation

In [None]:
%pip install -r ../requirements.txt

### Parameters
You need to set a series of configurations such as GPT-4V_DEPLOYMENT_NAME, OPENAI_API_BASE, OPENAI_API_VERSION.

Add "OPENAI_API_KEY" as variable name and \<Your API Key Value\> as variable value in the environment variables.
 <br>
      
      WINDOWS Users: 
         setx OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
         
      MACOS/LINUX Users: 
         export OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"

In [None]:
# Setting up the deployment name
deployment_name: str = "<your GPT-4 Turbo with Vision deployment name>"
# The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
openai_api_base: str = "<your resource base URL>"
# Currently OPENAI API have the following versions available: 2022-12-01.
# All versions follow the YYYY-MM-DD date structure.
openai_api_version: str = "<your OpenAI API version>"

# Insert your video SAS URL, e.g. https://<your-storage-account-name>.blob.core.windows.net/<your-container-name>/<your-video-name>?<SAS-token>
video_SAS_url: str = "https://gpt4vsamples.blob.core.windows.net/videos/Microsoft%20Copilot%20Short.mp4"

should_cleanup: bool = False

## Connect to your project
To start with let us create a config file with your project details. This file can be used in this sample or other samples to connect to your workspace.

In [None]:
import json
from pathlib import Path

config = {
    "GPT-4V_DEPLOYMENT_NAME": deployment_name,
    "OPENAI_API_BASE": openai_api_base,
    "OPENAI_API_VERSION": openai_api_version,
}

p = Path("../config.json")

with p.open(mode="w") as file:
    file.write(json.dumps(config))

## Run this Example

In [None]:
import re
import sys

parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from shared_functions import call_GPT4V, sample_frames, download_video

### Call GPT-4 Turbo with Vision API

In [None]:
from moviepy.editor import VideoFileClip

# System messages and user prompt
sys_message = """
Your task is to assist in analyzing and optimizing creative assets. 
You will be presented with advertisement videos for products. 
First describe the video in detail paying close attention to Product characteristics highlighted, 
Background images, Lighting, Color Palette and Human characteristics for persons in the video. 
Finally provide a summary of the video and talk about the main message the advertisement video tries to convey to the viewer. 
"""
user_prompt = "Summarize the ad video."

local_file_path = "downloaded_video.mp4"
assert download_video(video_SAS_url, local_file_path), "Failed to download video."

num_frames_to_sample = 10
video = VideoFileClip(local_file_path)
base64_frames = sample_frames(video, num_frames_to_sample)
content = [{"type": "text", "text": user_prompt}]
content.append({"type": "text", "text": f"Below are {num_frames_to_sample} frames uniformly sampled from the video."})
for frame in base64_frames:
    content.append({"type": "text", "text": f"Frame at time {frame[0]} seconds:"})
    content.append({"type": "image_url", "image_url": {"url": f"data:image/jpg;base64,{frame[1]}", "detail": "low"}})

messages = [
    {"role": "system", "content": [{"type": "text", "text": sys_message}]},
    {
        "role": "user",
        "content": content,
    },  # Prompt for the user
]

# Call GPT-4 Turbo with Vision API and print the response
try:
    response = call_GPT4V(messages)
    text = response["choices"][0]["message"]["content"]
    sentences = re.split(r"(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s", text)
    for sentence in sentences:  # Print the content of the response
        print(sentence)
except Exception as e:
    print(f"Failed to call GPT-4 Turbo with Vision API. Error: {e}")

## Cleaning up

To clean up all Azure ML resources used in this example, you can delete the individual resources you created in this tutorial.

If you made a resource group specifically to run this example, you could instead [delete the resource group](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/delete-resource-group).

In [None]:
if should_cleanup:
    Path(local_file_path).unlink()  # Delete the downloaded video
    # {{TODO: Add resource cleanup}}