# Generating Video Chapters

This notebook demonstrates how to automatically generate video chapters using **Azure Content Understanding**. By analyzing the video content, the LLM model applies its own reasoning to segment the video into meaningful chapters and scenes, providing concise descriptions and timestamps. This approach enables users to quickly create a structured table of contents for any video, powered by advanced AI understanding.

## Pre-requisites
1. Please follow the [README](../README.md#configure-azure-ai-service-resource) to create the essential resource needed for this sample.
2. Install the required packages.

In [None]:
%pip install -r ../requirements.txt

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class providing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, this acts as a lightweight SDK. Please fill in the constants **AZURE_AI_ENDPOINT** and **AZURE_AI_API_VERSION** with your Azure AI Service information. Optionally, you may provide **AZURE_AI_API_KEY** if your setup requires key-based authentication.

> ⚠️ Important:
Please update the code below to match your Azure authentication method.
Look for the `# IMPORTANT` comments and kindly modify those sections accordingly.
Skipping this step may cause the sample to not run correctly.

> ⚠️ Note: While using a subscription key works, using a token provider with Azure Active Directory (AAD) is safer and highly recommended for production environments.

In [None]:
import logging
import json
import os
import sys
import uuid
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

load_dotenv(find_dotenv())
logging.basicConfig(level=logging.INFO)

# For authentication, you can use either token-based authentication or a subscription key; only one method is required.
AZURE_AI_ENDPOINT = os.getenv("AZURE_AI_ENDPOINT")
# IMPORTANT: Please replace with your actual subscription key or set it in the ".env" file if not using token authentication.
AZURE_AI_API_KEY = os.getenv("AZURE_AI_API_KEY")
AZURE_AI_API_VERSION = os.getenv("AZURE_AI_API_VERSION", "2025-05-01-preview")

# Add the parent directory to the path to use shared modules
parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from python.content_understanding_client import AzureContentUnderstandingClient

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    # IMPORTANT: Comment out token_provider if using subscription key
    token_provider=token_provider,
    # IMPORTANT: Uncomment this if using subscription key
    # subscription_key=AZURE_AI_API_KEY,
    x_ms_useragent="azure-ai-content-understanding-python/video_chapters",  # This header is used for sample usage telemetry; please comment out this line if you wish to opt out.
)

## File to Analyze

Use the following variable to specify the file to analyze. For this tutorial, we will examine a short example video. To try your own videos, replace the file path accordingly. For longer videos, you may need to adjust the `timeout` variable in `poll_result`.

In [None]:
VIDEO_FILE_PATH = Path("../data/FlightSimulator.mp4")

# 1. Automated Chapter Discovery

We will first use Content Understanding and allow the service to interact with an LLM model, enabling it to apply its own reasoning to segment the video into meaningful chapters. It will provide concise descriptions with timestamps.

The custom analyzer schema template for this approach is [video_chapters_dynamic.json](../analyzer_templates/video_chapters_dynamic.json). This file defines the schema and configuration for a custom video analyzer that uses AI to dynamically generate chapters and scenes based on its understanding of the video content. The service analyzes the video and identifies the most meaningful chapter divisions, providing start and end timestamps for each. The configuration section allows flexible segmentation, enabling the model to reason about the content and organize chapters appropriately.

In this example, we will use the utility class `AzureContentUnderstandingClient` to load the analyzer schema from the template file and submit it to the Azure Content Understanding service. Then, we will analyze the video and generate the desired chapter and scene structure.

### Load Analyzer Template

In [None]:
analyzer_template_path = "../analyzer_templates/video_chapters_dynamic.json"
with open(analyzer_template_path, 'r') as f:
    template_content = json.load(f)
    print(json.dumps(template_content, indent=2))

### Create and Run Video Chapter Analyzer

In [None]:
video_analyzer_id = "video_scene_chapter" + "_" + str(uuid.uuid4())

print(f"Creating chapter video analyzer: {video_analyzer_id}")
response = client.begin_create_analyzer(video_analyzer_id, analyzer_template_path=analyzer_template_path)
result = client.poll_result(response)
print("✅ Chapter video analyzer created successfully!")

print(f"Analyzing chapter video: {VIDEO_FILE_PATH}")
print("⏳ Note: Video analysis may take significantly longer than document analysis...")
response = client.begin_analyze(video_analyzer_id, file_location=VIDEO_FILE_PATH)
result_json = client.poll_result(response, timeout_seconds=360)  # Video analysis may take longer, please adjust timeout as needed

print("Video Content Understanding result: ")
print(json.dumps(result, indent=2))

### Extract Video Content from Analyzed Result
Use the following utility to display results for your visual review.

In [None]:
from IPython.display import display, HTML
from python.chapters_utility import ChaptersFormatter

# For dynamic chaptering, pass dynamic=True
full_html = ChaptersFormatter.format_chapters_output(VIDEO_FILE_PATH, result_json, dynamic=True)
display(HTML(full_html))

### [Optional] Clean Up Marketing Video Analyzer

Note: In production environments, analyzers are typically kept for reuse rather than deleting.

In [None]:
client.delete_analyzer(video_analyzer_id)

## 2. Structured Chapter Creation

Alternatively, you can define a specific structure for chaptering that Content Understanding will use to guide the model. When your request includes information detailing clear chapter types and organization, the service can instruct the model to segment the video according to your desired structure, ensuring consistent and predictable chapter creation.

The custom analyzer schema template for this approach is [video_chapters_structured.json](../analyzer_templates/video_chapters_structured.json). This file defines the schema and configuration for a custom video analyzer. In this example, it specifies how a video should be segmented into chapters and scenes, including three chapter types: **"Topic Introduction"**, **"Details About the Work Done"**, and **"Conclusion or Results"**. Each segment contains a list of scenes, with each scene described by a short description, start timestamp, and end timestamp. The configuration section controls segmentation behavior and other analysis options, while the fieldSchema section outlines the expected structure of the output, ensuring chapters and scenes are clearly organized and non-overlapping.

We will also be using the utility class `AzureContentUnderstandingClient` to load the analyzer schema from the template file and submit it to the Azure Content Understanding service. Then, we will analyze the video and generate the desired chapter and scene structure.

### Load Analyzer Template

In [None]:
analyzer_template_path = "../analyzer_templates/video_chapters_structured.json"
with open(analyzer_template_path, 'r') as f:
    template_content = json.load(f)
    print(json.dumps(template_content, indent=2))

### Create and Run Video Chapter Analyzer

In [None]:
video_analyzer_id = "video_scene_chapter" + "_" + str(uuid.uuid4())

print(f"Creating chapter video analyzer: {video_analyzer_id}")
response = client.begin_create_analyzer(video_analyzer_id, analyzer_template_path=analyzer_template_path)
result = client.poll_result(response)
print("✅ Chapter video analyzer created successfully!")

print(f"Analyzing chapter video: {VIDEO_FILE_PATH}")
print("⏳ Note: Video analysis may take significantly longer than document analysis...")
response = client.begin_analyze(video_analyzer_id, file_location=VIDEO_FILE_PATH)
result_json = client.poll_result(response, timeout_seconds=360)  # Video analysis may take longer, please adjust the timeout as needed

print("Video Content Understanding result: ")
print(json.dumps(result, indent=2))

### Extract Video Content from Analyzed Result
Use the following utility to display results for your visual review.

In [None]:
from IPython.display import display, HTML
from python.chapters_utility import ChaptersFormatter

# For structured chaptering, use the default (dynamic=False)
full_html = ChaptersFormatter.format_chapters_output(VIDEO_FILE_PATH, result_json)
display(HTML(full_html))

### [Optional] Clean Up Marketing Video Analyzer

Note: In production environments, analyzers are typically kept for reuse rather than deleting.

In [None]:
client.delete_analyzer(video_analyzer_id)

# Conclusion

This notebook has demonstrated how **Azure Content Understanding** can automatically generate meaningful chapters and scenes from video content. By leveraging AI-driven analysis, you can quickly create structured, searchable tables of contents for any video, making it easier to navigate and understand complex material. This approach streamlines video processing and enables exciting new possibilities for content organization and discovery.