# Extract Content from Your File

This notebook demonstrates how to use the Content Understanding API to extract semantic content from multimodal files.

## Prerequisites
1. Ensure your Azure AI service is configured by following the [configuration steps](../README.md#configure-azure-ai-service-resource).
2. Install the required packages to run this sample.

In [None]:
%pip install -r ../requirements.txt

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class that provides functions to interact with the Content Understanding API. Prior to the official release of the Content Understanding SDK, it serves as a lightweight SDK.
>
> Fill in the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with the details from your Azure AI Service.

> ‚ö†Ô∏è Important:
You must update the code below to use your preferred Azure authentication method.
Look for the `# IMPORTANT` comments in the code and modify those sections accordingly.
Skipping this step may cause the sample to not run correctly.

> ‚ö†Ô∏è Note: While using a subscription key is supported, it is strongly recommended to use a token provider with Azure Active Directory (AAD) for enhanced security in production environments.

In [None]:
from datetime import datetime
import logging
import os
import sys
from typing import Any, Optional
from dotenv import find_dotenv, load_dotenv

# Add the parent directory to the Python path to import the sample_helper module
sys.path.append(os.path.join(os.path.dirname(os.getcwd()), 'python'))
from content_understanding_client import AzureContentUnderstandingClient
from extension.sample_helper import save_json_to_file 
from azure.identity import DefaultAzureCredential

load_dotenv(find_dotenv())
logging.basicConfig(level=logging.INFO)

# For authentication, you can use either token-based auth or subscription key; only one is required
AZURE_AI_ENDPOINT = os.getenv("AZURE_AI_ENDPOINT")
# IMPORTANT: Replace with your actual subscription key or set it in your ".env" file if not using token authentication
AZURE_AI_API_KEY = os.getenv("AZURE_AI_API_KEY")
API_VERSION = "2025-11-01"

# Create token provider for Azure AD authentication
def token_provider():
    credential = DefaultAzureCredential()
    token = credential.get_token("https://cognitiveservices.azure.com/.default")
    return token.token

# Create the Content Understanding client
try:
    client = AzureContentUnderstandingClient(
        endpoint=AZURE_AI_ENDPOINT,
        api_version=API_VERSION,
        subscription_key=AZURE_AI_API_KEY,
        token_provider=token_provider if not AZURE_AI_API_KEY else None,
        x_ms_useragent="azure-ai-content-understanding-python-sample-ga"    # The user agent is used for tracking sample usage and does not provide identity information. You can change this if you want to opt out of tracking.
    )
    credential_type = "Subscription Key" if AZURE_AI_API_KEY else "Azure AD Token"
    print(f"‚úÖ Client created successfully")
    print(f"   Endpoint: {AZURE_AI_ENDPOINT}")
    print(f"   Credential: {credential_type}")
    print(f"   API Version: {API_VERSION}")
except Exception as e:
    credential_type = "Subscription Key" if AZURE_AI_API_KEY else "Azure AD Token"
    print(f"‚ùå Failed to create client")
    print(f"   Endpoint: {AZURE_AI_ENDPOINT}")
    print(f"   Credential: {credential_type}")
    print(f"   Error: {e}")
    raise

## Configure Model Deployments for Prebuilt Analyzers

> **üí° Note:** This step is only required **once per Azure Content Understanding resource**, unless the GPT deployment has been changed. You can skip this section if:
> - This configuration has already been run once for your resource, or
> - Your administrator has already configured the model deployments for you

Before using prebuilt analyzers, you need to configure the default model deployment mappings. This tells Content Understanding which model deployments to use.

**Model Requirements:**
- **GPT-4.1** - Required for most prebuilt analyzers (e.g., `prebuilt-invoice`, `prebuilt-receipt`, `prebuilt-idDocument`)
- **GPT-4.1-mini** - Required for RAG analyzers (e.g., `prebuilt-documentSearch`, `prebuilt-audioSearch`, `prebuilt-videoSearch`)
- **text-embedding-3-large** - Required for all prebuilt analyzers that use embeddings

**Prerequisites:**
1. Deploy **GPT-4.1**, **GPT-4.1-mini**, and **text-embedding-3-large** models in Azure AI Foundry (see README.md for instructions)
2. Set `GPT_4_1_DEPLOYMENT`, `GPT_4_1_MINI_DEPLOYMENT`, and `TEXT_EMBEDDING_3_LARGE_DEPLOYMENT` in your `.env` file with the deployment names

In [None]:
# Get model deployment names from environment variables
# Get model deployment names from environment variables
GPT_4_1_DEPLOYMENT = os.getenv("GPT_4_1_DEPLOYMENT")
GPT_4_1_MINI_DEPLOYMENT = os.getenv("GPT_4_1_MINI_DEPLOYMENT")
TEXT_EMBEDDING_3_LARGE_DEPLOYMENT = os.getenv("TEXT_EMBEDDING_3_LARGE_DEPLOYMENT")

# Check if required deployments are configured
missing_deployments = []
if not GPT_4_1_DEPLOYMENT:
    missing_deployments.append("GPT_4_1_DEPLOYMENT")
if not GPT_4_1_MINI_DEPLOYMENT:
    missing_deployments.append("GPT_4_1_MINI_DEPLOYMENT")
if not TEXT_EMBEDDING_3_LARGE_DEPLOYMENT:
    missing_deployments.append("TEXT_EMBEDDING_3_LARGE_DEPLOYMENT")

if missing_deployments:
    print(f"‚ö†Ô∏è  Warning: Missing required model deployment configuration(s):")
    for deployment in missing_deployments:
        print(f"   - {deployment}")
    print("\n   Prebuilt analyzers require GPT-4.1, GPT-4.1-mini, and text-embedding-3-large deployments.")
    print("   Please:")
    print("   1. Deploy all three models in Azure AI Foundry")
    print("   2. Add the following to notebooks/.env:")
    print("      GPT_4_1_DEPLOYMENT=<your-gpt-4.1-deployment-name>")
    print("      GPT_4_1_MINI_DEPLOYMENT=<your-gpt-4.1-mini-deployment-name>")
    print("      TEXT_EMBEDDING_3_LARGE_DEPLOYMENT=<your-text-embedding-3-large-deployment-name>")
    print("   3. Restart the kernel and run this cell again")
else:
    print(f"üìã Configuring default model deployments...")
    print(f"   GPT-4.1 deployment: {GPT_4_1_DEPLOYMENT}")
    print(f"   GPT-4.1-mini deployment: {GPT_4_1_MINI_DEPLOYMENT}")
    print(f"   text-embedding-3-large deployment: {TEXT_EMBEDDING_3_LARGE_DEPLOYMENT}")
    
    try:
        # Update defaults to map model names to your deployments
        result = client.update_defaults({
            "gpt-4.1": GPT_4_1_DEPLOYMENT,
            "gpt-4.1-mini": GPT_4_1_MINI_DEPLOYMENT,
            "text-embedding-3-large": TEXT_EMBEDDING_3_LARGE_DEPLOYMENT
        })
        
        print(f"‚úÖ Default model deployments configured successfully")
        print(f"   Model mappings:")
        for model, deployment in result.get("modelDeployments", {}).items():
            print(f"     {model} ‚Üí {deployment}")
    except Exception as e:
        print(f"‚ùå Failed to configure defaults: {e}")
        print(f"   This may happen if:")
        print(f"   - One or more deployment names don't exist in your Azure AI Foundry project")
        print(f"   - You don't have permission to update defaults")
        raise

## Document Content

The `prebuilt-documentSearch` analyzer transforms unstructured documents into structured, machine-readable data optimized for retrieval-augmented generation (RAG) and automated workflows. It extracts content and layout elements while preserving document structure and semantic relationships.

Key capabilities include:
1. **Content Analysis:** Extracts text (printed and handwritten), selection marks, barcodes (12+ types), mathematical formulas (LaTeX), hyperlinks, and annotations.
2. **Figure Analysis:** Generates descriptions for images/charts/diagrams, converts charts to Chart.js syntax, and diagrams to Mermaid.js syntax.
3. **Structure Analysis:** Identifies paragraphs with contextual roles (title, section heading, page header/footer), detects tables with complex layouts (merged cells, multi-page), and maps hierarchical sections.
4. **GitHub Flavored Markdown:** Outputs richly formatted markdown that preserves document structure for LLM comprehension and AI-powered analysis.
5. **Broad Format Support:** Processes PDFs, images, Office documents (Word, Excel, PowerPoint), text files (HTML, Markdown), structured files (XML, JSON, CSV), and email formats (EML, MSG).

For detailed information about document elements and markdown representation, see [Document elements](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/document/elements) and [Document markdown](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/document/markdown).

> **Note:** Figure analysis (descriptions and chart/diagram analysis) is only supported for PDF and image file formats.

In [None]:
# Analyze document from local file
analyzer_sample_file = '../data/invoice.pdf'
analyzer_id = 'prebuilt-documentSearch'

print(f"üîç Analyzing {analyzer_sample_file} with {analyzer_id}...")
response = client.begin_analyze_binary(
    analyzer_id=analyzer_id,
    file_location=analyzer_sample_file,
)

result = client.poll_result(response)

print("\nüìÑ Markdown Content:")
print("=" * 50)
# Extract markdown from the first content element
contents = result.get("result", {}).get("contents", [])
if contents:
    content = contents[0]
    markdown = content.get("markdown", "")
    print(markdown)
print("=" * 50)

# Check if this is document content to access document-specific properties
if content.get("kind") == "document":
    document_content = content
    print(f"\nüìö Document Information:")
    print(f"Start page: {document_content.get('startPageNumber')}")
    print(f"End page: {document_content.get('endPageNumber')}")
    print(f"Total pages: {document_content.get('endPageNumber') - document_content.get('startPageNumber') + 1}")

    # Check for pages
    pages = document_content.get("pages")
    if pages is not None:
        print(f"\nüìÑ Pages ({len(pages)}):")
        for i, page in enumerate(pages):
            unit = document_content.get("unit", "units")
            print(f"  Page {page.get('pageNumber')}: {page.get('width')} x {page.get('height')} {unit}")

    # Check if there are tables in the document
    tables = document_content.get("tables")
    if tables is not None:
        print(f"\nüìä Tables ({len(tables)}):")
        table_counter = 1
        for table in tables:
            row_count = table.get("rowCount")
            col_count = table.get("columnCount")
            print(f"  Table {table_counter}: {row_count} rows x {col_count} columns")
            table_counter += 1
else:
    print("\nüìö Document Information: Not available for this content type")
    
# Save the result
saved_json_path = save_json_to_file(result, filename_prefix="content_analyzers_analyze_binary")
print(f"\nüìã Full analysis result saved. Review the complete JSON at: {saved_json_path}")

### Analyzing Documents from URLs

You can also analyze documents directly from publicly accessible URLs without downloading them first. This is useful for processing documents hosted on web servers, cloud storage, or GitHub repositories.

In [None]:
# Analyze document from URL
document_url = 'https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/invoice.pdf'
analyzer_id = 'prebuilt-documentSearch'

print(f"üîç Analyzing document from URL: {document_url}")
print(f"üìä Using analyzer: {analyzer_id}\n")

response = client.begin_analyze_url(
    analyzer_id=analyzer_id,
    url=document_url,
)

result = client.poll_result(response)

print("\nüìÑ Markdown Content:")
print("=" * 50)
# Extract markdown from the first content element
contents = result.get("result", {}).get("contents", [])
if contents:
    content = contents[0]
    markdown = content.get("markdown", "")
    print(markdown)
print("=" * 50)

# Check if this is document content to access document-specific properties
if content.get("kind") == "document":
    document_content = content
    print(f"\nüìö Document Information:")
    print(f"Start page: {document_content.get('startPageNumber')}")
    print(f"End page: {document_content.get('endPageNumber')}")
    print(f"Total pages: {document_content.get('endPageNumber') - document_content.get('startPageNumber') + 1}")

    # Check for pages
    pages = document_content.get("pages")
    if pages is not None:
        print(f"\nüìÑ Pages ({len(pages)}):")
        for i, page in enumerate(pages):
            unit = document_content.get("unit", "units")
            print(f"  Page {page.get('pageNumber')}: {page.get('width')} x {page.get('height')} {unit}")

    # Check if there are tables in the document
    tables = document_content.get("tables")
    if tables is not None:
        print(f"\nüìä Tables ({len(tables)}):")
        table_counter = 1
        for table in tables:
            row_count = table.get("rowCount")
            col_count = table.get("columnCount")
            print(f"  Table {table_counter}: {row_count} rows x {col_count} columns")
            table_counter += 1
else:
    print("\nüìö Document Information: Not available for this content type")
    
# Save the result
saved_json_path = save_json_to_file(result, filename_prefix="content_analyzers_url_document")
print(f"\nüìã Full analysis result saved. Review the complete JSON at: {saved_json_path}")

## Audio Content
The `prebuilt-audioSearch` analyzer provides conversation analysis capabilities for audio files. It automatically transcribes audio content, performs speaker diarization to distinguish between speakers, and generates conversation summaries. The analyzer supports multilingual transcription and outputs transcripts in standard WebVTT format.

Key features include:
1. **Transcription:** Converts conversational audio into searchable text with sentence-level and word-level timestamps.
2. **Speaker Diarization:** Distinguishes between speakers in a conversation, attributing parts of the transcript to specific speakers (e.g., "Speaker 1", "Speaker 2").
3. **Timing Information:** Precise timing data in milliseconds (startTimeMs, endTimeMs) for each phrase, crucial for audio-text synchronization.
4. **Summary Generation:** Automatically generates a summary of the conversation for quick understanding.
5. **Multilingual Support:** Supports automatic language detection and multilingual transcription across multiple locales.
6. **Markdown Output:** Structured markdown format with WebVTT transcripts preserving speaker identification and timing.

For detailed information about audio markdown format and capabilities, see [Audio overview](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/audio/overview) and [AudioVisual markdown representation](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/video/markdown).

In [None]:
analyzer_sample_file = '../data/audio.wav'
analyzer_id = 'prebuilt-audioSearch'

# Analyze audio file with the created analyzer
print(f"üîç Analyzing audio file from path: {analyzer_sample_file} with analyzer '{analyzer_id}'...")

# Begin audio analysis operation
print(f"üé¨ Starting audio analysis with analyzer '{analyzer_id}'...")
analysis_response = client.begin_analyze_binary(
    analyzer_id=analyzer_id,
    file_location=analyzer_sample_file,
)

# Wait for analysis completion
print(f"‚è≥ Waiting for audio analysis to complete...")
analysis_result = client.poll_result(analysis_response)
print(f"‚úÖ Audio analysis completed successfully!")

print("\nüìÑ Markdown Content:")
print("=" * 50)
# Extract markdown from the first content element
contents = analysis_result.get("result", {}).get("contents", [])
if contents:
    content = contents[0]
    markdown = content.get("markdown", "")
    print(markdown)
print("=" * 50)

# Check if this is audio-visual content to access audio-visual properties
if content.get("kind") == "audioVisual":
    audio_visual_content = content
    print("\nüéôÔ∏è Audio-Visual Information:")
    
    # Basic Audio-Visual Details
    try:
        start_time = audio_visual_content.get("startTimeMs")
        end_time = audio_visual_content.get("endTimeMs")
        duration_sec = (end_time - start_time) / 1000
        print(f"Start Time: {start_time} ms")
        print(f"End Time: {end_time} ms")
        print(f"Duration: {duration_sec:.2f} seconds")
    except (KeyError, TypeError):
        print("‚ùå Missing basic audio-visual content details.")

    # Transcript Phrases (limit to 10)
    transcript_phrases = audio_visual_content.get("transcriptPhrases", [])
    if transcript_phrases:
        print(f"\nüìù Transcript Phrases ({min(len(transcript_phrases), 10)}):")
        for idx, phrase in enumerate(transcript_phrases[:10]):
            print(f"  {idx + 1}. Speaker: {phrase.get('speaker')}")
            print(f"     Text: {phrase.get('text')}")
            print(f"     Start: {phrase.get('startTimeMs')} ms, End: {phrase.get('endTimeMs')} ms")
            confidence = phrase.get('confidence', 0)
            print(f"     Confidence: {confidence:.2%}")
            print(f"     Locale: {phrase.get('locale')}")
        if len(transcript_phrases) > 10:
            print(f"  ... and {len(transcript_phrases) - 10} more.")
    else:
        print("\nüìù No transcript phrases available.")

    # Markdown Preview
    if markdown:
        print("\nüéµ Markdown Content Preview:")
        print(markdown[:200] + ("..." if len(markdown) > 200 else ""))
    else:
        print("\nüéµ No Markdown content available.")
else:
    print("\nüéôÔ∏è Audio-Visual Information: Not available for this content type.")

# Save the result
saved_json_path = save_json_to_file(analysis_result, filename_prefix="content_analyzers_audio")
print(f"\nüìã Full analysis result saved. Review the complete JSON at: {saved_json_path}")


## Video Content
The `prebuilt-videoSearch` analyzer provides comprehensive analysis of video content, combining visual frame extraction, audio transcription, and AI-powered insights. It transforms raw video into RAG-ready structured output in both Markdown and JSON formats, enabling applications like media asset management, content categorization, and retrieval-augmented generation.

Key features include:
1. **Transcription with Diarization:** Converts audio to searchable WebVTT transcripts with speaker identification and multilingual support (same language handling as audio).
2. **Key Frame Extraction:** Intelligently extracts representative frames (~1 FPS) from each scene, embedded as markdown image references with timestamps (e.g., `![](keyFrame.400.jpg)`).
3. **Shot Detection:** Identifies video segment boundaries aligned with camera cuts and scene transitions, providing timestamps in `cameraShotTimesMs`.
4. **Segment-based Analysis:** Analyzes multiple frames per segment to identify actions, events, and themes rather than individual frame analysis.
5. **Custom Field Extraction:** Define business-specific fields (brands, categories, sentiment) that the generative model extracts from visual and audio content.
6. **Structured Output:** Content organized in GitHub Flavored Markdown with precise temporal alignment and JSON with detailed metadata.

For detailed information about video capabilities, elements, and markdown format, see [Video overview](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/video/overview), [Video elements](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/video/elements), and [AudioVisual markdown representation](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/video/markdown).

In [None]:
from typing import Dict
from datetime import datetime


# Utility function to save keyframe images
def save_keyframe_image_to_file(
    image_content: bytes,
    keyframe_id: str,
    test_name: str,
    test_py_file_dir: str,
    identifier: Optional[str] = None,
    output_dir: str = "test_output",
) -> str:
    """Save keyframe image to output file using pytest naming convention.

    Args:
        image_content: The binary image content to save
        keyframe_id: The keyframe ID (e.g., "keyframes/733")
        test_name: Name of the test case (e.g., function name)
        test_py_file_dir: Directory where pytest files are located
        identifier: Optional unique identifier to avoid conflicts (e.g., analyzer_id)
        output_dir: Directory name to save the output file (default: "test_output")

    Returns:
        str: Path to the saved image file

    Raises:
        OSError: If there are issues creating directory or writing file
    """
    # Generate timestamp and frame ID
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    # Extract the frame time from the keyframe path (e.g., "keyframes/733" -> "733")
    if "/" in keyframe_id:
        frame_id = keyframe_id.split("/")[-1]
    else:
        # Fallback: use as-is if no slash found
        frame_id = keyframe_id

    # Create output directory if it doesn't exist
    output_dir_path = os.path.join(test_py_file_dir, output_dir)
    os.makedirs(output_dir_path, exist_ok=True)

    # Generate output filename with optional identifier to avoid conflicts
    if identifier:
        output_filename = f"{test_name}_{identifier}_{timestamp}_{frame_id}.jpg"
    else:
        output_filename = f"{test_name}_{timestamp}_{frame_id}.jpg"

    saved_file_path = os.path.join(output_dir_path, output_filename)

    # Write the image content to file
    with open(saved_file_path, "wb") as image_file:
        image_file.write(image_content)

    print(f"Image file saved to: {saved_file_path}")
    return saved_file_path


analyzer_sample_file = '../data/FlightSimulator.mp4'
analyzer_id = 'prebuilt-videoSearch'

# Analyze video file with the created analyzer
print(f"üîç Analyzing video file from path: {analyzer_sample_file} with analyzer '{analyzer_id}'...")

# Begin video analysis operation
print(f"üé¨ Starting video analysis with analyzer '{analyzer_id}'...")
analysis_response = client.begin_analyze_binary(
    analyzer_id=analyzer_id,
    file_location=analyzer_sample_file,
)

# Wait for analysis completion
print(f"‚è≥ Waiting for video analysis to complete...")
analysis_result = client.poll_result(analysis_response)
print(f"‚úÖ Video analysis completed successfully!")

print("\nüìÑ Markdown Content:")
print("=" * 50)
# Extract markdown from the first content element
contents = analysis_result.get("result", {}).get("contents", [])
if contents:
    content = contents[0]
    markdown = content.get("markdown", "")
    print(markdown)
print("=" * 50)

# Check if this is video-visual content to access video-visual properties
if content.get("kind") == "audioVisual":
    video_visual_content = content
    print("\nüé¨ Video-Visual Information:")

    # Basic Video-Visual Details
    try:
        start_time = video_visual_content.get("startTimeMs")
        end_time = video_visual_content.get("endTimeMs")
        duration_sec = (end_time - start_time) / 1000
        print(f"Start Time: {start_time} ms")
        print(f"End Time: {end_time} ms")
        print(f"Duration: {duration_sec:.2f} seconds")
    except (KeyError, TypeError):
        print("‚ùå Missing basic audio-visual content details.")

    # Transcript Phrases (limit to 10)
    transcript_phrases = video_visual_content.get("transcriptPhrases", [])
    if transcript_phrases:
        print(f"\nüìù Transcript Phrases ({min(len(transcript_phrases), 10)}):")
        for idx, phrase in enumerate(transcript_phrases[:10]):
            print(f"  {idx + 1}. Speaker: {phrase.get('speaker')}")
            print(f"     Text: {phrase.get('text')}")
            print(f"     Start: {phrase.get('startTimeMs')} ms, End: {phrase.get('endTimeMs')} ms")
            confidence = phrase.get('confidence', 0)
            print(f"     Confidence: {confidence:.2%}")
            print(f"     Locale: {phrase.get('locale')}")
        if len(transcript_phrases) > 10:
            print(f"  ... and {len(transcript_phrases) - 10} more.")
    else:
        print("\nüìù No transcript phrases available.")

    # Key Frames (support both keyFrameTimesMs and KeyFrameTimesMs for forward compatibility)
    key_frame_times_ms = video_visual_content.get("keyFrameTimesMs") or video_visual_content.get("KeyFrameTimesMs", [])
    if key_frame_times_ms:
        print(f"\nüñºÔ∏è Key Frames ({len(key_frame_times_ms)}):")
        for idx, key_frame_time in enumerate(key_frame_times_ms[:5]):
            print(f"  Frame {idx + 1}: Time {key_frame_time} ms")
        if len(key_frame_times_ms) > 5:
            print(f"  ... and {len(key_frame_times_ms) - 5} more.")
    else:
        print("\nüñºÔ∏è No key frame data available.")

    # Markdown Preview
    if markdown:
        print("\nüéµ Markdown Content Preview:")
        print(markdown[:200] + ("..." if len(markdown) > 200 else ""))
    else:
        print("\nüéµ No Markdown content available.")
else:
    print("\nüé¨ Video-Visual Information: Not available for this content type.")

# Save the result
saved_json_path = save_json_to_file(analysis_result, filename_prefix="content_analyzers_video")
print(f"\nüìã Full analysis result saved. Review the complete JSON at: {saved_json_path}")

# Keyframe Processing
def extract_keyframe_ids(analysis_result: Dict[str, Any]) -> list[str]:
    """
    Extract all keyframe IDs from the analysis result.

    Args:
        analysis_result: The analysis result from the analyzer
    Returns:
        List of keyframe IDs (e.g., 'keyframes/1000', 'keyframes/2000')
    """
    print("Starting keyframe extraction from analysis result...")
    keyframe_ids = []
    contents = analysis_result.get("result", {}).get("contents", [])
    for idx, content in enumerate(contents):
        if content.get("kind") == "audioVisual":
            print(f"Found audioVisual content at index {idx}:")
            # Support both keyFrameTimesMs and KeyFrameTimesMs for forward compatibility
            key_frame_times_ms = content.get("keyFrameTimesMs") or content.get("KeyFrameTimesMs", [])
            if key_frame_times_ms:
                print(f"  Found {len(key_frame_times_ms)} keyframe timestamps")
                for time_ms in key_frame_times_ms:
                    keyframe_id = f"keyframes/{time_ms}"
                    keyframe_ids.append(keyframe_id)
            else:
                print(f"  No keyframe timestamps found in this audioVisual content.")
    print(f"Extracted {len(keyframe_ids)} total keyframe IDs: {keyframe_ids}")
    return keyframe_ids

keyframe_ids = extract_keyframe_ids(analysis_result)
if keyframe_ids:
    print(f"\nüñºÔ∏è Downloading {len(keyframe_ids)} keyframe images...")

    files_to_download = keyframe_ids[:min(3, len(keyframe_ids))]
    print(f"Files to download (first {len(files_to_download)}): {files_to_download}")

    for keyframe_id in files_to_download:
        print(f"Getting result file: {keyframe_id}")

        # Get the result file (keyframe image)
        image_content = client.get_result_file(
            analyze_response=analysis_response,
            file_id=keyframe_id,
        )

        if image_content:
            print(f"Retrieved image file for {keyframe_id} ({len(image_content)} bytes)")

            # Save the image file
            saved_file_path = save_keyframe_image_to_file(
                image_content=image_content,
                keyframe_id=keyframe_id,
                test_name="content_extraction_video",
                test_py_file_dir=os.getcwd(),
                identifier=analyzer_id
            )
            print(f"‚úÖ Saved keyframe image to: {saved_file_path}")
        else:
            print(f"‚ùå No image content retrieved for keyframe: {keyframe_id}")
else:
    print("\n‚ùå No keyframe IDs found in analysis result.")

## Congratulations!

You've successfully learned how to extract content from multimodal files using Azure Content Understanding! You explored:

- **Document extraction** with the `prebuilt-documentSearch` analyzer
- **Audio transcription** with speaker diarization using `prebuilt-audioSearch`
- **Video analysis** with keyframe extraction using `prebuilt-videoSearch`

### Learn More

To dive deeper into Azure Content Understanding capabilities:

- **[Content Understanding Overview](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/overview)** - Comprehensive introduction to the service
- **[What's New](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/whats-new)** - Latest features and updates
- **[Content Extraction Guide](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/document/overview#content-extraction)** - Detailed documentation on extraction capabilities

Explore other notebooks in this repository to learn about custom analyzers, field extraction, and advanced scenarios!