# Extract Content from Your File

This notebook demonstrate you can use Content Understanding API to extract semantic content from multimodal files.

## Prerequisites
1. Ensure Azure AI service is configured following [steps](../README.md#configure-azure-ai-service-resource)
2. Install the required packages to run the sample.

In [1]:
%pip install -r ../requirements.txt

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK.


In [2]:
import logging
import json
import os
import sys
import uuid
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

load_dotenv(find_dotenv())
logging.basicConfig(level=logging.INFO)

AZURE_AI_ENDPOINT = os.getenv("AZURE_AI_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_AI_API_VERSION", "2025-05-01-preview")

# Add the parent directory to the path to use shared modules
parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from python.content_understanding_client import AzureContentUnderstandingClient

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    x_ms_useragent="azure-ai-content-understanding-python/content_extraction", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.
)

# Utility function to save images
from PIL import Image
from io import BytesIO
import re

def save_image(image_id: str, response):
    raw_image = client.get_image_from_analyze_operation(analyze_response=response,
        image_id=image_id
    )
    image = Image.open(BytesIO(raw_image))
    # image.show()
    Path(".cache").mkdir(exist_ok=True)
    image.save(f".cache/{image_id}.jpg", "JPEG")


INFO:azure.identity._credentials.environment:No environment configuration found.
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS
INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'User-Agent': 'azsdk-python-identity/1.23.0 Python/3.11.12 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
No body was attached to the request
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS
INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'User-Agent': 'azsdk-python-identity/1.23.0 Python/3.11.12 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
No body was attached to the request
INFO:az

## Document Content

Content Understanding API is designed to extract all textual content from a specified document file. In addition to text extraction, it conducts a comprehensive layout analysis to identify and categorize tables and figures within the document. The output is then presented in a structured markdown format, ensuring clarity and ease of interpretation.



In [3]:
ANALYZER_SAMPLE_FILE = '../data/invoice.pdf'
ANALYZER_ID = 'prebuilt-documentAnalyzer'

# Analyzer file
response = client.begin_analyze(ANALYZER_ID, file_location=ANALYZER_SAMPLE_FILE)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

INFO:python.content_understanding_client:Analyzing file ../data/invoice.pdf with analyzer: prebuilt-documentAnalyzer
INFO:python.content_understanding_client:Request f2603415-b56b-4a74-90d9-7f25c3735512 in progress ...
INFO:python.content_understanding_client:Request f2603415-b56b-4a74-90d9-7f25c3735512 in progress ...
INFO:python.content_understanding_client:Request result is ready after 6.14 seconds.


{
  "id": "f2603415-b56b-4a74-90d9-7f25c3735512",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-documentAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "2025-07-01T06:37:52Z",
    "contents": [
      {
        "markdown": "CONTOSO LTD.\n\n\n# INVOICE\n\nContoso Headquarters\n123 456th St\nNew York, NY, 10001\n\nINVOICE: INV-100\n\nINVOICE DATE: 11/15/2019\n\nDUE DATE: 12/15/2019\n\nCUSTOMER NAME: MICROSOFT CORPORATION\n\nSERVICE PERIOD: 10/14/2019 - 11/14/2019\n\nCUSTOMER ID: CID-12345\n\nMicrosoft Corp\n123 Other St,\nRedmond WA, 98052\n\nBILL TO:\n\nMicrosoft Finance\n\n123 Bill St,\n\nRedmond WA, 98052\n\nSHIP TO:\n\nMicrosoft Delivery\n\n123 Ship St,\n\nRedmond WA, 98052\n\nSERVICE ADDRESS:\nMicrosoft Services\n123 Service St,\nRedmond WA, 98052\n\n\n<table>\n<tr>\n<th>SALESPERSON</th>\n<th>P.O. NUMBER</th>\n<th>REQUISITIONER</th>\n<th>SHIPPED VIA</th>\n<th>F.O.B. POINT</th>\n<th>TERMS</th>\n</tr>\n<tr>\n<td></td>\n<td>PO-3333</td>\n<td></td

> The markdown output contains layout information, which is very useful for Retrieval-Augmented Generation (RAG) scenarios. You can paste the markdown into a viewer such as Visual Studio Code and preview the layout structure.

In [4]:
print(result_json["result"]["contents"][0]["markdown"])

CONTOSO LTD.


# INVOICE

Contoso Headquarters
123 456th St
New York, NY, 10001

INVOICE: INV-100

INVOICE DATE: 11/15/2019

DUE DATE: 12/15/2019

CUSTOMER NAME: MICROSOFT CORPORATION

SERVICE PERIOD: 10/14/2019 - 11/14/2019

CUSTOMER ID: CID-12345

Microsoft Corp
123 Other St,
Redmond WA, 98052

BILL TO:

Microsoft Finance

123 Bill St,

Redmond WA, 98052

SHIP TO:

Microsoft Delivery

123 Ship St,

Redmond WA, 98052

SERVICE ADDRESS:
Microsoft Services
123 Service St,
Redmond WA, 98052


<table>
<tr>
<th>SALESPERSON</th>
<th>P.O. NUMBER</th>
<th>REQUISITIONER</th>
<th>SHIPPED VIA</th>
<th>F.O.B. POINT</th>
<th>TERMS</th>
</tr>
<tr>
<td></td>
<td>PO-3333</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</table>


<table>
<tr>
<th>DATE</th>
<th>ITEM CODE</th>
<th>DESCRIPTION</th>
<th>QTY</th>
<th>UM</th>
<th>PRICE</th>
<th>TAX</th>
<th>AMOUNT</th>
</tr>
<tr>
<td>3/4/2021</td>
<td>A123</td>
<td>Consulting Services</td>
<td>2</td>
<td>hours</td>
<td>$30.00</td>
<td>$6.00</td>
<td>$60.0

> You can get the layout information, including ```words/lines``` in the pagesnode and paragraphs info in ```paragraphs```, and ```tables``` in the table.

In [5]:
print(json.dumps(result_json["result"]["contents"][0], indent=2))


{
  "markdown": "CONTOSO LTD.\n\n\n# INVOICE\n\nContoso Headquarters\n123 456th St\nNew York, NY, 10001\n\nINVOICE: INV-100\n\nINVOICE DATE: 11/15/2019\n\nDUE DATE: 12/15/2019\n\nCUSTOMER NAME: MICROSOFT CORPORATION\n\nSERVICE PERIOD: 10/14/2019 - 11/14/2019\n\nCUSTOMER ID: CID-12345\n\nMicrosoft Corp\n123 Other St,\nRedmond WA, 98052\n\nBILL TO:\n\nMicrosoft Finance\n\n123 Bill St,\n\nRedmond WA, 98052\n\nSHIP TO:\n\nMicrosoft Delivery\n\n123 Ship St,\n\nRedmond WA, 98052\n\nSERVICE ADDRESS:\nMicrosoft Services\n123 Service St,\nRedmond WA, 98052\n\n\n<table>\n<tr>\n<th>SALESPERSON</th>\n<th>P.O. NUMBER</th>\n<th>REQUISITIONER</th>\n<th>SHIPPED VIA</th>\n<th>F.O.B. POINT</th>\n<th>TERMS</th>\n</tr>\n<tr>\n<td></td>\n<td>PO-3333</td>\n<td></td>\n<td></td>\n<td></td>\n<td></td>\n</tr>\n</table>\n\n\n<table>\n<tr>\n<th>DATE</th>\n<th>ITEM CODE</th>\n<th>DESCRIPTION</th>\n<th>QTY</th>\n<th>UM</th>\n<th>PRICE</th>\n<th>TAX</th>\n<th>AMOUNT</th>\n</tr>\n<tr>\n<td>3/4/2021</td>\n<td>A123</td

> This statement allows you to get structural information of the tables in the documents.

In [6]:
print(json.dumps(result_json["result"]["contents"][0]["tables"], indent=2))

[
  {
    "rowCount": 2,
    "columnCount": 6,
    "cells": [
      {
        "kind": "columnHeader",
        "rowIndex": 0,
        "columnIndex": 0,
        "rowSpan": 1,
        "columnSpan": 1,
        "content": "SALESPERSON",
        "source": "D(1,0.5389,4.5514,1.7505,4.5514,1.7505,4.8364,0.5389,4.8364)",
        "span": {
          "offset": 512,
          "length": 11
        },
        "elements": [
          "/paragraphs/19"
        ]
      },
      {
        "kind": "columnHeader",
        "rowIndex": 0,
        "columnIndex": 1,
        "rowSpan": 1,
        "columnSpan": 1,
        "content": "P.O. NUMBER",
        "source": "D(1,1.7505,4.5514,3.2399,4.5591,3.2399,4.8364,1.7505,4.8364)",
        "span": {
          "offset": 533,
          "length": 11
        },
        "elements": [
          "/paragraphs/20"
        ]
      },
      {
        "kind": "columnHeader",
        "rowIndex": 0,
        "columnIndex": 2,
        "rowSpan": 1,
        "columnSpan": 1,
        

## Audio Content
Our API output facilitates detailed analysis of spoken language, allowing developers to utilize the data for various applications, such as voice recognition, customer service analytics, and conversational AI. The structure of the output makes it easy to extract and analyze different components of the conversation for further processing or insights.

1. Speaker Identification: Each phrase is attributed to a specific speaker (in this case, "Speaker 2"). This allows for clarity in conversations with multiple participants.
1. Timing Information: Each transcription includes precise timing data:
    - startTimeMs: The time (in milliseconds) when the phrase begins.
    - endTimeMs: The time (in milliseconds) when the phrase ends.
    This information is crucial for applications like video subtitles, allowing synchronization between the audio and the text.
1. Text Content: The actual spoken text is provided, which in this instance is "Thank you for calling Woodgrove Travel." This is the main content of the transcription.
1. Confidence Score: Each transcription phrase includes a confidence score (0.933 in this case), indicating the likelihood that the transcription is accurate. A higher score suggests greater reliability.
1. Word-Level Breakdown: The transcription is further broken down into individual words, each with its own timing information. This allows for detailed analysis of speech patterns and can be useful for applications such as language processing or speech recognition improvement.
1. Locale Specification: The locale is specified as "en-US," indicating that the transcription is in American English. This is important for ensuring that the transcription algorithms account for regional dialects and pronunciations.


In [None]:
ANALYZER_SAMPLE_FILE = '../data/audio.wav'
ANALYZER_ID = 'prebuilt-audioAnalyzer'

# Analyzer file
response = client.begin_analyze(ANALYZER_ID, file_location=ANALYZER_SAMPLE_FILE)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

## Video Content
Video output provides detailed information about audiovisual content, specifically video shots. Here are the key features it offers:

1. Shot Information: Each shot is defined by a start and end time, along with a unique identifier. For example, Shot 0:0.0 to 0:2.800 includes a transcript and key frames.
1. Transcript: The API includes a transcript of the audio, formatted in WEBVTT, which allows for easy synchronization with the video. It captures spoken content and specifies the timing of the dialogue.
1. Key Frames: It provides a series of key frames (images) that represent important moments in the video shot, allowing users to visualize the content at specific timestamps.
1. Description: Each shot is accompanied by a description, providing context about the visuals presented. This helps in understanding the scene or subject matter without watching the video.
1. Audio Visual Metadata: Details about the video such as dimensions (width and height), type (audiovisual), and the presence of key frame timestamps are included.
1. Transcript Phrases: The output includes specific phrases from the transcript, along with timing and speaker information, enhancing the usability for applications like closed captioning or search functionalities.

In [4]:
ANALYZER_SAMPLE_FILE = '../data/FlightSimulator.mp4'
ANALYZER_ID = 'prebuilt-videoAnalyzer'

# Analyzer file
response = client.begin_analyze(ANALYZER_ID, file_location=ANALYZER_SAMPLE_FILE)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

# Save keyframes (optional)
keyframe_ids = set()
result_data = result_json.get("result", {})
contents = result_data.get("contents", [])

# Iterate over contents to find keyframes if available
for content in contents:
    # Extract keyframe IDs from "markdown" if it exists and is a string
    markdown_content = content.get("markdown", "")
    if isinstance(markdown_content, str):
        keyframe_ids.update(re.findall(r"(keyFrame\.\d+)\.jpg", markdown_content))

# Output the results
print("Unique Keyframe IDs:", keyframe_ids)

# Save all keyframe images
for keyframe_id in keyframe_ids:
    save_image(keyframe_id, response)

INFO:python.content_understanding_client:Analyzing file ../data/FlightSimulator.mp4 with analyzer: prebuilt-videoAnalyzer
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be-48db-879d-f013d89cb01b in progress ...
INFO:python.content_understanding_client:Request 1b5b095c-49be

{
  "id": "1b5b095c-49be-48db-879d-f013d89cb01b",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-videoAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "2025-07-01T06:47:20Z",
    "stringEncoding": "utf8",
    "contents": [
      {
        "markdown": "# Video: 00:00.000 => 00:43.866\nWidth: 1080\nHeight: 608\n\n## Segment 1: 00:00.000 => 00:07.367\nThe video begins with an aerial view of a scenic island, featuring the logos of Flight Simulator and Microsoft Azure AI. This is followed by a brief glimpse of a person speaking in an office setting, and then transitions to a visual representation of audio waveforms, indicating a discussion about neural TTS and the importance of good data for generating natural voices.\n\nTranscript\n```\nWEBVTT\n\n00:01.400 --> 00:06.640\n<Speaker 1>When it comes to the neural TTS, in order to get a good voice, it's better to have good data.\n\n00:07.120 --> 00:13.280\n<Speaker 2>To achieve that, we build a universal T

## Custom Video Inspection Analyzer (uSWS)
This section demonstrates how to create and use your custom uSWS schema video inspection analyzer for room-by-room condition grading.

In [7]:
# First, create your custom uSWS video inspection analyzer
import uuid

ANALYZER_TEMPLATE = "../analyzer_templates/uSWS_schema-video-inspection.json"
CUSTOM_ANALYZER_ID = "uSWS-video-inspection-" + str(uuid.uuid4())

print(f"Creating custom analyzer with ID: {CUSTOM_ANALYZER_ID}")

# Check if template file exists first
import os
if not os.path.exists(ANALYZER_TEMPLATE):
    print(f"❌ Error: Template file {ANALYZER_TEMPLATE} does not exist!")
else:
    print(f"✅ Template file {ANALYZER_TEMPLATE} found")
    
    # Validate JSON structure
    try:
        with open(ANALYZER_TEMPLATE, 'r') as f:
            template_content = json.load(f)
        print("✅ JSON template is valid")
        print(f"📋 Template description: {template_content.get('description', 'No description')}")
    except json.JSONDecodeError as e:
        print(f"❌ JSON validation error: {e}")

# Create the analyzer from your template
try:
    print("📝 Loading analyzer template...")
    response = client.begin_create_analyzer(CUSTOM_ANALYZER_ID, analyzer_template_path=ANALYZER_TEMPLATE)
    print("⏳ Waiting for analyzer creation to complete...")
    result = client.poll_result(response)

    print("Analyzer creation result:")
    print(json.dumps(result, indent=2))

    if result.get("status", "").lower() == "succeeded":
        print(f"✅ Custom analyzer '{CUSTOM_ANALYZER_ID}' created successfully!")
    else:
        print(f"❌ Failed to create analyzer.")
        print(f"Status: {result.get('status', 'Unknown')}")
        print(f"Error details: {result.get('error', 'No error details')}")
        if 'errors' in result:
            print(f"Validation errors: {result['errors']}")
        if 'message' in result:
            print(f"Error message: {result['message']}")
            
except Exception as e:
    print(f"❌ Exception during analyzer creation: {str(e)}")
    result = {"status": "failed", "error": str(e)}

Creating custom analyzer with ID: uSWS-video-inspection-b5bf153c-4e03-4c0b-98d7-daa41f82bdc6
✅ Template file ../analyzer_templates/uSWS_schema-video-inspection.json found
✅ JSON template is valid
📋 Template description: Social-Housing Exit-Inspection Analyzer – room-by-room condition grading
📝 Loading analyzer template...


INFO:python.content_understanding_client:Analyzer uSWS-video-inspection-b5bf153c-4e03-4c0b-98d7-daa41f82bdc6 create request accepted.


⏳ Waiting for analyzer creation to complete...


INFO:python.content_understanding_client:Request result is ready after 0.00 seconds.


Analyzer creation result:
{
  "id": "b6e91016-2651-4a9c-bc89-216680b97b13",
  "status": "Succeeded",
  "result": {
    "analyzerId": "uSWS-video-inspection-b5bf153c-4e03-4c0b-98d7-daa41f82bdc6",
    "description": "Social-Housing Exit-Inspection Analyzer \u2013 room-by-room condition grading",
    "createdAt": "2025-07-01T07:12:09Z",
    "lastModifiedAt": "2025-07-01T07:12:09Z",
    "baseAnalyzerId": "prebuilt-videoAnalyzer",
    "config": {
      "locales": [
        "en-US"
      ],
      "returnDetails": true,
      "enableFace": false,
      "disableContentFiltering": false,
      "segmentationMode": "custom",
      "segmentationDefinition": "Start a new segment whenever the camera clearly enters a different room or space (doorway threshold or verbal cue such as 'now the kitchen'). Follow the inspector\u2019s anti-clockwise path template.",
      "disableFaceBlurring": false
    },
    "fieldSchema": {
      "fields": {
        "Segments": {
          "type": "array",
          "de

In [None]:
# Now use your custom analyzer to analyze the uSWS video file
ANALYZER_TEST_FILE = '../data/uSWS_data/IMG_2601.mov'

print(f"Analyzing {ANALYZER_TEST_FILE} with custom analyzer {CUSTOM_ANALYZER_ID}")

# Check if file exists first
import os
if not os.path.exists(ANALYZER_TEST_FILE):
    print(f"❌ Error: File {ANALYZER_TEST_FILE} does not exist!")
else:
    print(f"✅ File {ANALYZER_TEST_FILE} found")

# Analyze file with your custom uSWS video inspection analyzer
try:
    response = client.begin_analyze(CUSTOM_ANALYZER_ID, file_location=ANALYZER_TEST_FILE)
    result_json = client.poll_result(response)
    
    print("Custom analysis result:")
    print(json.dumps(result_json, indent=2))
    
except Exception as e:
    print(f"❌ Error during analysis: {str(e)}")
    result_json = {"status": "failed", "error": str(e)}


Analyzing ../data/uSWS_data/IMG_2601.mov with custom analyzer uSWS-video-inspection-b5bf153c-4e03-4c0b-98d7-daa41f82bdc6
✅ File ../data/uSWS_data/IMG_2601.mov found


INFO:python.content_understanding_client:Analyzing file ../data/uSWS_data/IMG_2601.mov with analyzer: uSWS-video-inspection-b5bf153c-4e03-4c0b-98d7-daa41f82bdc6
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_understanding_client:Request b16f1bb3-b58d-4ff8-9197-e04d00c3dfef in progress ...
INFO:python.content_und

❌ Error during analysis: Operation timed out after 120.00 seconds.


In [None]:
# First, let's test with the standard prebuilt analyzer to ensure the video file works
print("🧪 Testing with prebuilt-videoAnalyzer first...")

try:
    test_response = client.begin_analyze('prebuilt-videoAnalyzer', file_location=ANALYZER_TEST_FILE)
    test_result = client.poll_result(test_response, timeout_seconds=300)  # 5 minutes
    
    if test_result.get("status", "").lower() == "succeeded":
        print("✅ Prebuilt analyzer successful - video file is valid")
        
        # Get video info
        if "result" in test_result and "contents" in test_result["result"]:
            contents = test_result["result"]["contents"]
            if contents:
                content = contents[0]
                if "videoInfo" in content:
                    video_info = content["videoInfo"]
                    duration = video_info.get("durationMs", 0) / 1000  # Convert to seconds
                    width = video_info.get("width", "Unknown")
                    height = video_info.get("height", "Unknown")
                    print(f"📹 Video info: {duration:.1f}s, {width}x{height}")
                    
                    if duration > 120:  # More than 2 minutes
                        print("⚠️  Long video detected - custom analysis may take extra time")
    else:
        print(f"❌ Prebuilt analyzer failed: {test_result.get('error', 'Unknown error')}")
        print("🚫 Skipping custom analyzer test")
        
except Exception as e:
    print(f"❌ Prebuilt test failed: {str(e)}")
    print("🚫 Video file may have issues")

print("-" * 50)

In [9]:
# Extract structured room inspection data from the results
if result_json.get("status") == "succeeded":
    result_data = result_json.get("result", {})
    contents = result_data.get("contents", [])
    
    print("\n🏠 uSWS ROOM-BY-ROOM INSPECTION RESULTS:")
    print("=" * 60)
    
    for content_idx, content in enumerate(contents, 1):
        print(f"\n📹 Analysis Content {content_idx}:")
        
        # Look for the structured field data
        fields = content.get("fields", {})
        segments = fields.get("Segments", [])
        
        if segments and isinstance(segments, list):
            print(f"   Found {len(segments)} room segments:")
            
            for i, segment in enumerate(segments, 1):
                print(f"\n   📍 Room Segment {i}:")
                print(f"      Segment ID: {segment.get('SegmentId', 'N/A')}")
                print(f"      Room Type: {segment.get('RoomType', 'Unknown')}")
                print(f"      Overall Condition Score: {segment.get('OverallRoomCondition', 'N/A')}/10")
                
                # Display detailed attributes
                attributes = segment.get('Attributes', '')
                if attributes:
                    print(f"\n      🔍 Room Attributes:")
                    for line in attributes.split('\n'):
                        if line.strip():
                            print(f"         • {line.strip()}")
                
                # Display issues found
                issues = segment.get('IssuesList', '')
                if issues:
                    print(f"\n      ⚠️  Issues Found:")
                    for line in issues.split('\n'):
                        if line.strip():
                            print(f"         🚨 {line.strip()}")
                
                # Display safety checks
                safety_checks = segment.get('AdditionalSafetyChecks', '')
                if safety_checks:
                    print(f"\n      🛡️  Safety Checks:")
                    for line in safety_checks.split('\n'):
                        if line.strip():
                            print(f"         🔍 {line.strip()}")
                
                print("      " + "-" * 50)
        else:
            print("   ⚠️  No room segments found in the analysis results")
            if fields:
                print(f"   Available fields: {list(fields.keys())}")
        
        # Save keyframes for visual reference
        markdown_content = content.get("markdown", "")
        if isinstance(markdown_content, str):
            keyframe_ids = set(re.findall(r"(keyFrame\.\d+)\.jpg", markdown_content))
            if keyframe_ids:
                print(f"\n   💾 Saving {len(keyframe_ids)} keyframe images for visual reference...")
                for keyframe_id in keyframe_ids:
                    save_image(keyframe_id, response)
                print(f"   ✅ Images saved to .cache/ folder")
        
        print("\n" + "=" * 60)

else:
    print(f"❌ Analysis failed")
    print(f"Status: {result_json.get('status', 'Unknown')}")
    print(f"Error: {result_json.get('error', 'Unknown error')}")
    
    # Print additional error details if available
    if 'details' in result_json:
        print(f"Details: {result_json['details']}")
    if 'message' in result_json:
        print(f"Message: {result_json['message']}")
    if 'errors' in result_json:
        print(f"Validation Errors: {result_json['errors']}")

❌ Analysis failed
Status: failed
Error: Operation timed out after 120.00 seconds.


In [9]:
# Optional: Clean up by deleting the custom analyzer when done
# Uncomment the line below if you want to delete the analyzer after use
# client.delete_analyzer(CUSTOM_ANALYZER_ID)
# print(f"🗑️ Custom analyzer '{CUSTOM_ANALYZER_ID}' deleted")

print(f"\n📋 Your custom analyzer ID is: {CUSTOM_ANALYZER_ID}")
print("💡 Tip: Save this ID if you want to reuse the analyzer later without recreating it!")


📋 Your custom analyzer ID is: uSWS-video-inspection-0ed4a082-9df0-4712-8fe1-687e9b0239fd
💡 Tip: Save this ID if you want to reuse the analyzer later without recreating it!


## Video Content with Face
This is a gated feature, please go through process [Azure AI Resource Face Gating](https://learn.microsoft.com/en-us/legal/cognitive-services/computer-vision/limited-access-identity?context=%2Fazure%2Fai-services%2Fcomputer-vision%2Fcontext%2Fcontext#registration-process) Select `[Video Indexer] Facial Identification (1:N or 1:1 matching) to search for a face in a media or entertainment video archive to find a face within a video and generate metadata for media or entertainment use cases only` in the registration form.

In [None]:
ANALYZER_SAMPLE_FILE = '../data/FlightSimulator.mp4'
ANALYZER_ID = 'prebuilt-videoAnalyzer'

# Analyzer file
response = client.begin_analyze(ANALYZER_ID, file_location=ANALYZER_SAMPLE_FILE)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

### Get and Save Key Frames and Face Thumbnails

In [None]:
# Initialize sets for unique face IDs and keyframe IDs
face_ids = set()
keyframe_ids = set()

# Extract unique face IDs safely
result_data = result_json.get("result", {})
contents = result_data.get("contents", [])

# Iterate over contents to find faces and keyframes if available
for content in contents:
    # Safely retrieve face IDs if "faces" exists and is a list
    faces = content.get("faces", [])
    if isinstance(faces, list):
        for face in faces:
            face_id = face.get("faceId")
            if face_id:
                face_ids.add(f"face.{face_id}")

    # Extract keyframe IDs from "markdown" if it exists and is a string
    markdown_content = content.get("markdown", "")
    if isinstance(markdown_content, str):
        keyframe_ids.update(re.findall(r"(keyFrame\.\d+)\.jpg", markdown_content))

# Output the results
print("Unique Face IDs:", face_ids)
print("Unique Keyframe IDs:", keyframe_ids)

# Save all face images
for face_id in face_ids:
    save_image(face_id, response)

# Save all keyframe images
for keyframe_id in keyframe_ids:
    save_image(keyframe_id, response)