# Extract Content from Your File

This notebook demonstrate you can use Content Understanding API to extract semantic content from multimodal files.

## Prerequisites
1. Ensure Azure AI service is configured following [steps](../README.md#configure-azure-ai-service-resource)
2. Install the required packages to run the sample.

In [1]:
%pip install -r ../requirements.txt

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK.


In [2]:
import logging
import json
import os
import sys
import uuid
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

load_dotenv(find_dotenv())
logging.basicConfig(level=logging.INFO)

AZURE_AI_ENDPOINT = os.getenv("AZURE_AI_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_AI_API_VERSION", "2025-05-01-preview")

# Add the parent directory to the path to use shared modules
parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from python.content_understanding_client import AzureContentUnderstandingClient

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    x_ms_useragent="azure-ai-content-understanding-python/content_extraction", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.
)

# Utility function to save images
from PIL import Image
from io import BytesIO
import re

def save_image(image_id: str, response):
    raw_image = client.get_image_from_analyze_operation(analyze_response=response,
        image_id=image_id
    )
    image = Image.open(BytesIO(raw_image))
    # image.show()
    Path(".cache").mkdir(exist_ok=True)
    image.save(f".cache/{image_id}.jpg", "JPEG")


INFO:azure.identity._credentials.environment:No environment configuration found.
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS
INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'User-Agent': 'azsdk-python-identity/1.23.0 Python/3.11.12 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
No body was attached to the request
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS
INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'User-Agent': 'azsdk-python-identity/1.23.0 Python/3.11.12 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
No body was attached to the request
INFO:az

## Video Content
Video output provides detailed information about audiovisual content, specifically video shots. Here are the key features it offers:

1. Shot Information: Each shot is defined by a start and end time, along with a unique identifier. For example, Shot 0:0.0 to 0:2.800 includes a transcript and key frames.
1. Transcript: The API includes a transcript of the audio, formatted in WEBVTT, which allows for easy synchronization with the video. It captures spoken content and specifies the timing of the dialogue.
1. Key Frames: It provides a series of key frames (images) that represent important moments in the video shot, allowing users to visualize the content at specific timestamps.
1. Description: Each shot is accompanied by a description, providing context about the visuals presented. This helps in understanding the scene or subject matter without watching the video.
1. Audio Visual Metadata: Details about the video such as dimensions (width and height), type (audiovisual), and the presence of key frame timestamps are included.
1. Transcript Phrases: The output includes specific phrases from the transcript, along with timing and speaker information, enhancing the usability for applications like closed captioning or search functionalities.

In [3]:
ANALYZER_SAMPLE_FILE = '../data/FlightSimulator.mp4'
ANALYZER_ID = 'prebuilt-videoAnalyzer'

# Analyzer file
response = client.begin_analyze(ANALYZER_ID, file_location=ANALYZER_SAMPLE_FILE)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

# Save keyframes (optional)
keyframe_ids = set()
result_data = result_json.get("result", {})
contents = result_data.get("contents", [])

# Iterate over contents to find keyframes if available
for content in contents:
    # Extract keyframe IDs from "markdown" if it exists and is a string
    markdown_content = content.get("markdown", "")
    if isinstance(markdown_content, str):
        keyframe_ids.update(re.findall(r"(keyFrame\.\d+)\.jpg", markdown_content))

# Output the results
print("Unique Keyframe IDs:", keyframe_ids)

# Save all keyframe images
for keyframe_id in keyframe_ids:
    save_image(keyframe_id, response)

INFO:python.content_understanding_client:Analyzing file ../data/FlightSimulator.mp4 with analyzer: prebuilt-videoAnalyzer
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72-4044-ab5a-5c8adaa40c96 in progress ...
INFO:python.content_understanding_client:Request df2b77b0-ba72

{
  "id": "df2b77b0-ba72-4044-ab5a-5c8adaa40c96",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-videoAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "2025-07-01T07:54:33Z",
    "stringEncoding": "utf8",
    "contents": [
      {
        "markdown": "# Video: 00:00.000 => 00:43.866\nWidth: 1080\nHeight: 608\n\n## Segment 1: 00:00.000 => 00:07.367\nThe video begins with an aerial view of a scenic island, featuring the logos of Flight Simulator and Microsoft Azure AI. This is followed by a waveform visualization, indicating a discussion about neural TTS (Text-to-Speech) technology and the importance of good data for achieving high-quality voice synthesis.\n\nTranscript\n```\nWEBVTT\n\n00:01.400 --> 00:06.640\n<Speaker 1>When it comes to the neural TTS, in order to get a good voice, it's better to have good data.\n\n00:07.120 --> 00:13.280\n<Speaker 2>To achieve that, we build a universal TTS model based on 3000 hours of data.\n```\n\nKey Frames\n- 

## Custom Video Inspection Analyzer (uSWS)
This section demonstrates how to create and use your custom uSWS schema video inspection analyzer for room-by-room condition grading.

In [4]:
# First, create your custom uSWS video inspection analyzer
import uuid

ANALYZER_TEMPLATE = "../analyzer_templates/uSWS_schema-video-inspection.json"
CUSTOM_ANALYZER_ID = "uSWS-video-inspection-" + str(uuid.uuid4())

print(f"Creating custom analyzer with ID: {CUSTOM_ANALYZER_ID}")

# Check if template file exists first
import os
if not os.path.exists(ANALYZER_TEMPLATE):
    print(f"Error: Template file {ANALYZER_TEMPLATE} does not exist!")
else:
    print(f"Template file {ANALYZER_TEMPLATE} found")
    
    # Validate JSON structure
    try:
        with open(ANALYZER_TEMPLATE, 'r') as f:
            template_content = json.load(f)
        print("JSON template is valid")
        print(f"Template description: {template_content.get('description', 'No description')}")
    except json.JSONDecodeError as e:
        print(f"JSON validation error: {e}")

# Create the analyzer from your template
try:
    print("Loading analyzer template...")
    response = client.begin_create_analyzer(CUSTOM_ANALYZER_ID, analyzer_template_path=ANALYZER_TEMPLATE)
    print("Waiting for analyzer creation to complete...")
    result = client.poll_result(response)

    print("Analyzer creation result:")
    print(json.dumps(result, indent=2))

    if result.get("status", "").lower() == "succeeded":
        print(f"Custom analyzer '{CUSTOM_ANALYZER_ID}' created successfully!")
    else:
        print(f"Failed to create analyzer.")
        print(f"Status: {result.get('status', 'Unknown')}")
        print(f"Error details: {result.get('error', 'No error details')}")
        if 'errors' in result:
            print(f"Validation errors: {result['errors']}")
        if 'message' in result:
            print(f"Error message: {result['message']}")
            
except Exception as e:
    print(f"Exception during analyzer creation: {str(e)}")
    result = {"status": "failed", "error": str(e)}

Creating custom analyzer with ID: uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42
Template file ../analyzer_templates/uSWS_schema-video-inspection.json found
JSON template is valid
Template description: Social-Housing Exit-Inspection Analyzer – room-by-room condition grading
Loading analyzer template...


INFO:python.content_understanding_client:Analyzer uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42 create request accepted.


Waiting for analyzer creation to complete...


INFO:python.content_understanding_client:Request result is ready after 0.00 seconds.


Analyzer creation result:
{
  "id": "a1890ca8-0270-496c-a507-eb90b7ee919e",
  "status": "Succeeded",
  "result": {
    "analyzerId": "uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42",
    "description": "Social-Housing Exit-Inspection Analyzer \u2013 room-by-room condition grading",
    "createdAt": "2025-07-01T07:57:59Z",
    "lastModifiedAt": "2025-07-01T07:57:59Z",
    "baseAnalyzerId": "prebuilt-videoAnalyzer",
    "config": {
      "locales": [
        "en-US"
      ],
      "returnDetails": true,
      "enableFace": false,
      "disableContentFiltering": false,
      "segmentationMode": "custom",
      "segmentationDefinition": "Start a new segment whenever the camera clearly enters a different room or space (doorway threshold or verbal cue such as 'now the kitchen'). Follow the inspector\u2019s anti-clockwise path template.",
      "disableFaceBlurring": false
    },
    "fieldSchema": {
      "fields": {
        "Segments": {
          "type": "array",
          "de

In [5]:
# Test with the standard prebuilt analyzer first to validate the uSWS video file
ANALYZER_TEST_FILE = '../data/uSWS_data/IMG_2601.mov'

print("Testing with prebuilt-videoAnalyzer first...")

try:
    test_response = client.begin_analyze('prebuilt-videoAnalyzer', file_location=ANALYZER_TEST_FILE)
    test_result = client.poll_result(test_response, timeout_seconds=300)  # 5 minutes
    
    if test_result.get("status", "").lower() == "succeeded":
        print("Prebuilt analyzer successful - video file is valid")
        
        # Get video info
        if "result" in test_result and "contents" in test_result["result"]:
            contents = test_result["result"]["contents"]
            if contents:
                content = contents[0]
                if "videoInfo" in content:
                    video_info = content["videoInfo"]
                    duration = video_info.get("durationMs", 0) / 1000  # Convert to seconds
                    width = video_info.get("width", "Unknown")
                    height = video_info.get("height", "Unknown")
                    print(f"Video info: {duration:.1f}s, {width}x{height}")
                    
                    if duration > 120:  # More than 2 minutes
                        print("Warning: Long video detected - custom analysis may take extra time")
    else:
        print(f"Prebuilt analyzer failed: {test_result.get('error', 'Unknown error')}")
        print("Skipping custom analyzer test")
        
except Exception as e:
    print(f"Prebuilt test failed: {str(e)}")
    print("Video file may have issues")

print("-" * 50)

Testing with prebuilt-videoAnalyzer first...


INFO:python.content_understanding_client:Analyzing file ../data/uSWS_data/IMG_2601.mov with analyzer: prebuilt-videoAnalyzer
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7789-40a0-87a5-3d4202efa638 in progress ...
INFO:python.content_understanding_client:Request 6eacbbbe-7

Prebuilt analyzer successful - video file is valid
--------------------------------------------------


In [8]:
# Now use your custom analyzer to analyze the uSWS video file
print(f"Analyzing {ANALYZER_TEST_FILE} with custom analyzer {CUSTOM_ANALYZER_ID}")

# Check if file exists first
import os
if not os.path.exists(ANALYZER_TEST_FILE):
    print(f"Error: File {ANALYZER_TEST_FILE} does not exist!")
else:
    print(f"File {ANALYZER_TEST_FILE} found")

# Analyze file with your custom uSWS video inspection analyzer
try:
    response = client.begin_analyze(CUSTOM_ANALYZER_ID, file_location=ANALYZER_TEST_FILE)
    #result_json = client.poll_result(response) # default time out 2 minutes
    result_json = client.poll_result(response, timeout_seconds=600)  # 10 minutes
    
    print("Custom analysis result:")
    print(json.dumps(result_json, indent=2))
    
except Exception as e:
    print(f"Error during analysis: {str(e)}")
    result_json = {"status": "failed", "error": str(e), "timestamp": str(uuid.uuid4())}

# Save the result to a JSON file (regardless of success or failure)
if result_json:
    timestamp = str(uuid.uuid4())[:8]  # Short unique identifier
    status = result_json.get("status", "unknown")
    output_filename = f"uSWS_analysis_result_{status}_{timestamp}.json"
    output_path = f".cache/{output_filename}"
    
    # Ensure .cache directory exists
    Path(".cache").mkdir(exist_ok=True)
    
    # Save the result to file
    with open(output_path, 'w') as f:
        json.dump(result_json, f, indent=2)
    
    print(f"Analysis result saved to: {output_path}")
else:
    print("No result data to save")

Analyzing ../data/uSWS_data/IMG_2601.mov with custom analyzer uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42
File ../data/uSWS_data/IMG_2601.mov found


INFO:python.content_understanding_client:Analyzing file ../data/uSWS_data/IMG_2601.mov with analyzer: uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_understanding_client:Request be2eae31-ad45-4a4e-b8dd-06ec70759a6a in progress ...
INFO:python.content_und

Custom analysis result:
{
  "id": "be2eae31-ad45-4a4e-b8dd-06ec70759a6a",
  "status": "Succeeded",
  "result": {
    "analyzerId": "uSWS-video-inspection-17bd06d4-213c-485f-a6d3-405d36e72d42",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "2025-07-01T08:24:24Z",
    "stringEncoding": "utf8",
    "contents": [
      {
        "markdown": "# Video: 00:00.000 => 04:02.710\nWidth: 1920\nHeight: 1080\n\n## Segment 1: 00:01.100 => 00:28.600\nThe video begins with an exterior view of a house, showing a well-maintained garden and a pathway leading to the entrance. The house has striped awnings over the windows and a covered walkway. There are trash bins and some trash bags near the bins.\n\n\n\nKey Frames\n- 00:01.100 ![](keyFrame.1100.jpg)\n- 00:02.200 ![](keyFrame.2200.jpg)\n- 00:03.300 ![](keyFrame.3300.jpg)\n- 00:04.400 ![](keyFrame.4400.jpg)\n- 00:05.500 ![](keyFrame.5500.jpg)\n- 00:06.600 ![](keyFrame.6600.jpg)\n- 00:07.700 ![](keyFrame.7700.jpg)\n- 00:08.800 ![](keyFrame.8800

In [None]:
# Extract structured room inspection data from the results
if result_json.get("status", "").lower() == "succeeded":
    result_data = result_json.get("result", {})
    contents = result_data.get("contents", [])
    
    print("\nuSWS ROOM-BY-ROOM INSPECTION RESULTS:")
    print("=" * 60)
    
    for content_idx, content in enumerate(contents, 1):
        print(f"\nAnalysis Content {content_idx}:")
        
        # Look for the structured field data
        fields = content.get("fields", {})
        segments = fields.get("Segments", [])
        
        if segments and isinstance(segments, list):
            print(f"   Found {len(segments)} room segments:")
            
            for i, segment in enumerate(segments, 1):
                print(f"\n   Room Segment {i}:")
                print(f"      Segment ID: {segment.get('SegmentId', 'N/A')}")
                print(f"      Room Type: {segment.get('RoomType', 'Unknown')}")
                print(f"      Overall Condition Score: {segment.get('OverallRoomCondition', 'N/A')}/10")
                
                # Display detailed attributes
                attributes = segment.get('Attributes', '')
                if attributes:
                    print(f"\n      Room Attributes:")
                    for line in attributes.split('\n'):
                        if line.strip():
                            print(f"         - {line.strip()}")
                
                # Display issues found
                issues = segment.get('IssuesList', '')
                if issues:
                    print(f"\n      Issues Found:")
                    for line in issues.split('\n'):
                        if line.strip():
                            print(f"         Issue: {line.strip()}")
                
                # Display safety checks
                safety_checks = segment.get('AdditionalSafetyChecks', '')
                if safety_checks:
                    print(f"\n      Safety Checks:")
                    for line in safety_checks.split('\n'):
                        if line.strip():
                            print(f"         Check: {line.strip()}")
                
                print("      " + "-" * 50)
        else:
            print("   Warning: No room segments found in the analysis results")
            if fields:
                print(f"   Available fields: {list(fields.keys())}")
        
        # Save keyframes for visual reference
        markdown_content = content.get("markdown", "")
        if isinstance(markdown_content, str):
            keyframe_ids = set(re.findall(r"(keyFrame\.\d+)\.jpg", markdown_content))
            if keyframe_ids:
                print(f"\n   Saving {len(keyframe_ids)} keyframe images for visual reference...")
                for keyframe_id in keyframe_ids:
                    save_image(keyframe_id, response)
                print(f"   Images saved to .cache/ folder")
        
        print("\n" + "=" * 60)

else:
    print(f"Analysis failed")
    print(f"Status: {result_json.get('status', 'Unknown')}")
    print(f"Error: {result_json.get('error', 'Unknown error')}")
    
    # Print additional error details if available
    if 'details' in result_json:
        print(f"Details: {result_json['details']}")
    if 'message' in result_json:
        print(f"Message: {result_json['message']}")
    if 'errors' in result_json:
        print(f"Validation Errors: {result_json['errors']}")
        
    # Debug: Show the actual status for troubleshooting
    print(f"Debug - Actual status value: '{result_json.get('status', 'None')}'")
    print(f"Debug - Status type: {type(result_json.get('status', 'None'))}")

Analysis failed
Status: Succeeded
Error: Unknown error


In [None]:
# Optional: Clean up by deleting the custom analyzer when done
# Uncomment the line below if you want to delete the analyzer after use
# client.delete_analyzer(CUSTOM_ANALYZER_ID)
# print(f"Custom analyzer '{CUSTOM_ANALYZER_ID}' deleted")

print(f"\nYour custom analyzer ID is: {CUSTOM_ANALYZER_ID}")
print("Tip: Save this ID if you want to reuse the analyzer later without recreating it!")