# Conduct Complex Analysis with Pro Mode

> #################################################################################
>
> **Note:** Pro mode is currently available only for `document` data.  
> [Supported file types](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/service-limits#document-and-text): pdf, tiff, jpg, jpeg, png, bmp, heif
>
> #################################################################################

This notebook demonstrates how to use **Pro mode** in Azure AI Content Understanding to enhance your analyzer with multiple inputs and optional reference data. Pro mode is designed for advanced use cases, particularly those requiring multi-step reasoning and complex decision-making (for example, identifying inconsistencies, drawing inferences, and making sophisticated decisions). Pro mode allows input from multiple content files and includes the option to provide reference data at analyzer creation time.

In this walkthrough, you'll learn how to:
1. Create an analyzer with a schema and reference data.
2. Analyze your files using Pro mode.

For more details on Pro mode, see the [Azure AI Content Understanding: Standard and Pro Modes](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/concepts/standard-pro-modes) documentation.

## Prerequisites
1. Ensure the Azure AI service is configured by following the [setup steps](../README.md#configure-azure-ai-service-resource).
2. If using reference documents, please follow [Set env for reference doc](../docs/set_env_for_training_data_and_reference_doc.md) to configure reference document environment variables in the [.env](./.env) file.
    - You can set `REFERENCE_DOC_SAS_URL` directly with the SAS URL for your Azure Blob container.
    - Alternatively, set both `REFERENCE_DOC_STORAGE_ACCOUNT_NAME` and `REFERENCE_DOC_CONTAINER_NAME` so that the SAS URL can be generated automatically during a later step.
    - Also, set `REFERENCE_DOC_PATH` to specify the folder path within the container where reference documents will be uploaded.
    > ⚠️ Note: Reference documents are optional in Pro mode. You can run Pro mode using only input documents. For example, the service can reason across two or more input files without any reference data.
3. Install the required packages to run the sample.

In [None]:
%pip install -r ../requirements.txt

## Create Azure Content Understanding Client
> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class containing client functions. Before the official release of the Content Understanding SDK, consider it a lightweight SDK. Fill in the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with your Azure AI Service details.

> ⚠️ Important:
You must update the code below to match your Azure authentication method.
Look for the `# IMPORTANT` comments and modify those sections accordingly.
If you skip this step, the sample may not run correctly.

> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is recommended and safer for production environments.

In [None]:
import logging
import json
import os
from pathlib import Path
import sys
from dotenv import load_dotenv
import base64
from azure.storage.blob import ContainerSasPermissions
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import (
    AnalyzeResult,
    AnalyzeInput,
    ContentAnalyzer,
    ContentAnalyzerConfig,
    AnalysisMode,
    ProcessingLocation,
    AudioVisualContent,
    FieldSchema,
    FieldDefinition,
    FieldType,
    GenerationMethod,
)
from datetime import datetime
from typing import Any
import uuid

# Add the parent directory to the Python path to import the sample_helper module
sys.path.append(os.path.join(os.path.dirname(os.getcwd()), 'python'))
from extension.document_processor import DocumentProcessor
from extension.sample_helper import (
    extract_operation_id_from_poller,
    PollerType,
    save_json_to_file,
)

load_dotenv()
logging.basicConfig(level=logging.INFO)

endpoint = os.environ.get("AZURE_CONTENT_UNDERSTANDING_ENDPOINT")
# Return AzureKeyCredential if AZURE_CONTENT_UNDERSTANDING_KEY is set, otherwise DefaultAzureCredential
key = os.getenv("AZURE_CONTENT_UNDERSTANDING_KEY")
credential = AzureKeyCredential(key) if key else DefaultAzureCredential()
# Create the ContentUnderstandingClient
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
print("✅ ContentUnderstandingClient created successfully")

try:
    processor = DocumentProcessor(client)
    print("✅ DocumentProcessor created successfully")
except Exception as e:
    print(f"❌ Failed to create DocumentProcessor: {e}")
    raise

## Prepare Reference Data
In this step, we will:
- Use `REFERENCE_DOC_PATH` and SAS URL related environment variables that were set in the Prerequisites.
- Attempt to obtain the SAS URL from the `REFERENCE_DOC_SAS_URL` environment variable.
  If this is not set, the SAS URL will be generated automatically using the `REFERENCE_DOC_STORAGE_ACCOUNT_NAME` and `REFERENCE_DOC_CONTAINER_NAME` environment variables.
- Use Azure AI service to extract OCR results from reference documents if needed.
- Generate a reference `.jsonl` file.
- Upload these files to the designated Azure Blob Storage.


In [None]:
# Load reference storage configuration from environment
reference_doc_path = os.getenv("REFERENCE_DOC_PATH") or f"reference_docs_{uuid.uuid4().hex[:8]}"
reference_doc_sas_url = os.getenv("REFERENCE_DOC_SAS_URL")

if not reference_doc_path.endswith("/"):
    reference_doc_path += "/"

if not reference_doc_sas_url:
    reference_doc_storage_account_name = os.getenv("REFERENCE_DOC_STORAGE_ACCOUNT_NAME")
    reference_doc_container_name = os.getenv("REFERENCE_DOC_CONTAINER_NAME")
    print(f"REFERENCE_DOC_STORAGE_ACCOUNT_NAME: {reference_doc_storage_account_name}")
    print(f"REFERENCE_DOC_CONTAINER_NAME: {reference_doc_container_name}")

    if reference_doc_storage_account_name and reference_doc_container_name:
        # We require "Write" permission to upload, modify, or append blobs
        reference_doc_sas_url = processor.generate_container_sas_url(
            account_name=reference_doc_storage_account_name,
            container_name=reference_doc_container_name,
            permissions=ContainerSasPermissions(read=True, write=True, list=True),
            expiry_hours=1,
        )

> ⚠️ Note: Reference documents are optional in Pro mode. You can run Pro mode using only input documents. For example, the service can reason across two or more input files without any reference data. To skip reference document preparation, please skip or comment out the code in the following section.

In [None]:
# Set skip_analyze to True if you already have OCR results for the documents in the reference_docs folder
# Please name the OCR result files with the same name as the original document filenames including extension, and add the suffix ".result.json"
# For example, if the original document is "invoice.pdf", the OCR result file should be named "invoice.pdf.result.json"
# NOTE: Please comment out the following line if you do not have any reference documents.
reference_docs = "../data/field_extraction_pro_mode/invoice_contract_verification/reference_docs"
print(f"REFERENCE_DOCS: {reference_docs}")
print(f"REFERENCE_DOC_SAS_URL: {reference_doc_sas_url}")
print(f"REFERENCE_DOC_PATH: {reference_doc_path}")
await processor.generate_knowledge_base_on_blob(reference_docs, reference_doc_sas_url, reference_doc_path, skip_analyze=False)

## Create Analyzer with Defined Schema for Pro Mode
Before creating the analyzer, assign a relevant name to the constant **CUSTOM_ANALYZER_ID** for your task. Here, we generate a unique suffix so this cell can be run multiple times to create different analyzers.

We use **reference_doc_sas_url** and **reference_doc_path** configured in the [.env](./.env) file and utilized in the previous step.

In [None]:
analyzer_id = f"pro-mode-sample-{datetime.now().strftime('%Y%m%d')}-{datetime.now().strftime('%H%M%S')}-{uuid.uuid4().hex[:8]}"

# Create a custom analyzer using object model
content_analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-documentAnalyzer",
    field_schema=FieldSchema(
        name="InvoiceContractVerification",
        description="Analyze invoice to confirm total consistency with signed contract.",
        fields={
            "PaymentTermsInconsistencies": FieldDefinition(
                type=FieldType.ARRAY,
                method=GenerationMethod.GENERATE,
                description="List all areas of inconsistency identified in the invoice with corresponding evidence.",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    method=GenerationMethod.GENERATE,
                    description="Area of inconsistency in the invoice with the company's contracts.",
                    properties={
                        "Evidence": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Evidence or reasoning for the inconsistency in the invoice."
                        ),
                        "InvoiceField": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Invoice field or the aspect that is inconsistent with the contract."
                        )
                    }
                )
            ),
            "ItemInconsistencies": FieldDefinition(
                type=FieldType.ARRAY,
                method=GenerationMethod.GENERATE,
                description="List all areas of inconsistency identified in the invoice in the goods or services sold (including detailed specifications for every line item).",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    method=GenerationMethod.GENERATE,
                    description="Area of inconsistency in the invoice with the company's contracts.",
                    properties={
                        "Evidence": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Evidence or reasoning for the inconsistency in the invoice."
                        ),
                        "InvoiceField": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Invoice field or the aspect that is inconsistent with the contract."
                        )
                    }
                )
            ),
            "BillingLogisticsInconsistencies": FieldDefinition(
                type=FieldType.ARRAY,
                method=GenerationMethod.GENERATE,
                description="List all areas of inconsistency identified in the invoice regarding billing logistics and administrative or legal issues.",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    method=GenerationMethod.GENERATE,
                    description="Area of inconsistency in the invoice with the company's contracts.",
                    properties={
                        "Evidence": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Evidence or reasoning for the inconsistency in the invoice."
                        ),
                        "InvoiceField": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Invoice field or the aspect that is inconsistent with the contract."
                        )
                    }
                )
            ),
            "PaymentScheduleInconsistencies": FieldDefinition(
                type=FieldType.ARRAY,
                method=GenerationMethod.GENERATE,
                description="List all areas of inconsistency identified in the invoice with corresponding evidence.",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    method=GenerationMethod.GENERATE,
                    description="Area of inconsistency in the invoice with the company's contracts.",
                    properties={
                        "Evidence": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Evidence or reasoning for the inconsistency in the invoice."
                        ),
                        "InvoiceField": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Invoice field or the aspect that is inconsistent with the contract."
                        )
                    }
                )
            ),
            "TaxOrDiscountInconsistencies": FieldDefinition(
                type=FieldType.ARRAY,
                method=GenerationMethod.GENERATE,
                description="List all areas of inconsistency identified in the invoice with corresponding evidence regarding taxes or discounts.",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    method=GenerationMethod.GENERATE,
                    description="Area of inconsistency in the invoice with the company's contracts.",
                    properties={
                        "Evidence": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Evidence or reasoning for the inconsistency in the invoice."
                        ),
                        "InvoiceField": FieldDefinition(
                            type=FieldType.STRING,
                            method=GenerationMethod.GENERATE,
                            description="Invoice field or the aspect that is inconsistent with the contract."
                        )
                    }
                )
            )
        }
    ),
    mode=AnalysisMode.PRO,
    processing_location=ProcessingLocation.GLOBAL,
    knowledge_sources=[{
        "kind": "reference",
        "containerUrl": reference_doc_sas_url,
        "prefix": reference_doc_path,
        "fileListPath": processor.KNOWLEDGE_SOURCE_LIST_FILE_NAME,
    }],
)
print("KNOWLEDGE_SOURCE_LIST_FILE_NAME", processor.KNOWLEDGE_SOURCE_LIST_FILE_NAME)
print(f"🔧 Creating custom analyzer '{analyzer_id}'...")
poller = await client.content_analyzers.begin_create_or_replace(
    analyzer_id=analyzer_id,
    resource=content_analyzer,
)

# Extract operation ID from the poller
operation_id = extract_operation_id_from_poller(
    poller, PollerType.ANALYZER_CREATION
)
print(f"📋 Extracted creation operation ID: {operation_id}")

# Wait for the analyzer to be created
print(f"⏳ Waiting for analyzer creation to complete...")
await poller.result()
print(f"✅ Analyzer '{analyzer_id}' created successfully!")

## Use Created Analyzer to Analyze Input Documents
After the analyzer is successfully created, it can be used to analyze input files.
> NOTE: Pro mode performs multi-step reasoning and may require more time for analysis.

In [None]:
input_docs = "../data/field_extraction_pro_mode/invoice_contract_verification/input_docs/contoso_lifts_invoice.pdf"

print(f"📄 Reading document file: {input_docs}")
with open(input_docs, "rb") as f:
    pdf_content = f.read()

print(f"🔍 Starting document analysis with analyzer '{analyzer_id}'...")
analysis_poller = await client.content_analyzers.begin_analyze_binary(
    analyzer_id=analyzer_id,
    input=pdf_content,
    content_type="application/pdf"
)

# Wait for analysis completion
print(f"⏳ Waiting for document analysis to complete...")
analysis_result = await analysis_poller.result()
print(f"✅ Document analysis completed successfully!")

# Extract operation ID for get_result
analysis_operation_id = extract_operation_id_from_poller(
    analysis_poller, PollerType.ANALYZE_CALL
)
print(f"📋 Extracted analysis operation ID: {analysis_operation_id}")

# Get the analysis result using the operation ID
print(
    f"🔍 Getting analysis result using operation ID '{analysis_operation_id}'..."
)
operation_status = await client.content_analyzers.get_result(
    operation_id=analysis_operation_id,
)

# The actual analysis result is in operation_status.result
operation_result = operation_status.result

# Create output directory if it doesn't exist
output_dir = "output"
os.makedirs(output_dir, exist_ok=True)

saved_file_path = os.path.join(output_dir, f"{analyzer_id}_result.json")
with open(saved_file_path, "w", encoding="utf-8") as file:
    json.dump(operation_result.as_dict(), file, indent=2)

logging.info(f"Full analyzer result saved to: {saved_file_path}")
print(f"💾 Analysis result saved to: {saved_file_path}")

> Let's review the extracted fields with Pro mode.

In [None]:
print(json.dumps(operation_result.as_dict(), indent=2))
fields = operation_result.contents[0].fields
print(fields)

> As seen in the `PaymentTermsInconsistencies` field, for example, the purchase contract details the payment terms agreed upon prior to the service. However, the implied payment terms on the invoice conflict with these. Pro mode identified the corresponding contract from the reference documents and analyzed it alongside the invoice to discover this inconsistency.

## Delete Existing Analyzer in Content Understanding Service
This step is optional. It removes the test analyzer from your service to prevent it from persisting. Without deletion, the analyzer will remain for potential reuse.

In [None]:
await client.content_analyzers.delete(analyzer_id)

## Bonus Sample
This sample demonstrates how Pro mode supports multi-document input and advanced reasoning. Unlike Document Standard Mode, which processes one document at a time, Pro mode can analyze multiple documents within a single analysis call. The service not only processes each document independently but also cross-references documents to perform reasoning across them, enabling deeper insights and validation.

### First, Set Up Variables for the Second Sample

In [None]:
# Define paths for analyzer template, input documents, and reference documents of the second sample
analyzer_template_2 = "../analyzer_templates/insurance_claims_review_pro_mode.json"
input_docs_2 = "../data/field_extraction_pro_mode/insurance_claims_review/input_docs"
reference_docs_2 = "../data/field_extraction_pro_mode/insurance_claims_review/reference_docs2"

# Load reference storage configuration from environment
reference_doc_path_2 = reference_doc_path.rstrip("/") + "_2/"  # NOTE: Use a different path for the second sample
analyzer_id_2 = "pro-mode-sample-" + str(uuid.uuid4())

### Generate Knowledge Base for the Second Sample
Upload [reference documents](../data/field_extraction_pro_mode/insurance_claims_review/reference_docs/) with existing OCR results for the second sample. These documents contain driver coverage policies useful for reviewing insurance claims.

In [None]:
logging.info("Start generating knowledge base for the second sample...")
# Reuse the same blob container
await processor.generate_knowledge_base_on_blob(reference_docs_2, reference_doc_sas_url, reference_doc_path_2, skip_analyze=True)

### Create Analyzer for the Second Sample
Reuse the existing AzureContentUnderstandingClient.

In [None]:
analyzer_id = f"pro-mode-sample-{datetime.now().strftime('%Y%m%d')}-{datetime.now().strftime('%H%M%S')}-{uuid.uuid4().hex[:8]}"
# Create a custom analyzer using object model
print(f"🔧 Creating custom analyzer '{analyzer_id}' for bonus...")

bonus_content_analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-documentAnalyzer",
    description="Bonus content analyzer for cleanup demonstration",
    field_schema=FieldSchema(
        name="InsuranceClaimsReview",
        description="Analyze documents for insurance claim approval strictly according to the provided insurance policy. Consider all aspects of the insurance claim documents, any potential discrepancies found among the documents, any claims that should be flagged for review, etc.",
        fields={
            "CarBrand": FieldDefinition(
                type=FieldType.STRING,
                description="Brand of the damaged vehicle.",
            ),
            "CarColor": FieldDefinition(
                type=FieldType.STRING,
                description="Color of the damaged vehicle. Only use color name from 17 web colors. Use CamalCase naming convention.",
            ),
            "CarModel": FieldDefinition(
                type=FieldType.STRING,
                description="Model of the damaged vehicle. Do not include brand name. Leave empty if not found.",
            ),
            "LicensePlate": FieldDefinition(
                type=FieldType.STRING,
                description="License plate number of the damaged vehicle.",
            ),
            "VIN": FieldDefinition(
                type=FieldType.STRING,
                description="VIN of the damaged vehicle. Leave empty if not found.",
            ),
            "ReportingOfficer": FieldDefinition(
                type=FieldType.STRING,
                description="Name of the reporting officer for the incident.",
            ),
            "LineItemCorroboration": FieldDefinition(
                type=FieldType.ARRAY,
                description="Validation of all of the line items on the claim, including parts, services, labors, materials, shipping and other costs and fees. When in doubt about adherence to the policy, mark as suspicious.",
                items_property=FieldDefinition(
                    type=FieldType.OBJECT,
                    description="Entry in the line item analysis table to analyze the pertinent information for the line item.",
                    properties={
                        "LineItemName": FieldDefinition(
                            type=FieldType.STRING,
                            description="Name of the line item in the claim.",
                        ),
                        "IdentifiedVehiclePart": FieldDefinition(
                            type=FieldType.STRING,
                            description="The relevant associated vehicle part for this line item",
                            enum=[
                                "BODY_TRIM",
                                "DRIVER_SIDE_DRIVER_DOOR",
                                "DRIVER_SIDE_DRIVER_HANDLE",
                                "DRIVER_SIDE_FRONT_TIRE",
                                "DRIVER_SIDE_FRONT_WHEEL",
                                "DRIVER_SIDE_FUEL_CAP",
                                "DRIVER_SIDE_PASSENGER_DOOR",
                                "DRIVER_SIDE_PASSENGER_HANDLE",
                                "DRIVER_SIDE_PASSENGER_WINDOW",
                                "DRIVER_SIDE_REAR_HEADLAMP",
                                "DRIVER_SIDE_REAR_TIRE",
                                "DRIVER_SIDE_REAR_WHEEL",
                                "DRIVER_SIDE_SIDE_WINDOW",
                                "DRIVER_SIDE_WINDOW",
                                "DRIVER_SIDE_WING_MIRROR",
                                "FRONT_BONNET",
                                "FRONT_BUMPER_LOWER",
                                "FRONT_BUMPER_UPPER",
                                "FRONT_DRIVER_SIDE_FOG_LIGHT",
                                "FRONT_DRIVER_SIDE_HEADLAMP",
                                "FRONT_GRILL",
                                "FRONT_NUMBER_PLATE",
                                "FRONT_PASSENGER_SIDE_FOG_LIGHT",
                                "FRONT_PASSENGER_SIDE_HEADLAMP",
                                "FRONT_WINDSHIELD",
                                "PASSENGER_SIDE_DRIVER_DOOR",
                                "PASSENGER_SIDE_DRIVER_HANDLE",
                                "PASSENGER_SIDE_FRONT_TIRE",
                                "PASSENGER_SIDE_FRONT_WHEEL",
                                "PASSENGER_SIDE_PASSENGER_DOOR",
                                "PASSENGER_SIDE_PASSENGER_HANDLE",
                                "PASSENGER_SIDE_PASSENGER_WINDOW",
                                "PASSENGER_SIDE_REAR_HEADLAMP",
                                "PASSENGER_SIDE_REAR_TIRE",
                                "PASSENGER_SIDE_REAR_WHEEL",
                                "PASSENGER_SIDE_SIDE_WINDOW",
                                "PASSENGER_SIDE_WINDOW",
                                "PASSENGER_SIDE_WING_MIRROR",
                                "REAR_BUMPER",
                                "REAR_NUMBER_PLATE",
                                "REAR_TRUNK",
                                "REAR_WINDSHIELD",
                                "ROOF_PANEL",
                                "OTHER"
                            ]
                        ),
                        "Cost": FieldDefinition(
                            type=FieldType.NUMBER,
                            description="The cost of this line item on the claim.",
                        ),
                        "Evidence": FieldDefinition(
                            type=FieldType.ARRAY,
                            description="The evidence for this line item entry, a list of the document with analyzed evidence supporting the claim formatted as <document name>/<evidence>. One of the insurance policy documents must be one of the documents.",
                            items_property=FieldDefinition(
                                type=FieldType.STRING,
                            )
                        ),
                        "ClaimStatus": FieldDefinition(
                            type=FieldType.STRING,
                            description="Determined by confidence in whether the claim should be approved based on the evidence. Item should be compliant to insurance policy and required for repairing the vehicle. Only use 'confirmed' for items explicitly approvable according to the policy. If unsure, use 'suspicious'.",
                            enum=[
                                "confirmed",
                                "suspicious",
                                "unconfirmed"
                            ],
                            enum_descriptions={
                                "confirmed": "Completely and explicitly corroborated by the policy.",
                                "suspicious": "Only partially verified, questionable, or otherwise uncertain evidence to approve automatically. Requires human review.",
                                "unconfirmed": "Explicitly not approved by the policy."
                            }
                        )
                    },
                )
            )
        }
    ),
    mode=AnalysisMode.PRO,
    processing_location=ProcessingLocation.GLOBAL,
)

# Start the analyzer creation operation
poller = await client.content_analyzers.begin_create_or_replace(
    analyzer_id=analyzer_id,
    resource=bonus_content_analyzer,
)

# Extract operation ID from the poller
operation_id = extract_operation_id_from_poller(
    poller, PollerType.ANALYZER_CREATION
)
print(f"📋 Extracted creation operation ID: {operation_id}")

# Wait for the analyzer to be created
print(f"⏳ Waiting for analyzer creation to complete...")
await poller.result()
print(f"✅ Analyzer '{analyzer_id}' created successfully!")

### Analyze Multiple Input Documents with the Second Analyzer
The [input_docs_2](../data/field_extraction_pro_mode/insurance_claims_review/input_docs/) folder contains two PDFs as input: a car accident report and a repair estimate.

The first document includes details such as the car’s license plate number, vehicle model, and incident information.
The second document provides a breakdown of the estimated repair costs.

Because of the complexity of this multi-document scenario, analysis may take several minutes.

In [None]:
inputs_data: list[AnalyzeInput] = []
input_dir = Path(input_docs_2)

for file_path in input_dir.glob("*"):
    if file_path.is_file() and processor.is_supported_doc_type_by_file_path(file_path, is_document=True):
        # Get relative path and replace separators with underscores
        relative_path = file_path.relative_to(input_dir)
        name = str(relative_path).replace(os.sep, '_').replace('/', '_').replace('\\', '_')

        with open(file_path, 'rb') as f:
            file_data = f.read()
            base64_data = base64.b64encode(file_data).decode('utf-8')
        
        inputs_data.append({
            'name': name,
            'data': base64_data
        })

analysis_poller = await client.content_analyzers.begin_analyze(
    analyzer_id=analyzer_id, 
    inputs=inputs_data, 
    content_type="application/json")

 # Wait for analysis completion
print(f"⏳ Waiting for document analysis to complete...")
analysis_result = await analysis_poller.result()
print(f"✅ Document analysis completed successfully!")

# Extract operation ID for get_result
analysis_operation_id = extract_operation_id_from_poller(
    analysis_poller, PollerType.ANALYZE_CALL
)
print(f"📋 Extracted analysis operation ID: {analysis_operation_id}")

# Get the analysis result using the operation ID
print(
    f"🔍 Getting analysis result using operation ID '{analysis_operation_id}'..."
)
operation_status = await client.content_analyzers.get_result(
    operation_id=analysis_operation_id,
)

print(f"✅ Analysis result retrieved successfully!")
print(f"   Operation ID: {operation_status.id}")
print(f"   Status: {operation_status.status}")

# The actual analysis result is in operation_status.result
operation_result = operation_status.result
if operation_result is None:
    print("⚠️  No analysis result available")

print(f"   Result contains {len(operation_result.contents)} contents")

# Save the analysis result to a file
saved_file_path = save_json_to_file(
    result=operation_result.as_dict(),
    filename_prefix="bonus_content_analyzer_get_result",
)
print(f"💾 Analysis result saved to: {saved_file_path}")

### Review Analyze Result

In [None]:
print(json.dumps(operation_result["contents"][0].as_dict()["fields"], indent=2))

### Examine `LineItemCorroboration` Field in Detail

> The `ReportingOfficer` field is only present in the car accident report, while fields such as `VIN` come exclusively from the repair estimate document. This demonstrates how information is extracted from both documents to generate a single unified result.
> 
> Multiple input documents are combined to produce one consolidated output. This is a single-analysis result, unlike a batch model where N input documents yield N outputs.

In [None]:
fields = operation_result["contents"][0]["fields"]["LineItemCorroboration"]
print(json.dumps(fields.as_dict(), indent=2))

> Within the `LineItemCorroboration` field, each line item from the *repair estimate document* is extracted along with its claim status and evidence. Items not covered by the policy, such as a Starbucks drink and hotel stay, are marked as suspicious. Valid damage repairs supported by the claim documents and permitted under the policy are confirmed.

### [Optional] Delete the Analyzer for the Second Sample After Use

In [None]:
await client.content_analyzers.delete(analyzer_id_2)