# Extract Custom Fields from Your File

This notebook demonstrates how to use analyzers to extract custom fields from your input files.

## Prerequisites
1. Ensure your Azure AI service is configured by following the [configuration steps](../README.md#configure-azure-ai-service-resource).
2. Install the required packages to run this sample.

In [None]:
%pip install -r ../requirements.txt

## Analyzer Templates

Below is a collection of analyzer templates designed to extract fields from various types of input files.

These templates are highly customizable, allowing you to modify them to fit your specific requirements. For additional verified templates from Microsoft, please visit [here](../analyzer_templates/README.md).

In [None]:
extraction_templates = {
    # Extract fields from invoices (without grounding sources or confidence scores).
    "invoice": ('../analyzer_templates/invoice.json', '../data/invoice.pdf'),

    # Extract fields from invoices, including grounding sources and confidence scores (optional add-on).
    "invoice_field_source": ('../analyzer_templates/invoice_field_source.json', '../data/invoice.pdf'),

    # Extract insights from call recordings (e.g., summary, topics, mentioned companies, and people).
    "call_recording": ('../analyzer_templates/call_recording_analytics.json', '../data/callCenterRecording.mp3'),

    # Extract summary and sentiment from conversational audio (e.g., customer service calls).
    "conversation_audio": ('../analyzer_templates/conversational_audio_analytics.json', '../data/callCenterRecording.mp3'),

    # Extract descriptions and sentiment analysis from marketing videos.
    "marketing_video": ('../analyzer_templates/marketing_video.json', '../data/FlightSimulator.mp4'),
}

Specify the analyzer template you want to use and provide a name for the analyzer to be created based on the selected template.

In [None]:
import uuid

ANALYZER_TEMPLATE = "invoice"

(analyzer_template_path, analyzer_sample_file_path) = extraction_templates[ANALYZER_TEMPLATE]

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class that provides functions to interact with the Content Understanding API. Prior to the official release of the Content Understanding SDK, this client serves as a lightweight SDK.

In [None]:
import logging
import json
import os
import sys
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

load_dotenv(find_dotenv())
logging.basicConfig(level=logging.INFO)

AZURE_AI_ENDPOINT = os.getenv("AZURE_AI_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_AI_API_VERSION", "2025-05-01-preview")

# Add the parent directory to the path to import shared modules
parent_dir = Path(Path.cwd()).parent
sys.path.append(str(parent_dir))
from python.content_understanding_client import AzureContentUnderstandingClient

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    # x_ms_useragent="azure-ai-content-understanding-python/field_extraction", # This header is used for sample usage telemetry. Comment out this line to opt out.
)

## Create Analyzer from the Template

In [None]:
CUSTOM_ANALYZER_ID = "field-extraction-sample-" + str(uuid.uuid4())
response = client.begin_create_analyzer(CUSTOM_ANALYZER_ID, analyzer_template_path=analyzer_template_path)
result = client.poll_result(response)

print(json.dumps(result, indent=2))

## Extract Fields Using the Analyzer

Once the analyzer is successfully created, you can use it to analyze your input files.

In [None]:
response = client.begin_analyze(CUSTOM_ANALYZER_ID, file_location=analyzer_sample_file_path)
result_json = client.poll_result(response)

print(json.dumps(result_json, indent=2))

## Clean Up
Optionally, delete the sample analyzer from your resource. In typical usage scenarios, you would reuse the same analyzer to analyze multiple files.

In [None]:
client.delete_analyzer(CUSTOM_ANALYZER_ID)