# Manage Analyzers in Your Resource

This notebook demo how to create a simple analyzer and manage its lifecycle.

Source: https://github.com/Azure-Samples/azure-ai-content-understanding-python.git

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. 


Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK.


In [1]:
import logging
import json
import os
import sys
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

In [2]:
load_dotenv(override=True, dotenv_path=find_dotenv())

True

In [3]:
logging.basicConfig(level=logging.INFO)

In [4]:
AZURE_AI_ENDPOINT = os.getenv("AZURE_CU_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_CU_API_VERSION", "2024-12-01-preview")
print(f"Current Azure Content Understanding endpoint: {AZURE_AI_ENDPOINT}")
print(f"Current Azure Content Understanding API version: {AZURE_AI_API_VERSION}")

Current Azure Content Understanding endpoint: https://ep-ai-services.services.ai.azure.com/
Current Azure Content Understanding API version: 2024-12-01-preview


In [None]:
# only if necessary, add the parent directory to the path to use shared modules
# parent_dir = Path(Path.cwd()).parent
# sys.path.append(str(parent_dir))

# import the utility class AzureContentUnderstandingClient, which is a wrapper around the Azure Content Understanding REST API client
from python.content_understanding_client import AzureContentUnderstandingClient

In [6]:
credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

INFO:azure.identity._credentials.environment:No environment configuration found.
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS


In [7]:
client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    x_ms_useragent="azure-ai-content-understanding-python/analyzer_management", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.
)

INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'User-Agent': 'azsdk-python-identity/1.21.0 Python/3.10.16 (Windows-10-10.0.26100-SP0)'
No body was attached to the request
INFO:azure.identity._credentials.chained:DefaultAzureCredential acquired a token from AzureCliCredential


## Create a simple analyzer
We first create an analyzer from a template to extract invoice fields.

In [12]:
import uuid

ANALYZER_TEMPLATE = "analyzer_templates/call_recording_analytics.json"
ANALYZER_ID = "analyzer-management-sample-" + str(uuid.uuid4())

response = client.begin_create_analyzer(ANALYZER_ID, analyzer_template_path=ANALYZER_TEMPLATE)
result = client.poll_result(response)

print(json.dumps(result, indent=2))

INFO:python.content_understanding_client:Analyzer analyzer-management-sample-bff712bc-bcf4-4223-b47b-51318af4dba7 create request accepted.
INFO:python.content_understanding_client:Request d7c24c35-b146-4d76-a088-b2c5f78a8087 in progress ...
INFO:python.content_understanding_client:Request d7c24c35-b146-4d76-a088-b2c5f78a8087 in progress ...
INFO:python.content_understanding_client:Request result is ready after 5.04 seconds.


{
  "id": "d7c24c35-b146-4d76-a088-b2c5f78a8087",
  "status": "Succeeded",
  "result": {
    "analyzerId": "analyzer-management-sample-bff712bc-bcf4-4223-b47b-51318af4dba7",
    "description": "Sample call recording analytics",
    "createdAt": "2025-04-02T15:54:50Z",
    "lastModifiedAt": "2025-04-02T15:54:53Z",
    "config": {
      "locales": [
        "en-US"
      ],
      "returnDetails": true,
      "disableContentFiltering": false
    },
    "fieldSchema": {
      "fields": {
        "Summary": {
          "type": "string",
          "method": "generate",
          "description": "A one-paragraph summary"
        },
        "Topics": {
          "type": "array",
          "method": "generate",
          "description": "Top 5 topics mentioned",
          "items": {
            "type": "string"
          }
        },
        "Companies": {
          "type": "array",
          "method": "generate",
          "description": "List of companies mentioned",
          "items": {
      

## List all analyzers created in your resource

After the analyzer is successfully created, we can use it to analyze our input files.

In [13]:
all_analyzers = client.get_all_analyzers()
print(f"Number of analyzers in your resource: {len(all_analyzers['value'])}")
print(f"First 3 analyzer details: {json.dumps(all_analyzers['value'][:3], indent=2)}")

Number of analyzers in your resource: 13
First 3 analyzer details: [
  {
    "analyzerId": "prebuilt-read",
    "description": "Extract content elements such as words, barcodes, and formulas from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": false,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "prebuilt-layout",
    "description": "Extract various content and layout elements such as words, paragraphs, and tables from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": true,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "analyzer-management-sample-bff712bc-bcf4-4223-b47b-51318af4dba7",
    "description": "Sample call recording analytics",
    "createdAt": "2025-04-02T15:54:50Z",

In [14]:
import pandas as pd
df_analyzers = pd.json_normalize(all_analyzers['value'])
df_analyzers_sorted = df_analyzers.sort_values(by='createdAt', ascending=True)
df_analyzers_sorted

Unnamed: 0,analyzerId,description,warnings,status,scenario,config.returnDetails,config.enableOcr,config.enableLayout,config.enableBarcode,config.enableFormula,...,fieldSchema.fields.TripDetails.items.properties.DestinationCity.description,fieldSchema.fields.TripDetails.items.properties.ReturnDate.type,fieldSchema.fields.TripDetails.items.properties.ReturnDate.method,fieldSchema.fields.TripDetails.items.properties.ReturnDate.description,fieldSchema.fields.Signature.type,fieldSchema.fields.Signature.method,fieldSchema.fields.Signature.description,fieldSchema.fields.InsuranceCorp.type,fieldSchema.fields.InsuranceCorp.method,fieldSchema.fields.InsuranceCorp.description
12,travel-insurance-analyzer,,[],ready,document,True,True,True,False,False,...,Trip city\t,date,extract,Date of return,string,extract,Policyholder signature,,,
8,lab-receipt-analyzer,Receipt analyzer for laboratories expenses,[],ready,image,False,,,,,...,,,,,,,,,,
3,auto-labeling-model-1743436442370-230,,[],ready,image,False,,,,,...,,,,,,,,,,
7,lab-receipt-analyzer-v2,Receipt analyzer for laboratories expenses wit...,[],ready,image,False,,,,,...,,,,,,,,,,
11,travel-insurance-analyzer-v2,Added company name field extraction,[],ready,document,True,True,True,False,False,...,Trip city\t,date,extract,Date of return,string,extract,Policyholder signature,string,extract,Insurance company name
4,auto-labeling-model-1743437717122-749,,[],ready,document,True,True,True,False,False,...,,,,,,,,,,
6,invoice-analyzer-v0,Sample invoice layout,[],ready,document,True,True,True,False,False,...,,,,,,,,,,
5,auto-labeling-model-1743441022830-535,,[],ready,image,False,,,,,...,,,,,,,,,,
10,people-detection-analyzer,Detects if the photo contains people,[],ready,image,False,,,,,...,,,,,,,,,,
9,my-invoice-analyzer,A personal assistant for collecting invoice data,[],ready,document,True,True,True,False,False,...,,,,,,,,,,


## Get analyzer details with id

Remember the analyzer id when you create it. You can use the id to look up detail analyzer definitions afterwards.

In [15]:
result = client.get_analyzer_detail_by_id(ANALYZER_ID)
print(json.dumps(result, indent=2))

{
  "analyzerId": "analyzer-management-sample-bff712bc-bcf4-4223-b47b-51318af4dba7",
  "description": "Sample call recording analytics",
  "createdAt": "2025-04-02T15:54:50Z",
  "lastModifiedAt": "2025-04-02T15:54:53Z",
  "config": {
    "locales": [
      "en-US"
    ],
    "returnDetails": true,
    "disableContentFiltering": false
  },
  "fieldSchema": {
    "fields": {
      "Summary": {
        "type": "string",
        "method": "generate",
        "description": "A one-paragraph summary"
      },
      "Topics": {
        "type": "array",
        "method": "generate",
        "description": "Top 5 topics mentioned",
        "items": {
          "type": "string"
        }
      },
      "Companies": {
        "type": "array",
        "method": "generate",
        "description": "List of companies mentioned",
        "items": {
          "type": "string"
        }
      },
      "People": {
        "type": "array",
        "method": "generate",
        "description": "List of peop

## Delete Analyzer
If you don't need an analyzer anymore, delete it with its id.

In [16]:
client.delete_analyzer(ANALYZER_ID)

INFO:python.content_understanding_client:Analyzer analyzer-management-sample-bff712bc-bcf4-4223-b47b-51318af4dba7 deleted.


<Response [204]>