# Manage Analyzers in Your Resource

This notebook demo how to create a simple analyzer and manage its lifecycle.

Source: https://github.com/Azure-Samples/azure-ai-content-understanding-python.git

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. 


Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK.


In [24]:
import logging
import json
import os
import sys
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

In [25]:
load_dotenv(dotenv_path='../infra/credentials.env', override=True)

True

In [26]:
logging.basicConfig(level=logging.INFO)

In [27]:
AZURE_AI_ENDPOINT = os.getenv("AZURE_CU_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_CU_API_VERSION", "2024-12-01-preview")
print(f"Current Azure Content Understanding endpoint: {AZURE_AI_ENDPOINT}")
print(f"Current Azure Content Understanding API version: {AZURE_AI_API_VERSION}")

Current Azure Content Understanding endpoint: https://ai-aifoundryupskillinghub687267079310.services.ai.azure.com/
Current Azure Content Understanding API version: 2024-12-01-preview


In [28]:
# only if necessary, add the parent directory to the path to use shared modules
# parent_dir = Path(Path.cwd()).parent
# sys.path.append(str(parent_dir))

# import the utility class AzureContentUnderstandingClient, which is a wrapper around the Azure Content Understanding REST API client
from python.content_understanding_client import AzureContentUnderstandingClient

In [29]:
credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

INFO:azure.identity._credentials.environment:No environment configuration found.
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use Azure ML managed identity


In [30]:
client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    x_ms_useragent="azure-ai-content-understanding-python/analyzer_management", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.
)

INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://127.0.0.1:46808/MSI/auth?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'secret': 'REDACTED'
    'User-Agent': 'azsdk-python-identity/1.21.0 Python/3.10.16 (Linux-5.15.0-1073-azure-x86_64-with-glibc2.31)'
No body was attached to the request
INFO:azure.core.pipeline.policies.http_logging_policy:Response status: 200
Response headers:
    'Content-Type': 'application/json'
    'Date': 'Thu, 10 Apr 2025 15:11:16 GMT'
    'Transfer-Encoding': 'chunked'
INFO:azure.identity._credentials.chained:DefaultAzureCredential acquired a token from ManagedIdentityCredential


## Create a simple analyzer
We first create an analyzer from a template to extract invoice fields.

In [40]:
import uuid

ANALYZER_TEMPLATE = "analyzer_templates/my_invoice_analyzer_template.json" #call_recording_analytics.json"
ANALYZER_ID = "my_invoice_analyzer"
# ANALYZER_ID = "analyzer-management-sample-" + str(uuid.uuid4())

response = client.begin_create_analyzer(ANALYZER_ID, analyzer_template_path=ANALYZER_TEMPLATE)
result = client.poll_result(response)

print(json.dumps(result, indent=2))

INFO:python.content_understanding_client:Analyzer my_invoice_analyzer create request accepted.
INFO:python.content_understanding_client:Request 196099c2-656b-41da-8306-13c5b66fde82 in progress ...
INFO:python.content_understanding_client:Request 196099c2-656b-41da-8306-13c5b66fde82 in progress ...
INFO:python.content_understanding_client:Request 196099c2-656b-41da-8306-13c5b66fde82 in progress ...


{
  "id": "196099c2-656b-41da-8306-13c5b66fde82",
  "status": "Succeeded",
  "result": {
    "analyzerId": "my_invoice_analyzer",
    "description": "Sample invoice analyzer",
    "createdAt": "2025-04-10T15:18:00Z",
    "lastModifiedAt": "2025-04-10T15:18:06Z",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": true,
      "enableBarcode": false,
      "enableFormula": false,
      "disableContentFiltering": false
    },
    "fieldSchema": {
      "fields": {
        "VendorName": {
          "type": "string",
          "method": "extract",
          "description": "Vendor issuing the invoice"
        },
        "Items": {
          "type": "array",
          "method": "extract",
          "items": {
            "type": "object",
            "properties": {
              "Description": {
                "type": "string",
                "method": "extract",
                "description": "Description of the item"
              },
              

## List all analyzers created in your resource

After the analyzer is successfully created, we can use it to analyze our input files.

In [37]:
all_analyzers = client.get_all_analyzers()
print(f"Number of analyzers in your resource: {len(all_analyzers['value'])}")
print(f"First 3 analyzer details: {json.dumps(all_analyzers['value'][:3], indent=2)}")

Number of analyzers in your resource: 4
First 3 analyzer details: [
  {
    "analyzerId": "prebuilt-read",
    "description": "Extract content elements such as words, barcodes, and formulas from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": false,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "prebuilt-layout",
    "description": "Extract various content and layout elements such as words, paragraphs, and tables from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": true,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "analyzer-management-sample-a706ced3-c0c1-4603-b1bc-d560c3fec847",
    "description": "Sample invoice analyzer",
    "createdAt": "2025-04-10T15:15:27Z",
    "las

In [38]:
import pandas as pd
df_analyzers = pd.json_normalize(all_analyzers['value'])
df_analyzers_sorted = df_analyzers.sort_values(by='createdAt', ascending=True)
df_analyzers_sorted

Unnamed: 0,analyzerId,description,warnings,status,scenario,config.returnDetails,config.enableOcr,config.enableLayout,config.enableBarcode,config.enableFormula,...,fieldSchema.fields.People.items.properties.Role.description,fieldSchema.fields.Sentiment.type,fieldSchema.fields.Sentiment.method,fieldSchema.fields.Sentiment.description,fieldSchema.fields.Sentiment.enum,fieldSchema.fields.Categories.type,fieldSchema.fields.Categories.method,fieldSchema.fields.Categories.description,fieldSchema.fields.Categories.items.type,fieldSchema.fields.Categories.items.enum
3,my_invoice_analyzer,Sample call recording analytics,[],ready,callCenter,True,,,,,...,Person's title/role,string,classify,Overall sentiment,"[Positive, Neutral, Negative]",array,classify,List of relevant categories,string,"[Agriculture, Business, Finance, Health, Insur..."
2,analyzer-management-sample-a706ced3-c0c1-4603-...,Sample invoice analyzer,[],ready,document,True,True,True,False,False,...,,,,,,,,,,
0,prebuilt-read,"Extract content elements such as words, barcod...",[],undefined,document,True,True,False,False,False,...,,,,,,,,,,
1,prebuilt-layout,Extract various content and layout elements su...,[],undefined,document,True,True,True,False,False,...,,,,,,,,,,


## Get analyzer details with id

Remember the analyzer id when you create it. You can use the id to look up detail analyzer definitions afterwards.

In [12]:
result = client.get_analyzer_detail_by_id(ANALYZER_ID)
print(json.dumps(result, indent=2))

{
  "analyzerId": "analyzer-management-sample-3d7b2f34-bde3-44de-bddf-ab506efd01e1",
  "description": "Sample call recording analytics",
  "createdAt": "2025-04-10T14:49:52Z",
  "lastModifiedAt": "2025-04-10T14:49:52Z",
  "config": {
    "locales": [
      "en-US"
    ],
    "returnDetails": true,
    "disableContentFiltering": false
  },
  "fieldSchema": {
    "fields": {
      "Summary": {
        "type": "string",
        "method": "generate",
        "description": "A one-paragraph summary"
      },
      "Topics": {
        "type": "array",
        "method": "generate",
        "description": "Top 5 topics mentioned",
        "items": {
          "type": "string"
        }
      },
      "Companies": {
        "type": "array",
        "method": "generate",
        "description": "List of companies mentioned",
        "items": {
          "type": "string"
        }
      },
      "People": {
        "type": "array",
        "method": "generate",
        "description": "List of peop

## Delete Analyzer
If you don't need an analyzer anymore, delete it with its id.

In [13]:
client.delete_analyzer(ANALYZER_ID)

INFO:python.content_understanding_client:Analyzer analyzer-management-sample-3d7b2f34-bde3-44de-bddf-ab506efd01e1 deleted.


<Response [204]>

In [39]:
client.delete_analyzer("my_invoice_analyzer")

INFO:python.content_understanding_client:Analyzer my_invoice_analyzer deleted.


<Response [204]>