# Manage Analyzers in Your Resource

This notebook demo how to create a simple analyzer and manage its lifecycle.

Source: https://github.com/Azure-Samples/azure-ai-content-understanding-python.git

## Create Azure AI Content Understanding Client

> The [AzureContentUnderstandingClient](python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. 


Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK.


In [2]:
import logging
import json
import os
import sys
from pathlib import Path
from dotenv import find_dotenv, load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

In [19]:
load_dotenv(dotenv_path='../infra/credentials.env', override=True)

True

In [4]:
logging.basicConfig(level=logging.INFO)

In [5]:
AZURE_AI_ENDPOINT = os.getenv("AZURE_CU_ENDPOINT")
AZURE_AI_API_VERSION = os.getenv("AZURE_CU_API_VERSION", "2024-12-01-preview")
print(f"Current Azure Content Understanding endpoint: {AZURE_AI_ENDPOINT}")
print(f"Current Azure Content Understanding API version: {AZURE_AI_API_VERSION}")

Current Azure Content Understanding endpoint: https://ai-aifoundryupskillinghub687267079310.services.ai.azure.com/
Current Azure Content Understanding API version: 2024-12-01-preview


In [6]:
# only if necessary, add the parent directory to the path to use shared modules
# parent_dir = Path(Path.cwd()).parent
# sys.path.append(str(parent_dir))

# import the utility class AzureContentUnderstandingClient, which is a wrapper around the Azure Content Understanding REST API client
from python.content_understanding_client import AzureContentUnderstandingClient

In [7]:
credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

INFO:azure.identity._credentials.environment:No environment configuration found.
INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use Azure ML managed identity


In [8]:
client = AzureContentUnderstandingClient(
    endpoint=AZURE_AI_ENDPOINT,
    api_version=AZURE_AI_API_VERSION,
    token_provider=token_provider,
    x_ms_useragent="azure-ai-content-understanding-python/analyzer_management", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.
)

INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://127.0.0.1:46808/MSI/auth?api-version=REDACTED&resource=REDACTED'
Request method: 'GET'
Request headers:
    'secret': 'REDACTED'
    'User-Agent': 'azsdk-python-identity/1.21.0 Python/3.10.16 (Linux-5.15.0-1073-azure-x86_64-with-glibc2.31)'
No body was attached to the request
INFO:azure.core.pipeline.policies.http_logging_policy:Response status: 200
Response headers:
    'Content-Type': 'application/json'
    'Date': 'Tue, 15 Apr 2025 16:04:38 GMT'
    'Transfer-Encoding': 'chunked'
INFO:azure.identity._credentials.chained:DefaultAzureCredential acquired a token from ManagedIdentityCredential


## Create a simple analyzer
We first create an analyzer from a template to extract invoice fields.

In [21]:
import uuid

ANALYZER_TEMPLATE = "analyzer_templates/my_invoice_analyzer_template.json" #call_recording_analytics.json"
ANALYZER_ID = "my_invoice_20250415"
# ANALYZER_ID = "analyzer-management-sample-" + str(uuid.uuid4())

response = client.begin_create_analyzer(ANALYZER_ID, analyzer_template_path=ANALYZER_TEMPLATE)
result = client.poll_result(response)

print(json.dumps(result, indent=2))

INFO:python.content_understanding_client:Analyzer my_invoice_20250415 create request accepted.
INFO:python.content_understanding_client:Request 93c6127b-92ca-46ec-9aa6-66ab7357935d in progress ...


{
  "id": "93c6127b-92ca-46ec-9aa6-66ab7357935d",
  "status": "Succeeded",
  "result": {
    "analyzerId": "my_invoice_20250415",
    "description": "Invoice analyzer template",
    "createdAt": "2025-04-15T16:11:27Z",
    "lastModifiedAt": "2025-04-15T16:11:28Z",
    "config": {
      "returnDetails": false,
      "enableOcr": true,
      "enableLayout": true,
      "enableBarcode": false,
      "enableFormula": false,
      "disableContentFiltering": false
    },
    "fieldSchema": {
      "fields": {
        "InvoiceNr": {
          "type": "string",
          "method": "extract",
          "description": "The ID of the invoice"
        },
        "InvoiceDate": {
          "type": "date",
          "method": "extract",
          "description": "The date when the invoice was issued"
        },
        "DueDate": {
          "type": "date",
          "method": "extract",
          "description": "The invoice due date"
        },
        "Company": {
          "type": "string",
      

## List all analyzers created in your resource

After the analyzer is successfully created, we can use it to analyze our input files.

In [17]:
all_analyzers = client.get_all_analyzers()
print(f"Number of analyzers in your resource: {len(all_analyzers['value'])}")
print(f"First 3 analyzer details: {json.dumps(all_analyzers['value'][:3], indent=2)}")

Number of analyzers in your resource: 11
First 3 analyzer details: [
  {
    "analyzerId": "prebuilt-read",
    "description": "Extract content elements such as words, barcodes, and formulas from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": false,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "prebuilt-layout",
    "description": "Extract various content and layout elements such as words, paragraphs, and tables from documents.",
    "config": {
      "returnDetails": true,
      "enableOcr": true,
      "enableLayout": true,
      "enableBarcode": false,
      "enableFormula": false
    },
    "status": "undefined",
    "scenario": "document"
  },
  {
    "analyzerId": "analyzer-management-sample-a706ced3-c0c1-4603-b1bc-d560c3fec847",
    "description": "Sample invoice analyzer",
    "createdAt": "2025-04-10T15:15:27Z",
    "la

In [18]:
import pandas as pd
df_analyzers = pd.json_normalize(all_analyzers['value'])
df_analyzers_sorted = df_analyzers.sort_values(by='createdAt', ascending=False)
df_analyzers_sorted[['analyzerId', 'createdAt']]

Unnamed: 0,analyzerId,createdAt
8,my_invoice_20250415,2025-04-15T16:08:50Z
10,my_invoice_analyzer_elisa,2025-04-11T10:13:32Z
4,content-understanding-search-sample-4d23a4ef-c...,2025-04-10T16:42:26Z
5,content-understanding-search-sample-52bb9c5b-9...,2025-04-10T16:42:07Z
7,content-understanding-search-sample-c189f090-8...,2025-04-10T16:37:34Z
3,content-understanding-search-sample-06f395bb-2...,2025-04-10T15:59:37Z
6,content-understanding-search-sample-b4d7ed6b-7...,2025-04-10T15:41:08Z
9,my_invoice_analyzer,2025-04-10T15:18:00Z
2,analyzer-management-sample-a706ced3-c0c1-4603-...,2025-04-10T15:15:27Z
0,prebuilt-read,


## Get analyzer details with id

Remember the analyzer id when you create it. You can use the id to look up detail analyzer definitions afterwards.

In [None]:
result = client.get_analyzer_detail_by_id(ANALYZER_ID)
print(json.dumps(result, indent=2))

## Delete Analyzer
If you don't need an analyzer anymore, delete it with its id.

In [20]:
client.delete_analyzer(ANALYZER_ID)

INFO:python.content_understanding_client:Analyzer my_invoice_20250415 deleted.


<Response [204]>

In [None]:
client.delete_analyzer("my_invoice_analyzer")