#### https://learn.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer?view=azure-python

# Azure Form Recognizer Document Analysis Script

This script uses the Azure Form Recognizer API to analyze a document (e.g., a PDF) using a custom-trained model.

## Prerequisites

1. **Azure Account**: Ensure you have an active Azure account.
2. **Azure Form Recognizer Resource**: Create a Form Recognizer resource in the Azure portal and obtain the:
   - Endpoint URL
   - API Key
3. **Custom Model**: Train and deploy a custom model in Azure Form Recognizer and note the model ID.
4. **Python SDK**: Install the Azure Form Recognizer Python SDK:
   ```bash
   pip install azure-ai-formrecognizer


In [2]:
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# Azure Form Recognizer setup
endpoint = "https://fin-tech-ai2.cognitiveservices.azure.com/"
api_key = "5eFtwamHBJoqlnYu3vfxr04nljak7TuN3kxRegpWH0yQWa5wakdXJQQJ99AKACGhslBXJ3w3AAALACOG4Ayo"
azure_model_id = "adani1.1"

document_analysis_client = DocumentAnalysisClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(api_key)
)

# Path to the PDF document
pdf_path = "C:\\Users\\ASUS\\Desktop\\split_pdfs\\checklist\\c6.pdf"  # Replace with the actual path to your PDF file

# Analyze document using Azure API
with open(pdf_path, "rb") as f:
    poller = document_analysis_client.begin_analyze_document(azure_model_id, f)
    result = poller.result()
print(type(result))    

<class 'azure.ai.formrecognizer._models.AnalyzeResult'>


The class `<class 'azure.ai.formrecognizer._models.AnalyzeResult'>` is part of the Azure Form Recognizer Python SDK. It represents the **result of an analyzed document** and provides a structured way to access all the extracted information from the document analysis.

### Overview of `AnalyzeResult`

The `AnalyzeResult` class contains attributes that store details of the analyzed document, such as text content, tables, key-value pairs, and styles. This result is returned when you use the `begin_analyze_document()` method on a document.

---

### Key Attributes of `AnalyzeResult`

1. **`model_id`**:
   - The ID of the custom model used for analysis.

2. **`content`**:
   - The full concatenated text content of the document.

3. **`pages`**:
   - A list of `DocumentPage` objects. Each page contains detailed information about the page's layout, including:
     - **Lines**: Extracted lines of text.
     - **Words**: Individual words with bounding boxes.
     - **Tables**: Tabular data extracted from the page.
     - **Selection Marks**: Marked fields like checkboxes.

4. **`documents`**:
   - A list of `AnalyzedDocument` objects. Each object represents structured data, typically from forms or invoices.
     - **Fields**: Key-value pairs extracted from the document, represented as a dictionary where:
       - Key: Field name.
       - Value: A `DocumentField` object with the extracted value, confidence, and bounding box.

5. **`tables`**:
   - A list of `DocumentTable` objects representing tables in the document.
     - **Cells**: Information about each cell, including its row/column index and content.

6. **`styles`**:
   - A list of detected styles in the document, such as whether the text is handwritten or printed.

7. **`key_value_pairs`** (Optional):
   - A list of extracted key-value pairs in the document.

8. **`languages`**:
   - Detected languages in the document.

---

### Example Usage

You can access the attributes of an `AnalyzeResult` object like this:

```python
# Print model ID used for analysis
print("Model ID:", result.model_id)

# Access content of the entire document
print("Document Content:", result.content)

# Iterate over pages
for page in result.pages:
    print(f"Page {page.page_number}:")
    print("Lines of text:")
    for line in page.lines:
        print(f"  {line.content} (confidence: {line.confidence})")

# Access tables
for table in result.tables:
    print(f"Table with {len(table.rows)} rows and {len(table.columns)} columns:")
    for cell in table.cells:
        print(f"  Cell ({cell.row_index}, {cell.column_index}): {cell.content}")

# Access key-value pairs
if result.documents:
    for doc in result.documents:
        print(f"Document type: {doc.doc_type}")
        for field_name, field in doc.fields.items():
            print(f"  Field '{field_name}': {field.value} (confidence: {field.confidence})")
```

---

### Practical Use Cases

- **Forms**: Extract structured data like names, dates, and amounts from forms.
- **Invoices/Receipts**: Retrieve line items, totals, and vendor details.
- **Tables**: Extract tabular data for further analysis.
- **Document Layouts**: Analyze the structure of documents, including headers, footers, and paragraphs.

---

### Documentation

Refer to the official [Azure Form Recognizer SDK documentation](https://learn.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.analyzeresult?view=azure-python) for a detailed API reference.