## Azure Document Intelligence and Computer Vision Overview

Azure Document Intelligence, part of Azure's AI services, leverages advanced machine learning and computer vision technologies, including **LayoutLMv3**, the most cutting-edge document AI model, to analyze and extract information from a variety of documents, such as invoices, receipts, and tax forms. LayoutLMv3 is designed to understand both the text and layout of documents, making it highly effective at capturing complex document structures and relationships between text elements. This model enables Azure Document Intelligence to automate the tedious process of manually reviewing and entering data into structured formats, identifying key fields, extracting text, and recognizing entities within complex documents.

Computer vision is a broader field of artificial intelligence that allows machines to interpret and understand visual inputs such as images or scanned documents. In the context of document intelligence, computer vision helps systems "read" documents, recognize patterns, and extract meaningful information from unstructured or semi-structured data sources, making it an essential tool for automating tasks such as document processing, compliance, and auditing.

In this section, we use Azure Document Intelligence to analyze and extract data from tax-related documents, automating the process of identifying document types and extracting relevant fields like wages, taxes, and filing statuses. Azure Document Intelligence can be explored further at [documentintelligence.ai.azure.com](https://documentintelligence.ai.azure.com).

Below is a breakdown of the steps we follow:

1. **Installing Required Packages**  
   We begin by installing the necessary packages, including `azure-ai-documentintelligence` and `xlsxwriter`, which are essential for document analysis and exporting the results to Excel.



In [1]:
!pip install azure-ai-documentintelligence==1.0.0b4 xlsxwriter



2. **Importing Libraries**  
   We import the required libraries for interacting with Azure's `DocumentIntelligenceClient` to analyze documents, and `pandas` for organizing and exporting data into structured formats.
   
3. **Setting up Azure Credentials**  
   Azure Document Intelligence requires an endpoint and a key to authenticate and access the service. These credentials are obtained from the Azure portal and allow us to interact with the Azure API.

In [2]:
# import libraries
import os
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest

endpoint = "https://aiballstate.cognitiveservices.azure.com/"
key = "Alex key"

4. **Document URL**  
   The URL of a tax document (in PDF format) is provided. This document will be analyzed using Azure’s prebuilt models designed for specific document types, such as tax forms.

In [3]:
tax_url = "https://raw.githubusercontent.com/alexanderjwhite/2024-AI-Ball-State/main/us-tax.pdf"

5. **Initializing the Azure Client**  
   We create an instance of the `DocumentIntelligenceClient` using the provided endpoint and credentials. This client is used to send requests to the Azure service, allowing us to analyze documents and receive extracted data.

In [4]:
document_intelligence_client = DocumentIntelligenceClient(
    endpoint=endpoint, credential=AzureKeyCredential(key),api_version="2024-07-31-preview"
)

6. **Running the Document Analysis**  
   The client sends a request to the Azure service to analyze the document. The document is processed using prebuilt models, specifically designed for tax documents, to extract relevant fields like form types (W-2, 1040) and other key information.

In [5]:
poller = document_intelligence_client.begin_analyze_document(
    "prebuilt-tax.us", AnalyzeDocumentRequest(url_source=tax_url)
)

7. **Extracting Document Data**  
   Once the analysis is complete, the results are processed to extract document types and fields. For example, for a W-2 form, fields like wages, tax year, and federal income tax withheld are extracted, along with confidence scores indicating the accuracy of the extraction.

In [6]:
tax_docs = poller.result()

In [8]:
# tax_docs

In [9]:
for document in tax_docs.documents:
    print(document['docType'])

tax.us.w2
tax.us.1040.2022
tax.us.1040ScheduleD.2022
tax.us.1040ScheduleD.2022
tax.us.1098E
tax.us.1099DIV.2022


In [11]:
# for document in tax_docs.documents:
#     print(document['docType'])
#     document_keys = document['fields'].keys()
#     for key in document_keys:
#         print(key)
    
    
    
# #     print(documents['docType'])
# # tax_docs.documents[0]['fields'].keys()


8. **Organizing Data in a DataFrame**  
   The extracted fields are structured into a pandas DataFrame, where each row contains a field name, its corresponding value, and the confidence score of the extraction. This allows us to easily manipulate and analyze the data.

In [12]:
import pandas as pd
fields = tax_docs.documents[0]['fields']
data = []
for field_name, field_info in fields.items():
    # Extract field value (handle different value types)
    if 'valueString' in field_info:
        value = field_info['valueString']
    elif 'valueNumber' in field_info:
        value = field_info['valueNumber']
    else:
        value = None

    # Extract confidence
    confidence = field_info.get('confidence', None)

    # Append to the data list
    data.append([field_name, value, confidence])

# Create DataFrame
df = pd.DataFrame(data, columns=['Field Name', 'Field Value', 'Confidence'])

In [13]:
df

Unnamed: 0,Field Name,Field Value,Confidence
0,TaxYear,2018,0.999
1,W2Copy,"Copy 2 -- To Be Filed with Employee's State, C...",0.999
2,W2FormVariant,W-2,0.999
3,ControlNumber,000086242,0.999
4,WagesTipsAndOtherCompensation,37160.56,0.999
5,FederalIncomeTaxWithheld,3894.54,0.999
6,SocialSecurityWages,37160.56,0.999
7,SocialSecurityTaxWithheld,2303.95,0.999
8,MedicareWagesAndTips,37160.56,0.999
9,MedicareTaxWithheld,538.83,1.0


9. **Saving Data to an Excel File**  
   The `xlsxwriter` package is used to save the DataFrame to an Excel file. Each document type is saved on a separate sheet within the Excel file, allowing for organized and easy access to the extracted information.

In [14]:
# Create an Excel writer object to save the output
excel_writer = pd.ExcelWriter('tax_documents.xlsx', engine='xlsxwriter')

# Iterate through each document in tax_docs.documents
for doc in tax_docs.documents:
#     print(doc)
    # Extract document type (you can adapt this based on your specific data)
    doc_type = doc['docType']  # 'Unknown' if 'documentType' is missing

    # Prepare data for the table
    data = []
    for field_name, field_info in doc['fields'].items():
        # Extract field value (handle different value types)
        if 'valueString' in field_info:
            value = field_info['valueString']
        elif 'valueNumber' in field_info:
            value = field_info['valueNumber']
        else:
            value = None

        # Extract confidence
        confidence = field_info.get('confidence', None)

        # Append to the data list
        data.append([field_name, value, confidence])

    # Create DataFrame
    df = pd.DataFrame(data, columns=['Field Name', 'Field Value', 'Confidence'])

    # Write each DataFrame to a new sheet in the Excel file
    df.to_excel(excel_writer, sheet_name=doc_type[:31], index=False)  # Excel sheet names are limited to 31 characters

# Save the Excel file
excel_writer.save()

# Output confirmation
print("Excel file with multiple document tabs has been created.")


Excel file with multiple document tabs has been created.


  excel_writer.save()


10. **Final Output**  
   Once the Excel file is created, a confirmation message is printed, indicating the successful completion of the process. The Excel file contains multiple tabs, each representing a different document type (W-2, 1040, etc.), with detailed field extractions for each.

This workflow demonstrates the power of Azure Document Intelligence in automating document processing tasks, enabling efficient extraction and organization of data from complex documents like tax forms. By leveraging AI and computer vision, this process reduces manual data entry efforts and ensures accurate, structured outputs for further analysis.