# 02. Azure AI Document Intelligence - Invoice Model

> https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/overview?view=doc-intel-4.0.0

## A. Create an AI Document Intelligence resource and set up environment to run notebook

**_Prerequsite_:** <br>

**AI Document Intelligence resource**: <br>
To create a AI Document Intelligence resource in your Azure subscription:
Please follow the steps as specified https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/create-document-intelligence-resource?view=doc-intel-4.0.0


Get your newly created Document Intelligence service in the Azure portal and on the **Keys and Endpoint** page, copy the **Key1** and **Endpoint** values and paste them in the code cell below, replacing **YOUR_FORM_KEY** and **YOUR_FORM_ENDPOINT**.

**Environment: **

1. **AML workspace**: Please ensure you have Python 3.10 version or above ie select **Python 3.10 - SDK v2** as kernel in AML Notebook. <br>
2. **VS Code**: Please select Python **Python 3.10** above or try to set up virtual envi by following steps https://code.visualstudio.com/docs/python/environments

Please skip this step if you are already done it

## B. Install AI Doc Intelligence library

In [1]:
# install azure-ai-formrecognizer python library and restart the kernal after installation
# ignore this step is you have already installed azure-ai-formrecognizer python library
%pip install azure-ai-formrecognizer --upgrade --user

Note: you may need to restart the kernel to use updated packages.


## C. Setting up AI Doc Intelligence endpoint and key

In [1]:
# Import the os module for interacting with the operating system
import os
# Import for handling resource not found errors
from azure.core.exceptions import ResourceNotFoundError
# Import for authenticating with the Azure service
from azure.core.credentials import AzureKeyCredential
# Import for analysing forms
from azure.ai.formrecognizer import DocumentAnalysisClient, AnalyzeResult

from dotenv import load_dotenv

load_dotenv("credentials.env")
api_key=os.getenv("FORM_RECOGNIZER_KEY")
endpoint = os.getenv("FORM_RECOGNIZER_ENDPOINT") 

# Create a DocumentAnalysisClient instance
# This client is used to interact with the Azure Form Recognizer service
# It is initialized with your endpoint and key
form_recognizer_client = DocumentAnalysisClient(endpoint, AzureKeyCredential(api_key))

## D. Extract the Invoice Document

In [2]:
# Define the URL of the sample invoice document to analyze
# You can change the URL pointing to your sample invoice docs but ensure you provide appropriate access
invoiceUrl = "XXXX"

# Start the analysis of the invoice document using the prebuilt invoice model
# The result is a poller object that can be used to check the status of the operation
poller = form_recognizer_client.begin_analyze_document_from_url("prebuilt-invoice",invoiceUrl)

# Print the poller object. This is useful for debugging purposes to ensure the analysis has started correctly.
print(poller)

# Get the result of the analysis
result = poller.result()

# Print the documents in the result. This will display the analyzed information from the invoice document.
print(result.documents)

<azure.core.polling._poller.LROPoller object at 0x7f7ca069a830>
[AnalyzedDocument(doc_type=invoice, bounding_regions=[BoundingRegion(page_number=1, polygon=[Point(x=0.0, y=0.0), Point(x=8.5, y=0.0), Point(x=8.5, y=11.0), Point(x=0.0, y=11.0)])], spans=[DocumentSpan(offset=0, length=226)], fields={'CustomerAddress': DocumentField(value_type=address, value=AddressValue(house_number=1020, po_box=None, road=Enterprise Way
Sunnayvale, CA 87659, city=None, state=None, postal_code=None, country_region=None, street_address=1020 Enterprise Way
Sunnayvale, CA 87659, unit=None, city_district=None, state_district=None, suburb=None, house=None, level=None), content=1020 Enterprise Way
Sunnayvale, CA 87659, bounding_regions=[BoundingRegion(page_number=1, polygon=[Point(x=5.186, y=1.6901), Point(x=6.6615, y=1.6901), Point(x=6.6615, y=2.0482), Point(x=5.186, y=2.0482)])], spans=[DocumentSpan(offset=60, length=19), DocumentSpan(offset=97, length=20)], confidence=0.992), 'CustomerAddressRecipient': Docu

## E. Extracted Layout Document insights/ response as a JSON format 

In [3]:
# Import necessary libraries
# json for handling JSON data
# datetime and time for generating timestamps
# AzureJSONEncoder for serializing Python objects to JSON
# urlparse for parsing URLs
import json
import datetime
import time
from azure.core.serialization import AzureJSONEncoder
from urllib.parse import urlparse


# Generate a unique filename based on the current timestamp and the basename of the URL
filename = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d%H%M%S')+"_"+os.path.splitext(os.path.basename(urlparse(invoiceUrl).path))[0]

# Convert the result of the analysis (which is a model) to a dictionary
analyze_result_dict = result.to_dict()

# Open a new file with the generated filename and write the dictionary to it as JSON
# Use the AzureJSONEncoder to handle any types that aren't serializable by default
with open(str(filename), 'w') as f:
        json.dump(analyze_result_dict, f, cls=AzureJSONEncoder,indent=4)

# Convert the dictionary back to the original model
model = AnalyzeResult.from_dict(analyze_result_dict)

# Print some information about the model
print("--------------JSON Response from Model Starts---------------------")
# use the model as normal
print("Model ID: '{}'".format(model.model_id))
print("Number of pages analyzed {}".format(len(model.pages)))
print("API version used: {}".format(model.api_version))

# Print the dictionary (which is the JSON response from the model)
print(json.dumps(analyze_result_dict,cls=AzureJSONEncoder,indent=4))
print("--------------JSON Response from Model Ends---------------------")

--------------JSON Response from Model Starts---------------------
Model ID: 'prebuilt-invoice'
Number of pages analyzed 1
API version used: 2023-07-31
{
    "api_version": "2023-07-31",
    "model_id": "prebuilt-invoice",
    "content": "Contoso\nAddress:\nInvoice For: Microsoft\n1 Redmond way Suite\n1020 Enterprise Way\n6000 Redmond, WA\nSunnayvale, CA 87659\n99243\nInvoice Number\nInvoice Date\nInvoice Due Date\nCharges\nVAT ID\n34278587\n6/18/2017\n6/24/2017\n$56,651.49\nPT",
    "languages": [],
    "pages": [
        {
            "page_number": 1,
            "angle": null,
            "width": 8.5,
            "height": 11.0,
            "unit": "inch",
            "lines": [
                {
                    "content": "Contoso",
                    "polygon": [
                        {
                            "x": 0.5301,
                            "y": 1.1458
                        },
                        {
                            "x": 1.4517,
             