# Azure Document Intelligence Custom Template User Feedback Loop Experiment

This experiment demonstrates how to replicate the functionality of the [Azure AI Document Intelligence](https://learn.microsoft.com/en-GB/azure/ai-services/document-intelligence/overview) Studio custom model training process. The aim is to showcase how you may implement a user feedback loop for improving the quality of document processing results. The feedback mechanism can be implemented to allow developers of custom models in Azure AI Document Intelligence to collect feedback from users to improve the model's performance.

This notebook provides an interactive user feedback experience, enabling a user to analyze a document using a trained model, visualize the analysis results overlaid on the document, and correct any incorrectly identified or missing fields. This implementation could be replicated in any client application using your chosen framework's capabilities, such as React, Angular, or Vue.js.

> **Note**: This notebook provides _one_ potential approach to user interaction, and can be interpreted in many ways based on your use case.

## Pre-requisites

> **Note**: Before continuing, please ensure that the [`Setup-Environment.ps1`](./Setup-Environment.ps1) script has been run to deploy the required infrastructure to Azure. This includes the Azure AI Document Intelligence resource and the Azure Storage account for creating a custom model.

This notebook uses [Dev Containers](https://code.visualstudio.com/docs/remote/containers) to ensure that all the required dependencies are available in a consistent local development environment.

The following are required to run this notebook:

- [Visual Studio Code](https://code.visualstudio.com/)
- [Docker Desktop](https://www.docker.com/products/docker-desktop)
- [Remote - Containers extension for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)

> **Note**: The Dev Container is pre-configured with the required dependencies and extensions. You can run this notebook outside of a Dev Container, but you will need to manually install the required dependencies including Poppler, Tesseract, and OpenCV.

The Dev Container will include the following dependencies by default:

- Debian 11 (Bullseye) base image
- Python 3.12
  - azure-ai-formrecognizer - for interacting with the Azure AI Document Intelligence service
  - azure-core - for interacting with the Azure AI Document Intelligence service
  - ipycanvas - for rendering the document and allowing the user to draw over it
  - ipykernel - for running the notebook
  - notebook - for running the notebook
  - opencv-python-headless - for image processing
  - pdf2image - for converting PDFs to images
  - pytesseract - for performing OCR on the document
- Poppler - used by pdf2image to convert PDFs to images
- Tesseract OCR - used by pytesseract to perform OCR on the document
- Python3 OpenCV - used for image processing

## Import Requirements

The following code block imports the required dependencies for this notebook.

It also configures the following:

- Setup the local working directory.
- Load local environment variables based on the output of the [`Setup-Environment.ps1`](./Setup-Environment.ps1) script run. The environment variables will be available in the [`.env`](./.env) file.
- Initialize the credential that will be used to authentication with the Azure services.

> **Note**: The [`Setup-Environment.ps1`](./Setup-Environment.ps1) script is not run as part of this notebook. It must be run separately, prior to running this notebook, to deploy the required infrastructure to Azure.

In [25]:
import os

from dotenv import dotenv_values
from azure.identity import DefaultAzureCredential
from modules.app_settings import AppSettings
from modules.model_training_client import ModelTrainingClient
from modules.document_canvas import (DocumentCanvas)
from modules.document_intelligence_label import DocumentIntelligenceLabel
from modules.document_intelligence_result_formatter import DocumentIntelligenceResultFormatter



In [26]:
working_dir = os.path.abspath('')


In [None]:
print(working_dir)

In [27]:
settings = AppSettings(dotenv_values(f"{working_dir}/config.env"))


In [28]:
azure_credential = DefaultAzureCredential(
    exclude_environment_credential=True,
    exclude_managed_identity_credential=True,
    exclude_shared_token_cache_credential=True,
    exclude_interactive_browser_credential=True,
    exclude_powershell_credential=True,
    exclude_visual_studio_code_credential=False,
    exclude_cli_credential=False
)

## Create a custom extraction model in Azure AI Document Intelligence

This experiment comes prepared with the data required to train a custom model. The data is located in the [`model_training`](./model_training/) directory and contains a set of invoices that will be used to create the initial custom model.

The following code blocks will create a model training client, using the [`ModelTrainingClient`](./modules/model_training_client.py), and run it to upload the files to and Azure Storage blob container, and training the model using Azure AI Document Intelligence.

In [29]:
# The name of the model
model_name = 'invoices' 

# The version of the model
initial_model_version = '1.0.0'

# The name of the model that will be registered in Azure AI Document Intelligence
initial_model_id = f"{model_name}-{initial_model_version}"

In [30]:
model_training_client = ModelTrainingClient(settings=settings, use_azure_credential=False, azure_credential=azure_credential)

In [31]:



# Uploads the initial training set to Azure Blob Storage and initiates model training using the uploaded data.
model_training_client.upload_training_data(f"{working_dir}/model_training")


## Demonstrate a user feedback loop experience for improving the model

The user feedback loop is a mechanism that allows users of your model to provide feedback on the quality of the results generated by the model from interactions they have with it using their own data.

The following code blocks emulates what a user experience flow may present itself within an intelligent application interfacing with Azure AI Document Intelligence.

The steps include:

- Analyzing a document using the custom model (this is required for providing the user feedback experience) and the prebuilt-layout model (this is required for the training of the custom model).
- Visualizing the analysis results overlaid on the document.
- Allowing the user to correct any incorrectly identified or missing fields.
- Providing the corrected data to the model for retraining.
- Using the retrained model to analyze a document.

In [32]:
# The name of the PDF file the user is providing.

pdf_file_name = 'Invoice_6.pdf'

base_file_name = pdf_file_name[:-4]  # Slice the string, excluding the last 4 characters ('.pdf')

# The directory containing the PDF file.
pdf_dir = os.path.join(working_dir, 'pdfs')

# The file path to the PDF file for loading.
pdf_path = os.path.join(pdf_dir, pdf_file_name)

# The file path to where the required JSON result from Azure AI Document Intelligence layout analysis will be stored.
pdf_ocr_path = os.path.join(pdf_dir, f"{pdf_file_name}.ocr.json")

# The file path to where the initial analysis of the user feedback document will be stored.
pdf_feedback_path = os.path.join(pdf_dir, f"{pdf_file_name}.ocr_{initial_model_version}.json")

# The file path to where the required JSON result for Azure AI Document Intelligence labels will be stored after user feedback.
pdf_labels_path = os.path.join(pdf_dir, f"{pdf_file_name}.labels.json")

# The file path to where the required document fields are, based on the original model training data.
document_fields_path = os.path.join(working_dir, 'model_training', 'fields.json')

In [39]:
# Resets the sample environment to only contain the initial training set. This is only necessary if the sample has been run previously.
model_training_client.delete_training_data(base_file_name)

'Invoice_6'

In [41]:
#train model less the excluded annotation set

invoice_model = model_training_client.create_model(model_name=initial_model_id)

### Run layout analysis on the document using Azure AI Document Intelligence

This step will use the Azure AI Document Intelligence service to perform layout analysis on the PDF document. When complete, the files will be saved to the `./pdfs` directory with the name format `<pdf_file_name>.ocr.json`.

> **Note**: These specific steps do not need to be run every time. The layout analysis is only required to be run once to capture the initial state of the document.

In [42]:
# Retraining a model requires that the OCR result provided in the training data set is created using the 'prebuilt-layout' model.
model_training_client.run_layout_analysis(pdf_path, pdf_ocr_path, 'prebuilt-layout')

{'status': 'succeeded',
 'createdDateTime': '2024-07-08T01:50:48Z',
 'lastUpdatedDateTime': '2024-07-08T01:50:48Z',
 'analyzeResult': {'apiVersion': '2023-07-31',
  'modelId': 'prebuilt-layout',
  'content': 'CONTOSO Innovation drives progress\n111st Avenue Redmond, WA 67891 (201) 555-0101 (201) 555-0102\nINVOICE\nCUSTOMER MADE Apps Kingdom Street London W2 6BD United Kingdom\nISSUED: 2/27/2024\nPRODUCT ID\nUNIT PRICE\nQUANTITY\nTOTAL PRICE\n5-08-XX\n9.50\n1.0\n9.50\n5-09-XX\n5.00\n5.0\n25.00\n5-13-XX\n11.50\n1.0\n11.50\n5-14-XX\n5.25\n3.5\n18.38\nTOTAL\n10.5\n64.38\nDistributor signature:\nDate: 2/27/2024\nCustomer signature:\nDate:',
  'languages': [],
  'pages': [{'pageNumber': 1,
    'angle': None,
    'width': 8.5,
    'height': 11.0,
    'unit': 'inch',
    'lines': [{'content': 'CONTOSO',
      'polygon': [1.0697,
       1.1936,
       2.2348,
       1.184,
       2.2348,
       1.4084,
       1.0697,
       1.418],
      'spans': [{'offset': 0, 'length': 7}]},
     {'content': 

In [43]:
# For providing the feedback, the user would perform their analysis using your initial model.
model_training_client.run_layout_analysis(pdf_path, pdf_feedback_path, initial_model_id)

{'status': 'succeeded',
 'createdDateTime': '2024-07-08T01:50:58Z',
 'lastUpdatedDateTime': '2024-07-08T01:50:58Z',
 'analyzeResult': {'apiVersion': '2023-07-31',
  'modelId': 'invoices-1.0.0',
  'content': 'CONTOSO Innovation drives progress\n111st Avenue Redmond, WA 67891 (201) 555-0101 (201) 555-0102\nINVOICE\nCUSTOMER MADE Apps Kingdom Street London W2 6BD United Kingdom\nISSUED: 2/27/2024\nPRODUCT ID\nUNIT PRICE\nQUANTITY\nTOTAL PRICE\n5-08-XX\n9.50\n1.0\n9.50\n5-09-XX\n5.00\n5.0\n25.00\n5-13-XX\n11.50\n1.0\n11.50\n5-14-XX\n5.25\n3.5\n18.38\nTOTAL\n10.5\n64.38\nDistributor signature:\nDate: 2/27/2024\nCustomer signature:\nDate:',
  'languages': [],
  'pages': [{'pageNumber': 1,
    'angle': None,
    'width': 8.5,
    'height': 11.0,
    'unit': 'inch',
    'lines': [{'content': 'CONTOSO',
      'polygon': [1.0697,
       1.1936,
       2.2348,
       1.184,
       2.2348,
       1.4084,
       1.0697,
       1.418],
      'spans': [{'offset': 0, 'length': 7}]},
     {'content': '

### Display the document in the notebook for user feedback

This step will render the document inside the notebook for the user to interact with. This is only a visual representation for the purpose of this experiment, and in a real-world scenario, this would be implemented in a client application.

The following code block will perform the following:

1. Load a document and store each page as an image.
1. Display the rendered image below as an interactive element in an output cell, rendering the output of the layout analysis over the image as label regions.
1. Allow you to move, remove, and resize label regions on the rendered image, and add fields to correct any incorrectly identified or missing fields.

pip install PyMuPDF

py -m venv pymupdf-venv
.\pymupdf-venv\Scripts\activate
python -m pip install --upgrade pip



pip install pytesseract
pip install opencv-python


In [44]:
doc_canvas = DocumentCanvas(working_dir)

canvases = doc_canvas.load_pdf(pdf_path, document_fields_path, pdf_feedback_path)
for canvas in canvases:
    display(canvas)

BBoxWidget(bboxes=[{'x': 1093.0, 'y': 1241.0, 'width': 57.00000000000003, 'height': 28.999999999999915, 'label…

### Process the user feedback into Document Intelligence labels format

Once the user has corrected the document analysis, the following code will process the label regions into the labels JSON format used by the Azure AI Document Intelligence service. The files will be saved to the `./pdfs` directory with the name format `<pdf_file_name>.labels.json`.

In a real-world scenario, labels may be presented alongside the rendered document UI, connected to the label regions, to allow the user to update the text and field, and then retrain the model using the updated labels and PDF documents.

The following code blocks will render the label regions as UI inputs. The inputs will be pre-populated, and you can update the details of each label associated with the document before re-training.

In [45]:
labels = [DocumentIntelligenceLabel(label_region, doc_canvas.fields) for label_region in doc_canvas.get_document_labels()]
    
for label in labels:
    display(label.render())

VBox(children=(Dropdown(description='Field:', index=5, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=6, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=11, options=('', 'Customer Name', 'Customer Address', 'Iss…

VBox(children=(Dropdown(description='Field:', index=8, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=2, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=7, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=9, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=1, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=3, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

VBox(children=(Dropdown(description='Field:', index=4, options=('', 'Customer Name', 'Customer Address', 'Issu…

### Create the labels JSON file

Once the user has updated the labels and text associated with the label regions, the following code block will create the labels JSON file in the format required by the Azure AI Document Intelligence service. The file will be saved to the `./pdfs` directory with the name format `<pdf_file_name>.labels.json`.

In [46]:
DocumentIntelligenceResultFormatter.save_to_labels_json(labels, pdf_file_name, pdf_labels_path)

{'$schema': 'https://schema.cognitiveservices.azure.com/formrecognizer/2021-03-01/labels.json',
 'document': 'Invoice_6.pdf',
 'labels': [{'label': 'Customer Address',
   'value': [{'page': 1,
     'text': 'Kingdom Street London W2 6BD United Kingdom',
     'boundingBoxes': [[0.7564705882352941,
       0.20045454545454544,
       0.8735294117647059,
       0.20045454545454544,
       0.8735294117647059,
       0.26045454545454544,
       0.7564705882352941,
       0.26045454545454544]]}]},
  {'label': 'Customer Name',
   'value': [{'page': 1,
     'text': 'MADE Apps',
     'boundingBoxes': [[0.7823529411764706,
       0.1845454545454545,
       0.8735294117647059,
       0.1845454545454545,
       0.8735294117647059,
       0.2,
       0.7823529411764706,
       0.2]]}]},
  {'label': 'Customer Signature',
   'value': [{'page': 1,
     'text': '',
     'boundingBoxes': [[0.29470588235294115,
       0.6831818181818182,
       0.5788235294117647,
       0.6831818181818182,
       0.578823

## Retrain the model using the updated labels and PDF documents

The next step emulates a post-user feedback loop experience, where the updated labels and PDF documents are used to retrain the model using the Azure AI Document Intelligence service. 

This would typically be done by you, as the application developer, manually by reviewing your user's feedback, selecting the appropriate documents to retrain the model with, and then processing them through the Azure AI Document Intelligence service.

The following code blocks will upload the updated labels and PDF documents to the Azure Storage blob container for the model training data set, and then retrain the model using the Azure AI Document Intelligence service. The updated model will then be used to analyze a document.

In [47]:
# The version of the updated model, in this example, a minor change by adding a new training document.
updated_model_version = "1.1.0"

# The name of the model that will be registered in Azure AI Document Intelligence
updated_model_id = f"{model_name}-{updated_model_version}"

# Uploads the updated user feedback documents to Azure Blob Storage and initiates model training using both the existing and new data.
model_training_client.upload_training_data(pdf_dir)
updated_model = model_training_client.create_model(model_name=updated_model_id)

In [48]:
# The file path to where the updated analysis of the user feedback document will be stored.
pdf_updated_analysis_path = os.path.join(pdf_dir, f"{pdf_file_name}.ocr_{updated_model_version}.json")

# Run layout analysis with the updated model
model_training_client.run_layout_analysis(pdf_path, pdf_updated_analysis_path, updated_model_id)

{'status': 'succeeded',
 'createdDateTime': '2024-07-08T01:52:43Z',
 'lastUpdatedDateTime': '2024-07-08T01:52:43Z',
 'analyzeResult': {'apiVersion': '2023-07-31',
  'modelId': 'invoices-1.1.0',
  'content': 'CONTOSO Innovation drives progress\n111st Avenue Redmond, WA 67891 (201) 555-0101 (201) 555-0102\nINVOICE\nCUSTOMER MADE Apps Kingdom Street London W2 6BD United Kingdom\nISSUED: 2/27/2024\nPRODUCT ID\nUNIT PRICE\nQUANTITY\nTOTAL PRICE\n5-08-XX\n9.50\n1.0\n9.50\n5-09-XX\n5.00\n5.0\n25.00\n5-13-XX\n11.50\n1.0\n11.50\n5-14-XX\n5.25\n3.5\n18.38\nTOTAL\n10.5\n64.38\nDistributor signature:\nDate: 2/27/2024\nCustomer signature:\nDate:',
  'languages': [],
  'pages': [{'pageNumber': 1,
    'angle': None,
    'width': 8.5,
    'height': 11.0,
    'unit': 'inch',
    'lines': [{'content': 'CONTOSO',
      'polygon': [1.0697,
       1.1936,
       2.2348,
       1.184,
       2.2348,
       1.4084,
       1.0697,
       1.418],
      'spans': [{'offset': 0, 'length': 7}]},
     {'content': '

In [49]:
doc_canvas = DocumentCanvas(working_dir)

canvases = doc_canvas.load_pdf(pdf_path, document_fields_path, pdf_updated_analysis_path)
for canvas in canvases:
    display(canvas)

BBoxWidget(bboxes=[{'x': 501.0, 'y': 1391.0, 'width': 443.99999999999994, 'height': 140.00000000000003, 'label…