# Azure Document Intelligence Custom Template User Feedback Loop Experiment

This notebook demonstrates how to implement a simple way to enable a user to draw over a document that may have been processed by Azure AI Document Intelligence to provide feedback on the quality of results by highlighting incorrect or missing information with corrections. 

The goal is to showcase how a feedback mechanism can be implemented to allow the developers of custom models in Azure AI Document Intelligence to collect feedback from users to improve the model with the ability to retrain.

> **Note**: This notebook only showcases the potential user interaction. The outputs are created as the labels JSON schema used by the Azure AI Document Intelligence service. The actual feedback processing and retraining of the model is not implemented in this notebook.

## Pre-requisites

The notebook uses [Dev Containers](https://code.visualstudio.com/docs/remote/containers) to ensure that all the required dependencies are available in a consistent local development environment.

The following are required to run this notebook:

- [Visual Studio Code](https://code.visualstudio.com/)
- [Docker Desktop](https://www.docker.com/products/docker-desktop)
- [Remote - Containers extension for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)

> **Note**: The notebook is designed to run in a [Dev Container](https://code.visualstudio.com/docs/remote/containers) in Visual Studio Code. The Dev Container is pre-configured with the required dependencies and extensions. You can run this notebook outside of a Dev Container, but you will need to manually install the required dependencies including Poppler, Tesseract, and OpenCV.

The Dev Container will include the following dependencies by default:

- Debian 11 (Bullseye) base image
- Python 3.12
  - ipycanvas - for rendering the document and allowing the user to draw over it
  - ipykernel - for running the notebook
  - notebook - for running the notebook
  - opencv-python-headless - for image processing
  - pdf2image - for converting PDFs to images
  - pytesseract - for performing OCR on the document
- Poppler - used by pdf2image to convert PDFs to images
- Tesseract OCR - used by pytesseract to perform OCR on the document
- Python3 OpenCV - used for image processing

## Setup

The following will setup the required imports and constants for the notebook, including the path to the sample Invoice PDF that will be used, as well as creating the necessary paths to store the images and labels JSON outputs.

In [None]:
from pdf2image import convert_from_path
from ipycanvas import Canvas
from ipywidgets import Image
import pytesseract
import os
import cv2
import json

working_dir = os.path.abspath('')

pdf_file_name = 'Invoice_1'
pdf_path = os.path.join(working_dir, 'pdfs', f'{pdf_file_name}.pdf')

images_dir = os.path.join(working_dir, 'images')
if not os.path.exists(images_dir):
    os.makedirs(images_dir)

labels_dir = os.path.join(working_dir, 'labels')
if not os.path.exists(labels_dir):
    os.makedirs(labels_dir)

## Define object for tracking the user feedback options

The following object is used to define the square border in which the user can draw over the document to provide feedback with. This object tracks the start and end coordinates of the border, as well as functions for performing the drawing of the border, normalizing the coordinates for the labels JSON output, and extracting the text within the border using OCR.

In [None]:
class SquareBorder:
    def __init__(self, image_path_ref: str, page_ref: int, border_width=2, border_color='black'):
        self.image_path_ref = image_path_ref
        self.page_ref = page_ref
        self.border_width = border_width
        self.border_color = border_color

    def start(self, x, y):
        self.start_x = x
        self.start_y = y

    def end(self, x, y):
        self.end_x = x
        self.end_y = y

    def draw(self, canvas: Canvas):
        canvas.stroke_style = self.border_color
        canvas.line_width = self.border_width
        canvas.stroke_rect(self.start_x, self.start_y, self.end_x - self.start_x, self.end_y - self.start_y)
        self.normalize(canvas)

    def normalize(self, canvas: Canvas):
        # normalize the square_border pixels 0..1
        self.start_x_normalized = self.start_x / canvas.width
        self.start_y_normalized = self.start_y / canvas.height
        self.end_x_normalized = self.end_x / canvas.width
        self.end_y_normalized = self.end_y / canvas.height

    def extract_text(self):
        start_x_int = int(self.start_x)
        start_y_int = int(self.start_y)
        end_x_int = int(self.end_x)
        end_y_int = int(self.end_y)

        img = cv2.imread(self.image_path_ref)
        crop_img = img[start_y_int:end_y_int, start_x_int:end_x_int]
        self.text = pytesseract.image_to_string(crop_img)
        return self.text

    def get_bounding_box(self):
        return [self.start_x_normalized, self.start_y_normalized, self.end_x_normalized, self.start_y_normalized, self.end_x_normalized, self.end_y_normalized, self.start_x_normalized, self.end_y_normalized]

    def as_label(self):
        return {
            "label": "", 
            "value": [
                {
                    "page": self.page_ref,
                    "text": self.extract_text(),
                    "boundingBoxes": [self.get_bounding_box()]
                }
            ]
        }

## Load the PDF document into view for user feedback

The following code will perform the following:

1. Load the PDF document and store each page as an image using pdf2image.
1. Display the rendered image using Canvas below as an interactive element in an output cell. **Note**: The image is rendered at the original size of the PDF page.
1. Allow you to draw borders over the rendered image by clicking/holding, dragging, and releasing the mouse.

> **Note**: This simple demonstration does not allow drawn borders to be removed or edited once drawn. To start again, you will need to re-run the cell.

In [None]:
square_borders = []

def handle_mouse_down_start_draw(canvas: Canvas, x, y):
    square_border = SquareBorder(canvas.image_path_ref, canvas.page_ref)
    square_border.start(x, y)
    square_borders.append(square_border)

def handle_mouse_down_end_draw(canvas: Canvas, x, y):
    square_border = square_borders[-1]
    square_border.end(x, y)
    square_border.draw(canvas)

def load_pdf(file_path: str):
    pages = convert_from_path(file_path, fmt='jpeg')

    print(f'Loaded {len(pages)} pages')

    canvases = [Canvas(width=page.width, height=page.height) for page in pages]

    for i, page in enumerate(pages):
        page_ref = i + 1
        image_path_ref = os.path.join(images_dir, f'{pdf_file_name}_Page_{page_ref}.jpg')

        page.save(image_path_ref, 'JPEG')
    
        canvases[i].image_path_ref = image_path_ref
        canvases[i].page_ref = page_ref

        canvases[i].draw_image(Image.from_file(image_path_ref), 0, 0, pages[i].width, pages[i].height)
        canvases[i].on_mouse_down(lambda x, y: handle_mouse_down_start_draw(canvases[i], x, y))
        canvases[i].on_mouse_up(lambda x, y: handle_mouse_down_end_draw(canvases[i], x, y))

    return canvases

canvases = load_pdf(pdf_path)

canvases[0]

## Process the user feedback into Document Intelligence labels format

Once the user has drawn borders over the document to provide feedback, the following code will process the drawn borders into the labels JSON format used by the Azure AI Document Intelligence service. The files will be saved to the `./labels` directory.

In a real-world scenario, the labels JSON files could be loaded into a UI to allow the user to update the label names associated with the custom model, and then retrain the model using the updated labels and PDF documents.

In [None]:
labels = {
    "$schema": "https://schema.cognitiveservices.azure.com/formrecognizer/2021-03-01/labels.json",
    "document": f"{pdf_file_name}.pdf",
    "labels": [square_border.as_label() for square_border in square_borders]
}

labels_file_path = os.path.join(labels_dir, f'{pdf_file_name}.labels.json')
with open(labels_file_path, 'w') as f:
    json.dump(labels, f, indent=4)