# Signature Detection by Reading Pixels

* Author: docai-incubator@google.com

## Disclaimer

This tool is not supported by the Google engineering team or product team. It is provided and supported on a best-effort basis by the **DocAI Incubator Team**. No guarantees of performance are implied.

## Objective

This documentation outlines the procedure for detecting the signature in the document by taking normalized bounding box coordinates of signature location.While using this code, the user needs to set two values while calling the function a) BlankLine Pixel count b)  Signature Pixel Count (only for the black pixels).

## Prerequisites
* Access to vertex AI Notebook or Google Colab
* Python
* Python Libraries like cv2, PIL, base64, io, numpy etc.

## Step by Step procedure 

### 1.Importing Required Modules

In [None]:
!wget https://raw.githubusercontent.com/GoogleCloudPlatform/document-ai-samples/main/incubator-tools/best-practices/utilities/utilities.py

In [None]:
import io
from io import BytesIO
import base64
import numpy as np
import json
import cv2
from PIL import Image, ImageDraw, ImageFont
from typing import Any, Dict, List, Optional, Sequence, Tuple, Union

### 2.Setup the inputs

In [None]:
# Path to the JSON file containing OCR data
json_file_path = "Handwritten_1-0.json"

# The text anchor indicating where the applicant's signature starts
start_anchor_text = "Applicant's Signature:"

# The text anchor indicating where the signature area ends or 'None' if not specified
end_anchor_text = "Date:"  # Change to None if there is no end anchor

# Factor by which the bounding box's height should exceed the height of the start anchor text
# If None, the maximum height of the start anchor text bounding box will be used
height_of_signature_bbox = (
    3  # Example: 3 times the height of the start anchor text bbox
)

# Factor by which the bounding box's length should be extented
# If None, an end anchor text must be provided to determine the length
length_of_signature_bbox = (
    1.5  # Example: 1.5 times the Width of the start anchor text bbox
)

# Number of blank pixels that define a blank line in the signature area
blankLinePixelCount = 600  # Adjust value based on image resolution and clarity

# Minimum number of pixels that define the presence of a signature
signatureThresholdPixelCount = 1200  # Adjust value based on image density and size

### 3.Run the required functions

In [None]:
def get_start_and_end_index(json_data: dict, anchor_text: str) -> tuple[int, int]:
    """
    Finds the start and end index of the given anchor text in the document JSON.

    Args:
        json_data (dict): The JSON data containing the document text.
        anchor_text (str): The anchor text whose start and end indices need to be found.

    Returns:
        tuple[int, int]: The start and end index of the anchor text.
    """
    start_index = json_data["text"].find(anchor_text)
    end_index = start_index + len(anchor_text)
    return start_index, end_index


def calculate_new_bbox(
    start_anchor_coords: dict,
    end_anchor_coords: dict,
    height_of_signature_bbox: float = None,
) -> dict:
    """
    Calculates a new bounding box between two anchor coordinates.

    Args:
        start_anchor_coords (dict): The coordinates of the start anchor.
        end_anchor_coords (dict): The coordinates of the end anchor.
        height_of_signature_bbox (float, optional): The height of the signature bounding box. If not provided, the maximum height of the anchors is used.

    Returns:
        dict: The new bounding box coordinates with min_x, min_y, max_x, and max_y.
    """
    new_min_x = start_anchor_coords["max_x"]
    new_max_x = end_anchor_coords["min_x"]
    start_anchor_height = start_anchor_coords["max_y"] - start_anchor_coords["min_y"]
    end_anchor_height = end_anchor_coords["max_y"] - end_anchor_coords["min_y"]
    if height_of_signature_bbox is not None:
        new_height = height_of_signature_bbox * start_anchor_height
    else:
        new_height = max(start_anchor_height, end_anchor_height)
    average_min_y = (start_anchor_coords["min_y"] + end_anchor_coords["min_y"]) / 2
    new_min_y = average_min_y - new_height / 2
    new_max_y = average_min_y + new_height / 2
    return {
        "min_x": new_min_x,
        "min_y": new_min_y,
        "max_x": new_max_x,
        "max_y": new_max_y,
    }


def detect_signature(
    json_data: dict,
    coords: dict,
    blankLinePixelCount: int,
    signatureThresholdPixelCount: int,
) -> bool:
    """
    Detects if a signature is present in the bounding box region.

    Args:
        json_data (dict): The JSON data containing the document pages.
        coords (dict): Coordinates of the region to be checked for a signature.
        blankLinePixelCount (int): The threshold number of blank pixels considered as a blank line.
        signatureThresholdPixelCount (int): The minimum number of black pixels required to detect a signature.

    Returns:
        bool: True if a signature is detected, False otherwise.
    """
    image_data = json_data["pages"][0]["image"]["content"]
    image_bytes = base64.b64decode(image_data)
    image = Image.open(io.BytesIO(image_bytes))
    width, height = image.size
    min_x = int(coords["min_x"] * width)
    min_y = int(coords["min_y"] * height)
    max_x = int(coords["max_x"] * width)
    max_y = int(coords["max_y"] * height)
    cropped_image = image.crop((min_x, min_y, max_x, max_y))
    cropped_image.save("cropped.jpeg")
    cropped_img = cv2.imread("cropped.jpeg", 0)
    _, cropped_bw_image = cv2.threshold(cropped_img, 127, 255, cv2.THRESH_BINARY)
    pixel_value, occurrence = np.unique(cropped_bw_image, return_counts=True)
    d = dict(zip(pixel_value, occurrence))
    print(d)
    cropped_black_pixel = d.get(0, 0)
    if (
        cropped_black_pixel > blankLinePixelCount
        and cropped_black_pixel > signatureThresholdPixelCount
    ):
        print("Signature Detected")
        return True
    else:
        print("No Signature Detected")
        return False


def visualize(json_data: dict, coords: dict, signature_present: bool) -> None:
    """
    Visualizes the bounding box for the detected signature on the image.

    Args:
        json_data (dict): The JSON data containing the document pages.
        coords (dict): Coordinates of the bounding box to visualize.
        signature_present (bool): Whether a signature was detected (True) or not (False).

    Returns:
        None
    """
    image_data = json_data["pages"][0]["image"]["content"]
    image_bytes = base64.b64decode(image_data)
    image = Image.open(io.BytesIO(image_bytes))
    draw = ImageDraw.Draw(image)
    width, height = image.size
    min_x = coords["min_x"] * width
    min_y = coords["min_y"] * height
    max_x = coords["max_x"] * width
    max_y = coords["max_y"] * height
    polygon = [(min_x, min_y), (max_x, min_y), (max_x, max_y), (min_x, max_y)]
    if signature_present == True:
        color = "green"
    else:
        color = "red"
    draw.polygon(polygon, outline=color, width=6)
    display(image)


def get_token(json_dict: dict, page: int, start_index: int, end_index: int) -> dict:
    """
    Retrieves the bounding box for the tokens within a given text segment range.

    Args:
        json_dict (dict): The JSON data containing page and token information.
        page (int): The page number to extract tokens from.
        start_index (int): The starting index of the text segment.
        end_index (int): The ending index of the text segment.

    Returns:
        dict: A dictionary containing the bounding box coordinates (min_x, min_y, max_x, max_y).
    """
    temp_xy = {"x": [], "y": []}
    start_check = start_index - 2
    end_check = end_index + 2

    for token in json_dict["pages"][page]["tokens"]:
        text_segments = token["layout"]["textAnchor"].get("textSegments", [])

        for segment in text_segments:
            start_temp = int(segment.get("startIndex", "0"))
            end_temp = int(segment["endIndex"])

            if (
                start_temp >= start_check
                and end_temp <= end_check
                and (end_temp - start_temp) > 3
            ):
                normalized_vertices = token["layout"]["boundingPoly"][
                    "normalizedVertices"
                ]
                for vertex in normalized_vertices:
                    temp_xy["x"].append(vertex.get("x", 0))
                    temp_xy["y"].append(vertex.get("y", 0))

    min_x = min(temp_xy["x"], default=None)
    min_y = min(temp_xy["y"], default=None)
    max_x = max(temp_xy["x"], default=None)
    max_y = max(temp_xy["y"], default=None)

    return {"min_x": min_x, "min_y": min_y, "max_x": max_x, "max_y": max_y}


def calc_arbitrary_bbox_coords(start_anchor_coords: dict) -> dict:
    """
    Calculates arbitrary bounding box coordinates based on the start anchor coordinates and provided height/length ratios.

    Args:
        start_anchor_coords (dict): Coordinates of the start anchor.
    Returns:
        dict: New bounding box coordinates.
    """
    current_height = start_anchor_coords["max_y"] - start_anchor_coords["min_y"]
    current_width = start_anchor_coords["max_x"] - start_anchor_coords["min_x"]
    new_height = height_of_signature_bbox * current_height
    new_length = length_of_signature_bbox * current_width
    new_box_coords = {
        "min_x": start_anchor_coords["max_x"],
        "min_y": start_anchor_coords["min_y"] - new_height / 2,
        "max_x": start_anchor_coords["max_x"] + new_length,
        "max_y": start_anchor_coords["max_y"] + new_height / 2,
    }
    return new_box_coords

### 4.Run the code

In [None]:
with open(json_file_path, "r") as f:
    json_data = json.load(f)

start_anchor_start_index, start_anchor_end_index = get_start_and_end_index(
    json_data, start_anchor_text
)
start_anchor_coords = get_token(
    json_data, 0, start_anchor_start_index, start_anchor_end_index
)

if end_anchor_text is None:
    signature_bbox_coords = calc_arbitrary_bbox_coords(start_anchor_coords)
else:
    end_anchor_start_index, end_anchor_end_index = get_start_and_end_index(
        json_data, end_anchor_text
    )
    end_anchor_coords = get_token(
        json_data, 0, end_anchor_start_index, end_anchor_end_index
    )
    signature_bbox_coords = calculate_new_bbox(
        start_anchor_coords, end_anchor_coords, height_of_signature_bbox
    )

signature_present = detect_signature(
    json_data, signature_bbox_coords, blankLinePixelCount, signatureThresholdPixelCount
)
# Comment the Below Line if you dont want to Visualize the Output.
visualize(json_data, signature_bbox_coords, signature_present)

### 5.Output

Upon executing the above code, the bounding box for the signature is configured and set based on the provided parameters. The output will display the region of the image designated as the signature bounding box, and indicate whether a signature has been detected within that region.


<img src="./Images/image_output.png" width=800 height=400 ></img>