# Introduction

In this notebook, we use two distinct Optical Character Recognition (OCR) methods to handle different types of text extraction tasks. Each approach is tailored to address specific needs based on the nature of the text and the format of the documents. Here's a breakdown of the two OCR functions and their purposes:

## 1. MRZ OCR for Passports/Documents

### Purpose

The **MRZ OCR** function is designed specifically for reading Machine Readable Zones (MRZ) found in passports and similar documents. MRZ is a standardized area in these documents that contains critical information in a fixed format, making it ideal for specialized OCR.


### Why MRZ OCR?

The MRZ area has a well-defined format, which simplifies the extraction process. Using a dedicated function for MRZ allows for precise extraction of essential personal information from passports and documents.

## 2. Comprehensive OCR for General English Text

### Purpose

The **Comprehensive OCR** function is used for extracting general English text from various types of documents and images. This approach is well-suited for processing documents with varied text layouts and extracting English paragraphs of text.

### Why Comprehensive OCR?

For documents with varied text layouts and general English text, a comprehensive OCR approach is necessary. It allows for flexible and adaptable text extraction, accommodating diverse text structures and content types in English paragraphs.

## Summary

By employing both MRZ-specific and comprehensive OCR functions, we can effectively handle different text extraction needs:

- **MRZ OCR** for structured and standardized document with machine readable zones like passports and government IDs, etc..., ensuring accurate extraction of critical information
- **Comprehensive OCR** for general English text extraction tasks, providing flexibility and adaptability for documents with varied text layouts and paragraph structures.

This dual approach ensures that we can address a wide range of text extraction challenges efficiently and accurately.


In [None]:
# Install Tesseract OCR engine using the package manager
!sudo apt update  # Update package lists for upgrades and new package installations
!sudo apt install -y tesseract-ocr  # Install Tesseract OCR with automatic confirmation

# Install Python libraries
!pip install pytesseract  # Python wrapper for Tesseract OCR
!pip install -q easyocr  # EasyOCR library for Optical Character Recognition
!pip install -q PassportEye  # Library for parsing and OCR of passport MRZs
!pip install -q keras_ocr  # Keras-based Optical Character Recognition library
!pip install -q pyspellchecker  # Simple spell checking library
!pip install -q nltk  # Natural Language Toolkit for text processing


[33m0% [Working][0m            Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
[33m0% [Waiting for headers] [Waiting for headers] [1 InRelease 3,626 B/3,626 B 100[0m[33m0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Waiting f[0m                                                                               Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
[33m0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Connected[0m                                                                               Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
                                                                               Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
[33m0% [Waiting for headers] [4 InRelease 14.2 kB/129 kB 11%] [Waiting for headers][0m                                                          

<h3 style='font-weight: bold'>Import necesary packages</h3>

In [None]:
import os
import string as st
from dateutil import parser
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import cv2
import numpy as np
from passporteye import read_mrz
import easyocr
from spellchecker import SpellChecker
import difflib
import datetime
import json
import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
import warnings


# Suppress warnings
warnings.filterwarnings('ignore')

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


True

<h3 style='font-weight: bold'>Load easyOCR engine</h3>

In [None]:
# lOAD OCR ENGINE (easyOCR)
reader=easyocr.Reader(lang_list=['en'], gpu=True)  # Enable gpu if available



Progress: |██████████████████████████████████████████████████| 100.0% Complete



Progress: |██████████████████████████████████████████████████| 100.0% Complete

# **MRZ OCR** (For passports and documents with machine readable zones)

# Explanation of Functions and OCR Method

## 1. `parse_date`

### Purpose
Parses a date string and formats it into 'DD/MM/YYYY'. Includes an option to correct the year if it appears to be in the future.

### Parameters
- `string` (str): The date string to parse.
- `fix_year` (bool): If `True`, adjusts the year if it is greater than the current year.

### Returns
- `str`: The parsed date in 'DD/MM/YYYY' format.

### Details
- Uses `parser.parse` from the `dateutil` library to interpret the date string in a year-first format.
- If `fix_year` is enabled and the year is in the future, adjusts the year by subtracting 100 years.
- Returns the date in a standardized format.

## 2. `clean`

### Purpose
Cleans a string by removing all non-alphanumeric characters and converting it to uppercase.

### Parameters
- `string` (str): The string to clean.

### Returns
- `str`: The cleaned string.

### Details
- Removes characters that are not letters or digits.
- Converts the string to uppercase for consistency.

## 3. `get_gender`

### Purpose
Normalizes the gender code from a given input.

### Parameters
- `code` (str): The gender code to interpret. Expected values are 'M', 'F', or other values.

### Returns
- `str`: 'M' for male, 'F' for female, or an empty string if the code is not recognized.

### Details
- Converts the code to uppercase and validates it.
- Returns 'M' or 'F' based on the input or '' if the input does not match expected values.

## 4. `print_data`

### Purpose
Prints the key-value pairs from a dictionary in a formatted manner.

### Parameters
- `data` (dict): A dictionary containing the data to be printed.

### Details
- Replaces underscores with spaces and capitalizes the first letter of each word in the keys.
- Prints each key-value pair in a readable format.

## 5. `ocr`

### Purpose
Extracts and processes personal information from a passport image using OCR.

### Parameters
- `img_name` (str): Path to the passport image.

### Returns
- `dict`: Extracted personal information.

### Details

1. **Extract MRZ from Image**:
   - Uses `read_mrz` to extract the MRZ (Machine Readable Zone) from the passport image.
   - Saves the MRZ Region of Interest (ROI) to a temporary file.

2. **Read and Preprocess MRZ Image**:
   - Loads and resizes the MRZ image for OCR processing.
   - Defines allowed characters for OCR and reads text using EasyOCR.

3. **Process MRZ Lines**:
   - Processes MRZ lines based on their structure and type:
     - **Type 1**: Typically has 3 lines with specific lengths. Processes the document type, issuing country, document number, personal number, date of birth, gender, expiration date, and nationality.
     - **Type 2**: Typically has 2 lines with varying lengths. Processes the document type, issuing country, surname, name, document number, nationality, date of birth, gender, expiration date, and personal number.
     - **Type 3**: Typically has 2 lines with different formatting. Processes the passport type, issuing country, surname, name, passport number, nationality, date of birth, gender, expiration date, and personal number.

4. **Extract Personal Information**:
   - Extracts and formats personal details such as name, surname, gender, date of birth, nationality, passport type, passport number, issuing country, expiration date, and personal number.
   - Cleans and formats the extracted data.

5. **Clean Up**:
   - Deletes the temporary image file used for processing.


In [None]:
def parse_date(string, fix_year=False):
    """
    Parses a date string and optionally fixes the year if it's in the future.

    Parameters:
    - string (str): The date string to parse.
    - fix_year (bool): If True, adjusts the year if it is greater than the current year.

    Returns:
    - str: The parsed date in 'DD/MM/YYYY' format.
    """
    try:
        date = parser.parse(string, yearfirst=True).date()
        if fix_year and date.year > datetime.datetime.now().year:
            date = date.replace(year=date.year - 100)
        return date.strftime('%d/%m/%Y')
    except ValueError as e:
        print(f"Error parsing date: {e}")
        return None

def clean(string):
    """
    Cleans a string by removing all non-alphanumeric characters and converting to uppercase.

    Parameters:
    - string (str): The string to clean.

    Returns:
    - str: The cleaned string.
    """
    return ''.join([i for i in string if i.isalnum()]).upper()

def get_gender(code):
    """
    Returns the normalized gender code from a given input.

    Parameters:
    - code (str): The gender code to interpret. Expected to be 'M', 'F', or other values.

    Returns:
    - str: 'M' for male, 'F' for female, or 'Unknown' if the code is not recognized or provided.
    """
    normalized_code = code.upper()
    return normalized_code if normalized_code in ['M', 'F'] else ''

def print_data(data):
    """
    Prints the key-value pairs from a dictionary in a formatted manner.

    Parameters:
    - data (dict): A dictionary containing the data to be printed.
    """
    for key, value in data.items():
        formatted_key = key.replace('_', ' ').capitalize()
        print(f'{formatted_key}\t:\t{value}')

def process_mrz_type1(lines):
    """
    Processes MRZ lines of type 1 to extract user information.

    Parameters:
    - lines (list): List of MRZ lines extracted from the image.

    Returns:
    - dict: Extracted user information.
    """
    user_info = {}

    # Process Row 1
    user_info['document_type'] = clean(lines[0][0:1])
    user_info['document_type'] += clean(lines[0][1:2])  # Type character (e.g., I, A, or C)
    user_info['issuing_country'] = clean(lines[0][2:5])  # Issuing Country (ISO 3166-1 code)
    user_info['document_number'] = clean(lines[0][5:14])  # Document Number
    # Skip Check Digit over Document Number (position 15) and Optional (16-30)

    # Process Row 2
    user_info['date_of_birth'] = parse_date(lines[1][0:6], fix_year=True)  # Date of Birth (YYMMDD)
    user_info['gender'] = get_gender(lines[1][7:8])  # Sex (M, F, or <)
    user_info['expiration_date'] = parse_date(lines[1][8:14])  # Expiration Date (YYMMDD)
    user_info['nationality'] = clean(lines[1][15:18])  # Nationality
    # Skip Check Digit over Expiration Date (position 15) and Optional1 (19-29)

    # Process Row 3
    names = lines[2].replace('<', ' ').strip().split()
    user_info['surname'] = names[0] if names else ''
    user_info['name'] = ' '.join(names[1:]) if len(names) > 1 else ''

    return user_info

def process_mrz_type2(lines):
    """
    Processes MRZ lines of type 2 to extract user information.

    Parameters:
    - lines (list): List of MRZ lines extracted from the image.

    Returns:
    - dict: Extracted user information.
    """
    user_info = {}
    user_info['document_type'] = clean(lines[0][0:2])
    user_info['issuing_country'] = clean(lines[0][2:5])
    names = lines[0][5:].replace('<', ' ').split()
    user_info['surname'] = names[0] if names else ''
    user_info['name'] = ' '.join(names[1:]) if len(names) > 1 else ''
    user_info['document_number'] = clean(lines[1][0:9])
    user_info['nationality'] = clean(lines[1][10:13])
    user_info['date_of_birth'] = parse_date(lines[1][13:19], fix_year=True)
    user_info['gender'] = get_gender(lines[1][20])
    user_info['expiration_date'] = parse_date(lines[1][21:27])
    user_info['personal_number'] = clean(lines[1][28:35])
    return user_info

def process_mrz_type3(lines):
    """
    Processes MRZ lines of type 3 to extract user information.

    Parameters:
    - lines (list): List of MRZ lines extracted from the image.

    Returns:
    - dict: Extracted user information.
    """
    user_info = {}
    user_info['passport_type'] = clean(lines[0][0:2])
    user_info['issuing_country'] = clean(lines[0][2:5])
    names = lines[0][5:44].replace('<', ' ').split()
    user_info['surname'] = names[0] if names else ''
    user_info['name'] = ' '.join(names[1:]) if len(names) > 1 else ''
    user_info['passport_number'] = clean(lines[1][0:9])
    user_info['nationality'] = clean(lines[1][10:13])
    user_info['date_of_birth'] = parse_date(lines[1][13:19], fix_year=True)
    user_info['gender'] = get_gender(lines[1][20])
    user_info['expiration_date'] = parse_date(lines[1][21:27])
    user_info['personal_number'] = clean(lines[1][28:42])
    return user_info

def ocr(img_name):
    """
    Extracts and processes personal information from the passport image.

    Parameters:
    - img_name (str): Path to the passport image.

    Returns:
    - dict: Extracted personal information.
    """
    user_info = {}
    temp_image_path = 'tmp.png'

    # Extract MRZ from image
    mrz = read_mrz(img_name, save_roi=True)
    if not mrz:
        print(f'Machine cannot read image {img_name}.')
        return user_info

    # Save and process MRZ image
    mpimg.imsave(temp_image_path, mrz.aux['roi'], cmap='gray')
    img = cv2.imread(temp_image_path)
    img = cv2.resize(img, None, fx=2, fy=2)  # Increase resolution for better OCR

    # Define allowed characters for OCR
    allowlist = st.ascii_letters + st.digits + '< '
    lines = reader.readtext(img, paragraph=False, detail=0, allowlist=allowlist)

    # Process MRZ lines based on their structure
    if len(lines) == 3 and all(len(line) >= 30 for line in lines):
        user_info = process_mrz_type1(lines)
    elif len(lines) == 2:
        if all(len(line) >= 36 for line in lines):
            user_info = process_mrz_type2(lines)
        elif all(len(line) >= 44 for line in lines):
            user_info = process_mrz_type3(lines)
    else:
        print("Unrecognized MRZ format")

    # Clean up temporary image file
    os.remove(temp_image_path)
    return user_info


# Example Usage of MRZ OCR Function

The `ocr` function is designed to extract personal information from passport images by specifically reading and interpreting the MRZ (Machine Readable Zone) text. It returns structured data related to personal information found in the MRZ area.

**Usage**

- To use the OCR functionality, call the `ocr` function with the path to the image file. This function will process the MRZ area of the image to extract personal details.
- The extracted personal information will be returned as a structured dictionary. The MRZ image will be saved temporarily, processed, and cleaned up after extraction.

**Note:** Uncomment the following code block to use the ocr function. Replace 'image_path' with the actual path to your image file.


In [None]:
#img_name = 'image_path'
#%time data = ocr(img_name)
#print_data(data)

# **Testing Framework:**

## Accuracy Calculation Functions

These functions are designed to evaluate the performance of OCR (Optical Character Recognition) systems by comparing the recognized text with the ground truth text. They calculate and print both character and word accuracies.



In [None]:
def character_accuracy(gt, rec):
    """
    Calculate character accuracy between ground truth and recognized text.

    Args:
        gt (str): Ground truth text.
        rec (str): Recognized text.

    Returns:
        float: Character accuracy percentage.
    """
    correct_chars = sum(1 for gt_char, rec_char in zip(gt, rec) if gt_char == rec_char)
    total_chars = len(gt)
    return (correct_chars / total_chars) * 100 if total_chars > 0 else 0

def word_accuracy(gt, rec):
    """
    Calculate word accuracy between ground truth and recognized text.

    Args:
        gt (str): Ground truth text.
        rec (str): Recognized text.

    Returns:
        float: Word accuracy percentage.
    """
    gt_words = gt.split()
    rec_words = rec.split()
    correct_words = sum(1 for gt_word, rec_word in zip(gt_words, rec_words) if gt_word == rec_word)
    total_words = len(gt_words)
    return (correct_words / total_words) * 100 if total_words > 0 else 0

def calculate_accuracies(gt_dict, rec_dict):
    """
    Calculate and print character and word accuracies for each field in the ground truth and recognized dictionaries.

    Args:
        gt_dict (dict): Dictionary with ground truth text.
        rec_dict (dict): Dictionary with recognized text.
    """
    char_acc_total = 0
    word_acc_total = 0
    fields_count = 0  # Count of non-empty fields

    for key in gt_dict:
        gt = gt_dict[key].strip().lower()
        rec = rec_dict.get(key, "").strip().lower()

        # Skip empty ground truth fields
        if gt == "":
            continue

        # Calculate accuracies
        char_acc = character_accuracy(gt, rec)
        word_acc = word_accuracy(gt, rec)
        char_acc_total += char_acc
        word_acc_total += word_acc
        fields_count += 1

        # Print field-wise accuracy
        print(f"{key}:")
        print(f"  Ground Truth: {gt}")
        print(f"  Recognized: {rec}")
        print(f"  Character Accuracy: {char_acc:.2f}%")
        print(f"  Word Accuracy: {word_acc:.2f}%\n")

    # Print overall accuracies
    if fields_count > 0:
        overall_char_acc = char_acc_total / fields_count
        overall_word_acc = word_acc_total / fields_count

        print(f"Overall Character Accuracy: {overall_char_acc:.2f}%")
        print(f"Overall Word Accuracy: {overall_word_acc:.2f}%")
    else:
        print("No non-empty fields to calculate accuracy.")



## Gathering Actual Data from User Input

To evaluate the OCR results, you need to compare them with the actual data. Use the following code snippet to collect the actual data from the user.

### Code Explanation

The code snippet prompts the user to enter various pieces of information related to the passport or document. Each input is converted to uppercase (where applicable) and stored in a dictionary called `actual_data`.

In [None]:
# Get actual data from user input
actual_data = {}
print("Enter the actual data:")
actual_data['name'] = input("Name: ").upper()
actual_data['surname'] = input("Surname: ").upper()
actual_data['gender'] = input("Gender (M/F): ").upper()
actual_data['date_of_birth'] = input("Date of Birth (DD/MM/YYYY): ")
actual_data['nationality'] = input("Nationality: ").upper()
actual_data['passport_type'] = input("Passport Type: ").upper()
actual_data['passport_number'] = input("Passport Number: ")
actual_data['issuing_country'] = input("Issuing Country: ").upper()
actual_data['expiration_date'] = input("Expiration Date (DD/MM/YYYY): ")
actual_data['personal_number'] = input("Personal Number: ")

## Calculate and Display Accuracies

After gathering the actual data and extracting the OCR results, you can calculate and display the accuracy of the OCR results compared to the actual data. Use the `calculate_accuracies` function to compute both character and word accuracies for each field and overall.


In [None]:
# Calculate accuracies
calculate_accuracies(actual_data, data)

# **Comprehensive OCR System**:

### Imports and Initialization

1. **Imports**:
    - `cv2`, `numpy`, `easyocr`, `json`, `nltk`, `matplotlib.pyplot`: Libraries for image processing, OCR, natural language processing, and visualization.
    - `SpellChecker` from `spellchecker`: For spell checking.
    - `word_tokenize` and `pos_tag` from `nltk.tokenize` and `nltk.tag`: For text tokenization and part-of-speech tagging.

2. **NLTK Downloads**:
    - Downloads necessary NLTK data for tokenization and POS tagging.

3. **Spell Checker Initialization**:
    - Initializes the spell checker to correct spelling errors.

### Functions

1. **`show_image(title, image)`**:
    - **Purpose**: Displays an image with a given title.
    - **Parameters**:
        - `title`: Title of the image window.
        - `image`: Image data in numpy array format.
    - **Functionality**:
        - Uses `matplotlib` to display the image with a title.
        - Converts the image from BGR to RGB for correct color representation.

2. **`correct_text(text)`**:
    - **Purpose**: Corrects spelling in the provided text, ignoring proper nouns and numbers.
    - **Parameters**:
        - `text`: The text to be corrected.
    - **Functionality**:
        - Tokenizes the text into words.
        - Tags each word with its part-of-speech (POS).
        - Uses the spell checker to correct misspelled words, except for proper nouns and numbers.

3. **`preprocess_image(img_path)`**:
    - **Purpose**: Preprocesses the image to prepare it for OCR.
    - **Parameters**:
        - `img_path`: Path to the image file.
    - **Functionality**:
        - Loads the image using `cv2.imread`.
        - Resizes the image if its dimensions exceed a maximum value (`max_dim`).
        - Converts the image to grayscale to simplify OCR processing.
    - **Returns**:
        - `gray_image`: The preprocessed grayscale image.
        - `original_image`: The original color image.

4. **`ocr2(img_path)`**:
    - **Purpose**: Extracts and corrects text from an image using OCR, then draws bounding boxes around detected text.
    - **Parameters**:
        - `img_path`: Path to the image file.
    - **Functionality**:
        - **Preprocessing**: Calls `preprocess_image` to get the grayscale and original images.
        - **OCR Processing**: Initializes the `easyocr.Reader` and uses it to detect text in the grayscale image.
        - **Text Correction and Annotation**:
            - For each detected text segment, corrects the text using `correct_text`.
            - Draws a bounding box around the detected text in the original image.
            - Adds the corrected text and bounding box data to a JSON list.
        - **Display**: Uses `show_image` to display the original image with annotated text.
    - **Returns**:
        - `json_data`: A list of dictionaries containing the detected text, bounding boxes, and confidence scores.

### Summary

1. **Image Preprocessing**: Resize and convert the image to grayscale for OCR.
2. **OCR Processing**: Detect text from the preprocessed image.
3. **Text Correction**: Improve the detected text by correcting spelling errors.
4. **Annotation**: Draw bounding boxes around detected text on the original image.
5. **Display and Output**: Show the annotated image and return the JSON data with text, bounding boxes, and confidence scores.

This code helps in processing images, extracting text, and providing an annotated view of the detected text, making it suitable for tasks like verifying OCR accuracy.


In [None]:
import cv2
import numpy as np
import easyocr
import json
import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
from spellchecker import SpellChecker
import matplotlib.pyplot as plt

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

# Initialize the spell checker
spell = SpellChecker()

def show_image(title, image):
    """
    Displays an image with a title.

    Parameters:
    - title (str): Title of the image.
    - image (numpy.ndarray): The image to display.
    """
    plt.figure(figsize=(10, 10))
    plt.title(title)
    plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB) if len(image.shape) == 3 else image, cmap='gray')
    plt.axis('off')
    plt.show()

def correct_text(text):
    """
    Corrects the text using spell checker, except for proper nouns and numbers.

    Parameters:
    - text (str): The text to correct.

    Returns:
    - str: The corrected text.
    """
    # Tokenize the text
    tokens = word_tokenize(text)

    # Perform part-of-speech tagging
    tagged = pos_tag(tokens)

    corrected_words = []
    for word, tag in tagged:
        # Only correct words that are not proper nouns (NNP) or numbers (CD)
        if tag not in ['NNP', 'NNPS']:
            corrected_word = spell.correction(word)
            corrected_words.append(corrected_word if corrected_word else word)
        else:
            corrected_words.append(word)

    return ' '.join(corrected_words)

def preprocess_image(img_path):
    """
    Preprocesses the image for OCR.

    Parameters:
    - img_path (str): Path to the image file.

    Returns:
    - image (numpy.ndarray): The preprocessed image.
    """
    # Load and resize the image
    image = cv2.imread(img_path)

    if image is None:
        raise ValueError(f"Error: Could not load image {img_path}")

    # Resize the image
    height, width = image.shape[:2]
    max_dim = 1000  # Resize to a maximum dimension of 1000 pixels
    if max(height, width) > max_dim:
        scale = max_dim / float(max(height, width))
        new_size = (int(width * scale), int(height * scale))
        image = cv2.resize(image, new_size, interpolation=cv2.INTER_AREA)

    # Convert to grayscale
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return gray_image, image

def ocr2(img_path):
    """
    Processes an image to extract and correct text using OCR.

    Parameters:
    - img_path (str): Path to the image file.

    Returns:
    - json_data (list): List of dictionaries containing text, bounding boxes, and confidence scores.
    """
    gray_image, original_image = preprocess_image(img_path)

    # Initialize the OCR reader
    reader = easyocr.Reader(['en'])

    # Read text from preprocessed image
    results = reader.readtext(gray_image)

    json_data = []

    if results:
        # Draw bounding boxes and corrected text on the original image
        for (bbox, text, prob) in results:
            corrected_text = correct_text(text)

            # Draw bounding box
            p0, p1, p2, p3 = bbox
            p0 = tuple(map(int, p0))
            p1 = tuple(map(int, p1))
            p2 = tuple(map(int, p2))
            p3 = tuple(map(int, p3))

            cv2.rectangle(original_image, p0, p2, (0, 255, 0), 2)

            # Put corrected text label
            cv2.putText(original_image, corrected_text, p0, cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)

            # Add data to JSON list
            json_data.append({
                "text": corrected_text,
                "bounding_box": [p0, p1, p2, p3],
                "confidence": prob
            })

        # Display the image with bounding boxes
        show_image("Image with Bounding Boxes", original_image)

    else:
        print("No text detected by OCR.")

    # Return JSON data
    return json_data


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


**Usage:**

To use the OCR functionality, call the ocr2 function with the path to the image file.
The processed image with bounding boxes will be displayed, and the recognized text along with bounding box coordinates and confidence scores will be returned as a JSON object.

**Note:** Uncomment the following code block to use the ocr function. Replace 'image_path' with the actual path to your image file.

In [None]:
# Example usage

#img_path = 'image_path'  # Change to your image path
#%time results = ocr2(img_path)
#print(json.dumps(results, indent=2))