# Intelligent Document Processing System Demo

This notebook demonstrates how to use the Intelligent Document Processing System to extract information from documents using OCR and computer vision techniques.

## Overview

1. Setting up and configuring the system
2. Loading and preprocessing images
3. Performing OCR and extracting text
4. Detecting and analyzing regions of interest
5. Extracting structured data
6. Generating reports

In [None]:
# Import required libraries
import os
import yaml
import cv2
import numpy as np
from pathlib import Path
from src.ocr.processor import DocumentProcessor
from src.vision.contour_detector import ContourDetector
from src.ocr.data_extractor import DataExtractor
from src.ocr.report_generator import ReportGenerator

# Load configuration
with open('../config.yaml', 'r') as f:
    config = yaml.safe_load(f)

## Creating Sample Document

Let's create a sample document with text to demonstrate the system. We'll create an image with some text and save it for processing.

In [None]:
# Create a sample document image
def create_sample_document():
    # Create a white image
    img = np.ones((800, 600), dtype=np.uint8) * 255
    
    # Add some text
    font = cv2.FONT_HERSHEY_SIMPLEX
    text_items = [
        ('Technical Document', (50, 50), 1.5),
        ('Document ID: TECH-2025-001', (50, 100), 1),
        ('Date: 02/10/2025', (50, 150), 1),
        ('Contact: info@example.com', (50, 200), 1),
        ('Phone: +1-234-567-8900', (50, 250), 1),
        ('Measurements:', (50, 300), 1),
        ('Length: 123.45 mm', (70, 350), 0.8),
        ('Width: 45.67 cm', (70, 400), 0.8),
        ('Weight: 892.3 g', (70, 450), 0.8),
        ('Total Cost: $1,234.56', (50, 500), 1)
    ]
    
    for text, pos, scale in text_items:
        cv2.putText(img, text, pos, font, scale, (0, 0, 0), 2)
    
    return img

# Create and save the sample document
sample_img = create_sample_document()
sample_path = '../data/input/sample_doc.png'
os.makedirs('../data/input', exist_ok=True)
cv2.imwrite(sample_path, sample_img)

# Display the image
from IPython.display import Image
Image(filename=sample_path)

## Document Processing Pipeline

Now let's process the document through our pipeline:
1. OCR processing
2. Contour detection
3. Data extraction
4. Report generation

In [None]:
# Initialize components
doc_processor = DocumentProcessor(ocr_engine=config['ocr']['engine'])
contour_detector = ContourDetector(
    min_area=config['image']['contours']['min_area'],
    max_area=config['image']['contours']['max_area']
)
data_extractor = DataExtractor()
report_generator = ReportGenerator(config['paths']['output_dir'])

# Process the document
print("1. Processing document with OCR...")
text = doc_processor.process(sample_path)
print("\nExtracted text:")
print("-" * 50)
print(text)
print("-" * 50)

In [None]:
# Detect and extract regions
print("\n2. Detecting regions of interest...")
regions = contour_detector.process_image(sample_path, visualize=True)
print(f"Found {len(regions)} regions of interest")

# Display some regions
for i, (roi, coords) in enumerate(regions[:3]):
    print(f"\nRegion {i+1} coordinates (x, y, w, h):", coords)
    
    # Save and display ROI
    roi_path = f'../data/input/roi_{i}.png'
    cv2.imwrite(roi_path, roi)
    display(Image(filename=roi_path))

In [None]:
# Extract structured data
print("\n3. Extracting structured data...")
extracted_data = data_extractor.process_text(text)

print("\nExtracted Data:")
print("-" * 50)
for key, values in extracted_data.items():
    if isinstance(values, list) and key != 'tables':
        print(f"{key}:")
        for value in values:
            print(f"  - {value}")
print("-" * 50)

In [None]:
# Generate reports
print("\n4. Generating reports...")
output_files = report_generator.generate_reports(
    extracted_data,
    base_filename="sample_doc"
)

print("\nGenerated Reports:")
print("-" * 50)
for report_type, file_path in output_files.items():
    print(f"{report_type}: {file_path}")
print("-" * 50)

# Display the visualization if available
if 'visualization' in output_files:
    display(Image(filename=output_files['visualization']))

## Conclusion

This notebook demonstrated the complete pipeline of the Intelligent Document Processing System:

1. We created a sample document with various types of information
2. Processed it through OCR to extract text
3. Detected and extracted regions of interest using computer vision
4. Extracted structured data using pattern matching
5. Generated comprehensive reports in multiple formats

The system can be easily adapted to work with different types of documents by:
- Adjusting the configuration parameters in `config.yaml`
- Adding new patterns for data extraction
- Customizing the report generation formats and layouts