# 📸 Image Processing Pipeline - Step by Step

This notebook demonstrates the complete image processing pipeline used in **PrintChakra**, from raw image input to final processed output.

## Overview
- **Input**: `image.jpg` (raw document photo)
- **Output**: Processed, enhanced document image
- **Pipeline Steps**: 12 stages of image processing
- **Each step shows**: Input → Processing → Output → Debug Info

## Requirements
- OpenCV (cv2)
- NumPy
- Matplotlib
- PIL/Pillow
- pytesseract (for OCR)

Let's begin! 🚀

## Step 1: Import Required Libraries

Import all necessary libraries for image processing, visualization, and OCR.

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import pytesseract
import os
from datetime import datetime

# Configure matplotlib for inline display
%matplotlib inline
plt.rcParams['figure.figsize'] = (12, 6)

# Print library versions
print("=" * 60)
print("📦 LIBRARY VERSIONS")
print("=" * 60)
print(f"OpenCV version: {cv2.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Python version: {os.sys.version.split()[0]}")
print("=" * 60)
print("✅ All libraries imported successfully!")
print("=" * 60)

## Step 2: Load and Display Input Image

Load the input image `image.jpg` from the current directory and verify it was loaded successfully.

**Expected Input**: File path to `image.jpg`  
**Expected Output**: Loaded image as NumPy array with shape (height, width, channels)

In [None]:
# Input: File path
input_image_path = 'image.jpg'

print("=" * 60)
print("📂 LOADING INPUT IMAGE")
print("=" * 60)
print(f"Input file: {input_image_path}")

# Load image
try:
    original_image = cv2.imread(input_image_path)
    
    if original_image is None:
        raise FileNotFoundError(f"❌ Error: Could not load image from '{input_image_path}'")
    
    # Debug info
    print(f"✅ Image loaded successfully!")
    print(f"Image shape: {original_image.shape} (height, width, channels)")
    print(f"Data type: {original_image.dtype}")
    print(f"File size: {os.path.getsize(input_image_path) / 1024:.2f} KB")
    print(f"Color space: BGR (OpenCV default)")
    
    # Convert BGR to RGB for display
    original_image_rgb = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
    
    # Display
    plt.figure(figsize=(10, 8))
    plt.imshow(original_image_rgb)
    plt.title('Original Input Image', fontsize=14, fontweight='bold')
    plt.axis('off')
    plt.tight_layout()
    plt.show()
    
    print("=" * 60)
    
except Exception as e:
    print(f"❌ ERROR: {str(e)}")
    print("Please ensure 'image.jpg' exists in the current directory")
    print("=" * 60)

## Step 3: Convert to Grayscale

Convert the BGR color image to grayscale. This reduces the image from 3 channels (BGR) to 1 channel (grayscale), simplifying further processing.

**Expected Input**: BGR image (height, width, 3)  
**Expected Output**: Grayscale image (height, width)

In [None]:
# Input: BGR color image
print("=" * 60)
print("🎨 GRAYSCALE CONVERSION")
print("=" * 60)
print(f"Input shape: {original_image.shape}")
print(f"Input channels: {original_image.shape[2]}")

# Convert to grayscale
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)

# Debug info
print(f"Output shape: {gray.shape}")
print(f"Output channels: 1 (grayscale)")
print(f"Data type: {gray.dtype}")
print(f"Value range: [{gray.min()}, {gray.max()}]")
print(f"Mean intensity: {gray.mean():.2f}")
print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(original_image_rgb)
axes[0].set_title('Input: Color Image', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(gray, cmap='gray')
axes[1].set_title('Output: Grayscale Image', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Grayscale conversion complete!")

## Step 4: Apply Gaussian Blur

Apply Gaussian blur to reduce noise and smooth the image. This helps improve edge detection in later steps.

**Expected Input**: Grayscale image  
**Expected Output**: Blurred grayscale image with reduced noise

In [None]:
# Input: Grayscale image
kernel_size = (5, 5)
sigma = 0

print("=" * 60)
print("🌫️  GAUSSIAN BLUR")
print("=" * 60)
print(f"Input shape: {gray.shape}")
print(f"Kernel size: {kernel_size}")
print(f"Sigma: {sigma} (auto-calculated)")

# Apply Gaussian blur
blurred = cv2.GaussianBlur(gray, kernel_size, sigma)

# Debug info
print(f"Output shape: {blurred.shape}")
print(f"Blur effect - before mean: {gray.mean():.2f}, after mean: {blurred.mean():.2f}")
print(f"Standard deviation - before: {gray.std():.2f}, after: {blurred.std():.2f}")
print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(gray, cmap='gray')
axes[0].set_title('Input: Sharp Grayscale', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(blurred, cmap='gray')
axes[1].set_title(f'Output: Blurred (kernel={kernel_size})', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Gaussian blur applied successfully!")

## Step 5: Edge Detection (Canny)

Apply Canny edge detection to find edges in the image. This is crucial for detecting document boundaries.

**Expected Input**: Blurred grayscale image  
**Expected Output**: Binary edge map (white edges on black background)

In [None]:
# Input: Blurred grayscale image
threshold1 = 50
threshold2 = 150

print("=" * 60)
print("🔍 CANNY EDGE DETECTION")
print("=" * 60)
print(f"Input shape: {blurred.shape}")
print(f"Lower threshold: {threshold1}")
print(f"Upper threshold: {threshold2}")

# Apply Canny edge detection
edges = cv2.Canny(blurred, threshold1, threshold2)

# Debug info
print(f"Output shape: {edges.shape}")
print(f"Output range: [{edges.min()}, {edges.max()}]")
print(f"Edge pixels: {np.count_nonzero(edges)} ({100 * np.count_nonzero(edges) / edges.size:.2f}%)")
print(f"Non-edge pixels: {edges.size - np.count_nonzero(edges)}")
print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(blurred, cmap='gray')
axes[0].set_title('Input: Blurred Image', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(edges, cmap='gray')
axes[1].set_title(f'Output: Detected Edges (t1={threshold1}, t2={threshold2})', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Edge detection complete!")

## Step 6: Binary Thresholding

Convert the grayscale image to a pure binary image (black and white only) using Otsu's thresholding method.

**Expected Input**: Grayscale image  
**Expected Output**: Binary image (0 or 255 values only)

In [None]:
# Input: Grayscale image
print("=" * 60)
print("⚫⚪ BINARY THRESHOLDING (Otsu's Method)")
print("=" * 60)
print(f"Input shape: {gray.shape}")
print(f"Input value range: [{gray.min()}, {gray.max()}]")

# Apply Otsu's thresholding
threshold_value, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# Debug info
print(f"Auto-calculated threshold: {threshold_value:.2f}")
print(f"Output shape: {binary.shape}")
print(f"Output values: {np.unique(binary)} (only 0 and 255)")
print(f"White pixels: {np.count_nonzero(binary)} ({100 * np.count_nonzero(binary) / binary.size:.2f}%)")
print(f"Black pixels: {binary.size - np.count_nonzero(binary)} ({100 * (1 - np.count_nonzero(binary) / binary.size):.2f}%)")
print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(gray, cmap='gray')
axes[0].set_title('Input: Grayscale (256 levels)', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(binary, cmap='gray')
axes[1].set_title(f'Output: Binary (threshold={threshold_value:.0f})', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Binary thresholding complete!")

## Step 7: Morphological Operations (Erosion & Dilation)

Apply morphological operations to clean up the binary image by removing small noise and filling gaps.

**Expected Input**: Binary image  
**Expected Output**: Cleaned binary image with noise removed and gaps filled

In [None]:
# Input: Binary image
kernel_size = (3, 3)
iterations = 1

print("=" * 60)
print("🔨 MORPHOLOGICAL OPERATIONS")
print("=" * 60)
print(f"Input shape: {binary.shape}")
print(f"Kernel size: {kernel_size}")
print(f"Iterations: {iterations}")

# Create kernel
kernel = np.ones(kernel_size, np.uint8)

# Erosion (removes small white noise)
eroded = cv2.erode(binary, kernel, iterations=iterations)
print(f"\n1️⃣ Erosion complete")
print(f"   White pixels before: {np.count_nonzero(binary)}")
print(f"   White pixels after: {np.count_nonzero(eroded)}")
print(f"   Pixels removed: {np.count_nonzero(binary) - np.count_nonzero(eroded)}")

# Dilation (fills small black gaps)
dilated = cv2.dilate(eroded, kernel, iterations=iterations)
print(f"\n2️⃣ Dilation complete")
print(f"   White pixels before: {np.count_nonzero(eroded)}")
print(f"   White pixels after: {np.count_nonzero(dilated)}")
print(f"   Pixels added: {np.count_nonzero(dilated) - np.count_nonzero(eroded)}")

print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

axes[0].imshow(binary, cmap='gray')
axes[0].set_title('Input: Original Binary', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(eroded, cmap='gray')
axes[1].set_title('After Erosion (noise removed)', fontsize=12, fontweight='bold')
axes[1].axis('off')

axes[2].imshow(dilated, cmap='gray')
axes[2].set_title('After Dilation (gaps filled)', fontsize=12, fontweight='bold')
axes[2].axis('off')

plt.tight_layout()
plt.show()

print("✅ Morphological operations complete!")

## Step 8: Contour Detection

Find and draw contours (outlines) of objects in the image. This helps identify document boundaries.

**Expected Input**: Binary image  
**Expected Output**: List of contours and image with contours drawn

In [None]:
# Input: Binary/edge image
print("=" * 60)
print("📐 CONTOUR DETECTION")
print("=" * 60)
print(f"Input shape: {edges.shape}")

# Find contours
contours, hierarchy = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Debug info
print(f"Total contours found: {len(contours)}")

# Sort by area (largest first)
sorted_contours = sorted(contours, key=cv2.contourArea, reverse=True)

# Display info about top 5 largest contours
print(f"\nTop 5 largest contours:")
for i, cnt in enumerate(sorted_contours[:5]):
    area = cv2.contourArea(cnt)
    perimeter = cv2.arcLength(cnt, True)
    print(f"  {i+1}. Area: {area:.0f} px², Perimeter: {perimeter:.0f} px")

print("=" * 60)

# Draw contours on color image
contour_image = original_image.copy()
cv2.drawContours(contour_image, sorted_contours[:10], -1, (0, 255, 0), 3)
contour_image_rgb = cv2.cvtColor(contour_image, cv2.COLOR_BGR2RGB)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(edges, cmap='gray')
axes[0].set_title('Input: Edge Map', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(contour_image_rgb)
axes[1].set_title(f'Output: Top 10 Contours (total: {len(contours)})', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Contour detection complete!")

## Step 9: Image Resizing and Scaling

Resize the image to specific dimensions. This is useful for standardizing output or reducing file size.

**Expected Input**: Processed image  
**Expected Output**: Resized image with new dimensions

In [None]:
# Input: Grayscale image
target_width = 800
scale_percent = None  # Or specify percentage

print("=" * 60)
print("📏 IMAGE RESIZING")
print("=" * 60)
print(f"Original dimensions: {gray.shape[1]}x{gray.shape[0]} (width x height)")

# Calculate new dimensions maintaining aspect ratio
aspect_ratio = gray.shape[0] / gray.shape[1]
target_height = int(target_width * aspect_ratio)

print(f"Target width: {target_width}px")
print(f"Calculated height: {target_height}px (maintaining aspect ratio)")
print(f"Interpolation method: INTER_LINEAR")

# Resize image
resized = cv2.resize(gray, (target_width, target_height), interpolation=cv2.INTER_LINEAR)

# Debug info
original_pixels = gray.shape[0] * gray.shape[1]
resized_pixels = resized.shape[0] * resized.shape[1]
scale_factor = resized_pixels / original_pixels

print(f"\nOutput dimensions: {resized.shape[1]}x{resized.shape[0]}")
print(f"Original pixels: {original_pixels:,}")
print(f"Resized pixels: {resized_pixels:,}")
print(f"Scale factor: {scale_factor:.2%}")
print("=" * 60)

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(gray, cmap='gray')
axes[0].set_title(f'Input: {gray.shape[1]}x{gray.shape[0]}', fontsize=12, fontweight='bold')
axes[0].axis('off')

axes[1].imshow(resized, cmap='gray')
axes[1].set_title(f'Output: {resized.shape[1]}x{resized.shape[0]}', fontsize=12, fontweight='bold')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Image resizing complete!")

## Step 10: Histogram Equalization

Apply histogram equalization to enhance contrast and improve image quality for better OCR results.

**Expected Input**: Grayscale image  
**Expected Output**: Contrast-enhanced image with more uniform intensity distribution

In [None]:
# Input: Grayscale image
print("=" * 60)
print("📊 HISTOGRAM EQUALIZATION")
print("=" * 60)
print(f"Input shape: {gray.shape}")
print(f"Input value range: [{gray.min()}, {gray.max()}]")
print(f"Input mean: {gray.mean():.2f}, std: {gray.std():.2f}")

# Apply histogram equalization
equalized = cv2.equalizeHist(gray)

# Debug info
print(f"\nOutput value range: [{equalized.min()}, {equalized.max()}]")
print(f"Output mean: {equalized.mean():.2f}, std: {equalized.std():.2f}")
print(f"Contrast improvement: {equalized.std() / gray.std():.2f}x")
print("=" * 60)

# Create figure with images and histograms
fig = plt.figure(figsize=(16, 8))

# Original image
ax1 = plt.subplot(2, 2, 1)
ax1.imshow(gray, cmap='gray')
ax1.set_title('Input: Original Grayscale', fontsize=12, fontweight='bold')
ax1.axis('off')

# Original histogram
ax2 = plt.subplot(2, 2, 2)
ax2.hist(gray.ravel(), bins=256, range=[0, 256], color='blue', alpha=0.7)
ax2.set_title('Input Histogram', fontsize=12, fontweight='bold')
ax2.set_xlabel('Pixel Intensity')
ax2.set_ylabel('Frequency')
ax2.grid(alpha=0.3)

# Equalized image
ax3 = plt.subplot(2, 2, 3)
ax3.imshow(equalized, cmap='gray')
ax3.set_title('Output: Equalized (Enhanced Contrast)', fontsize=12, fontweight='bold')
ax3.axis('off')

# Equalized histogram
ax4 = plt.subplot(2, 2, 4)
ax4.hist(equalized.ravel(), bins=256, range=[0, 256], color='green', alpha=0.7)
ax4.set_title('Output Histogram (More Uniform)', fontsize=12, fontweight='bold')
ax4.set_xlabel('Pixel Intensity')
ax4.set_ylabel('Frequency')
ax4.grid(alpha=0.3)

plt.tight_layout()
plt.show()

print("✅ Histogram equalization complete!")

## Step 11: OCR Text Extraction (Optional)

Extract text from the processed image using Tesseract OCR.

**Expected Input**: Processed grayscale/binary image  
**Expected Output**: Extracted text string

**Note**: Requires Tesseract OCR to be installed on your system.

In [None]:
# Input: Processed image (use equalized for better results)
print("=" * 60)
print("📝 OCR TEXT EXTRACTION")
print("=" * 60)
print(f"Input shape: {equalized.shape}")

try:
    # Perform OCR
    text = pytesseract.image_to_string(equalized, lang='eng')
    
    # Debug info
    text_length = len(text.strip())
    word_count = len(text.split())
    line_count = len([line for line in text.split('\n') if line.strip()])
    
    print(f"✅ OCR completed successfully!")
    print(f"Characters extracted: {text_length}")
    print(f"Word count: {word_count}")
    print(f"Line count: {line_count}")
    print("=" * 60)
    
    # Display extracted text
    print("\n📄 EXTRACTED TEXT:")
    print("-" * 60)
    if text.strip():
        print(text)
    else:
        print("(No text detected)")
    print("-" * 60)
    
    # Get additional OCR data
    data = pytesseract.image_to_data(equalized, output_type=pytesseract.Output.DICT)
    n_boxes = len(data['text'])
    confident_words = [i for i in range(n_boxes) if int(data['conf'][i]) > 60]
    
    print(f"\nConfidence analysis:")
    print(f"Total words detected: {n_boxes}")
    print(f"High confidence (>60%): {len(confident_words)} words")
    
except Exception as e:
    print(f"❌ OCR Error: {str(e)}")
    print("Make sure Tesseract OCR is installed and in your PATH")
    print("Windows: https://github.com/UB-Mannheim/tesseract/wiki")
    
print("=" * 60)

## Step 12: Save Final Processed Image

Save the processed image to disk for later use.

**Expected Input**: Final processed image  
**Expected Output**: Saved image file `output.jpg`

In [None]:
# Input: Final processed image (using equalized as final output)
output_filename = 'output.jpg'

print("=" * 60)
print("💾 SAVING PROCESSED IMAGE")
print("=" * 60)
print(f"Output filename: {output_filename}")
print(f"Image shape: {equalized.shape}")
print(f"Data type: {equalized.dtype}")

# Save the image
try:
    cv2.imwrite(output_filename, equalized)
    
    # Verify file was saved
    if os.path.exists(output_filename):
        file_size = os.path.getsize(output_filename)
        print(f"✅ Image saved successfully!")
        print(f"File size: {file_size / 1024:.2f} KB")
        print(f"Full path: {os.path.abspath(output_filename)}")
        
        # Display before/after comparison
        fig, axes = plt.subplots(1, 2, figsize=(14, 6))
        
        axes[0].imshow(original_image_rgb)
        axes[0].set_title('BEFORE: Original Input', fontsize=12, fontweight='bold')
        axes[0].axis('off')
        
        axes[1].imshow(equalized, cmap='gray')
        axes[1].set_title('AFTER: Final Processed Output', fontsize=12, fontweight='bold')
        axes[1].axis('off')
        
        plt.tight_layout()
        plt.show()
        
    else:
        print(f"❌ Error: File was not saved")
        
except Exception as e:
    print(f"❌ Save Error: {str(e)}")

print("=" * 60)

## 🎉 Processing Complete!

### Summary of Pipeline Steps

1. ✅ **Load Image** - Loaded `image.jpg` and verified integrity
2. ✅ **Grayscale Conversion** - Reduced from 3 channels to 1
3. ✅ **Gaussian Blur** - Reduced noise for better edge detection
4. ✅ **Edge Detection** - Applied Canny algorithm to find boundaries
5. ✅ **Binary Thresholding** - Converted to black & white using Otsu's method
6. ✅ **Morphological Ops** - Cleaned image with erosion & dilation
7. ✅ **Contour Detection** - Found document boundaries
8. ✅ **Resizing** - Standardized dimensions
9. ✅ **Histogram Equalization** - Enhanced contrast
10. ✅ **OCR Extraction** - Extracted text using Tesseract
11. ✅ **Save Output** - Saved final processed image

### Next Steps

- Use `output.jpg` as your processed image
- Adjust parameters in individual cells for better results
- Experiment with different threshold values
- Try different morphological kernel sizes

### Troubleshooting

- **Image not loading?** Ensure `image.jpg` exists in the current directory
- **OCR errors?** Install Tesseract OCR and add to PATH
- **Poor results?** Adjust blur kernel size, Canny thresholds, or binary threshold

---

**PrintChakra Image Processing Pipeline** - Version 2.0.0