# Tutorial: ROI Extraction & Morphological Operations
## Week 8, Day 4 - Module 3: Image Processing & DNNs

**Course**: Deep Neural Network Architectures (21CSE558T)  
**Date**: Wednesday, October 9, 2025  
**Duration**: 1 Hour (60 minutes)  

---

## Learning Objectives
1. Extract Regions of Interest (ROI) from images using OpenCV
2. Apply morphological operations to clean binary images
3. Build an integrated image processing pipeline
4. Solve real-world image preprocessing problems

## Setup: Import Libraries and Create Output Directory

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os

# Create output directory
os.makedirs('output', exist_ok=True)

# Configure matplotlib for inline display
plt.rcParams['figure.figsize'] = (12, 8)

# Helper function to display images in Jupyter
def show_images(images, titles, rows=1, cols=None, figsize=(15, 8), cmap='gray'):
    """Display multiple images in a grid"""
    if cols is None:
        cols = len(images)
    
    fig, axes = plt.subplots(rows, cols, figsize=figsize)
    
    if rows == 1 and cols == 1:
        axes = [axes]
    elif rows == 1 or cols == 1:
        axes = axes.flatten()
    else:
        axes = axes.flatten()
    
    for i, (img, title) in enumerate(zip(images, titles)):
        if len(img.shape) == 3 and img.shape[2] == 3:
            # BGR to RGB conversion for color images
            axes[i].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        else:
            axes[i].imshow(img, cmap=cmap)
        axes[i].set_title(title)
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()

print("✓ Libraries imported successfully")
print("✓ Output directory created")
print("✓ Helper functions defined")

---
# Part 1: ROI Extraction (25 minutes)

## Concept: Region of Interest (ROI)
ROI is a portion of an image selected for specific processing or analysis.

**Real-World Examples:**
- Cropping faces from group photos
- Extracting license plates from traffic cameras
- Isolating tumor regions in medical imaging

## Method 1: Rectangular ROI Extraction (Array Slicing)

In [None]:
# Create a sample image (or load your own)
# For demo purposes, we'll create a synthetic image
img = np.zeros((400, 600, 3), dtype=np.uint8)

# Draw some shapes
cv2.rectangle(img, (50, 50), (250, 250), (255, 0, 0), -1)  # Blue rectangle
cv2.circle(img, (450, 200), 80, (0, 255, 0), -1)  # Green circle
cv2.putText(img, 'ROI DEMO', (200, 350), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)

print(f"Image shape: {img.shape}")
print(f"Image dimensions: {img.shape[1]} x {img.shape[0]} (width x height)")

# Display
show_images([img], ['Original Image'])

In [None]:
# Define ROI coordinates
x, y, w, h = 50, 50, 200, 200  # Extract the blue rectangle

# Visualize ROI selection (draw rectangle)
img_with_roi = img.copy()
cv2.rectangle(img_with_roi, (x, y), (x+w, y+h), (0, 255, 255), 3)  # Yellow rectangle

# Extract ROI using array slicing
roi = img[y:y+h, x:x+w]

print(f"ROI coordinates: x={x}, y={y}, width={w}, height={h}")
print(f"ROI shape: {roi.shape}")

# Display results
show_images([img_with_roi, roi], 
            ['ROI Selection (Yellow Box)', 'Extracted ROI'],
            cols=2)

# Save ROI
cv2.imwrite('output/roi_extracted.jpg', roi)
print("✓ ROI saved to output/roi_extracted.jpg")

### Exercise 1.1: Extract Multiple ROIs

In [None]:
# TODO: Extract the green circle as a separate ROI
# Hint: The circle center is at (450, 200) with radius 80

# Define coordinates for circle ROI
x2 = 370  # 450 - 80
y2 = 120  # 200 - 80
w2 = 160  # diameter = 2 * radius
h2 = 160

# Extract circle ROI
roi_circle = img[y2:y2+h2, x2:x2+w2]

# Display
show_images([img, roi_circle], 
            ['Original Image', 'Circle ROI'],
            cols=2)

print("✓ Exercise 1.1 completed!")

## Method 2: Contour-Based ROI Extraction

In [None]:
# Create image with multiple objects (coins simulation)
coins_img = np.zeros((400, 600), dtype=np.uint8)

# Draw circles (simulate coins)
cv2.circle(coins_img, (100, 100), 40, 255, -1)
cv2.circle(coins_img, (250, 150), 50, 255, -1)
cv2.circle(coins_img, (450, 120), 45, 255, -1)
cv2.circle(coins_img, (200, 300), 35, 255, -1)
cv2.circle(coins_img, (400, 300), 55, 255, -1)

# Add some noise
cv2.circle(coins_img, (50, 350), 5, 255, -1)  # Small noise

print(f"Coins image shape: {coins_img.shape}")
show_images([coins_img], ['Binary Image with Objects'])

In [None]:
# Find contours
contours, hierarchy = cv2.findContours(coins_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

print(f"Total contours found: {len(contours)}")

# Visualize all contours
coins_color = cv2.cvtColor(coins_img, cv2.COLOR_GRAY2BGR)
cv2.drawContours(coins_color, contours, -1, (0, 255, 0), 2)

show_images([coins_color], ['Detected Contours (Green)'])

In [None]:
# Extract ROI for each contour with filtering
coins_with_boxes = cv2.cvtColor(coins_img, cv2.COLOR_GRAY2BGR)
extracted_rois = []
valid_count = 0

for i, contour in enumerate(contours):
    # Calculate area to filter noise
    area = cv2.contourArea(contour)
    
    # Filter: only process contours with area > 500 (removes small noise)
    if area > 500:
        # Get bounding rectangle
        x, y, w, h = cv2.boundingRect(contour)
        
        # Draw bounding box
        cv2.rectangle(coins_with_boxes, (x, y), (x+w, y+h), (255, 0, 0), 2)
        
        # Extract ROI
        roi = coins_img[y:y+h, x:x+w]
        extracted_rois.append(roi)
        
        # Save individual ROI
        cv2.imwrite(f'output/coin_{valid_count}.jpg', roi)
        
        print(f"Coin {valid_count}: Area={area:.0f}, Size={w}x{h}")
        valid_count += 1

print(f"\n✓ Extracted {valid_count} valid objects (filtered out noise)")

# Display results
show_images([coins_with_boxes], ['Bounding Boxes (Blue)'])

In [None]:
# Display all extracted ROIs in a grid
if len(extracted_rois) > 0:
    titles = [f'Coin {i}' for i in range(len(extracted_rois))]
    show_images(extracted_rois, titles, rows=2, cols=3)
    print("✓ All extracted coins displayed")

### Exercise 1.2: Contour-Based Extraction Practice

In [None]:
# TODO: Modify the area threshold and observe the results
# Try different values: 100, 500, 1000, 2000

min_area = 1000  # Change this value

filtered_count = 0
for contour in contours:
    if cv2.contourArea(contour) > min_area:
        filtered_count += 1

print(f"With min_area = {min_area}:")
print(f"  Objects detected: {filtered_count}")
print(f"  Objects filtered out: {len(contours) - filtered_count}")

# Question: What's the optimal min_area to keep all coins but remove noise?
# Answer: Around 500-800 pixels

---
# Part 2: Morphological Operations (25 minutes)

## Concept: Morphological Operations
Operations that process images based on shapes using a structuring element (kernel).

**Key Operations:**
- **Erosion**: Shrinks objects, removes noise
- **Dilation**: Expands objects, fills holes
- **Opening**: Erosion → Dilation (removes small objects)
- **Closing**: Dilation → Erosion (fills small holes)
- **Gradient**: Dilation - Erosion (extracts edges)

## Demo: All Morphological Operations

In [None]:
# Create a noisy binary image
noisy_img = np.zeros((300, 300), dtype=np.uint8)

# Main object (filled rectangle)
cv2.rectangle(noisy_img, (50, 50), (250, 250), 255, -1)

# Add holes (simulate broken parts)
cv2.circle(noisy_img, (100, 100), 10, 0, -1)
cv2.circle(noisy_img, (200, 150), 15, 0, -1)

# Add noise (small white dots)
for _ in range(20):
    x, y = np.random.randint(10, 290, 2)
    cv2.circle(noisy_img, (x, y), 3, 255, -1)

show_images([noisy_img], ['Noisy Binary Image'])
print("Created noisy image with:")
print("  - Main object (rectangle)")
print("  - Holes inside object (black circles)")
print("  - Noise outside object (white dots)")

In [None]:
# Define structuring element (kernel)
kernel = np.ones((5, 5), np.uint8)

# Apply all morphological operations
erosion = cv2.erode(noisy_img, kernel, iterations=1)
dilation = cv2.dilate(noisy_img, kernel, iterations=1)
opening = cv2.morphologyEx(noisy_img, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(noisy_img, cv2.MORPH_CLOSE, kernel)
gradient = cv2.morphologyEx(noisy_img, cv2.MORPH_GRADIENT, kernel)

# Display all results
images = [noisy_img, erosion, dilation, opening, closing, gradient]
titles = ['Original', 'Erosion\n(Shrink)', 'Dilation\n(Expand)', 
          'Opening\n(Remove Noise)', 'Closing\n(Fill Holes)', 'Gradient\n(Edges)']

show_images(images, titles, rows=2, cols=3, figsize=(18, 10))

print("\n📊 Observations:")
print("  Erosion: Made everything smaller, removed small noise")
print("  Dilation: Made everything larger, filled some holes")
print("  Opening: Removed noise but preserved main object")
print("  Closing: Filled holes inside the object")
print("  Gradient: Extracted the boundary/edge")

## Exercise 2.1: Noise Removal with Opening

In [None]:
# Create a document-like image with text and noise
doc_img = np.zeros((200, 400), dtype=np.uint8)

# Simulate text (white rectangles)
cv2.rectangle(doc_img, (20, 50), (100, 70), 255, -1)
cv2.rectangle(doc_img, (120, 50), (200, 70), 255, -1)
cv2.rectangle(doc_img, (20, 90), (150, 110), 255, -1)
cv2.rectangle(doc_img, (20, 130), (180, 150), 255, -1)

# Add salt-and-pepper noise
noise = np.random.randint(0, 2, doc_img.shape) * 255
noise = (noise * 0.05).astype(np.uint8)  # 5% noise
doc_noisy = cv2.bitwise_or(doc_img, noise)

show_images([doc_noisy], ['Noisy Document'])

In [None]:
# TODO: Apply opening to remove noise
# Experiment with different kernel sizes

# Try different kernels
kernel_3x3 = np.ones((3, 3), np.uint8)
kernel_5x5 = np.ones((5, 5), np.uint8)
kernel_7x7 = np.ones((7, 7), np.uint8)

# Apply opening with different kernels
opening_3x3 = cv2.morphologyEx(doc_noisy, cv2.MORPH_OPEN, kernel_3x3)
opening_5x5 = cv2.morphologyEx(doc_noisy, cv2.MORPH_OPEN, kernel_5x5)
opening_7x7 = cv2.morphologyEx(doc_noisy, cv2.MORPH_OPEN, kernel_7x7)

# Display comparison
images = [doc_noisy, opening_3x3, opening_5x5, opening_7x7]
titles = ['Noisy Original', 'Opening 3×3', 'Opening 5×5', 'Opening 7×7']

show_images(images, titles, rows=1, cols=4, figsize=(20, 5))

print("\n🔍 Kernel Size Effects:")
print("  3×3: Mild noise removal, text preserved")
print("  5×5: Good balance, most noise removed")
print("  7×7: Aggressive, may damage thin text")
print("\n✓ Exercise 2.1 completed!")

## Exercise 2.2: Fill Holes with Closing

In [None]:
# Create text with holes
text_img = np.zeros((150, 300), dtype=np.uint8)

# Draw text-like shapes with intentional holes
cv2.rectangle(text_img, (20, 50), (80, 100), 255, -1)
cv2.circle(text_img, (50, 75), 10, 0, -1)  # Hole 1

cv2.rectangle(text_img, (100, 50), (160, 100), 255, -1)
cv2.circle(text_img, (130, 75), 8, 0, -1)  # Hole 2

cv2.rectangle(text_img, (180, 50), (240, 100), 255, -1)
cv2.circle(text_img, (210, 75), 12, 0, -1)  # Hole 3

show_images([text_img], ['Text with Holes'])
print("Created text with internal holes (black circles)")

In [None]:
# TODO: Apply closing to fill holes
kernel = np.ones((7, 7), np.uint8)
closing = cv2.morphologyEx(text_img, cv2.MORPH_CLOSE, kernel)

# Display comparison
show_images([text_img, closing], 
            ['Text with Holes', 'After Closing (Holes Filled)'],
            cols=2)

print("✓ Holes successfully filled!")
print("✓ Exercise 2.2 completed!")

## Exercise 2.3: Morphological Gradient for Edge Detection

In [None]:
# Use the coins image from earlier
kernel = np.ones((5, 5), np.uint8)

# Apply morphological gradient
gradient = cv2.morphologyEx(coins_img, cv2.MORPH_GRADIENT, kernel)

# Compare with Canny edge detection
canny_edges = cv2.Canny(coins_img, 100, 200)

# Display comparison
show_images([coins_img, gradient, canny_edges], 
            ['Original', 'Morphological Gradient', 'Canny Edges'],
            cols=3, figsize=(18, 5))

print("\n📊 Comparison:")
print("  Morphological Gradient: Thick edges, shape-based")
print("  Canny: Thin edges, intensity-based")
print("\n✓ Exercise 2.3 completed!")

---
# Part 3: Integration Exercise (10 minutes)

## Mini-Project: Complete Document Processing Pipeline

In [None]:
# Create a complex document image
document = np.zeros((400, 600), dtype=np.uint8)

# Add text blocks
cv2.rectangle(document, (50, 50), (250, 120), 255, -1)
cv2.rectangle(document, (300, 50), (550, 120), 255, -1)
cv2.rectangle(document, (50, 150), (550, 220), 255, -1)
cv2.rectangle(document, (50, 250), (350, 320), 255, -1)

# Add holes in text
cv2.circle(document, (150, 85), 8, 0, -1)
cv2.circle(document, (425, 85), 10, 0, -1)
cv2.circle(document, (300, 185), 12, 0, -1)

# Add noise
for _ in range(50):
    x, y = np.random.randint(10, 590), np.random.randint(10, 390)
    cv2.circle(document, (x, y), 2, 255, -1)

show_images([document], ['Complex Document: Noisy with Holes'])

In [None]:
def process_document_pipeline(binary_image, min_area=1000):
    """
    Complete document processing pipeline
    
    Steps:
    1. Remove noise (Opening)
    2. Fill holes (Closing)
    3. Find text regions (Contours)
    4. Extract ROIs
    """
    # Step 1: Remove noise with opening
    kernel_open = np.ones((3, 3), np.uint8)
    cleaned = cv2.morphologyEx(binary_image, cv2.MORPH_OPEN, kernel_open)
    
    # Step 2: Fill holes with closing
    kernel_close = np.ones((9, 9), np.uint8)
    filled = cv2.morphologyEx(cleaned, cv2.MORPH_CLOSE, kernel_close)
    
    # Step 3: Find contours
    contours, _ = cv2.findContours(filled, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    # Step 4: Extract ROIs
    rois = []
    valid_count = 0
    
    # Create visualization image
    result_img = cv2.cvtColor(filled, cv2.COLOR_GRAY2BGR)
    
    for contour in contours:
        area = cv2.contourArea(contour)
        if area > min_area:
            x, y, w, h = cv2.boundingRect(contour)
            
            # Draw bounding box
            cv2.rectangle(result_img, (x, y), (x+w, y+h), (0, 255, 0), 2)
            
            # Extract ROI
            roi = filled[y:y+h, x:x+w]
            rois.append(roi)
            
            # Save ROI
            cv2.imwrite(f'output/text_region_{valid_count}.jpg', roi)
            valid_count += 1
    
    return cleaned, filled, result_img, rois, valid_count

# Run the pipeline
cleaned, filled, result, rois, num_regions = process_document_pipeline(document)

# Display pipeline stages
show_images([document, cleaned, filled, result], 
            ['1. Original', '2. After Opening', '3. After Closing', '4. Detected Regions'],
            rows=1, cols=4, figsize=(20, 5))

print(f"\n✅ Pipeline Results:")
print(f"  ✓ Noise removed")
print(f"  ✓ Holes filled")
print(f"  ✓ {num_regions} text regions extracted")
print(f"  ✓ ROIs saved to output/ folder")

In [None]:
# Display all extracted text regions
if len(rois) > 0:
    titles = [f'Text Region {i}' for i in range(len(rois))]
    show_images(rois, titles, rows=2, cols=2, figsize=(12, 8))
    print("✓ All text regions successfully extracted!")

---
# Summary and Key Takeaways

## ✅ What We Learned

### ROI Extraction:
1. **Rectangular ROI**: `img[y:y+h, x:x+w]` for known coordinates
2. **Contour-based ROI**: Automatic extraction using `cv2.findContours()` and `cv2.boundingRect()`
3. **Filtering**: Use `cv2.contourArea()` to remove noise

### Morphological Operations:
1. **Opening (Erode→Dilate)**: Removes noise, preserves large objects
2. **Closing (Dilate→Erode)**: Fills holes, connects gaps
3. **Gradient**: Edge detection for binary images
4. **Kernel size**: Larger = stronger effect, but can damage details

### Pipeline Thinking:
1. Process in stages: Segment → Clean → Extract
2. Save intermediate results for debugging
3. Apply operations on binary, extract ROI from original
4. Filter by area/size to remove unwanted regions

---

## 🔗 Connection to Future Topics

- **Week 9**: Feature extraction from these ROIs (shape, color, texture)
- **Module 4**: R-CNN uses ROI extraction for object detection
- **Real Applications**: Document digitization, medical imaging, quality control

---

## 📝 Practice Exercises

Try these on your own:
1. Load a real document image and extract text regions
2. Process a photo with multiple objects (fruits, coins, etc.)
3. Build a pipeline for license plate detection
4. Experiment with different kernel shapes (cross, ellipse)

---

**End of Tutorial**

*Course: 21CSE558T - Deep Neural Network Architectures*  
*Module 3: Image Processing & Deep Neural Networks*  
*Week 8, Day 4 - October 9, 2025*