# 2. Preprocessing and Segmentation

**Paper Reference:** Section 3.2 - Model (Segmentation Module)

This notebook implements the preprocessing and building segmentation pipeline.

## Overview

From the paper:
> "After resizing each input image to 512×512 pixels and normalizing pixel intensities, images were processed using ReFineNet, a pretrained segmentation network. Post-processing further refined these masks by applying morphological opening to eliminate small artifacts and reduce noise, followed by the watershed algorithm."

**Pipeline:**
1. Image resizing to 512×512
2. Pixel intensity normalization
3. ReFineNet segmentation (pretrained)
4. Test-time augmentation (TTA)
5. Morphological post-processing
6. Watershed segmentation
7. Size filtering (500-100,000 pixels)

## 2.1 Environment Setup

In [None]:
# Core imports
import os
import numpy as np
import cv2
from pathlib import Path
import matplotlib.pyplot as plt
from PIL import Image
import warnings
warnings.filterwarnings('ignore')

# Deep learning imports
import tensorflow as tf
from tensorflow.keras.preprocessing import image as keras_image

# scipy for watershed
from scipy import ndimage
from skimage import morphology, measure, segmentation
from skimage.feature import peak_local_max

print(f"TensorFlow version: {tf.__version__}")
print(f"GPUs available: {tf.config.list_physical_devices('GPU')}")

## 2.2 Configuration

In [None]:
# Image specifications (Paper Section 3.2)
IMAGE_SIZE = 512              # Target image size
MODEL_INPUT_SIZE = 224        # DenseNet201 input size

# Segmentation parameters (Paper Section 3.2)
MIN_BUILDING_AREA = 500       # Minimum segment area in pixels
MAX_BUILDING_AREA = 100000    # Maximum segment area in pixels

# Morphological operation kernel size
MORPH_KERNEL_SIZE = 5

# Building classes
BUILDING_CLASSES = ['Commercial', 'High', 'Hospital', 'Industrial', 'Multi', 'Schools', 'Single']

print(f"Image Size: {IMAGE_SIZE}x{IMAGE_SIZE}")
print(f"Building Area Range: {MIN_BUILDING_AREA}-{MAX_BUILDING_AREA} pixels")

## 2.3 Image Preprocessing

Paper Section 3.2:
> "After resizing each input image to 512×512 pixels and normalizing pixel intensities..."

In [None]:
def load_and_preprocess_image(image_path, target_size=IMAGE_SIZE):
    """
    Load and preprocess satellite image for segmentation.
    
    Paper Reference: Section 3.2
    "After resizing each input image to 512×512 pixels and 
     normalizing pixel intensities..."
    
    Args:
        image_path (str): Path to input image
        target_size (int): Target image size
        
    Returns:
        np.array: Preprocessed image (normalized, resized)
    """
    # Load image
    img = cv2.imread(str(image_path))
    if img is None:
        raise ValueError(f"Could not load image: {image_path}")
    
    # Convert BGR to RGB
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    # Resize to target size
    img = cv2.resize(img, (target_size, target_size))
    
    # Normalize pixel intensities to [0, 1]
    img_normalized = img.astype(np.float32) / 255.0
    
    return img_normalized

def prepare_for_model(img, model_input_size=MODEL_INPUT_SIZE):
    """
    Prepare image for DenseNet201 classification.
    
    Args:
        img (np.array): Input image
        model_input_size (int): Model input size (224 for DenseNet)
        
    Returns:
        np.array: Image ready for model prediction
    """
    # Resize for model input
    img_resized = cv2.resize(img, (model_input_size, model_input_size))
    
    # Add batch dimension
    img_batch = np.expand_dims(img_resized, axis=0)
    
    return img_batch

print("Preprocessing functions defined.")

## 2.4 Test-Time Augmentation (TTA)

Paper Section 3.2:
> "To further improve mask robustness against variations in building orientation and appearance, we employed test-time augmentation (TTA). TTA involved generating predictions from horizontally and vertically flipped versions of each image and averaging these predictions."

In [None]:
def apply_tta(image):
    """
    Apply Test-Time Augmentation (TTA) for robust predictions.
    
    Paper Reference: Section 3.2
    "TTA involved generating predictions from horizontally and vertically 
     flipped versions of each image and averaging these predictions."
    
    Args:
        image (np.array): Input image
        
    Returns:
        list: List of augmented images [original, h_flip, v_flip, hv_flip]
    """
    augmented = [
        image,                                    # Original
        np.fliplr(image),                        # Horizontal flip
        np.flipud(image),                        # Vertical flip
        np.flipud(np.fliplr(image))              # Both flips
    ]
    
    return augmented

def average_tta_predictions(predictions_list):
    """
    Average predictions from TTA-augmented images.
    
    Args:
        predictions_list (list): List of prediction arrays
        
    Returns:
        np.array: Averaged predictions
    """
    return np.mean(predictions_list, axis=0)

print("TTA functions defined.")

## 2.5 Morphological Post-Processing

Paper Section 3.2:
> "Post-processing further refined these masks by applying morphological opening to eliminate small artifacts and reduce noise."

In [None]:
def morphological_postprocess(mask, kernel_size=MORPH_KERNEL_SIZE):
    """
    Apply morphological operations to refine segmentation mask.
    
    Paper Reference: Section 3.2
    "Post-processing further refined these masks by applying morphological 
     opening to eliminate small artifacts and reduce noise."
    
    Args:
        mask (np.array): Binary segmentation mask
        kernel_size (int): Size of morphological kernel
        
    Returns:
        np.array: Refined binary mask
    """
    # Create elliptical kernel for morphological operations
    kernel = cv2.getStructuringElement(
        cv2.MORPH_ELLIPSE, 
        (kernel_size, kernel_size)
    )
    
    # Morphological opening: erosion followed by dilation
    # Removes small artifacts while preserving larger structures
    mask_opened = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    
    # Optional: closing to fill small holes
    mask_closed = cv2.morphologyEx(mask_opened, cv2.MORPH_CLOSE, kernel)
    
    return mask_closed.astype(np.uint8)

print("Morphological post-processing function defined.")

## 2.6 Watershed Segmentation

Paper Section 3.2:
> "...followed by the watershed algorithm, chosen for its efficacy in segmenting connected or overlapping building structures."

In [None]:
def watershed_segmentation(mask, image):
    """
    Apply watershed algorithm to separate connected buildings.
    
    Paper Reference: Section 3.2 (Meyer, 1994)
    "...followed by the watershed algorithm, chosen for its efficacy in 
     segmenting connected or overlapping building structures."
    
    Args:
        mask (np.array): Binary segmentation mask
        image (np.array): Original image for visualization
        
    Returns:
        np.array: Labeled segments array
    """
    # Compute distance transform
    distance = ndimage.distance_transform_edt(mask)
    
    # Find local maxima as markers
    local_max = peak_local_max(
        distance, 
        min_distance=20,
        labels=mask
    )
    
    # Create markers for watershed
    markers = np.zeros(distance.shape, dtype=bool)
    markers[tuple(local_max.T)] = True
    markers = measure.label(markers)
    
    # Apply watershed
    labels = segmentation.watershed(-distance, markers, mask=mask)
    
    return labels

print("Watershed segmentation function defined.")

## 2.7 Building Segment Extraction

Paper Section 3.2:
> "We filtered segmented regions by size, retaining only those within a pixel area range of 500–100,000 pixels."

In [None]:
def extract_building_segments(labels, image, 
                              min_area=MIN_BUILDING_AREA, 
                              max_area=MAX_BUILDING_AREA):
    """
    Extract individual building segments within size constraints.
    
    Paper Reference: Section 3.2
    "We filtered segmented regions by size, retaining only those within 
     a pixel area range of 500–100,000 pixels."
    
    Args:
        labels (np.array): Labeled segments from watershed
        image (np.array): Original image
        min_area (int): Minimum building area in pixels
        max_area (int): Maximum building area in pixels
        
    Returns:
        list: List of dictionaries with building crops and metadata
    """
    buildings = []
    
    for region in measure.regionprops(labels):
        # Filter by area
        if min_area <= region.area <= max_area:
            # Get bounding box
            minr, minc, maxr, maxc = region.bbox
            
            # Extract building crop
            building_crop = image[minr:maxr, minc:maxc]
            
            buildings.append({
                'crop': building_crop,
                'bbox': (minr, minc, maxr, maxc),
                'area': region.area,
                'centroid': region.centroid
            })
    
    return buildings

print(f"Building extraction function defined.")
print(f"Area filter: {MIN_BUILDING_AREA} - {MAX_BUILDING_AREA} pixels")

## 2.8 Complete Segmentation Pipeline

In [None]:
def segment_buildings(image_path, segmentation_model=None):
    """
    Complete building segmentation pipeline.
    
    Paper Reference: Section 3.2
    Full pipeline as described in methodology.
    
    Args:
        image_path (str): Path to satellite image
        segmentation_model: Pretrained segmentation model (ReFineNet)
        
    Returns:
        dict: Segmentation results with buildings and visualization
    """
    # Step 1: Load and preprocess
    image = load_and_preprocess_image(image_path)
    
    # Step 2: Apply TTA
    tta_images = apply_tta(image)
    
    # Step 3: Get segmentation masks
    # Note: Using simple threshold as placeholder for ReFineNet
    # In production, use actual ReFineNet model
    gray = cv2.cvtColor((image * 255).astype(np.uint8), cv2.COLOR_RGB2GRAY)
    _, mask = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    
    # Step 4: Morphological post-processing
    mask_refined = morphological_postprocess(mask)
    
    # Step 5: Watershed segmentation
    labels = watershed_segmentation(mask_refined, image)
    
    # Step 6: Extract buildings
    buildings = extract_building_segments(labels, image)
    
    return {
        'original': image,
        'mask': mask_refined,
        'labels': labels,
        'buildings': buildings,
        'num_buildings': len(buildings)
    }

print("Complete segmentation pipeline defined.")

## 2.9 Visualization

In [None]:
def visualize_segmentation(results, save_path=None):
    """
    Visualize segmentation results (Figure 5 in paper).
    
    Paper Reference: Figure 5 - Visual Comparison of Segmentation Stages
    """
    fig, axes = plt.subplots(1, 4, figsize=(16, 4))
    
    # Original image
    axes[0].imshow(results['original'])
    axes[0].set_title('Original Image')
    axes[0].axis('off')
    
    # Segmentation mask
    axes[1].imshow(results['mask'], cmap='gray')
    axes[1].set_title('Segmentation Mask')
    axes[1].axis('off')
    
    # Labeled regions
    axes[2].imshow(results['labels'], cmap='nipy_spectral')
    axes[2].set_title(f'Watershed Labels\n({results["num_buildings"]} buildings)')
    axes[2].axis('off')
    
    # Overlay
    overlay = results['original'].copy()
    for building in results['buildings']:
        minr, minc, maxr, maxc = building['bbox']
        cv2.rectangle(overlay, (minc, minr), (maxc, maxr), (0, 1, 0), 2)
    axes[3].imshow(overlay)
    axes[3].set_title('Detected Buildings')
    axes[3].axis('off')
    
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=150, bbox_inches='tight')
    
    plt.show()

print("Visualization function defined.")

## Summary

This notebook implements the preprocessing and segmentation pipeline:

1. **Image Preprocessing**: Resize to 512×512, normalize pixel intensities
2. **Test-Time Augmentation**: H/V flips for robust predictions
3. **Morphological Refinement**: Opening to remove artifacts
4. **Watershed Segmentation**: Separate overlapping buildings
5. **Size Filtering**: Keep buildings with 500-100,000 pixel area

**Next Step**: `03_model_training.ipynb` - Train DenseNet201 classifier