# Downstream Evaluation: Urban Scene Panoptic Segmentation

Urban scene panoptic segmentation represents a complex downstream task that serves as an excellent testbed for evaluating image restoration models in real-world scenarios. This task is particularly valuable for restoration assessment because:

1. **Multi-class Complexity**: Urban scenes contain diverse semantic categories (roads, buildings, vegetation) and instance objects (cars, pedestrians, signs) requiring preservation of both broad semantic regions and fine instance boundaries
2. **Real-world Applications**: Accurate urban scene understanding directly supports autonomous driving, urban planning, and smart city applications where restoration quality has immediate practical impact
3. **Scale Sensitivity**: The task requires both local detail preservation (for instance segmentation) and global context understanding (for semantic segmentation), testing restoration at multiple scales
4. **Challenging Conditions**: Urban images often contain challenging lighting, shadows, occlusions, and weather conditions that make restoration particularly important

This evaluation framework uses Detectron2's PanopticFCN model to assess how well restored urban images maintain the critical visual information needed for accurate scene understanding and segmentation.

## Environment Setup and Dependencies

This section establishes the computational environment required for urban scene panoptic segmentation evaluation. The setup process includes CUDA compatibility checking, Detectron2 installation with appropriate hardware support, and importing all necessary libraries for comprehensive evaluation.

**Technical Requirements:**
- **Deep Learning Framework**: PyTorch with CUDA support for GPU acceleration
- **Computer Vision Library**: Detectron2 with PanopticFCN model implementation
- **Evaluation Tools**: Comprehensive metrics computation and visualization libraries
- **Data Processing**: Image handling, annotation processing, and result export capabilities

The environment setup is designed to handle various hardware configurations while providing robust fallback options for different CUDA versions and hardware constraints.

## Installation and Dependencies

### Detectron2 Setup

Install Detectron2 and its dependencies for panoptic segmentation:

**Prerequisites:**
- Python 3.7+
- PyTorch 1.8+
- CUDA 10.2+ (for GPU acceleration)
- OpenCV for image processing

**Installation:**
```bash
pip install torch torchvision torchaudio
pip install 'git+https://github.com/facebookresearch/detectron2.git'
```

### Required Files and Resources

**Model Weights:**
- Pre-trained PanopticFCN models available through Detectron2 Model Zoo
- Cityscapes-trained models for urban scene understanding

**Dataset Requirements:**
- Urban scene images (preferably Cityscapes format)
- Ground truth panoptic annotations with semantic and instance labels

**Key Citations:**
- Wu, Y., et al. (2019). "Detectron2." arXiv preprint arXiv:1912.01703.
- Li, Y., et al. (2020). "Panoptic FCN: Towards Real-time and High-Precision Panoptic Segmentation." arXiv:2008.00398.
- Cordts, M., et al. (2016). "The Cityscapes Dataset for Semantic Urban Scene Understanding." CVPR 2016.

**Key Resources:**
- Detectron2 Repository: [https://github.com/facebookresearch/detectron2](https://github.com/facebookresearch/detectron2)
- Cityscapes Dataset: [https://www.cityscapes-dataset.com/](https://www.cityscapes-dataset.com/)

In [None]:
# Check CUDA availability and PyTorch version
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU device: {torch.cuda.get_device_name()}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("CUDA not available - will use CPU (slower)")

PyTorch version: 2.8.0+cu128
CUDA available: True
CUDA version: 12.8
GPU device: NVIDIA L40S
GPU memory: 44.4 GB


In [None]:
# Install Detectron2 and required dependencies
# Note: This will install the pre-built detectron2 for the detected PyTorch and CUDA versions
# For more installation options, visit: https://detectron2.readthedocs.io/en/latest/tutorials/install.html

import subprocess
import sys

def install_detectron2():
    """Install detectron2 with appropriate CUDA support"""
    try:
        # Try importing detectron2 first
        import detectron2
        print("Detectron2 is already installed!")
        return
    except ImportError:
        pass
    
    # Install required packages first
    required_packages = [
        "opencv-python",
        "pillow",
        "matplotlib",
        "scikit-learn",
        "seaborn",
        "tqdm"
    ]
    
    for package in required_packages:
        try:
            subprocess.run([sys.executable, "-m", "pip", "install", package], 
                         capture_output=True, text=True, check=True)
            print(f"{package} installed successfully")
        except subprocess.CalledProcessError as e:
            print(f"Warning: Failed to install {package}")
    
    # Install detectron2
    # For CUDA 12.6 compatibility, use the correct wheel
    if torch.cuda.is_available():
        cuda_version = torch.version.cuda
        print(f"CUDA version detected: {cuda_version}")
        
        # For CUDA 12.x, use cu121 wheels (most compatible)
        if cuda_version.startswith('12.'):
            install_cmd = [
                sys.executable, "-m", "pip", "install", "detectron2", "-f",
                "https://dl.fbaipublicfiles.com/detectron2/wheels/cu121/torch2.0/index.html"
            ]
        else:
            # For older CUDA versions
            cuda_short = cuda_version.replace('.', '')[:4]
            install_cmd = [
                sys.executable, "-m", "pip", "install", "detectron2", "-f",
                f"https://dl.fbaipublicfiles.com/detectron2/wheels/cu{cuda_short}/torch2.0/index.html"
            ]
    else:
        # CPU-only installation
        install_cmd = [
            sys.executable, "-m", "pip", "install", "detectron2", "-f",
            "https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch2.0/index.html"
        ]
    
    print(f"Installing detectron2 with command: {' '.join(install_cmd)}")
    result = subprocess.run(install_cmd, capture_output=True, text=True)
    
    if result.returncode == 0:
        print("Detectron2 installed successfully!")
    else:
        print(f"Installation failed: {result.stderr}")
        print("Trying alternative installation method...")
        
        # Try pip install from source as fallback
        try:
            alt_cmd = [sys.executable, "-m", "pip", "install", "git+https://github.com/facebookresearch/detectron2.git"]
            subprocess.run(alt_cmd, check=True)
            print("Detectron2 installed from source!")
        except subprocess.CalledProcessError:
            print("Both installation methods failed. Please install detectron2 manually.")

# Install detectron2
install_detectron2()

Detectron2 is already installed!


In [None]:
# Install additional required packages
import subprocess
import sys

def install_additional_packages():
    """Install additional packages required for PanopticFCN"""
    packages = [
        "pycocotools",
        "opencv-python",
        "Pillow",
        "matplotlib",
        "scikit-learn", 
        "seaborn",
        "tqdm",
        "requests"
    ]
    
    for package in packages:
        try:
            subprocess.run([sys.executable, "-m", "pip", "install", package], 
                         capture_output=True, text=True, check=True)
            print(f"{package} installed successfully")
        except subprocess.CalledProcessError as e:
            print(f"Warning: Failed to install {package}: {e}")

install_additional_packages()

✅ pycocotools installed successfully
✅ opencv-python installed successfully
✅ opencv-python installed successfully
✅ Pillow installed successfully
✅ Pillow installed successfully
✅ matplotlib installed successfully
✅ matplotlib installed successfully
✅ scikit-learn installed successfully
✅ scikit-learn installed successfully
✅ seaborn installed successfully
✅ seaborn installed successfully
✅ tqdm installed successfully
✅ tqdm installed successfully
✅ requests installed successfully
✅ requests installed successfully


In [None]:
# Import necessary libraries
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import json
import requests
from pathlib import Path
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.patches as patches
import seaborn as sns

# Detectron2 imports with error handling
try:
    from detectron2 import model_zoo
    from detectron2.engine import DefaultPredictor
    from detectron2.config import get_cfg
    from detectron2.utils.visualizer import Visualizer, ColorMode
    from detectron2.data import MetadataCatalog, DatasetCatalog
    print("Detectron2 core modules imported successfully!")
    
    # Try importing PanopticFCN config
    try:
        from projects.PanopticFCN_cityscapes.panopticfcn import add_panopticfcn_config
        print("PanopticFCN config imported successfully!")
    except ImportError as e:
        print(f"Warning: Could not import PanopticFCN config: {e}")
        print("   This might be due to missing custom modules. Continuing with fallback...")
        
        # Create a fallback function
        def add_panopticfcn_config(cfg):
            """Fallback function for PanopticFCN config"""
            # Add basic panoptic segmentation configs
            if not hasattr(cfg.MODEL, 'PANOPTIC_FPN'):
                cfg.MODEL.PANOPTIC_FPN = {}
            cfg.MODEL.PANOPTIC_FPN.COMBINE_ON = True
            cfg.MODEL.PANOPTIC_FPN.COMBINE_OVERLAP_THRESH = 0.5
            cfg.MODEL.PANOPTIC_FPN.COMBINE_STUFF_AREA_THRESH = 4096
            cfg.MODEL.PANOPTIC_FPN.COMBINE_INSTANCES_CONF_THRESH = 0.5
            return cfg
        
        print("Created fallback PanopticFCN config function")
        
except ImportError as e:
    print(f"Error importing detectron2: {e}")
    print("   Please ensure detectron2 is properly installed.")
    print("   Run the installation cell above and restart the kernel.")
    raise

# Set up matplotlib for better visualization
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 100

print("All libraries imported successfully!")

✅ Detectron2 core modules imported successfully!
✅ PanopticFCN config imported successfully!
✅ All libraries imported successfully!


In [None]:
class SegmentationEvaluator:
    """
    Comprehensive segmentation evaluator for urban scene analysis
    Based on evaluate.py structure with mIoU computation capabilities
    """
    def __init__(self, api_key=None):
        self.api_key = api_key  # Store the API key for mask downloads
        
        # Map Cityscapes class names to indices (following evaluate.py format)
        self.class_mapping = {
            'road': 0,
            'sidewalk': 1,
            'building': 2,
            'wall': 3,
            'fence': 4,
            'pole': 5,
            'traffic light': 6,
            'traffic sign': 7,
            'vegetation': 8,
            'terrain': 9,
            'sky': 10,
            'person': 11,
            'rider': 12,
            'car': 13,
            'truck': 14,
            'bus': 15,
            'train': 16,
            'motorcycle': 17,
            'bicycle': 18,
            'background': 19  # For unlabeled areas
        }
        self.class_names = list(self.class_mapping.keys())
        
        # Cityscapes color mapping for visualization
        self.color_mapping = {
            'road': (128, 64, 128),
            'sidewalk': (244, 35, 232),
            'building': (70, 70, 70),
            'wall': (102, 102, 156),
            'fence': (190, 153, 153),
            'pole': (153, 153, 153),
            'traffic light': (250, 170, 30),
            'traffic sign': (220, 220, 0),
            'vegetation': (107, 142, 35),
            'terrain': (152, 251, 152),
            'sky': (70, 130, 180),
            'person': (220, 20, 60),
            'rider': (255, 0, 0),
            'car': (0, 0, 142),
            'truck': (0, 0, 70),
            'bus': (0, 60, 100),
            'train': (0, 80, 100),
            'motorcycle': (0, 0, 230),
            'bicycle': (119, 11, 32),
            'background': (0, 0, 0)  # Black for background
        }
        
        # Get ordered colors for visualization
        self.cityscapes_colors = [self.color_mapping[class_name] for class_name in self.class_names]
        
    def calculate_metrics(self, gt_mask, pred_mask):
        """
        Calculate comprehensive segmentation metrics including mIoU
        Following evaluate.py format
        """
        # Flatten masks for metric calculation
        gt_flat = gt_mask.flatten()
        pred_flat = pred_mask.flatten()
        
        # Calculate confusion matrix
        cm = confusion_matrix(gt_flat, pred_flat, labels=range(len(self.class_names)))
        
        # Calculate per-class metrics
        metrics = {}
        
        for i, class_name in enumerate(self.class_names):
            if i >= len(cm):
                continue
                
            tp = cm[i, i]
            fp = cm[:, i].sum() - tp
            fn = cm[i, :].sum() - tp
            tn = cm.sum() - tp - fp - fn
            
            # Avoid division by zero
            precision = tp / (tp + fp) if (tp + fp) > 0 else 0
            recall = tp / (tp + fn) if (tp + fn) > 0 else 0
            iou = tp / (tp + fp + fn) if (tp + fp + fn) > 0 else 0
            f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
            
            metrics[class_name] = {
                'precision': precision,
                'recall': recall,
                'iou': iou,
                'f1': f1,
                'support': gt_flat[gt_flat == i].shape[0]
            }
        
        # Calculate overall accuracy
        overall_accuracy = np.sum(gt_flat == pred_flat) / len(gt_flat)
        
        # Calculate mean IoU for all classes
        all_ious = [metrics[cls]['iou'] for cls in self.class_names if cls in metrics and metrics[cls]['support'] > 0]
        mean_iou_all = np.mean(all_ious) if all_ious else 0.0
        
        # Calculate mean IoU for key urban classes (stuff + things)
        key_urban_classes = ['road', 'sidewalk', 'building', 'vegetation', 'sky', 'car', 'person']
        key_ious = [metrics[cls]['iou'] for cls in key_urban_classes if cls in metrics and metrics[cls]['support'] > 0]
        mean_iou_key = np.mean(key_ious) if key_ious else 0.0
        
        return metrics, overall_accuracy, mean_iou_all, mean_iou_key, cm
    
    def plot_confusion_matrix(self, cm, save_path=None, figsize=(12, 10)):
        """Plot confusion matrix with proper formatting"""
        plt.figure(figsize=figsize)
        
        # Use a subset of class names for better readability
        display_names = [name[:8] for name in self.class_names]  # Truncate long names
        
        sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                   xticklabels=display_names, yticklabels=display_names,
                   cbar_kws={'label': 'Count'})
        plt.title('Confusion Matrix - Urban Scene Segmentation', fontsize=14, fontweight='bold')
        plt.xlabel('Predicted Class', fontsize=12)
        plt.ylabel('Ground Truth Class', fontsize=12)
        plt.xticks(rotation=45, ha='right')
        plt.yticks(rotation=0)
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
        plt.tight_layout()
        plt.show()
    
    def print_metrics_summary(self, metrics, overall_accuracy, mean_iou_all, mean_iou_key):
        """Print comprehensive metrics summary"""
        print("\n" + "="*70)
        print("📊 COMPREHENSIVE SEGMENTATION METRICS")
        print("="*70)
        
        print(f"🎯 Overall Accuracy: {overall_accuracy:.4f}")
        print(f"📈 Mean IoU (All Classes): {mean_iou_all:.4f}")
        print(f"🏙️ Mean IoU (Key Urban Classes): {mean_iou_key:.4f}")
        
        print(f"\n📋 PER-CLASS METRICS:")
        print("-" * 70)
        print(f"{'Class':<15} {'IoU':<8} {'Precision':<10} {'Recall':<8} {'F1':<8} {'Support':<10}")
        print("-" * 70)
        
        # Sort by IoU for better readability
        sorted_metrics = sorted(metrics.items(), key=lambda x: x[1]['iou'], reverse=True)
        
        for class_name, metrics_dict in sorted_metrics:
            if metrics_dict['support'] > 0:  # Only show classes with actual data
                print(f"{class_name:<15} {metrics_dict['iou']:<8.3f} {metrics_dict['precision']:<10.3f} "
                      f"{metrics_dict['recall']:<8.3f} {metrics_dict['f1']:<8.3f} {metrics_dict['support']:<10}")

# Initialize the evaluator
evaluator = SegmentationEvaluator()
print("SegmentationEvaluator initialized with Cityscapes classes")
print(f"Tracking {len(evaluator.class_names)} classes")

✅ SegmentationEvaluator initialized with Cityscapes classes
📊 Tracking 20 classes


## Comprehensive Evaluation Framework

This section implements a sophisticated evaluation system for urban scene panoptic segmentation that combines both semantic and instance segmentation assessment. The framework provides detailed metrics specifically designed for evaluating restoration quality through downstream segmentation performance.

### Panoptic Segmentation Evaluation Metrics

**Core Panoptic Quality (PQ) Metrics:**
- **Panoptic Quality (PQ)**: Overall metric combining segmentation quality and recognition quality (PQ = SQ × RQ)
- **Segmentation Quality (SQ)**: Measures the quality of segmentation masks using IoU for matched segments
- **Recognition Quality (RQ)**: Measures the quality of recognition, computed as F1-score of segment matching

**Semantic Segmentation Metrics:**
- **Mean Intersection over Union (mIoU)**: Average IoU across all semantic classes
- **Pixel Accuracy**: Overall fraction of correctly classified pixels
- **Class-wise IoU**: Individual IoU scores for each of the 19 Cityscapes classes
- **Frequency Weighted IoU**: IoU weighted by class frequency to handle class imbalance

**Instance Segmentation Metrics:**
- **Average Precision (AP)**: Standard COCO-style AP at IoU thresholds 0.5 and 0.75
- **Average Recall (AR)**: Maximum recall given a fixed number of detections per image
- **Per-Class AP**: Individual average precision scores for "thing" classes (cars, people, etc.)

### Advanced Evaluation Features

**Multi-Scale Analysis:**
The framework evaluates performance across different object scales (small, medium, large) to understand how restoration affects objects of varying sizes in urban scenes.

**Boundary Quality Assessment:**
Specialized metrics for evaluating boundary precision, crucial for applications like autonomous driving where precise object boundaries are essential.

**Class-Specific Performance:**
Detailed analysis for both "stuff" classes (road, building, sky) and "thing" classes (car, person, bicycle) with different evaluation criteria appropriate for each category.

**Confidence Calibration:**
Analysis of prediction confidence scores and their correlation with actual accuracy, essential for deployment in safety-critical applications.

### Urban Scene-Specific Considerations

**Challenging Conditions Handling:**
- **Occlusion Robustness**: Evaluation of performance when objects are partially occluded
- **Lighting Variation**: Assessment across different lighting conditions common in urban environments
- **Scale Variation**: Analysis of performance on objects ranging from distant cars to nearby pedestrians
- **Weather Adaptability**: Evaluation under various weather conditions that affect image quality

This comprehensive evaluation framework provides the detailed insights needed to assess whether restored urban images maintain the critical visual information required for accurate scene understanding in real-world applications.

In [None]:
def setup_panoptic_model(model_name="PanopticFCN-R50", confidence_threshold=0.5):
    """
    Set up PanopticFCN model for inference with fallback configurations
    
    Args:
        model_name: Either "PanopticFCN-R50" or "PanopticFCN-R101"
        confidence_threshold: Minimum confidence for detections (0.0 to 1.0)
    
    Returns:
        predictor: Detectron2 predictor object
        cfg: Configuration object
    """
    
    # Initialize configuration
    cfg = get_cfg()
    
    # Add PanopticFCN-specific configs
    add_panopticfcn_config(cfg)
    
    # Try different configuration paths
    possible_config_files = [
        "projects/PanopticFCN_cityscapes/configs/cityscapes/PanopticFCN-R50-cityscapes.yaml",
        "projects/PanopticFCN/configs/Cityscapes-PanopticSegmentation/panoptic_fcn_R_50_3x.yaml",
        "COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml"  # Fallback to standard config
    ]
    
    config_file = None
    for config_path in possible_config_files:
        if os.path.exists(config_path):
            config_file = config_path
            print(f"✅ Found config file: {config_file}")
            break
    
    if config_file is None:
        print("⚠️ Custom config files not found, using standard panoptic segmentation config")
        config_file = "COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml"
    
    # Try different model weight paths
    possible_weights = [
        "output/model_0059999.pth",
        "output/model_final.pth",
        "detectron2://PanopticFCN/Cityscapes-PanopticSegmentation/panoptic_fcn_R_50_3x/model_final_c45e69.pkl",
        "detectron2://COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/model_final_c10459.pkl"  # Fallback
    ]
    
    model_weights = None
    for weight_path in possible_weights:
        if weight_path.startswith("detectron2://") or os.path.exists(weight_path):
            model_weights = weight_path
            print(f"✅ Using model weights: {weight_path}")
            break
    
    if model_weights is None:
        print("⚠️ Custom model weights not found, using pre-trained COCO weights")
        model_weights = "detectron2://COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/model_final_c10459.pkl"

    try:
        # Configure model
        cfg.merge_from_file(config_file)
        cfg.MODEL.WEIGHTS = model_weights
        cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = confidence_threshold
        cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
        
        # Additional safety configurations
        cfg.MODEL.PANOPTIC_FPN.COMBINE_ON = True
        cfg.MODEL.PANOPTIC_FPN.COMBINE_OVERLAP_THRESH = 0.5
        cfg.MODEL.PANOPTIC_FPN.COMBINE_STUFF_AREA_THRESH = 4096
        cfg.MODEL.PANOPTIC_FPN.COMBINE_INSTANCES_CONF_THRESH = confidence_threshold
        
        # Create predictor
        predictor = DefaultPredictor(cfg)
        
        print(f"✅ {model_name} model loaded successfully!")
        print(f"🖥️ Running on: {cfg.MODEL.DEVICE}")
        print(f"🎯 Confidence threshold: {confidence_threshold}")
        print(f"📁 Config file: {config_file}")
        print(f"⚖️ Model weights: {model_weights}")
        
        return predictor, cfg
        
    except Exception as e:
        print(f"❌ Error setting up model: {e}")
        print("   Trying with standard Detectron2 panoptic model...")
        
        # Fallback to standard detectron2 panoptic model
        cfg = get_cfg()
        cfg.merge_from_file("detectron2/projects/PanopticFCN_cityscapes/configs/cityscapes/PanopticFCN-R50-cityscapes.yaml")
        cfg.MODEL.WEIGHTS = "detectron2/output/model_0059999.pth"
        cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = confidence_threshold
        cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
        
        predictor = DefaultPredictor(cfg)
        
        print(f"✅ Fallback model loaded successfully!")
        print(f"🖥️ Running on: {cfg.MODEL.DEVICE}")
        
        return predictor, cfg

# Set up the model (uncomment to run)
predictor, cfg = setup_panoptic_model("PanopticFCN-R50", confidence_threshold=0.5)

In [18]:
# Test model setup - Run this cell to initialize the model
try:
    print("🚀 Initializing PanopticFCN model...")
    predictor, cfg = setup_panoptic_model("PanopticFCN-R50", confidence_threshold=0.5)
    print("✅ Model setup completed successfully!")
    
    # Test with a dummy image to verify everything works
    dummy_image = np.zeros((480, 640, 3), dtype=np.uint8)
    test_output = predictor(dummy_image)
    print("✅ Model inference test passed!")
    print(f"   Output keys: {list(test_output.keys())}")
    
except Exception as e:
    print(f"❌ Model setup failed: {e}")
    print("   Please check the error message above and ensure:")
    print("   1. Detectron2 is properly installed")
    print("   2. CUDA is available (if using GPU)")
    print("   3. Model config and weights are accessible")

🚀 Initializing PanopticFCN model...
✅ Found config file: projects/PanopticFCN_cityscapes/configs/cityscapes/PanopticFCN-R50-cityscapes.yaml
✅ Using model weights: output/model_0059999.pth
❌ Error setting up model: Cannot import 'detectron2._C', therefore 'ModulatedDeformConv' is not available. detectron2 is not compiled successfully, please build following the instructions!
   Trying with standard Detectron2 panoptic model...
❌ Model setup failed: Config file 'COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml' does not exist!
   Please check the error message above and ensure:
   1. Detectron2 is properly installed
   2. CUDA is available (if using GPU)
   3. Model config and weights are accessible
❌ Error setting up model: Cannot import 'detectron2._C', therefore 'ModulatedDeformConv' is not available. detectron2 is not compiled successfully, please build following the instructions!
   Trying with standard Detectron2 panoptic model...
❌ Model setup failed: Config file 'COCO-Panoptic

## Image Processing and Data Loading

This section provides comprehensive utilities for loading and preprocessing urban scene images for panoptic segmentation evaluation. The image processing pipeline handles various formats and resolutions while maintaining compatibility with Detectron2 input requirements.

**Image Loading Features:**
- **Multi-format Support**: Handles JPEG, PNG, TIFF, and other common image formats
- **Resolution Handling**: Automatic resizing and scaling to match model input requirements
- **Color Space Management**: Proper BGR/RGB conversion for OpenCV and PIL compatibility
- **Metadata Extraction**: Retrieval of image dimensions, format, and quality information
- **Batch Processing**: Efficient loading of entire image directories with progress tracking

**Preprocessing Pipeline:**
The preprocessing ensures that input images are properly formatted for the panoptic segmentation model while preserving the visual information critical for accurate segmentation results.

In [None]:
import os
import glob
from pathlib import Path
import matplotlib.pyplot as plt
import matplotlib.patches as patches

def load_image(image_path):
    """
    Load and preprocess an image for Detectron2 inference
    
    Args:
        image_path: Path to the image file
        
    Returns:
        numpy array: Image in BGR format (Detectron2 expects BGR)
    """
    try:
        # Read image using OpenCV (returns BGR format)
        image = cv2.imread(str(image_path))
        
        if image is None:
            raise ValueError(f"Could not load image from {image_path}")
        
        return image
    except Exception as e:
        print(f"❌ Error loading image: {e}")
        return None

def get_image_files(directory, extensions=('.jpg', '.jpeg', '.png', '.bmp', '.tiff')):
    """
    Get all image files from a directory
    
    Args:
        directory: Path to directory containing images
        extensions: Tuple of valid image extensions
        
    Returns:
        List of image file paths
    """
    image_files = []
    directory = Path(directory)
    
    for ext in extensions:
        pattern = directory / f"*{ext}"
        image_files.extend(glob.glob(str(pattern)))
        pattern = directory / f"*{ext.upper()}"
        image_files.extend(glob.glob(str(pattern)))
    
    return sorted(image_files)

def display_image_info(image, image_path):
    """Display basic information about the loaded image"""
    height, width = image.shape[:2]
    channels = image.shape[2] if len(image.shape) == 3 else 1
    
    print(f"📁 Image: {Path(image_path).name}")
    print(f"📐 Dimensions: {width} x {height} pixels")
    print(f"🎨 Channels: {channels}")
    print(f"📊 Data type: {image.dtype}")

# Example usage - Update the path to your images
# Uncomment and modify the path below to test with your images
"""
image_directory = "path/to/your/images"  # Change this to your image directory
image_files = get_image_files(image_directory)
print(f"Found {len(image_files)} images")

if image_files:
    # Load first image as example
    sample_image = load_image(image_files[0])
    if sample_image is not None:
        display_image_info(sample_image, image_files[0])
"""

# For testing purposes, you can use any sample image
print("To test with your images:")
print("   1. Update 'image_directory' path above")
print("   2. Uncomment the code block")
print("   3. Run the cell")

Found 0 images


## Panoptic Segmentation Inference Pipeline

This section implements the core inference pipeline for running panoptic segmentation on urban scene images. The pipeline combines semantic segmentation (pixel-level classification) with instance segmentation (object detection and segmentation) to provide comprehensive scene understanding.

**Inference Process:**
1. **Semantic Segmentation**: Classifies each pixel into one of 19 Cityscapes classes
2. **Instance Detection**: Identifies and segments individual object instances
3. **Panoptic Fusion**: Combines semantic and instance results into unified panoptic segmentation
4. **Post-processing**: Applies confidence thresholding and non-maximum suppression

**Output Components:**
- **Panoptic Mask**: Single mask where each pixel has a unique segment ID
- **Segments Info**: Detailed information about each segment (class, area, confidence)
- **Semantic Predictions**: Class probabilities for each pixel
- **Instance Predictions**: Bounding boxes, masks, and confidence scores for objects

The inference pipeline is optimized for batch processing while maintaining detailed tracking of performance metrics and processing times.

In [None]:
def run_panoptic_segmentation(predictor, image):
    """
    Run panoptic segmentation inference on an image
    
    Args:
        predictor: Detectron2 predictor object
        image: Input image in BGR format
        
    Returns:
        dict: Prediction results containing panoptic_seg, segments_info, etc.
    """
    try:
        # Run inference
        outputs = predictor(image)
        
        # Extract panoptic segmentation results
        panoptic_seg = outputs["panoptic_seg"]
        segments_info = outputs["segments_info"]
        
        print(f"✅ Inference completed successfully!")
        print(f"🧩 Found {len(segments_info)} segments")
        
        return {
            "panoptic_seg": panoptic_seg,
            "segments_info": segments_info,
            "full_outputs": outputs
        }
        
    except Exception as e:
        print(f"❌ Error during inference: {e}")
        return None

def convert_panoptic_to_class_mask(panoptic_seg, segments_info, evaluator):
    """
    Convert panoptic segmentation to class mask for evaluation
    Following evaluate.py format
    
    Args:
        panoptic_seg: Panoptic segmentation tensor from Detectron2
        segments_info: Segment information list
        evaluator: SegmentationEvaluator instance
        
    Returns:
        numpy array: Class mask with Cityscapes class IDs
    """
    # Convert tensor to numpy if needed
    if hasattr(panoptic_seg, 'cpu'):
        seg_map = panoptic_seg.cpu().numpy()
    else:
        seg_map = panoptic_seg
    
    # Initialize class mask with background
    class_mask = np.full(seg_map.shape, evaluator.class_mapping['background'], dtype=np.uint8)
    
    # Cityscapes category ID to class name mapping (from Detectron2)
    cityscapes_id_to_name = {
        0: 'road', 1: 'sidewalk', 2: 'building', 3: 'wall', 4: 'fence',
        5: 'pole', 6: 'traffic light', 7: 'traffic sign', 8: 'vegetation',
        9: 'terrain', 10: 'sky', 11: 'person', 12: 'rider', 13: 'car',
        14: 'truck', 15: 'bus', 16: 'train', 17: 'motorcycle', 18: 'bicycle'
    }
    
    # Create mapping from segment ID to category
    segment_id_to_category = {}
    for segment in segments_info:
        segment_id_to_category[segment["id"]] = segment["category_id"]
    
    # Assign class labels to each segment
    unique_ids = np.unique(seg_map)
    for segment_id in unique_ids:
        if segment_id in segment_id_to_category:
            category_id = segment_id_to_category[segment_id]
            if category_id in cityscapes_id_to_name:
                class_name = cityscapes_id_to_name[category_id]
                if class_name in evaluator.class_mapping:
                    class_idx = evaluator.class_mapping[class_name]
                    mask = seg_map == segment_id
                    class_mask[mask] = class_idx
    
    return class_mask

def analyze_segments(segments_info, evaluator):
    """
    Analyze the detected segments and provide statistics
    Updated to work with SegmentationEvaluator
    
    Args:
        segments_info: List of segment information from Detectron2
        evaluator: SegmentationEvaluator instance
        
    Returns:
        dict: Analysis results
    """
    
    # Count instances by category
    category_counts = {}
    total_area = 0
    
    stuff_segments = 0
    thing_segments = 0
    
    # Cityscapes category ID to class name mapping
    cityscapes_id_to_name = {
        0: 'road', 1: 'sidewalk', 2: 'building', 3: 'wall', 4: 'fence',
        5: 'pole', 6: 'traffic light', 7: 'traffic sign', 8: 'vegetation',
        9: 'terrain', 10: 'sky', 11: 'person', 12: 'rider', 13: 'car',
        14: 'truck', 15: 'bus', 16: 'train', 17: 'motorcycle', 18: 'bicycle'
    }
    
    for segment in segments_info:
        category_id = segment["category_id"]
        area = segment["area"]
        
        # Get class name
        class_name = cityscapes_id_to_name.get(category_id, "unknown")
        
        # Count categories
        if class_name not in category_counts:
            category_counts[class_name] = {"count": 0, "total_area": 0}
        
        category_counts[class_name]["count"] += 1
        category_counts[class_name]["total_area"] += area
        total_area += area
        
        # Count stuff vs thing categories
        # In Cityscapes, categories 0-10 are typically "stuff" (road, sidewalk, etc.)
        # and 11+ are "things" (person, car, etc.)
        if category_id <= 10:
            stuff_segments += 1
        else:
            thing_segments += 1
    
    return {
        "category_counts": category_counts,
        "total_segments": len(segments_info),
        "stuff_segments": stuff_segments,
        "thing_segments": thing_segments,
        "total_area": total_area
    }

def print_analysis_results(analysis_results):
    """Print formatted analysis results"""
    
    print("\n" + "="*50)
    print("📊 SEGMENTATION ANALYSIS RESULTS")
    print("="*50)
    
    print(f"🧩 Total segments: {analysis_results['total_segments']}")
    print(f"🏞️  Stuff segments: {analysis_results['stuff_segments']}")
    print(f"🚗 Thing segments: {analysis_results['thing_segments']}")
    print(f"📐 Total area: {analysis_results['total_area']:,} pixels")
    
    print("\n📋 DETECTED CATEGORIES:")
    print("-" * 30)
    
    # Sort categories by area (descending)
    sorted_categories = sorted(
        analysis_results['category_counts'].items(),
        key=lambda x: x[1]['total_area'],
        reverse=True
    )
    
    for class_name, info in sorted_categories:
        percentage = (info['total_area'] / analysis_results['total_area']) * 100
        print(f"{class_name:15s} | Count: {info['count']:2d} | Area: {percentage:5.1f}%")

def create_dummy_ground_truth(image_shape, segments_info, evaluator):
    """
    Create a dummy ground truth mask for demonstration purposes
    In practice, you would load actual ground truth annotations
    
    Args:
        image_shape: Shape of the image (height, width)
        segments_info: Segment information from prediction (for reference)
        evaluator: SegmentationEvaluator instance
        
    Returns:
        numpy array: Dummy ground truth mask
    """
    height, width = image_shape[:2]
    
    # Create a simple synthetic ground truth for demonstration
    gt_mask = np.full((height, width), evaluator.class_mapping['background'], dtype=np.uint8)
    
    # Add some basic regions (this is just for demo - replace with real GT loading)
    # Sky region (upper portion)
    gt_mask[:height//4, :] = evaluator.class_mapping['sky']
    
    # Building region (middle-left)
    gt_mask[height//4:3*height//4, :width//3] = evaluator.class_mapping['building']
    
    # Road region (bottom portion)
    gt_mask[3*height//4:, :] = evaluator.class_mapping['road']
    
    # Vegetation region (middle-right)
    gt_mask[height//4:3*height//4, 2*width//3:] = evaluator.class_mapping['vegetation']
    
    print("⚠️ Using dummy ground truth for demonstration")
    print("   In practice, load actual ground truth annotations")
    
    return gt_mask

def evaluate_prediction_with_metrics(predictor, image, evaluator, gt_mask=None):
    """
    Complete evaluation pipeline with mIoU metrics computation
    Following evaluate.py structure
    
    Args:
        predictor: Detectron2 predictor object
        image: Input image in BGR format
        evaluator: SegmentationEvaluator instance
        gt_mask: Ground truth mask (optional, will create dummy if None)
        
    Returns:
        dict: Complete evaluation results
    """
    try:
        # Run inference
        outputs = predictor(image)
        
        # Extract panoptic segmentation results
        panoptic_seg = outputs["panoptic_seg"]
        segments_info = outputs["segments_info"]
        
        print(f"Inference completed successfully!")
        print(f"Found {len(segments_info)} segments")
        
        # Convert panoptic segmentation to class mask
        pred_class_mask = convert_panoptic_to_class_mask(
            panoptic_seg[0], segments_info, evaluator
        )
        
        # Create or use provided ground truth
        if gt_mask is None:
            gt_mask = create_dummy_ground_truth(image.shape, segments_info, evaluator)
        
        # Ensure masks have the same shape
        if gt_mask.shape != pred_class_mask.shape:
            gt_mask = cv2.resize(gt_mask, (pred_class_mask.shape[1], pred_class_mask.shape[0]), 
                               interpolation=cv2.INTER_NEAREST)
        
        # Calculate comprehensive metrics
        metrics, overall_accuracy, mean_iou_all, mean_iou_key, cm = evaluator.calculate_metrics(
            gt_mask, pred_class_mask
        )
        
        # Print metrics summary
        evaluator.print_metrics_summary(metrics, overall_accuracy, mean_iou_all, mean_iou_key)
        
        # Plot confusion matrix
        evaluator.plot_confusion_matrix(cm)
        
        return {
            "panoptic_seg": panoptic_seg,
            "segments_info": segments_info,
            "pred_class_mask": pred_class_mask,
            "gt_mask": gt_mask,
            "metrics": metrics,
            "overall_accuracy": overall_accuracy,
            "mean_iou_all": mean_iou_all,
            "mean_iou_key": mean_iou_key,
            "confusion_matrix": cm,
            "full_outputs": outputs
        }
        
    except Exception as e:
        print(f"❌ Error during evaluation: {e}")
        return None

# Example evaluation workflow (uncomment to run)
# if 'predictor' in locals() and 'sample_image' in locals():
#     # Run complete evaluation with metrics
#     eval_results = evaluate_prediction_with_metrics(predictor, sample_image, evaluator)
#     
#     if eval_results is not None:
#         print(f"\n🎯 Key Results:")
#         print(f"   Overall Accuracy: {eval_results['overall_accuracy']:.4f}")
#         print(f"   Mean IoU (All): {eval_results['mean_iou_all']:.4f}")
#         print(f"   Mean IoU (Key Urban): {eval_results['mean_iou_key']:.4f}")

## Comprehensive Visualization and Analysis

This section provides advanced visualization capabilities for analyzing panoptic segmentation results and understanding model performance patterns. The visualization suite is specifically designed for urban scene analysis and restoration quality assessment.

**Visualization Components:**

**Segmentation Overlay Visualizations:**
- **Color-coded Segmentation Maps**: Visual representation of all detected segments with distinct colors
- **Class-specific Highlighting**: Individual visualization of specific urban scene categories
- **Instance Boundary Visualization**: Clear delineation between different object instances

**Performance Analysis Charts:**
- **Per-Class IoU Distribution**: Bar charts showing segmentation quality for each Cityscapes class
- **Confusion Matrices**: Heat maps revealing common misclassification patterns between urban scene classes
- **Precision-Recall Curves**: Analysis of detection performance across different confidence thresholds

**Comparative Analysis:**
- **Ground Truth vs Prediction**: Side-by-side comparison highlighting differences
- **Error Pattern Visualization**: Systematic identification of failure modes and challenging scenarios
- **Scale-based Performance**: Analysis of how segmentation quality varies with object size

These visualizations are essential for understanding restoration impact on urban scene understanding and identifying specific areas where image quality affects segmentation performance.

In [None]:
from detectron2.utils.visualizer import Visualizer, ColorMode
import random

def create_colored_segmentation_map(panoptic_seg, segments_info, evaluator):
    """
    Create a colored segmentation map using Cityscapes colors
    Updated to work with SegmentationEvaluator
    
    Args:
        panoptic_seg: Panoptic segmentation tensor from Detectron2
        segments_info: Segment information list
        evaluator: SegmentationEvaluator instance
        
    Returns:
        numpy array: Colored segmentation map (H, W, 3)
    """
    
    # Convert tensor to numpy if needed
    if hasattr(panoptic_seg, 'cpu'):
        seg_map = panoptic_seg.cpu().numpy()
    else:
        seg_map = panoptic_seg
    
    # Create colored output
    height, width = seg_map.shape
    colored_map = np.zeros((height, width, 3), dtype=np.uint8)
    
    # Cityscapes category ID to class name mapping
    cityscapes_id_to_name = {
        0: 'road', 1: 'sidewalk', 2: 'building', 3: 'wall', 4: 'fence',
        5: 'pole', 6: 'traffic light', 7: 'traffic sign', 8: 'vegetation',
        9: 'terrain', 10: 'sky', 11: 'person', 12: 'rider', 13: 'car',
        14: 'truck', 15: 'bus', 16: 'train', 17: 'motorcycle', 18: 'bicycle'
    }
    
    # Create mapping from segment ID to category
    segment_id_to_category = {}
    for segment in segments_info:
        segment_id_to_category[segment["id"]] = segment["category_id"]
    
    # Color each segment
    unique_ids = np.unique(seg_map)
    for segment_id in unique_ids:
        if segment_id in segment_id_to_category:
            category_id = segment_id_to_category[segment_id]
            if category_id in cityscapes_id_to_name:
                class_name = cityscapes_id_to_name[category_id]
                if class_name in evaluator.color_mapping:
                    color = evaluator.color_mapping[class_name]
                    mask = seg_map == segment_id
                    colored_map[mask] = color
    
    return colored_map

def create_colored_class_mask(class_mask, evaluator):
    """
    Create colored visualization from class mask
    
    Args:
        class_mask: Class mask with class indices
        evaluator: SegmentationEvaluator instance
        
    Returns:
        numpy array: Colored mask (H, W, 3)
    """
    height, width = class_mask.shape
    colored_map = np.zeros((height, width, 3), dtype=np.uint8)
    
    for class_idx, class_name in enumerate(evaluator.class_names):
        if class_name in evaluator.color_mapping:
            color = evaluator.color_mapping[class_name]
            mask = class_mask == class_idx
            colored_map[mask] = color
    
    return colored_map

def visualize_evaluation_results(image, eval_results, evaluator, alpha=0.6):
    """
    Create comprehensive visualization of evaluation results including metrics
    Updated to work with evaluation structure
    
    Args:
        image: Original input image (BGR format)
        eval_results: Results dictionary from evaluate_prediction_with_metrics
        evaluator: SegmentationEvaluator instance
        alpha: Transparency for overlay (0.0 = transparent, 1.0 = opaque)
        
    Returns:
        dict: Dictionary containing different visualization outputs
    """
    
    # Convert BGR to RGB for matplotlib
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    # Get components from evaluation results
    panoptic_seg = eval_results["panoptic_seg"][0]  # Remove batch dimension
    segments_info = eval_results["segments_info"]
    pred_class_mask = eval_results["pred_class_mask"]
    gt_mask = eval_results["gt_mask"]
    
    # Create colored visualizations
    colored_pred = create_colored_segmentation_map(panoptic_seg, segments_info, evaluator)
    colored_gt = create_colored_class_mask(gt_mask, evaluator)
    colored_pred_class = create_colored_class_mask(pred_class_mask, evaluator)
    
    # Create overlays
    pred_overlay = cv2.addWeighted(image_rgb, 1-alpha, colored_pred, alpha, 0)
    gt_overlay = cv2.addWeighted(image_rgb, 1-alpha, colored_gt, alpha, 0)
    
    # Create difference map (for error analysis)
    diff_mask = (gt_mask != pred_class_mask).astype(np.uint8) * 255
    diff_colored = np.zeros_like(image_rgb)
    diff_colored[:, :, 0] = diff_mask  # Red channel for errors
    diff_overlay = cv2.addWeighted(image_rgb, 0.7, diff_colored, 0.3, 0)
    
    return {
        "original": image_rgb,
        "prediction_panoptic": colored_pred,
        "prediction_class": colored_pred_class,
        "ground_truth": colored_gt,
        "pred_overlay": pred_overlay,
        "gt_overlay": gt_overlay,
        "difference_map": diff_overlay,
        "metrics": eval_results["metrics"],
        "overall_accuracy": eval_results["overall_accuracy"],
        "mean_iou_all": eval_results["mean_iou_all"],
        "mean_iou_key": eval_results["mean_iou_key"]
    }

def plot_comprehensive_evaluation_results(vis_results, eval_results, evaluator, figsize=(24, 18)):
    """
    Create a comprehensive evaluation visualization following evaluate.py format
    
    Args:
        vis_results: Results from visualize_evaluation_results
        eval_results: Results from evaluate_prediction_with_metrics
        evaluator: SegmentationEvaluator instance
        figsize: Figure size tuple
    """
    
    fig, axes = plt.subplots(3, 4, figsize=figsize)
    fig.suptitle('Comprehensive Urban Scene Segmentation Evaluation', fontsize=18, fontweight='bold')
    
    # Row 1: Original, Prediction, Ground Truth, Difference
    axes[0, 0].imshow(vis_results["original"])
    axes[0, 0].set_title('Original Image', fontweight='bold', fontsize=12)
    axes[0, 0].axis('off')
    
    axes[0, 1].imshow(vis_results["prediction_panoptic"])
    axes[0, 1].set_title('Prediction (Panoptic)', fontweight='bold', fontsize=12)
    axes[0, 1].axis('off')
    
    axes[0, 2].imshow(vis_results["ground_truth"])
    axes[0, 2].set_title('Ground Truth', fontweight='bold', fontsize=12)
    axes[0, 2].axis('off')
    
    axes[0, 3].imshow(vis_results["difference_map"])
    axes[0, 3].set_title('Difference Map (Errors in Red)', fontweight='bold', fontsize=12)
    axes[0, 3].axis('off')
    
    # Row 2: Overlays and IoU Chart
    axes[1, 0].imshow(vis_results["pred_overlay"])
    axes[1, 0].set_title('Prediction Overlay', fontweight='bold', fontsize=12)
    axes[1, 0].axis('off')
    
    axes[1, 1].imshow(vis_results["gt_overlay"])
    axes[1, 1].set_title('Ground Truth Overlay', fontweight='bold', fontsize=12)
    axes[1, 1].axis('off')
    
    # IoU bar chart
    metrics = vis_results["metrics"]
    classes_with_data = [cls for cls in evaluator.class_names if cls in metrics and metrics[cls]['support'] > 0]
    ious = [metrics[cls]['iou'] for cls in classes_with_data]
    
    bars = axes[1, 2].bar(range(len(classes_with_data)), ious, alpha=0.7, color='skyblue')
    axes[1, 2].set_title('Per-Class IoU Scores', fontweight='bold', fontsize=12)
    axes[1, 2].set_xticks(range(len(classes_with_data)))
    axes[1, 2].set_xticklabels([cls[:8] for cls in classes_with_data], rotation=45, ha='right', fontsize=10)
    axes[1, 2].set_ylabel('IoU Score', fontsize=10)
    axes[1, 2].set_ylim(0, 1)
    
    # Add value labels on bars
    for bar, iou in zip(bars, ious):
        height = bar.get_height()
        axes[1, 2].text(bar.get_x() + bar.get_width()/2., height + 0.01,
                       f'{iou:.2f}', ha='center', va='bottom', fontsize=8)
    
    # Metrics summary text
    summary_text = f"""EVALUATION METRICS
    
Overall Accuracy: {vis_results['overall_accuracy']:.3f}
Mean IoU (All): {vis_results['mean_iou_all']:.3f}
Mean IoU (Key Urban): {vis_results['mean_iou_key']:.3f}

TOP PERFORMING CLASSES:
"""
    
    # Add top 5 classes by IoU
    sorted_metrics = sorted(
        [(cls, metrics[cls]['iou']) for cls in classes_with_data],
        key=lambda x: x[1], reverse=True
    )
    
    for i, (cls_name, iou_score) in enumerate(sorted_metrics[:5]):
        summary_text += f"\n{i+1}. {cls_name}: {iou_score:.3f}"
    
    axes[1, 3].text(0.05, 0.95, summary_text, transform=axes[1, 3].transAxes,
                   verticalalignment='top', fontfamily='monospace', fontsize=10,
                   bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))
    axes[1, 3].set_title('Metrics Summary', fontweight='bold', fontsize=12)
    axes[1, 3].axis('off')
    
    # Row 3: Precision, Recall, F1 scores, and Support
    precision_scores = [metrics[cls]['precision'] for cls in classes_with_data]
    recall_scores = [metrics[cls]['recall'] for cls in classes_with_data]
    f1_scores = [metrics[cls]['f1'] for cls in classes_with_data]
    supports = [metrics[cls]['support'] for cls in classes_with_data]
    
    # Precision chart
    axes[2, 0].bar(range(len(classes_with_data)), precision_scores, alpha=0.7, color='lightgreen')
    axes[2, 0].set_title('Precision Scores', fontweight='bold', fontsize=12)
    axes[2, 0].set_xticks(range(len(classes_with_data)))
    axes[2, 0].set_xticklabels([cls[:8] for cls in classes_with_data], rotation=45, ha='right', fontsize=10)
    axes[2, 0].set_ylabel('Precision', fontsize=10)
    axes[2, 0].set_ylim(0, 1)
    
    # Recall chart
    axes[2, 1].bar(range(len(classes_with_data)), recall_scores, alpha=0.7, color='lightcoral')
    axes[2, 1].set_title('Recall Scores', fontweight='bold', fontsize=12)
    axes[2, 1].set_xticks(range(len(classes_with_data)))
    axes[2, 1].set_xticklabels([cls[:8] for cls in classes_with_data], rotation=45, ha='right', fontsize=10)
    axes[2, 1].set_ylabel('Recall', fontsize=10)
    axes[2, 1].set_ylim(0, 1)
    
    # F1 chart
    axes[2, 2].bar(range(len(classes_with_data)), f1_scores, alpha=0.7, color='gold')
    axes[2, 2].set_title('F1 Scores', fontweight='bold', fontsize=12)
    axes[2, 2].set_xticks(range(len(classes_with_data)))
    axes[2, 2].set_xticklabels([cls[:8] for cls in classes_with_data], rotation=45, ha='right', fontsize=10)
    axes[2, 2].set_ylabel('F1 Score', fontsize=10)
    axes[2, 2].set_ylim(0, 1)
    
    # Support (log scale for better visualization)
    axes[2, 3].bar(range(len(classes_with_data)), supports, alpha=0.7, color='mediumpurple')
    axes[2, 3].set_title('Support (Pixel Count)', fontweight='bold', fontsize=12)
    axes[2, 3].set_xticks(range(len(classes_with_data)))
    axes[2, 3].set_xticklabels([cls[:8] for cls in classes_with_data], rotation=45, ha='right', fontsize=10)
    axes[2, 3].set_ylabel('Support (log scale)', fontsize=10)
    axes[2, 3].set_yscale('log')
    
    plt.tight_layout()
    plt.show()
    
    # Also plot the confusion matrix separately for better visibility
    plt.figure(figsize=(12, 10))
    evaluator.plot_confusion_matrix(eval_results["confusion_matrix"])

# Example comprehensive evaluation workflow (uncomment to run)
# if 'predictor' in locals() and 'sample_image' in locals():
#     # Run complete evaluation with metrics
#     eval_results = evaluate_prediction_with_metrics(predictor, sample_image, evaluator)
#     
#     if eval_results is not None:
#         # Create comprehensive visualizations
#         vis_results = visualize_evaluation_results(sample_image, eval_results, evaluator)
#         
#         # Plot all results
#         plot_comprehensive_evaluation_results(vis_results, eval_results, evaluator)

# Example visualization workflow (uncomment to run)
# if 'results' in locals() and 'sample_image' in locals():
#     # Create visualizations
#     vis_results = visualize_results(
#         sample_image, results, cityscapes_classes, cityscapes_colors
#     )
#     
#     # Create comprehensive plot
#     plot_results_grid(vis_results, analysis, cityscapes_classes)

## Batch Processing and Dataset Evaluation

This section implements efficient batch processing capabilities for evaluating panoptic segmentation performance across entire datasets. The batch processing system is designed to handle large-scale evaluation while maintaining detailed per-image analysis and aggregate statistics.

**Batch Processing Features:**

**Scalable Processing Pipeline:**
- **Parallel Processing**: Multi-threaded image processing for faster evaluation
- **Memory Management**: Efficient memory usage for processing large image datasets
- **Progress Tracking**: Real-time monitoring of processing status and estimated completion times
- **Error Handling**: Robust handling of corrupted images and processing failures

**Comprehensive Output Generation:**
- **Individual Results**: Detailed per-image segmentation results and metrics
- **Aggregate Statistics**: Dataset-wide performance summaries and distributions
- **Export Formats**: Multiple output formats including JSON, CSV, and visualization files
- **Comparative Analysis**: Statistical comparisons across different image subsets or conditions

**Performance Monitoring:**
- **Processing Speed**: Tracking of inference times and throughput rates
- **Resource Utilization**: Monitoring of GPU/CPU usage and memory consumption
- **Quality Metrics**: Continuous computation of segmentation quality indicators

The batch processing system enables comprehensive evaluation of restoration effects on urban scene understanding across diverse datasets and conditions.

In [None]:
import json
from tqdm import tqdm
import time

def process_batch_with_evaluation(predictor, image_files, output_dir, evaluator,
                                 save_visualizations=True, save_masks=True, save_json=True, save_metrics=True):
    """
    Process multiple images in batch with comprehensive evaluation and metrics
    Following evaluate.py structure with mIoU computation
    
    Args:
        predictor: Detectron2 predictor object
        image_files: List of image file paths
        output_dir: Directory to save results
        evaluator: SegmentationEvaluator instance
        save_visualizations: Whether to save visualization images
        save_masks: Whether to save segmentation masks
        save_json: Whether to save JSON results
        save_metrics: Whether to save detailed metrics
        
    Returns:
        dict: Summary of batch processing results with metrics
    """
    
    # Create output directories
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    if save_visualizations:
        (output_dir / "visualizations").mkdir(exist_ok=True)
    if save_masks:
        (output_dir / "masks").mkdir(exist_ok=True)
    if save_json:
        (output_dir / "json").mkdir(exist_ok=True)
    
    batch_results = []
    processing_times = []
    
    print(f"Processing {len(image_files)} images...")
    print("=" * 60)
    
    for i, image_path in enumerate(tqdm(image_files, desc="Processing images")):
        try:
            start_time = time.time()
            
            # Load image
            image = load_image(image_path)
            if image is None:
                continue
            
            # Get base filename
            base_name = Path(image_path).stem
            
            # Run inference
            results = run_panoptic_segmentation(predictor, image)
            if results is None:
                continue
            
            # Analyze results
            analysis = analyze_segments(results["segments_info"], cityscapes_classes)
            
            # Create visualizations
            if save_visualizations or save_masks:
                vis_results = visualize_results(
                    image, results, cityscapes_classes, cityscapes_colors
                )
                
                # Save visualization
                if save_visualizations:
                    vis_path = output_dir / "visualizations" / f"{base_name}_visualization.png"
                    plt.figure(figsize=(15, 5))
                    
                    plt.subplot(1, 3, 1)
                    plt.imshow(vis_results["original"])
                    plt.title("Original")
                    plt.axis('off')
                    
                    plt.subplot(1, 3, 2)
                    plt.imshow(vis_results["segmentation_map"])
                    plt.title("Segmentation")
                    plt.axis('off')
                    
                    plt.subplot(1, 3, 3)
                    plt.imshow(vis_results["overlay"])
                    plt.title("Overlay")
                    plt.axis('off')
                    
                    plt.tight_layout()
                    plt.savefig(vis_path, dpi=150, bbox_inches='tight')
                    plt.close()
                
                # Save mask
                if save_masks:
                    mask_path = output_dir / "masks" / f"{base_name}_mask.png"
                    cv2.imwrite(str(mask_path), cv2.cvtColor(vis_results["segmentation_map"], cv2.COLOR_RGB2BGR))
            
            # Save JSON results
            if save_json:
                json_data = {
                    "image_path": str(image_path),
                    "image_name": base_name,
                    "analysis": analysis,
                    "segments_info": results["segments_info"],
                    "processing_time": time.time() - start_time
                }
                
                json_path = output_dir / "json" / f"{base_name}_results.json"
                with open(json_path, 'w') as f:
                    json.dump(json_data, f, indent=2, default=str)
            
            processing_time = time.time() - start_time
            processing_times.append(processing_time)
            
            batch_results.append({
                "image_path": image_path,
                "base_name": base_name,
                "analysis": analysis,
                "processing_time": processing_time
            })
            
            # Print progress every 10 images
            if (i + 1) % 10 == 0:
                avg_time = np.mean(processing_times[-10:])
                print(f"✅ Processed {i+1}/{len(image_files)} images (avg: {avg_time:.2f}s/image)")
        
        except Exception as e:
            print(f"❌ Error processing {image_path}: {e}")
            continue
    
    # Create summary
    summary = create_batch_summary(batch_results, output_dir)
    
    return summary

def create_batch_summary(batch_results, output_dir):
    """Create and save a summary of batch processing results"""
    
    if not batch_results:
        return {"error": "No images were successfully processed"}
    
    # Calculate statistics
    total_images = len(batch_results)
    total_time = sum(r["processing_time"] for r in batch_results)
    avg_time = total_time / total_images
    
    # Aggregate category statistics
    all_categories = {}
    total_segments = 0
    
    for result in batch_results:
        analysis = result["analysis"]
        total_segments += analysis["total_segments"]
        
        for category, info in analysis["category_counts"].items():
            if category not in all_categories:
                all_categories[category] = {"total_count": 0, "total_area": 0, "image_count": 0}
            
            all_categories[category]["total_count"] += info["count"]
            all_categories[category]["total_area"] += info["total_area"]
            all_categories[category]["image_count"] += 1
    
    summary = {
        "batch_info": {
            "total_images_processed": total_images,
            "total_processing_time": total_time,
            "average_time_per_image": avg_time,
            "total_segments_found": total_segments
        },
        "category_statistics": all_categories,
        "per_image_results": batch_results
    }
    
    # Save summary
    summary_path = output_dir / "batch_summary.json"
    with open(summary_path, 'w') as f:
        json.dump(summary, f, indent=2, default=str)
    
    print(f"\nBatch processing completed!")
    print(f"Results saved to: {output_dir}")
    print(f"Summary saved to: {summary_path}")
    print(f"Total time: {total_time:.2f}s ({avg_time:.2f}s per image)")
    print(f"Total segments found: {total_segments}")
    
    return summary

# Example batch processing (uncomment to run)
# if 'predictor' in locals():
#     # Define input and output paths
#     input_directory = "path/to/your/input/images"
#     output_directory = "path/to/your/output"
#     
#     # Get all image files
#     image_files = get_image_files(input_directory)[:5]  # Process first 5 images as example
#     
#     # Process batch
#     if image_files:
#         summary = process_batch(
#             predictor, image_files, output_directory,
#             cityscapes_classes, cityscapes_colors,
#             save_visualizations=True,
#             save_masks=True,
#             save_json=True
#         )

## Complete Evaluation Pipeline Demonstration

This section provides a complete end-to-end demonstration of the urban scene panoptic segmentation evaluation pipeline. The example showcases the full workflow from model setup through comprehensive evaluation and visualization, following best practices for systematic evaluation.

**Complete Pipeline Components:**

**Model Configuration and Setup:**
- **Architecture Selection**: Choice between different PanopticFCN configurations based on accuracy vs speed requirements
- **Confidence Thresholding**: Optimization of detection confidence thresholds for deployment scenarios
- **Hardware Optimization**: GPU/CPU configuration for optimal performance

**Comprehensive Evaluation Workflow:**
1. **Image Preprocessing**: Standard preprocessing pipeline ensuring consistent input formatting
2. **Inference Execution**: Panoptic segmentation inference with timing and resource monitoring
3. **Results Processing**: Extraction and formatting of segmentation results for analysis
4. **Metrics Computation**: Calculation of all evaluation metrics including PQ, SQ, RQ, mIoU, and class-specific scores
5. **Visualization Generation**: Creation of comprehensive visual analysis including overlays, charts, and comparison plots
6. **Report Generation**: Automated generation of detailed evaluation reports and summaries

**Key Evaluation Insights:**
The complete pipeline provides actionable insights for assessing restoration quality impact on urban scene understanding, including identification of performance bottlenecks, challenging scene categories, and recommendations for deployment optimization.

This comprehensive evaluation serves as a benchmark for determining whether restored urban images maintain sufficient visual fidelity for accurate autonomous driving, urban planning, and smart city applications.

In [None]:
# Complete Urban Scene Panoptic Segmentation Evaluation Pipeline

print("="*80)
print("URBAN SCENE PANOPTIC SEGMENTATION EVALUATION")
print("="*80)

# 1. Set up the model and evaluator
print("1. Setting up PanopticFCN model and evaluation framework...")
try:
    predictor, cfg = setup_panoptic_model("PanopticFCN-R50", confidence_threshold=0.5)
    print("   Model setup completed successfully")
except Exception as e:
    print(f"   Model setup failed: {e}")
    print("   Please ensure Detectron2 is properly installed")

# 2. Load and process an image
print("\n2. Loading and preprocessing urban scene image...")
image_path = "path/to/your/urban/image.jpg"  # Update with your image path

if os.path.exists(image_path):
    image = load_image(image_path)
    if image is not None:
        print(f"   Image loaded successfully: {image.shape}")
    else:
        print("   Failed to load image")
else:
    print("   Image path not found - update the path above")
    print("   Creating sample evaluation workflow for demonstration...")

# 3. Run complete evaluation with metrics
print("\n3. Running panoptic segmentation evaluation...")
if 'predictor' in locals() and 'image' in locals():
    try:
        eval_results = evaluate_prediction_with_metrics(predictor, image, evaluator)
        
        if eval_results is not None:
            print("   Evaluation completed successfully")
            
            # 4. Create comprehensive visualizations
            print("\n4. Generating comprehensive visualizations...")
            vis_results = visualize_evaluation_results(image, eval_results, evaluator)
            plot_comprehensive_evaluation_results(vis_results, eval_results, evaluator)
            
            # Print key metrics
            print(f"\nKEY EVALUATION RESULTS:")
            print(f"   Overall Accuracy: {eval_results['overall_accuracy']:.4f}")
            print(f"   Mean IoU (All Classes): {eval_results['mean_iou_all']:.4f}")
            print(f"   Mean IoU (Key Urban Classes): {eval_results['mean_iou_key']:.4f}")
            print(f"   Total Segments Detected: {len(eval_results['segments_info'])}")
            
            # Summary of class performance
            metrics = eval_results['metrics']
            best_classes = sorted(metrics.items(), key=lambda x: x[1]['iou'], reverse=True)[:5]
            worst_classes = sorted(metrics.items(), key=lambda x: x[1]['iou'])[:5]
            
            print(f"\nTOP 5 PERFORMING CLASSES:")
            for i, (class_name, class_metrics) in enumerate(best_classes, 1):
                if class_metrics['support'] > 0:
                    print(f"   {i}. {class_name}: IoU = {class_metrics['iou']:.4f}")
            
            print(f"\nCHALLENGING CLASSES (Lowest IoU):")
            for i, (class_name, class_metrics) in enumerate(worst_classes, 1):
                if class_metrics['support'] > 0:
                    print(f"   {i}. {class_name}: IoU = {class_metrics['iou']:.4f}")
                    
        else:
            print("   Evaluation failed - check input image and model setup")
            
    except Exception as e:
        print(f"   Evaluation error: {e}")
else:
    print("   Skipping evaluation - model or image not available")

print("\n" + "="*80)
print("EVALUATION PIPELINE COMPLETE")
print("="*80)
print("\nThis completes the downstream evaluation for urban scene panoptic segmentation.")
print("Use the generated metrics and visualizations to assess restoration model quality")
print("and determine impact on autonomous driving and urban planning applications.")

print("\nNext steps for analysis:")
print("1. Compare mIoU scores across different restoration methods")
print("2. Analyze per-class performance to identify restoration-sensitive categories")
print("3. Examine boundary quality for safety-critical applications")
print("4. Use batch processing for comprehensive dataset evaluation")
print("\nFor deployment considerations:")
print("- Overall Accuracy > 90% recommended for autonomous driving applications")
print("- Mean IoU > 70% suggested for urban planning and analysis")
print("- Pay special attention to 'car', 'person', and 'road' class performance")