<a href="https://colab.research.google.com/github/FarrelAD/Hology-8-2025-Data-Mining-PRIVATE/blob/dev%2Ffarrel/notebooks/farrel/nb_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Regression-Based Crowd Counting Model

This notebook implements a **direct regression approach** for crowd counting, which is fundamentally different from the density map approach. Instead of generating pixel-wise density maps, we'll:

1. **Extract global features** from images using pretrained CNN backbones
2. **Directly predict crowd counts** using regression layers
3. **Use simpler loss functions** (MSE/MAE) on scalar count values
4. **Leverage transfer learning** from ImageNet pretrained models

## Key Advantages of Regression Approach:
- **Faster training** - no need to generate density maps
- **Lower memory usage** - direct scalar prediction
- **Transfer learning** - can use powerful pretrained backbones
- **Interpretable outputs** - direct count prediction

# 1. Project Setup and Data Loading

## Import Libraries

In [None]:
import numpy as np
import cv2
import os
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
import torchvision.models as models
from PIL import Image
from typing import Optional, Tuple, Dict, Any, List
import warnings
warnings.filterwarnings('ignore')

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

## Dataset Download and Setup
Keep the same dataset download code as reference notebook

In [None]:
# @title Setup Kaggle secret key
!pip install -q kaggle

from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

# Then move kaggle.json into the folder where the API expects to find it.
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

In [None]:
# @title Setup dataset in Colab
import zipfile
import os
from google.colab import drive

drive.mount('/content/drive')

# Paths
zip_path = "/content/penyisihan-hology-8-0-2025-data-mining.zip"
drive_extract_path = "/content/drive/MyDrive/PROJECTS/Cognivio/Percobaan Hology 8 2025/dataset"
local_dataset_path = "/content/dataset"  # for current session

# ---------------------------
# Step 1: Download zip (if not exists in /content)
# ---------------------------
if not os.path.exists(zip_path):
    print("Dataset not found locally, downloading...")
    !kaggle competitions download -c penyisihan-hology-8-0-2025-data-mining -p /content
else:
    print("Dataset already exists, skipping download.")

# ---------------------------
# Step 2: Extract to Google Drive (for backup)
# ---------------------------
os.makedirs(drive_extract_path, exist_ok=True)

if not os.listdir(drive_extract_path):  # Check if folder is empty
    print("Extracting dataset to Google Drive...")
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall(drive_extract_path)
    print("Dataset extracted to:", drive_extract_path)
else:
    print("Dataset already extracted at:", drive_extract_path)

# ---------------------------
# Step 3: Copy dataset to local /content (faster training)
# ---------------------------
if not os.path.exists(local_dataset_path):
    print("Copying dataset to Colab local storage (/content)...")
    !cp -r "$drive_extract_path" "$local_dataset_path"
else:
    print("Dataset already available in Colab local storage.")

# ---------------------------
# Step 4: Define dataset paths for training
# ---------------------------
TRAIN_IMG_DIR = os.path.join(local_dataset_path, "train", "images")
TRAIN_LBL_DIR = os.path.join(local_dataset_path, "train", "labels")
TEST_IMG_DIR  = os.path.join(local_dataset_path, "test", "images")

print("Train images:", TRAIN_IMG_DIR)
print("Train labels:", TRAIN_LBL_DIR)
print("Test images:", TEST_IMG_DIR)

# 2. Data Preprocessing and Augmentation

For regression-based crowd counting, we focus on global image features rather than spatial density information.

In [None]:
def load_crowd_counts(label_dir: str) -> Dict[str, int]:
    """
    Load crowd counts from JSON label files.
    Returns a dictionary mapping image names to their crowd counts.
    """
    crowd_counts = {}
    label_files = [f for f in os.listdir(label_dir) if f.endswith('.json')]
    
    for label_file in label_files:
        img_name = label_file.replace('.json', '.jpg')
        label_path = os.path.join(label_dir, label_file)
        
        with open(label_path, 'r') as f:
            data = json.load(f)
            crowd_counts[img_name] = data.get('human_num', 0)
    
    return crowd_counts

def analyze_dataset_statistics(crowd_counts: Dict[str, int]) -> None:
    """Analyze and visualize dataset statistics."""
    counts = list(crowd_counts.values())
    
    print("Dataset Statistics:")
    print(f"Total images: {len(counts)}")
    print(f"Min count: {min(counts)}")
    print(f"Max count: {max(counts)}")
    print(f"Mean count: {np.mean(counts):.2f}")
    print(f"Median count: {np.median(counts):.2f}")
    print(f"Std deviation: {np.std(counts):.2f}")
    
    # Visualize distribution
    plt.figure(figsize=(12, 4))
    
    plt.subplot(1, 2, 1)
    plt.hist(counts, bins=50, alpha=0.7, edgecolor='black')
    plt.xlabel('Crowd Count')
    plt.ylabel('Number of Images')
    plt.title('Distribution of Crowd Counts')
    plt.grid(True, alpha=0.3)
    
    plt.subplot(1, 2, 2)
    plt.boxplot(counts)
    plt.ylabel('Crowd Count')
    plt.title('Crowd Count Box Plot')
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Load and analyze crowd counts
crowd_counts = load_crowd_counts(TRAIN_LBL_DIR)
analyze_dataset_statistics(crowd_counts)

In [None]:
# Define data augmentation strategies
train_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomRotation(degrees=10),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    transforms.RandomAdjustSharpness(sharpness_factor=2, p=0.3),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # ImageNet normalization
])

val_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

test_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

print("Data preprocessing and augmentation transforms defined.")
print("Using ImageNet normalization for transfer learning compatibility.")

# 3. Feature-based Regression Model Architecture

We'll implement multiple architecture options using different pretrained backbones.

In [None]:
class CrowdCountingRegressor(nn.Module):
    """
    Regression-based crowd counting model using pretrained CNN backbones.
    """
    def __init__(
        self, 
        backbone_name: str = 'resnet50', 
        pretrained: bool = True, 
        dropout_rate: float = 0.5
    ):
        super(CrowdCountingRegressor, self).__init__()
        
        self.backbone_name = backbone_name
        
        # Load pretrained backbone
        if backbone_name == 'resnet50':
            self.backbone = models.resnet50(pretrained=pretrained)
            feature_dim = self.backbone.fc.in_features
            self.backbone.fc = nn.Identity()  # Remove final classification layer
            
        elif backbone_name == 'efficientnet_b0':
            self.backbone = models.efficientnet_b0(pretrained=pretrained)
            feature_dim = self.backbone.classifier[1].in_features
            self.backbone.classifier = nn.Identity()
            
        elif backbone_name == 'vgg16':
            self.backbone = models.vgg16(pretrained=pretrained)
            feature_dim = self.backbone.classifier[6].in_features
            self.backbone.classifier = nn.Identity()
            
        elif backbone_name == 'mobilenet_v2':
            self.backbone = models.mobilenet_v2(pretrained=pretrained)
            feature_dim = self.backbone.classifier[1].in_features
            self.backbone.classifier = nn.Identity()
            
        else:
            raise ValueError(f"Unsupported backbone: {backbone_name}")
        
        # Regression head
        self.regression_head = nn.Sequential(
            nn.AdaptiveAvgPool2d((1, 1)),  # Global Average Pooling
            nn.Flatten(),
            nn.Dropout(dropout_rate),
            nn.Linear(feature_dim, 512),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            nn.Linear(512, 128),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            nn.Linear(128, 1),  # Single output for count
            nn.ReLU()  # Ensure non-negative predictions
        )
        
        # Initialize regression head weights
        self._initialize_regression_head()
    
    def _initialize_regression_head(self):
        """Initialize regression head weights."""
        for m in self.regression_head.modules():
            if isinstance(m, nn.Linear):
                nn.init.xavier_normal_(m.weight)
                nn.init.constant_(m.bias, 0)
    
    def forward(self, x):
        # Extract features using backbone
        features = self.backbone(x)
        
        # Handle different backbone output formats
        if len(features.shape) == 4:  # For backbones that output 4D tensors
            features = self.regression_head(features)
        else:  # For backbones that already apply global pooling
            features = features.view(features.size(0), -1)
            features = self.regression_head[2:](features)  # Skip pooling and flatten
            
        return features.squeeze()

def create_model(
    backbone_name: str = 'resnet50', 
    pretrained: bool = True
) -> CrowdCountingRegressor:
    """Factory function to create crowd counting models."""
    model = CrowdCountingRegressor(backbone_name=backbone_name, pretrained=pretrained)
    return model

# Test different model architectures
available_backbones = ['resnet50', 'efficientnet_b0', 'vgg16', 'mobilenet_v2']

for backbone in available_backbones:
    try:
        model = create_model(backbone_name=backbone)
        total_params = sum(p.numel() for p in model.parameters())
        trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
        
        print(f"{backbone}:")
        print(f"  Total parameters: {total_params:,}")
        print(f"  Trainable parameters: {trainable_params:,}")
        print()
    except Exception as e:
        print(f"Error with {backbone}: {e}")

# 4. Dataset Class for Regression Approach

Custom PyTorch Dataset class optimized for regression-based crowd counting.

In [None]:
class CrowdCountingRegressionDataset(Dataset):
    """
    PyTorch Dataset for regression-based crowd counting.
    Returns (image, count) pairs for direct count prediction.
    """
    def __init__(
        self,
        image_dir: str,
        crowd_counts: Dict[str, int],
        image_files: List[str],
        transform: Optional[transforms.Compose] = None,
        return_meta: bool = False
    ):
        self.image_dir = image_dir
        self.crowd_counts = crowd_counts
        self.image_files = image_files
        self.transform = transform
        self.return_meta = return_meta
        
        # Filter files that have corresponding labels
        self.valid_files = [f for f in image_files if f in crowd_counts]
        
        print(f"Dataset created with {len(self.valid_files)} images")
        
    def __len__(self) -> int:
        return len(self.valid_files)
    
    def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.Tensor]:
        img_name = self.valid_files[idx]
        img_path = os.path.join(self.image_dir, img_name)
        
        # Load image
        image = Image.open(img_path).convert('RGB')
        
        # Apply transforms
        if self.transform:
            image = self.transform(image)
        
        # Get crowd count
        count = float(self.crowd_counts[img_name])
        count_tensor = torch.tensor(count, dtype=torch.float32)
        
        if self.return_meta:
            return {
                'image': image,
                'count': count_tensor,
                'image_name': img_name
            }
        
        return image, count_tensor

def create_train_val_split(
    image_dir: str,
    crowd_counts: Dict[str, int],
    val_size: float = 0.2,
    random_state: int = 42
) -> Tuple[List[str], List[str]]:
    """
    Create stratified train-validation split based on crowd count ranges.
    """
    image_files = list(crowd_counts.keys())
    counts = [crowd_counts[f] for f in image_files]
    
    # Create stratification bins based on count ranges
    bins = [0, 5, 15, 30, 50, 100, float('inf')]
    strata = np.digitize(counts, bins)
    
    train_files, val_files = train_test_split(
        image_files,
        test_size=val_size,
        random_state=random_state,
        stratify=strata
    )
    
    return train_files, val_files

# Create train-validation split
train_files, val_files = create_train_val_split(TRAIN_IMG_DIR, crowd_counts, val_size=0.2)

print(f"Training set: {len(train_files)} images")
print(f"Validation set: {len(val_files)} images")

# Analyze split distribution
train_counts = [crowd_counts[f] for f in train_files]
val_counts = [crowd_counts[f] for f in val_files]

print(f"\nTraining set statistics:")
print(f"  Mean: {np.mean(train_counts):.2f}, Std: {np.std(train_counts):.2f}")
print(f"Validation set statistics:")
print(f"  Mean: {np.mean(val_counts):.2f}, Std: {np.std(val_counts):.2f}")

In [None]:
# Create datasets
train_dataset = CrowdCountingRegressionDataset(
    image_dir=TRAIN_IMG_DIR,
    crowd_counts=crowd_counts,
    image_files=train_files,
    transform=train_transforms
)

val_dataset = CrowdCountingRegressionDataset(
    image_dir=TRAIN_IMG_DIR,
    crowd_counts=crowd_counts,
    image_files=val_files,
    transform=val_transforms
)

# Create data loaders
BATCH_SIZE = 16
NUM_WORKERS = 4

train_loader = DataLoader(
    train_dataset,
    batch_size=BATCH_SIZE,
    shuffle=True,
    num_workers=NUM_WORKERS,
    pin_memory=True
)

val_loader = DataLoader(
    val_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=NUM_WORKERS,
    pin_memory=True
)

print(f"Data loaders created:")
print(f"  Training batches: {len(train_loader)}")
print(f"  Validation batches: {len(val_loader)}")

# Visualize sample batch
def visualize_sample_batch(data_loader, dataset_name="Dataset"):
    """Visualize a sample batch from the data loader."""
    images, counts = next(iter(data_loader))
    
    fig, axes = plt.subplots(2, 4, figsize=(16, 8))
    axes = axes.flatten()
    
    for i in range(min(8, len(images))):
        img = images[i]
        # Denormalize image for visualization
        img = img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
        img = torch.clamp(img, 0, 1)
        img = img.permute(1, 2, 0).numpy()
        
        axes[i].imshow(img)
        axes[i].set_title(f'Count: {counts[i].item():.0f}')
        axes[i].axis('off')
    
    plt.suptitle(f'Sample Batch from {dataset_name}')
    plt.tight_layout()
    plt.show()

visualize_sample_batch(train_loader, "Training Set")

# 5. Model Training with Feature Extraction

Implement comprehensive training pipeline with validation monitoring and learning rate scheduling.

In [None]:
class EarlyStopping:
    """Early stopping to prevent overfitting."""
    def __init__(self, patience=7, min_delta=0, restore_best_weights=True):
        self.patience = patience
        self.min_delta = min_delta
        self.restore_best_weights = restore_best_weights
        self.best_loss = None
        self.counter = 0
        self.best_weights = None
        
    def __call__(self, val_loss, model):
        if self.best_loss is None:
            self.best_loss = val_loss
            self.save_checkpoint(model)
        elif val_loss < self.best_loss - self.min_delta:
            self.best_loss = val_loss
            self.counter = 0
            self.save_checkpoint(model)
        else:
            self.counter += 1
            
        if self.counter >= self.patience:
            if self.restore_best_weights:
                model.load_state_dict(self.best_weights)
            return True
        return False
    
    def save_checkpoint(self, model):
        self.best_weights = model.state_dict().copy()

def train_model(
    model: nn.Module,
    train_loader: DataLoader,
    val_loader: DataLoader,
    num_epochs: int = 50,
    learning_rate: float = 0.001,
    patience: int = 10,
    loss_function: str = 'mse'
) -> Dict[str, List[float]]:
    """
    Train the crowd counting regression model.
    """
    # Loss function selection
    if loss_function == 'mse':
        criterion = nn.MSELoss()
    elif loss_function == 'mae':
        criterion = nn.L1Loss()
    elif loss_function == 'huber':
        criterion = nn.SmoothL1Loss()
    else:
        raise ValueError(f"Unsupported loss function: {loss_function}")
    
    # Optimizer and scheduler
    optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=1e-4)
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5, verbose=True)
    
    # Early stopping
    early_stopping = EarlyStopping(patience=patience, min_delta=0.01)
    
    # Training history
    history = {
        'train_loss': [],
        'val_loss': [],
        'train_mae': [],
        'val_mae': [],
        'learning_rate': []
    }
    
    model.to(device)
    
    print(f"Starting training with {loss_function.upper()} loss...")
    print(f"Model: {model.backbone_name}")
    print(f"Device: {device}")
    print("-" * 60)
    
    for epoch in range(num_epochs):
        # Training phase
        model.train()
        train_loss = 0.0
        train_mae = 0.0
        
        for batch_idx, (images, targets) in enumerate(train_loader):
            images, targets = images.to(device), targets.to(device)
            
            optimizer.zero_grad()
            outputs = model(images)
            
            loss = criterion(outputs, targets)
            loss.backward()
            optimizer.step()
            
            train_loss += loss.item()
            train_mae += nn.L1Loss()(outputs, targets).item()
        
        # Validation phase
        model.eval()
        val_loss = 0.0
        val_mae = 0.0
        
        with torch.no_grad():
            for images, targets in val_loader:
                images, targets = images.to(device), targets.to(device)
                outputs = model(images)
                
                loss = criterion(outputs, targets)
                val_loss += loss.item()
                val_mae += nn.L1Loss()(outputs, targets).item()
        
        # Calculate average losses
        train_loss /= len(train_loader)
        val_loss /= len(val_loader)
        train_mae /= len(train_loader)
        val_mae /= len(val_loader)
        
        # Update learning rate
        scheduler.step(val_loss)
        current_lr = optimizer.param_groups[0]['lr']
        
        # Store history
        history['train_loss'].append(train_loss)
        history['val_loss'].append(val_loss)
        history['train_mae'].append(train_mae)
        history['val_mae'].append(val_mae)
        history['learning_rate'].append(current_lr)
        
        # Print progress
        if (epoch + 1) % 5 == 0 or epoch == 0:
            print(f'Epoch {epoch+1:3d}/{num_epochs} | '
                  f'Train Loss: {train_loss:.4f} | Val Loss: {val_loss:.4f} | '
                  f'Train MAE: {train_mae:.4f} | Val MAE: {val_mae:.4f} | '
                  f'LR: {current_lr:.6f}')
        
        # Early stopping check
        if early_stopping(val_loss, model):
            print(f'Early stopping triggered at epoch {epoch+1}')
            break
    
    print("\nTraining completed!")
    return history

# Initialize and train model
model = create_model(backbone_name='resnet50', pretrained=True)

# Train the model
training_history = train_model(
    model=model,
    train_loader=train_loader,
    val_loader=val_loader,
    num_epochs=50,
    learning_rate=0.001,
    patience=10,
    loss_function='mse'
)

# 6. Evaluation and Performance Metrics

Comprehensive evaluation using multiple regression metrics and statistical analysis.

In [None]:
def evaluate_model(
    model: nn.Module,
    data_loader: DataLoader,
    dataset_name: str = "Dataset"
) -> Dict[str, float]:
    """
    Comprehensive model evaluation with multiple metrics.
    """
    model.eval()
    all_predictions = []
    all_targets = []
    
    with torch.no_grad():
        for images, targets in data_loader:
            images, targets = images.to(device), targets.to(device)
            outputs = model(images)
            
            all_predictions.extend(outputs.cpu().numpy())
            all_targets.extend(targets.cpu().numpy())
    
    predictions = np.array(all_predictions)
    targets = np.array(all_targets)
    
    # Calculate metrics
    mae = mean_absolute_error(targets, predictions)
    mse = mean_squared_error(targets, predictions)
    rmse = np.sqrt(mse)
    mape = np.mean(np.abs((targets - predictions) / (targets + 1e-8))) * 100  # Add small epsilon to avoid division by zero
    r2 = r2_score(targets, predictions)
    
    # Additional metrics
    correlation = np.corrcoef(targets, predictions)[0, 1]
    max_error = np.max(np.abs(targets - predictions))
    
    metrics = {
        'MAE': mae,
        'MSE': mse,
        'RMSE': rmse,
        'MAPE': mape,
        'R²': r2,
        'Correlation': correlation,
        'Max Error': max_error,
        'Mean Target': np.mean(targets),
        'Mean Prediction': np.mean(predictions)
    }
    
    print(f"\n{dataset_name} Evaluation Results:")
    print("-" * 40)
    for metric, value in metrics.items():
        print(f"{metric:15s}: {value:8.4f}")
    
    return metrics, predictions, targets

# Evaluate on training and validation sets
train_metrics, train_preds, train_targets = evaluate_model(model, train_loader, "Training Set")
val_metrics, val_preds, val_targets = evaluate_model(model, val_loader, "Validation Set")

In [None]:
def plot_training_history(history: Dict[str, List[float]]) -> None:
    """Plot training history curves."""
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Loss curves
    axes[0, 0].plot(history['train_loss'], label='Training Loss', color='blue')
    axes[0, 0].plot(history['val_loss'], label='Validation Loss', color='red')
    axes[0, 0].set_title('Training and Validation Loss')
    axes[0, 0].set_xlabel('Epoch')
    axes[0, 0].set_ylabel('Loss')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # MAE curves
    axes[0, 1].plot(history['train_mae'], label='Training MAE', color='blue')
    axes[0, 1].plot(history['val_mae'], label='Validation MAE', color='red')
    axes[0, 1].set_title('Training and Validation MAE')
    axes[0, 1].set_xlabel('Epoch')
    axes[0, 1].set_ylabel('Mean Absolute Error')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # Learning rate
    axes[1, 0].plot(history['learning_rate'], color='green')
    axes[1, 0].set_title('Learning Rate Schedule')
    axes[1, 0].set_xlabel('Epoch')
    axes[1, 0].set_ylabel('Learning Rate')
    axes[1, 0].set_yscale('log')
    axes[1, 0].grid(True, alpha=0.3)
    
    # Loss difference
    loss_diff = np.array(history['val_loss']) - np.array(history['train_loss'])
    axes[1, 1].plot(loss_diff, color='purple')
    axes[1, 1].set_title('Validation - Training Loss')
    axes[1, 1].set_xlabel('Epoch')
    axes[1, 1].set_ylabel('Loss Difference')
    axes[1, 1].grid(True, alpha=0.3)
    axes[1, 1].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    
    plt.tight_layout()
    plt.show()

plot_training_history(training_history)

# 7. Visualization of Predictions

Create comprehensive visualizations to analyze model performance and prediction quality.

In [None]:
def create_prediction_visualizations(
    predictions: np.ndarray,
    targets: np.ndarray,
    dataset_name: str = "Dataset"
) -> None:
    """
    Create comprehensive prediction visualization plots.
    """
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    # 1. Scatter plot: Predictions vs Targets
    axes[0, 0].scatter(targets, predictions, alpha=0.6, s=30)
    axes[0, 0].plot([targets.min(), targets.max()], [targets.min(), targets.max()], 'r--', lw=2)
    axes[0, 0].set_xlabel('True Count')
    axes[0, 0].set_ylabel('Predicted Count')
    axes[0, 0].set_title(f'{dataset_name}: Predictions vs True Values')
    axes[0, 0].grid(True, alpha=0.3)
    
    # Add correlation coefficient
    correlation = np.corrcoef(targets, predictions)[0, 1]
    axes[0, 0].text(0.05, 0.95, f'Correlation: {correlation:.3f}', 
                    transform=axes[0, 0].transAxes, bbox=dict(boxstyle="round", facecolor='wheat'))
    
    # 2. Residual plot
    residuals = predictions - targets
    axes[0, 1].scatter(targets, residuals, alpha=0.6, s=30)
    axes[0, 1].axhline(y=0, color='r', linestyle='--')
    axes[0, 1].set_xlabel('True Count')
    axes[0, 1].set_ylabel('Residuals (Predicted - True)')
    axes[0, 1].set_title(f'{dataset_name}: Residual Plot')
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Error distribution
    axes[0, 2].hist(residuals, bins=30, alpha=0.7, edgecolor='black')
    axes[0, 2].axvline(x=0, color='r', linestyle='--')
    axes[0, 2].set_xlabel('Residuals')
    axes[0, 2].set_ylabel('Frequency')
    axes[0, 2].set_title(f'{dataset_name}: Error Distribution')
    axes[0, 2].grid(True, alpha=0.3)
    
    # 4. Absolute error vs true count
    abs_errors = np.abs(residuals)
    axes[1, 0].scatter(targets, abs_errors, alpha=0.6, s=30)
    axes[1, 0].set_xlabel('True Count')
    axes[1, 0].set_ylabel('Absolute Error')
    axes[1, 0].set_title(f'{dataset_name}: Absolute Error vs True Count')
    axes[1, 0].grid(True, alpha=0.3)
    
    # 5. Percentage error vs true count
    percentage_errors = np.abs(residuals) / (targets + 1e-8) * 100
    axes[1, 1].scatter(targets, percentage_errors, alpha=0.6, s=30)
    axes[1, 1].set_xlabel('True Count')
    axes[1, 1].set_ylabel('Absolute Percentage Error (%)')
    axes[1, 1].set_title(f'{dataset_name}: Percentage Error vs True Count')
    axes[1, 1].grid(True, alpha=0.3)
    
    # 6. Cumulative error distribution
    sorted_abs_errors = np.sort(abs_errors)
    cumulative_percentages = np.arange(1, len(sorted_abs_errors) + 1) / len(sorted_abs_errors) * 100
    axes[1, 2].plot(sorted_abs_errors, cumulative_percentages)
    axes[1, 2].set_xlabel('Absolute Error')
    axes[1, 2].set_ylabel('Cumulative Percentage (%)')
    axes[1, 2].set_title(f'{dataset_name}: Cumulative Error Distribution')
    axes[1, 2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Print error statistics
    print(f"\n{dataset_name} Error Analysis:")
    print(f"Mean Absolute Error: {np.mean(abs_errors):.2f}")
    print(f"Median Absolute Error: {np.median(abs_errors):.2f}")
    print(f"90th Percentile Error: {np.percentile(abs_errors, 90):.2f}")
    print(f"95th Percentile Error: {np.percentile(abs_errors, 95):.2f}")
    print(f"Max Absolute Error: {np.max(abs_errors):.2f}")

# Create visualizations for both datasets
create_prediction_visualizations(train_preds, train_targets, "Training Set")
create_prediction_visualizations(val_preds, val_targets, "Validation Set")

In [None]:
def visualize_sample_predictions(
    model: nn.Module,
    dataset: CrowdCountingRegressionDataset,
    num_samples: int = 12,
    sort_by_error: bool = True
) -> None:
    """
    Visualize sample predictions with images.
    """
    model.eval()
    
    # Get predictions for entire dataset
    dataset.return_meta = True
    sample_loader = DataLoader(dataset, batch_size=1, shuffle=False)
    
    samples = []
    with torch.no_grad():
        for data in sample_loader:
            image = data['image'].to(device)
            true_count = data['count'].item()
            image_name = data['image_name'][0]
            
            pred_count = model(image).item()
            error = abs(pred_count - true_count)
            
            # Denormalize image for visualization
            img_viz = data['image'].squeeze()
            img_viz = img_viz * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
            img_viz = torch.clamp(img_viz, 0, 1).permute(1, 2, 0).numpy()
            
            samples.append({
                'image': img_viz,
                'true_count': true_count,
                'pred_count': pred_count,
                'error': error,
                'image_name': image_name
            })
    
    # Sort by error if requested
    if sort_by_error:
        samples.sort(key=lambda x: x['error'], reverse=True)
    
    # Create visualization
    cols = 4
    rows = (num_samples + cols - 1) // cols
    fig, axes = plt.subplots(rows, cols, figsize=(16, 4 * rows))
    axes = axes.flatten() if rows > 1 else [axes] if cols == 1 else axes
    
    for i in range(num_samples):
        if i < len(samples):
            sample = samples[i]
            axes[i].imshow(sample['image'])
            axes[i].set_title(f"True: {sample['true_count']:.0f}, Pred: {sample['pred_count']:.1f}\n"
                             f"Error: {sample['error']:.1f}", fontsize=10)
            axes[i].axis('off')
        else:
            axes[i].axis('off')
    
    title = "Worst Predictions" if sort_by_error else "Sample Predictions"
    plt.suptitle(f'{title} - Validation Set', fontsize=16)
    plt.tight_layout()
    plt.show()
    
    dataset.return_meta = False  # Reset

# Visualize worst and best predictions
visualize_sample_predictions(model, val_dataset, num_samples=12, sort_by_error=True)
visualize_sample_predictions(model, val_dataset, num_samples=12, sort_by_error=False)

# 8. Test Set Prediction and Submission

Generate predictions for the test set and create the final submission file.

In [None]:
class TestDataset(Dataset):
    """Dataset class for test images without labels."""
    def __init__(self, image_dir: str, transform: transforms.Compose):
        self.image_dir = image_dir
        self.transform = transform
        self.image_files = sorted([f for f in os.listdir(image_dir) if f.endswith('.jpg')],
                                 key=lambda x: int(os.path.splitext(x)[0]))
        
    def __len__(self):
        return len(self.image_files)
    
    def __getitem__(self, idx):
        img_name = self.image_files[idx]
        img_path = os.path.join(self.image_dir, img_name)
        
        image = Image.open(img_path).convert('RGB')
        if self.transform:
            image = self.transform(image)
            
        return image, img_name

def generate_test_predictions(
    model: nn.Module,
    test_dir: str,
    output_file: str = 'submission.csv'
) -> pd.DataFrame:
    """
    Generate predictions for test set and create submission file.
    """
    print("Generating test set predictions...")
    
    # Create test dataset
    test_dataset = TestDataset(test_dir, test_transforms)
    test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False, num_workers=4)
    
    model.eval()
    predictions = []
    image_names = []
    
    with torch.no_grad():
        for images, names in test_loader:
            images = images.to(device)
            outputs = model(images)
            
            # Convert to numpy and round to integers
            batch_preds = outputs.cpu().numpy()
            predictions.extend([max(0, int(round(pred))) for pred in batch_preds])  # Ensure non-negative integers
            image_names.extend(names)
    
    # Create submission DataFrame
    submission_df = pd.DataFrame({
        'image_id': image_names,
        'predicted_count': predictions
    })
    
    # Sort by image_id to ensure correct order
    submission_df['sort_key'] = submission_df['image_id'].apply(lambda x: int(os.path.splitext(x)[0]))
    submission_df = submission_df.sort_values('sort_key').drop('sort_key', axis=1).reset_index(drop=True)
    
    # Save submission file
    submission_df.to_csv(output_file, index=False)
    
    print(f"Submission file '{output_file}' created successfully!")
    print(f"Generated predictions for {len(submission_df)} test images")
    
    # Display statistics
    pred_counts = submission_df['predicted_count'].values
    print(f"\nTest Predictions Statistics:")
    print(f"Min: {pred_counts.min()}")
    print(f"Max: {pred_counts.max()}")
    print(f"Mean: {pred_counts.mean():.2f}")
    print(f"Median: {np.median(pred_counts):.2f}")
    print(f"Std: {pred_counts.std():.2f}")
    
    # Visualize prediction distribution
    plt.figure(figsize=(12, 4))
    
    plt.subplot(1, 2, 1)
    plt.hist(pred_counts, bins=30, alpha=0.7, edgecolor='black')
    plt.xlabel('Predicted Count')
    plt.ylabel('Frequency')
    plt.title('Distribution of Test Predictions')
    plt.grid(True, alpha=0.3)
    
    plt.subplot(1, 2, 2)
    plt.boxplot(pred_counts)
    plt.ylabel('Predicted Count')
    plt.title('Test Predictions Box Plot')
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    return submission_df

# Generate test predictions
submission_df = generate_test_predictions(model, TEST_IMG_DIR, 'regression_submission.csv')

# Display first 10 rows of submission
print("\nFirst 10 rows of submission:")
print(submission_df.head(10))

In [None]:
# Visualize sample test predictions
def visualize_test_predictions(
    model: nn.Module,
    test_dir: str,
    submission_df: pd.DataFrame,
    num_samples: int = 12
) -> None:
    """Visualize sample test predictions."""
    test_dataset = TestDataset(test_dir, test_transforms)
    
    # Sample random images
    indices = np.random.choice(len(test_dataset), num_samples, replace=False)
    
    model.eval()
    
    fig, axes = plt.subplots(3, 4, figsize=(16, 12))
    axes = axes.flatten()
    
    with torch.no_grad():
        for i, idx in enumerate(indices):
            if i >= num_samples:
                break
                
            image, img_name = test_dataset[idx]
            
            # Get prediction
            image_batch = image.unsqueeze(0).to(device)
            pred_count = model(image_batch).item()
            
            # Denormalize image for visualization
            img_viz = image
            img_viz = img_viz * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
            img_viz = torch.clamp(img_viz, 0, 1).permute(1, 2, 0).numpy()
            
            axes[i].imshow(img_viz)
            axes[i].set_title(f'{img_name}\nPredicted: {pred_count:.1f}')
            axes[i].axis('off')
    
    plt.suptitle('Sample Test Predictions', fontsize=16)
    plt.tight_layout()
    plt.show()

visualize_test_predictions(model, TEST_IMG_DIR, submission_df, num_samples=12)

## Model Summary and Next Steps

### Model Performance Summary:
- **Architecture**: ResNet50 + Regression Head
- **Training Strategy**: Transfer learning with ImageNet pretrained weights
- **Loss Function**: MSE with L2 regularization
- **Validation MAE**: {val_metrics['MAE']:.2f}
- **Validation R²**: {val_metrics['R²']:.3f}

### Key Advantages of Regression Approach:
1. **Direct count prediction** - No need for density map generation
2. **Fast inference** - Single forward pass gives count
3. **Transfer learning** - Leverage powerful pretrained features
4. **Memory efficient** - Lower memory usage than density approaches

### Potential Improvements:
1. **Ensemble methods** - Combine multiple backbone architectures
2. **Data augmentation** - More sophisticated augmentation strategies
3. **Loss function tuning** - Experiment with Huber loss or custom losses
4. **Multi-scale training** - Train on different image resolutions
5. **Feature engineering** - Add crowd density-specific features

### Next Steps:
1. Experiment with different backbones (EfficientNet, Vision Transformers)
2. Implement model ensembling
3. Analyze failure cases and improve data preprocessing
4. Consider hybrid approaches combining regression with attention mechanisms