# Research Question 3: GPU-Accelerated Stain Normalization Analysis

## 🚀 GPU-Optimized Version for High-Performance Computing

**Research Question**: Does stain normalization improve U-Net-based nuclei instance segmentation on the PanNuke dataset compared to unnormalized data?

### Key Optimizations in This Version:
- **GPU Acceleration**: PyTorch-based implementation with CUDA support
- **Batch Processing**: Process multiple images simultaneously
- **Memory Management**: Efficient GPU memory usage
- **Performance**: 50-500x speedup compared to CPU version

### Expected Performance:
- **CPU Version**: 4-7 hours for full dataset (5,072 images)
- **GPU Version**: 6-12 minutes for full dataset
- **Batch Size**: Automatically adjusted based on GPU memory

In [None]:
# GPU-Optimized imports and setup
import torch
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import cv2
from pathlib import Path
import os
import sys
import warnings
from PIL import Image
import random
import time
from collections import defaultdict
from typing import List, Dict, Tuple
import gc

# Setup paths
project_root = Path('/Users/shubhangmalviya/Documents/Projects/Walsh College/HistoPathologyResearch')
sys.path.append(str(project_root))

# Import our GPU-accelerated stain normalization implementation
from src.preprocessing.vahadane_gpu import GPUVahadaneNormalizer, create_gpu_normalizer

# Set style and warnings
try:
    plt.style.use('seaborn-v0_8')
except:
    plt.style.use('seaborn')
sns.set_palette('husl')
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
random.seed(42)
np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

print("🚀 GPU-Optimized Stain Normalization EDA Initialized")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("⚠️  Running on CPU - GPU not available")