# 🔬 Ovarian Cancer Segmentation Lab

Welcome to this comprehensive lab on medical image segmentation for ovarian cancer detection! In this lab, you'll work with volumetric CT scan data to develop an advanced deep learning solution for automated cancer tissue identification.

## 📋 Task Overview
Your goal is to develop a 3D U-Net model that can accurately segment CT volumes into three distinct classes:
- **Class 0**: Background tissue
- **Class 1**: Primary ovarian cancer
- **Class 2**: Metastatic tissue

## 🎯 Learning Objectives
By completing this lab, you will:
- Master working with medical imaging data in NIfTI format
- Implement and understand the 3D U-Net architecture
- Learn effective training strategies for medical image segmentation
- Develop skills in evaluating and validating medical imaging models
- Gain practical experience with real-world medical data

## 🔍 Clinical Relevance
Accurate segmentation of ovarian cancer tissues is crucial for:
- Early detection and diagnosis
- Treatment planning and monitoring
- Assessment of disease progression
- Research and clinical trials

Let's dive in and build a solution that could make a real difference in healthcare! 🚀


# 1️⃣ Environment Setup and Dependencies

Before we begin our implementation, let's set up our development environment with all necessary packages and configurations.

## 📦 Required Packages
We'll be using the following key libraries:
- **PyTorch**: For deep learning model implementation
- **NiBabel**: For handling medical imaging data in NIfTI format
- **scikit-image**: For image processing and transformations
- **NumPy**: For numerical computations
- **Matplotlib**: For visualization

## 🖥️ Hardware Requirements
- GPU with CUDA support (recommended)
- Sufficient RAM for 3D volume processing
- Adequate storage for medical imaging data

## ⚙️ Configuration
We'll set up:
- CUDA device if available
- Random seeds for reproducibility
- Memory optimization settings


In [None]:
# Install required packages
!pip install numpy --quiet
!pip install scipy --quiet
!pip install scikit-learn --quiet
!pip install scikit-image nibabel gdown torch torchvision --quiet

# Import necessary libraries
import os
import numpy as np
import matplotlib.pyplot as plt
import nibabel as nib
from skimage import transform
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from torch.optim.lr_scheduler import ReduceLROnPlateau
from sklearn.model_selection import train_test_split

# Set up GPU if available
if not torch.cuda.is_available():
    print("WARNING: CUDA is not available. Please make sure to enable GPU in Runtime > Change runtime type")
    print("Current device: CPU")
else:
    # Set default tensor type to CUDA
    torch.set_default_tensor_type('torch.cuda.FloatTensor')
    device = torch.device('cuda')
    print(f"Using device: {device}")
    print(f"GPU Device: {torch.cuda.get_device_name(0)}")
    print(f"Available memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)


# 2️⃣ Data Acquisition and Preprocessing

## 📥 Dataset Download
First, we'll download our dataset containing CT scans and their corresponding segmentation masks. The data is stored in NIfTI format (`.nii.gz`), which is commonly used for medical imaging.

## 🗂️ Data Organization
The dataset is organized into two main directories:
- `Data_Subsample/CT/`: Contains the CT scan volumes
- `Data_Subsample/Segmentation/`: Contains the corresponding segmentation masks

## 💾 Data Loading
Let's download and extract the dataset, then verify our data structure:


In [None]:
# Download and extract dataset
file_id = '1Wo4h6ZVIFygVvqd68ApwWIdPQk3l7gkO'
output = 'Data_Subsample.zip'

if not os.path.exists('Data_Subsample.zip'):
    import subprocess
    subprocess.run(['gdown', '--id', file_id, '-O', output])

# Extract data if not already extracted
if not os.path.exists('Data_Subsample'):
    import zipfile
    with zipfile.ZipFile(output, 'r') as zip_ref:
        zip_ref.extractall('.')

# List available files
ct_files = sorted([f for f in os.listdir('Data_Subsample/CT') if f.endswith('.nii.gz')])
seg_files = sorted([f for f in os.listdir('Data_Subsample/Segmentation') if f.endswith('.nii.gz')])

print(f'Number of CT volumes: {len(ct_files)}')
print(f'Number of segmentation masks: {len(seg_files)}')


# 3️⃣ Loss Functions and Metrics

For medical image segmentation, choosing appropriate loss functions is crucial. We'll implement two key components:

## 🎯 Dice Loss
The Dice coefficient (also known as F1 score) is particularly useful for segmentation tasks because it:
- Handles class imbalance well
- Focuses on overlap between predictions and ground truth
- Ranges from 0 (no overlap) to 1 (perfect overlap)

## 🔄 Combined Loss
We'll combine Dice Loss with weighted Cross-Entropy to:
- Balance between pixel-wise and region-based segmentation quality
- Handle class imbalance through dynamic class weights
- Provide smoother gradients during training

Let's implement these loss functions:


In [None]:
class DiceLoss(nn.Module):
    """Dice Loss for multi-class 3D segmentation"""
    def __init__(self, smooth=1e-5):
        super(DiceLoss, self).__init__()
        self.smooth = smooth

    def forward(self, predictions, targets):
        # predictions shape: (batch_size, n_classes, d1, d2, d3)
        # targets shape: (batch_size, d1, d2, d3)

        # Convert predictions to probabilities
        predictions = F.softmax(predictions, dim=1)

        # One-hot encode targets
        n_classes = predictions.shape[1]
        one_hot_targets = F.one_hot(targets, n_classes).permute(0, 4, 1, 2, 3).float()

        # Calculate Dice score for each class
        numerator = 2 * (predictions * one_hot_targets).sum(dim=(2, 3, 4))
        denominator = predictions.sum(dim=(2, 3, 4)) + one_hot_targets.sum(dim=(2, 3, 4))
        dice_scores = (numerator + self.smooth) / (denominator + self.smooth)

        # Average over classes and batch
        return 1 - dice_scores.mean()

class CombinedLoss(nn.Module):
    """Combined Dice and weighted Cross-Entropy loss with focus on cancer classes"""
    def __init__(self, smooth=1e-5, ce_weight=0.5, background_weight=0.1):
        super(CombinedLoss, self).__init__()
        self.smooth = smooth
        self.ce_weight = ce_weight
        self.background_weight = background_weight
        self.dice_loss = DiceLoss(smooth=smooth)

    def forward(self, predictions, targets):
        # Calculate class weights with reduced background weight
        n_classes = predictions.shape[1]
        class_counts = torch.bincount(targets.flatten(), minlength=n_classes).float()
        total_pixels = class_counts.sum()
        
        # Modify weights to focus on cancer classes
        class_weights = torch.zeros_like(class_counts)
        class_weights[0] = self.background_weight  # Background class
        class_weights[1:] = (1.0 - self.background_weight) / (n_classes - 1)  # Cancer classes
        
        # Scale weights by inverse frequency within cancer classes
        cancer_counts = class_counts[1:]  # Counts for cancer classes
        if cancer_counts.sum() > 0:  # Avoid division by zero
            cancer_weights = total_pixels / (cancer_counts * n_classes + self.smooth)
            cancer_weights = cancer_weights / cancer_weights.sum()  # Normalize
            class_weights[1:] *= cancer_weights
        
        class_weights = class_weights.to(predictions.device)

        # Dice Loss (focusing on cancer classes)
        dice_loss = self.dice_loss(predictions, targets)

        # Weighted Cross Entropy Loss
        ce_loss = F.cross_entropy(predictions, targets, weight=class_weights)

        # Combine losses
        return dice_loss + self.ce_weight * ce_loss


# 4️⃣ Dataset and Model Architecture

## 📊 Dataset Implementation
We'll create a custom PyTorch Dataset class that:
- Loads and preprocesses 3D medical images
- Handles data normalization and augmentation
- Manages batch creation for training

## 🏗️ Model Architecture
Our 3D U-Net implementation includes:
- Encoder path with increasing feature channels
- Decoder path with skip connections
- Advanced features:
  - Batch normalization for stable training
  - Residual connections for better gradient flow
  - Dropout for regularization
  - Squeeze-and-Excitation blocks for channel attention

Let's implement these components:


In [None]:
class OvarianCancerDataset(Dataset):
    """Dataset class for 3D ovarian cancer segmentation"""
    def __init__(self, ct_files, seg_files, target_xy_size=128, is_train=True):
        self.ct_files = ct_files
        self.seg_files = seg_files
        self.target_xy_size = target_xy_size
        self.is_train = is_train

    def normalize_volume(self, volume):
        """Normalize volume to [0,1] range with robust scaling"""
        p1, p99 = np.percentile(volume, (1, 99))
        volume = np.clip(volume, p1, p99)
        volume = (volume - p1) / (p99 - p1)
        return volume

    def load_volume(self, file_path):
        """Load a NIfTI volume and return its data"""
        return nib.load(file_path).get_fdata()

    def resize_volume(self, volume, is_mask=False):
        """Resize volume while preserving z-dimension"""
        resized_slices = []
        for z in range(volume.shape[2]):
            if is_mask:
                # Nearest neighbor for masks to preserve labels
                slice_2d = transform.resize(volume[:,:,z], 
                                         (self.target_xy_size, self.target_xy_size), 
                                         order=0, 
                                         preserve_range=True,
                                         anti_aliasing=False)
            else:
                # Bilinear interpolation for CT images
                slice_2d = transform.resize(volume[:,:,z], 
                                         (self.target_xy_size, self.target_xy_size), 
                                         order=1,
                                         preserve_range=True,
                                         anti_aliasing=True)
            resized_slices.append(slice_2d)
        return np.stack(resized_slices, axis=2)

    def augment_volume(self, ct_vol, seg_vol):
        """Apply data augmentation"""
        if not self.is_train:
            return ct_vol, seg_vol

        # Random flip
        if np.random.random() > 0.5:
            ct_vol = np.flip(ct_vol, axis=0)
            seg_vol = np.flip(seg_vol, axis=0)
        if np.random.random() > 0.5:
            ct_vol = np.flip(ct_vol, axis=1)
            seg_vol = np.flip(seg_vol, axis=1)

        # Random rotation
        if np.random.random() > 0.5:
            angle = np.random.uniform(-15, 15)
            for z in range(ct_vol.shape[2]):
                ct_vol[:,:,z] = transform.rotate(ct_vol[:,:,z], angle, 
                                               mode='reflect', 
                                               preserve_range=True)
                seg_vol[:,:,z] = transform.rotate(seg_vol[:,:,z], angle, 
                                                mode='reflect', 
                                                order=0, 
                                                preserve_range=True)

        return ct_vol, seg_vol

    def preprocess_volume(self, ct_path, seg_path):
        """Load and preprocess a single volume pair"""
        # Load volumes
        ct_vol = self.load_volume(ct_path)
        seg_vol = self.load_volume(seg_path)

        # Resize volumes while preserving z-dimension
        ct_vol = self.resize_volume(ct_vol, is_mask=False)
        seg_vol = self.resize_volume(seg_vol, is_mask=True)

        # Normalize CT volume
        ct_vol = self.normalize_volume(ct_vol)

        # Apply augmentation
        if self.is_train:
            ct_vol, seg_vol = self.augment_volume(ct_vol, seg_vol)

        # Ensure segmentation values are integers
        seg_vol = np.round(seg_vol).astype(np.int64)

        return ct_vol, seg_vol

    def __len__(self):
        return len(self.ct_files)

    def __getitem__(self, idx):
        ct_path = os.path.join('Data_Subsample/CT', self.ct_files[idx])
        seg_path = os.path.join('Data_Subsample/Segmentation', self.seg_files[idx])

        # Load and preprocess
        ct_vol, seg_vol = self.preprocess_volume(ct_path, seg_path)

        # Convert to torch tensors and add channel dimension
        ct_vol = torch.FloatTensor(ct_vol).unsqueeze(0)
        seg_vol = torch.LongTensor(seg_vol)
        
        return ct_vol, seg_vol

# Split data into training and validation sets
train_ct, val_ct, train_seg, val_seg = train_test_split(
    ct_files, seg_files, test_size=0.2, random_state=42
)

# Create datasets
train_dataset = OvarianCancerDataset(train_ct, train_seg)
val_dataset = OvarianCancerDataset(val_ct, val_seg)

# Set random seed for reproducibility
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

# Create dataloaders
train_loader = DataLoader(
    train_dataset, 
    batch_size=1, 
    shuffle=True,  # Let PyTorch handle shuffling
    num_workers=0,  # Run in main process
    pin_memory=True if torch.cuda.is_available() else False
)

val_loader = DataLoader(
    val_dataset, 
    batch_size=1,
    shuffle=False,
    num_workers=0,  # Run in main process
    pin_memory=True if torch.cuda.is_available() else False
)

print(f'Training samples: {len(train_dataset)}')
print(f'Validation samples: {len(val_dataset)}')


In [None]:
class SEBlock(nn.Module):
    """Squeeze-and-Excitation block for channel attention"""
    def __init__(self, channels, reduction=16):
        super().__init__()
        self.avg_pool = nn.AdaptiveAvgPool3d(1)
        self.fc = nn.Sequential(
            nn.Linear(channels, channels // reduction),
            nn.ReLU(inplace=True),
            nn.Linear(channels // reduction, channels),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1, 1)
        return x * y

class DoubleConv(nn.Module):
    """Enhanced double convolution block with SE attention"""
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.double_conv = nn.Sequential(
            nn.Conv3d(in_channels, out_channels, kernel_size=3, padding=1),
            nn.BatchNorm3d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv3d(out_channels, out_channels, kernel_size=3, padding=1),
            nn.BatchNorm3d(out_channels),
            nn.ReLU(inplace=True)
        )
        self.se = SEBlock(out_channels)
        self.residual = nn.Sequential(
            nn.Conv3d(in_channels, out_channels, kernel_size=1),
            nn.BatchNorm3d(out_channels)
        )

    def forward(self, x):
        main = self.double_conv(x)
        main = self.se(main)
        residual = self.residual(x)
        return F.relu(main + residual)

class UNet3D(nn.Module):
    """Enhanced 3D U-Net with SE attention, residual connections, and deep supervision"""
    def __init__(self, in_channels=1, out_channels=3, features=[32, 64, 128, 256]):
        super(UNet3D, self).__init__()
        self.encoder = nn.ModuleList()
        self.decoder = nn.ModuleList()
        self.deep_supervision = nn.ModuleList()
        self.pool = nn.MaxPool3d(kernel_size=2, stride=2)

        # Encoder
        in_channels_temp = in_channels
        for feature in features:
            self.encoder.append(DoubleConv(in_channels_temp, feature))
            in_channels_temp = feature

        # Decoder with deep supervision
        for i, feature in enumerate(reversed(features[:-1])):
            # Upsampling
            self.decoder.append(
                nn.Sequential(
                    nn.ConvTranspose3d(
                        features[features.index(feature)+1],
                        feature,
                        kernel_size=2,
                        stride=2
                    ),
                    nn.BatchNorm3d(feature),
                    nn.ReLU(inplace=True)
                )
            )
            # Double conv after concatenation
            self.decoder.append(DoubleConv(feature * 2, feature))
            
            # Deep supervision outputs
            self.deep_supervision.append(
                nn.Conv3d(feature, out_channels, kernel_size=1)
            )

        self.bottleneck = DoubleConv(features[-2], features[-1])
        self.final_conv = nn.Conv3d(features[0], out_channels, kernel_size=1)

        # Advanced regularization
        self.dropout = nn.Dropout3d(p=0.3)
        self.spatial_dropout = nn.Dropout3d(p=0.1)

    def forward(self, x):
        skip_connections = []
        deep_outputs = []

        # Encoder
        for encoder in self.encoder[:-1]:
            x = encoder(x)
            x = self.spatial_dropout(x)  # Spatial dropout for feature map augmentation
            skip_connections.append(x)
            x = self.pool(x)
            x = self.dropout(x)

        x = self.bottleneck(x)

        # Decoder with deep supervision
        skip_connections = skip_connections[::-1]
        for idx in range(0, len(self.decoder), 2):
            x = self.decoder[idx](x)
            skip = skip_connections[idx//2]

            # Handle different sizes
            if x.shape != skip.shape:
                x = F.interpolate(x, size=skip.shape[2:])

            concat_skip = torch.cat((skip, x), dim=1)
            x = self.decoder[idx+1](concat_skip)
            x = self.dropout(x)

            # Deep supervision output
            deep_out = self.deep_supervision[idx//2](x)
            deep_outputs.append(deep_out)

        # Final output
        final_out = self.final_conv(x)
        
        if self.training:
            # During training, return main output and deep supervision outputs
            return final_out, deep_outputs
        else:
            # During inference, return only the main output
            return final_out

# Initialize model and move to GPU if available
model = UNet3D(in_channels=1, out_channels=3)
if torch.cuda.is_available():
    model = model.cuda()

print(f"Model parameters: {sum(p.numel() for p in model.parameters()):,}")
print(f"Model architecture:\n{model}")


# 5️⃣ Training and Evaluation

## 🏃‍♂️ Training Process
Our training pipeline includes:
- Batch-wise training with GPU acceleration
- Learning rate scheduling with ReduceLROnPlateau
- Early stopping to prevent overfitting
- Model checkpointing to save best weights
- Memory optimization with periodic cache clearing

## 📈 Evaluation Metrics
We'll monitor:
- Dice coefficient per class
- Overall segmentation accuracy
- Class-wise precision and recall
- Training and validation loss curves

## 🔍 Visualization
During and after training, we'll visualize:
- Sample predictions on validation data
- Training progress and learning curves
- Segmentation overlays on CT slices

Let's implement the training loop and evaluation functions:


In [None]:
def train_epoch(model, loader, optimizer, criterion, deep_weight=0.5):
    """Train the model for one epoch with deep supervision"""
    model.train()
    total_loss = 0
    num_batches = len(loader)

    for batch_idx, (data, target) in enumerate(loader):
        # Move data to GPU if available
        if torch.cuda.is_available():
            data = data.cuda()
            target = target.cuda()

        # Clear gradients
        optimizer.zero_grad()

        # Forward pass with deep supervision
        main_output, deep_outputs = model(data)
        
        # Calculate main loss
        main_loss = criterion(main_output, target)
        
        # Calculate deep supervision losses
        deep_loss = 0
        for deep_out in deep_outputs:
            # Resize deep supervision output to match target size if needed
            if deep_out.shape[2:] != target.shape[1:]:
                deep_out = F.interpolate(deep_out, size=target.shape[1:], mode='trilinear', align_corners=False)
            deep_loss += criterion(deep_out, target)
        
        # Combine losses
        loss = main_loss + deep_weight * (deep_loss / len(deep_outputs))
        output = model(data)
        loss = criterion(output, target)

        # Backward pass
        loss.backward()
        optimizer.step()

        # Update metrics
        total_loss += loss.item()

        # Print progress and clear cache
        if batch_idx % 5 == 0:
            print(f'Batch {batch_idx}/{num_batches}, Loss: {loss.item():.4f}')
            if torch.cuda.is_available():
                torch.cuda.empty_cache()  # Clear GPU cache periodically

    return total_loss / num_batches

def validate(model, loader, criterion):
    """Validate the model"""
    model.eval()
    total_loss = 0
    dice_scores = []
    num_batches = 0

    with torch.no_grad():
        for data, target in loader:
            # Move data to GPU if available
            if torch.cuda.is_available():
                data = data.cuda()
                target = target.cuda()
            
            # Model returns only main output during validation
            output = model(data)
            loss = criterion(output, target)
            
            # Calculate Dice score
            pred = F.softmax(output, dim=1)
            pred = torch.argmax(pred, dim=1)
            for i in range(3):  # 3 classes
                dice = calculate_dice_score(pred == i, target == i)
                dice_scores.append(dice)
            
            total_loss += loss.item()
            num_batches += 1

    avg_dice = sum(dice_scores) / len(dice_scores)
    return total_loss / num_batches, avg_dice

def calculate_dice_score(pred, target):
    """Calculate Dice score for binary masks"""
    intersection = (pred & target).sum().float()
    union = pred.sum() + target.sum()
    if union == 0:
        return 1.0  # Define empty as perfect match
    return (2.0 * intersection / union).item()

def predict_volume(model, ct_volume):
    """Generate predictions for a single volume"""
    model.eval()
    with torch.no_grad():
        if torch.cuda.is_available():
            ct_volume = ct_volume.cuda()
        pred = model(ct_volume.unsqueeze(0))
        pred = F.softmax(pred, dim=1)
        pred = torch.argmax(pred, dim=1)
    return pred[0].cpu().numpy()

# Training setup with focus on cancer classes
criterion = CombinedLoss(
    ce_weight=0.7,  # Balance between Dice and CE
    background_weight=0.1,  # Reduce focus on background class
    smooth=1e-5
)

# Optimizer with gradient clipping
optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=1e-3,
    weight_decay=0.01,
    amsgrad=True  # Use AMSGrad variant
)

# Learning rate scheduler
scheduler = torch.optim.lr_scheduler.OneCycleLR(
    optimizer,
    max_lr=1e-3,
    epochs=50,
    steps_per_epoch=len(train_loader),
    pct_start=0.3,  # Warm-up period
    div_factor=25,  # Initial lr = max_lr/25
    final_div_factor=1000,  # Min lr = initial_lr/1000
)

# Training loop with improved monitoring
n_epochs = 50  # Increased epochs
best_val_dice = 0.0  # Track best Dice score instead of loss
patience = 10  # Increased patience
patience_counter = 0
history = {'train_loss': [], 'val_loss': [], 'val_dice': [], 'lr': []}

print("Starting training...")
for epoch in range(n_epochs):
    print(f"\nEpoch {epoch+1}/{n_epochs}")
    
    # Training phase with deep supervision
    train_loss = train_epoch(model, train_loader, optimizer, criterion, deep_weight=0.5)
    print(f"Training Loss: {train_loss:.4f}")
    
    # Validation phase
    val_loss, val_dice = validate(model, val_loader, criterion)
    print(f"Validation Loss: {val_loss:.4f}, Validation Dice: {val_dice:.4f}")
    
    # Learning rate scheduling
    scheduler.step()
    current_lr = optimizer.param_groups[0]['lr']
    print(f"Current learning rate: {current_lr:.2e}")
    
    # Update history
    history['train_loss'].append(train_loss)
    history['val_loss'].append(val_loss)
    history['val_dice'].append(val_dice)
    history['lr'].append(current_lr)
    
    # Model checkpointing based on Dice score
    if val_dice > best_val_dice:
        best_val_dice = val_dice
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'scheduler_state_dict': scheduler.state_dict(),
            'best_dice': best_val_dice,
            'history': history
        }, 'best_model.pth')
        patience_counter = 0
        print(f"Saved new best model with Dice score: {best_val_dice:.4f}!")
    else:
        patience_counter += 1
    
    # Early stopping
    if patience_counter >= patience:
        print(f"\nEarly stopping triggered after {epoch+1} epochs")
        print(f"Best validation Dice score: {best_val_dice:.4f}")
        break
    
    # Clear GPU memory
    if torch.cuda.is_available():
        torch.cuda.empty_cache()


# 6️⃣ Discussion and Future Work

## 🔍 Technical Analysis
Consider and discuss the following aspects of your implementation:

### Data Challenges
- What difficulties did you encounter with the medical imaging data?
- How effective was your preprocessing pipeline?
- What additional data augmentation techniques could be beneficial?

### Model Performance
- How well did the model segment different tissue types?
- What were the main sources of errors?
- How could the architecture be improved?

## 🏥 Clinical Impact
Reflect on the clinical applications:

### Current Capabilities
- How reliable is the model for clinical use?
- What are the limitations of the current implementation?
- How does it compare to human expert performance?

### Future Improvements
- What additional validation would be needed for clinical deployment?
- How could the model be integrated into clinical workflows?
- What safety measures should be implemented?

## 🚀 Next Steps
Consider these potential improvements:

### Technical Enhancements
- Implement additional data augmentation techniques
- Experiment with different model architectures
- Add uncertainty quantification
- Optimize for inference speed

### Clinical Integration
- Develop a user-friendly interface
- Add reporting and visualization tools
- Implement quality assurance measures
- Design clinical validation studies

Write your answers and reflections below:


# 9. Discussion Questions

Please answer the following questions based on your implementation and results:

1. **Data Analysis**
   - What challenges did you encounter with the medical imaging data?
   - How did you handle class imbalance?

2. **Model Performance**
   - How well did the model perform on different classes?
   - What were the main sources of error?

3. **Clinical Relevance**
   - How might this model be useful in a clinical setting?
   - What additional validation would be needed?

4. **Improvements**
   - What modifications could improve the model's performance?
   - How could the preprocessing pipeline be enhanced?

Write your answers below:

1. Data Analysis:
   > Your answer here

2. Model Performance:
   > Your answer here

3. Clinical Relevance:
   > Your answer here

4. Improvements:
   > Your answer here
