# Brain Tumor Segmentation with U-Net

This notebook trains a U-Net model for brain tumor segmentation and compares **ARSVD vs SVD** compression methods.

## Pipeline Options:
- **Option A**: Run complete pipeline in ONE command (Training + Compression)
- **Option B**: Step-by-step (Train first, then compression experiments separately)

### Quick Start:
1. Mount Google Drive and extract dataset
2. Clone the repository
3. Install dependencies
4. Run Option A (complete pipeline) OR Option B (step-by-step)

## Step 1: Mount Google Drive and Extract Dataset

In [None]:
import os
# Suppress TensorFlow warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Extract the brain tumor dataset
# Make sure to upload your dataset zip file to Google Drive first
# Update the path below to point to your dataset zip file
!unzip /content/drive/MyDrive/brain_tumor_dataset.zip -d /content/dataset

## Step 2: Clone the GitHub Repository

In [None]:
# Clone the repository
!git clone https://github.com/Innocentaman/arsvd.git

# Navigate to the project directory
%cd arsvd

# List the contents to verify
!ls -la

## Step 3: Install Dependencies

In [None]:
# Install required packages
# TensorFlow 2.19.0 is already installed in Colab
!pip install opencv-python numpy pandas scikit-learn tqdm scipy matplotlib seaborn

---

## OPTION A: Complete Pipeline (One Command)

**This runs EVERYTHING in a single command:**
1. ‚úÖ Train U-Net model
2. ‚úÖ Evaluate on test set
3. ‚úÖ Run ARSVD compression experiments
4. ‚úÖ Run SVD compression experiments
5. ‚úÖ Generate comparison plots

### Recommended for:
- Full training runs (50+ epochs)
- Complete experiments
- Production use

### Parameters:
- `--epochs 1`: Change to `--epochs 50` for real training
- `--run_compression`: Enables compression experiments
- `--svd_ranks`: Test different fixed ranks
- `--arsvd_taus`: Test different entropy thresholds

In [None]:
# Run complete pipeline: Training + Compression Experiments
# This will take time depending on epochs and number of experiments

# For testing (1 epoch):
!python run_complete_pipeline.py \
  --data_root /content/dataset/data \
  --epochs 1 \
  --batch_size 16 \
  --lr 1e-4 \
  --img_size 256 \
  --patience 20 \
  --lr_patience 5 \
  --svd_ranks "50,100,150" \
  --arsvd_taus "0.95,0.9,0.85,0.8" \
  --run_compression \
  --out_dir ./complete_results

# For actual training (50+ epochs):
# !python run_complete_pipeline.py \
#   --data_root /content/dataset/data \
#   --epochs 50 \
#   --batch_size 16 \
#   --lr 1e-4 \
#   --img_size 256 \
#   --patience 20 \
#   --lr_patience 5 \
#   --svd_ranks "50,100,150" \
#   --arsvd_taus "0.95,0.9,0.85,0.8" \
#   --run_compression \
#   --out_dir ./complete_results

---

## OPTION B: Step-by-Step (Separate Commands)

**Run training and compression separately**

### Recommended for:
- Testing and debugging
- Understanding each step
- Running compression on existing trained model

### Steps:
1. Train model first
2. Then run compression experiments on trained model

### Step B1: Train U-Net Model Only

In [None]:
# Train and evaluate U-Net model
# Change --epochs 1 to --epochs 50 for actual training

!python run_pipeline.py \
  --data_root /content/dataset/data \
  --out_dir ./artifacts \
  --epochs 1 \
  --batch_size 16 \
  --lr 1e-4 \
  --img_size 256 \
  --patience 20 \
  --lr_patience 5 \
  --seed 42

### Step B2: Run Compression Experiments

In [None]:
# Run ARSVD vs SVD compression experiments on trained model
# Uses the model trained in previous step

!python run_compression_pipeline.py \
  --data_root /content/dataset/data \
  --model_path ./artifacts/model.h5 \
  --svd_ranks "50,100,150" \
  --arsvd_taus "0.95,0.9,0.85,0.8" \
  --batch_size 16 \
  --img_size 256 \
  --out_dir ./compression_results

---

## View Results

The following cells work for **both Option A and Option B**.

Results locations:
- Option A: `./complete_results/`
- Option B: `./artifacts/` (training) and `./compression_results/` (compression)

### Training Results

In [None]:
# Display training metrics
import pandas as pd
import matplotlib.pyplot as plt
import os

# Detect which option was run
if os.path.exists('./complete_results/training/log.csv'):
    log_path = './complete_results/training/log.csv'
    score_path = './complete_results/training/score.csv'
    results_dir = './complete_results/training/results'
    print('Using Option A results (complete pipeline)')
elif os.path.exists('./artifacts/log.csv'):
    log_path = './artifacts/log.csv'
    score_path = './artifacts/score.csv'
    results_dir = './artifacts/results'
    print('Using Option B results (step-by-step)')
else:
    print('No training results found! Please run training first.')

# Load and display training history
if 'log_path' in locals():
    log_df = pd.read_csv(log_path)
    print("Training History:")
    print(log_df.tail())
    
    # Plot training curves
    plt.figure(figsize=(12, 4))
    
    plt.subplot(1, 2, 1)
    plt.plot(log_df['epoch'], log_df['loss'], label='Train Loss')
    plt.plot(log_df['epoch'], log_df['val_loss'], label='Val Loss')
    plt.title('Training and Validation Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    
    plt.subplot(1, 2, 2)
    plt.plot(log_df['epoch'], log_df['dice_coef'], label='Train Dice Coef')
    plt.plot(log_df['epoch'], log_df['val_dice_coef'], label='Val Dice Coef')
    plt.title('Training and Validation Dice Coefficient')
    plt.xlabel('Epoch')
    plt.ylabel('Dice Coefficient')
    plt.legend()
    
    plt.tight_layout()
    plt.show()

In [None]:
# Display test metrics
if 'score_path' in locals():
    score_df = pd.read_csv(score_path)
    print("\nTest Results:")
    print(f"Average F1 Score: {score_df['F1'].mean():.4f}")
    print(f"Average Jaccard Index: {score_df['Jaccard'].mean():.4f}")
    print(f"Average Recall: {score_df['Recall'].mean():.4f}")
    print(f"Average Precision: {score_df['Precision'].mean():.4f}")
    
    print("\nPer-image results:")
    print(score_df.head(10))

In [None]:
# Display some prediction results
from IPython.display import Image, display

if 'results_dir' in locals() and os.path.exists(results_dir):
    result_images = sorted(os.listdir(results_dir))[:5]  # Display first 5 results
    
    for img_name in result_images:
        img_path = os.path.join(results_dir, img_name)
        print(f"\n{img_name}")
        display(Image(filename=img_path, width=800))

### Compression Results (ARSVD vs SVD)

In [None]:
# Load and display compression results
import pandas as pd

# Detect which option was run
if os.path.exists('./complete_results/compression/compression_summary.csv'):
    compression_csv = './complete_results/compression/compression_summary.csv'
    compression_plots = './complete_results/compression/plots'
    print('Using Option A compression results')
elif os.path.exists('./compression_results/compression_summary.csv'):
    compression_csv = './compression_results/compression_summary.csv'
    compression_plots = './compression_results/plots'
    print('Using Option B compression results')
else:
    print('No compression results found! Please run compression experiments first.')

# Display summary table
if 'compression_csv' in locals():
    summary_df = pd.read_csv(compression_csv)
    print("\n" + "="*80)
    print("COMPRESSION EXPERIMENTS SUMMARY")
    print("="*80)
    print(summary_df.to_string(index=False))
    print("="*80)

In [None]:
# Display all comparison plots
from IPython.display import Image, display

if 'compression_plots' in locals() and os.path.exists(compression_plots):
    plot_files = sorted([f for f in os.listdir(compression_plots) if f.endswith('.png')])
    
    for plot_file in plot_files:
        plot_path = os.path.join(compression_plots, plot_file)
        print(f"\n{'='*80}")
        print(f"{plot_file.replace('_', ' ').title().replace('.png', '')}")
        print(f"{'='*80}")
        display(Image(filename=plot_path, width=1000))

### Analysis and Key Findings

In [None]:
# Compare baseline vs compressed models
if 'compression_csv' in locals():
    summary_df = pd.read_csv(compression_csv)
    
    baseline = summary_df[summary_df['method'] == 'none'].iloc[0]
    svd_results = summary_df[summary_df['method'] == 'svd']
    arsvd_results = summary_df[summary_df['method'] == 'arsvd']
    
    print("\n" + "="*80)
    print("KEY FINDINGS")
    print("="*80)
    
    print(f"\nüìä BASELINE MODEL (No Compression):")
    print(f"   Dice: {baseline['dice']:.4f}")
    print(f"   IoU: {baseline['iou']:.4f}")
    print(f"   F1: {baseline['f1']:.4f}")
    
    print(f"\nüîß SVD COMPRESSION (Fixed Rank):")
    for _, row in svd_results.iterrows():
        dice_drop = baseline['dice'] - row['dice']
        print(f"   Rank {int(row['param_value'])}: Dice={row['dice']:.4f} (drop: {dice_drop:.4f})")
    
    print(f"\nüéØ ARSVD COMPRESSION (Adaptive Rank):")
    for _, row in arsvd_results.iterrows():
        dice_drop = baseline['dice'] - row['dice']
        print(f"   Tau {row['param_value']:.2f}: Dice={row['dice']:.4f} (drop: {dice_drop:.4f})")
    
    # Find best configurations
    if len(svd_results) > 0:
        best_svd = svd_results.loc[svd_results['dice'].idxmax()]
        print(f"\nüèÜ BEST COMPRESSED MODELS:")
        print(f"   SVD: Rank {int(best_svd['param_value'])} with Dice={best_svd['dice']:.4f}")
    
    if len(arsvd_results) > 0:
        best_arsvd = arsvd_results.loc[arsvd_results['dice'].idxmax()]
        print(f"   ARSVD: Tau={best_arsvd['param_value']:.2f} with Dice={best_arsvd['dice']:.4f}")
    
    if len(svd_results) > 0 and len(arsvd_results) > 0:
        arsvd_better = best_arsvd['dice'] > best_svd['dice']
        if arsvd_better:
            print(f"\n‚úÖ ARSVD outperforms SVD by {best_arsvd['dice'] - best_svd['dice']:.4f} Dice points")
        else:
            print(f"\n‚ö†Ô∏è  SVD outperforms ARSVD by {best_svd['dice'] - best_arsvd['dice']:.4f} Dice points")
    
    print("="*80)

## Save Results to Google Drive

In [None]:
# Copy all results to Google Drive for permanent storage

if os.path.exists('./complete_results'):
    !cp -r ./complete_results /content/drive/MyDrive/brain_tumor_complete_results
    print("\n" + "="*80)
    print("‚úÖ OPTION A RESULTS SAVED")
    print("="*80)
    print("Location: /content/drive/MyDrive/brain_tumor_complete_results")
    print("\nContents:")
    print("  - training/model.h5 (trained model)")
    print("  - training/score.csv (test metrics)")
    print("  - training/results/ (sample predictions)")
    print("  - compression/compression_summary.csv (all experiments)")
    print("  - compression/plots/ (comparison visualizations)")
    print("="*80)

elif os.path.exists('./artifacts'):
    !cp -r ./artifacts /content/drive/MyDrive/brain_tumor_segmentation_results
    print("\n‚úÖ Training results saved to: /content/drive/MyDrive/brain_tumor_segmentation_results")

if os.path.exists('./compression_results'):
    !cp -r ./compression_results /content/drive/MyDrive/brain_tumor_compression_results
    print("‚úÖ Compression results saved to: /content/drive/MyDrive/brain_tumor_compression_results")