# üè• Medical Image Segmentation - UNet Training

This notebook trains **UNet** models on different data fractions (10%, 25%, 50%, 100%).

**Runtime:** ~2-3 hours on Colab GPU

---

## Step 1: Setup GPU Runtime

‚ö†Ô∏è **IMPORTANT:** Make sure you're using a GPU runtime!
- Go to: **Runtime ‚Üí Change runtime type ‚Üí GPU**

In [None]:
# Verify GPU is available
import torch
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ö†Ô∏è WARNING: No GPU detected! Training will be very slow.")

## Step 2: Clone Repository

In [None]:
# Clone the repository
!git clone https://github.com/Prabhat9801/Medical-Image-Segmentation.git
%cd Medical-Image-Segmentation

## Step 3: Install Dependencies

In [None]:
# Install required packages
!pip install -q timm albumentations opencv-python-headless tqdm

## Step 4: Mount Google Drive & Extract Data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import zipfile
import os
import shutil
import pandas as pd

# Paths
zip_path = "/content/drive/MyDrive/isic_processed_256.zip"
final_path = "/content/Medical-Image-Segmentation/data/processed/isic"

print("üì¶ Extracting data with Windows path fix...")

# Remove existing if present
if os.path.exists(final_path):
    shutil.rmtree(final_path)

# Create the directory structure
os.makedirs(final_path, exist_ok=True)
os.makedirs(os.path.join(final_path, "images"), exist_ok=True)
os.makedirs(os.path.join(final_path, "masks"), exist_ok=True)

# Extract and fix Windows paths
with zipfile.ZipFile(zip_path, 'r') as z:
    for file_info in z.filelist:
        orig_name = file_info.filename
        fixed_name = orig_name.replace('\\', '/')
        
        if fixed_name.startswith('isic/'):
            fixed_name = fixed_name[5:]
        
        if 'images/' in fixed_name or (fixed_name.endswith('.png') and 'mask' not in fixed_name.lower()):
            filename = os.path.basename(fixed_name)
            target_path = os.path.join(final_path, "images", filename)
        elif 'masks/' in fixed_name or '_mask.png' in fixed_name:
            filename = os.path.basename(fixed_name)
            target_path = os.path.join(final_path, "masks", filename)
        elif fixed_name.endswith('splits.csv'):
            target_path = os.path.join(final_path, "splits.csv")
        else:
            continue
        
        with z.open(file_info) as source, open(target_path, 'wb') as target:
            shutil.copyfileobj(source, target)

print("‚úÖ Extraction complete!")

# Fix splits.csv paths
print("\nüîß Fixing splits.csv paths...")
splits_path = os.path.join(final_path, "splits.csv")
if os.path.exists(splits_path):
    df = pd.read_csv(splits_path)
    
    # Fix paths to absolute Colab paths
    df['image_path'] = df['image_path'].apply(
        lambda x: os.path.join(final_path, "images", os.path.basename(x))
    )
    df['mask_path'] = df['mask_path'].apply(
        lambda x: os.path.join(final_path, "masks", os.path.basename(x))
    )
    
    df.to_csv(splits_path, index=False)
    print("‚úÖ splits.csv paths fixed!")

# Verify
import glob
image_count = len(glob.glob(os.path.join(final_path, "images", "*.png")))
mask_count = len(glob.glob(os.path.join(final_path, "masks", "*.png")))

print(f"\n‚úÖ Images: {image_count}")
print(f"‚úÖ Masks: {mask_count}")
print(f"‚úÖ splits.csv: {'Found' if os.path.exists(splits_path) else 'NOT FOUND'}")
print("\nüéâ Data extraction successful!")

## Step 5: Train UNet - 10% Data

**Expected time:** ~15-20 minutes

In [None]:
!python -m src.train \
    --model unet \
    --epochs 50 \
    --batch_size 16 \
    --data_fraction 0.1 \
    --lr 1e-4

print("\n‚úÖ UNet 10% training complete!")

## Step 6: Train UNet - 25% Data

**Expected time:** ~30-40 minutes

In [None]:
!python -m src.train \
    --model unet \
    --epochs 50 \
    --batch_size 16 \
    --data_fraction 0.25 \
    --lr 1e-4

print("\n‚úÖ UNet 25% training complete!")

## Step 7: Train UNet - 50% Data

**Expected time:** ~50-60 minutes

In [None]:
!python -m src.train \
    --model unet \
    --epochs 50 \
    --batch_size 16 \
    --data_fraction 0.5 \
    --lr 1e-4

print("\n‚úÖ UNet 50% training complete!")

## Step 8: Train UNet - 100% Data

**Expected time:** ~90-120 minutes

In [None]:
!python -m src.train \
    --model unet \
    --epochs 50 \
    --batch_size 16 \
    --data_fraction 1.0 \
    --lr 1e-4

print("\n‚úÖ UNet 100% training complete!")

## Step 9: Save Results to Google Drive

In [None]:
# Copy all experiments to Google Drive
!mkdir -p /content/drive/MyDrive/medical_segmentation_results
!cp -r experiments /content/drive/MyDrive/medical_segmentation_results/unet_experiments

print("\n‚úÖ All UNet results saved to Google Drive!")
print("üìÅ Location: /content/drive/MyDrive/medical_segmentation_results/unet_experiments")

## üéâ UNet Training Complete!

**Next steps:**
1. Run `COLAB_TRAIN_UNETPP.ipynb` to train UNet++
2. Run `COLAB_TRAIN_TRANSUNET.ipynb` to train TransUNet
3. Run `COLAB_RESULTS.ipynb` to evaluate and compare all models