# üè• Medical Image Segmentation - TransUNet Training (OPTIMIZED)

This notebook trains **TransUNet** models with speed optimizations:
- ‚úÖ Mixed Precision (FP16) - 2x faster
- ‚úÖ 20 epochs (vs 50) - sufficient for convergence
- ‚úÖ Auto-save after each training
- ‚úÖ Resume capability

**Runtime:** ~2 hours total (vs 4+ hours before)

---

## Step 1: Setup GPU Runtime

‚ö†Ô∏è **IMPORTANT:** Make sure you're using a GPU runtime!
- Go to: **Runtime ‚Üí Change runtime type ‚Üí GPU**

In [12]:
# Verify GPU is available
import torch
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ö†Ô∏è WARNING: No GPU detected! Training will be very slow.")

GPU Available: True
GPU Name: Tesla T4
GPU Memory: 15.83 GB


## Step 2: Clone Repository & Install Dependencies

In [13]:
# Clone the repository
!git clone https://github.com/Prabhat9801/Medical-Image-Segmentation.git
%cd Medical-Image-Segmentation

# Install required packages
!pip install -q timm albumentations opencv-python-headless tqdm

fatal: could not create work tree dir 'Medical-Image-Segmentation': No space left on device
[Errno 2] No such file or directory: 'Medical-Image-Segmentation'
/content/Medical-Image-Segmentation
[31mERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/pip/_internal/cli/base_command.py", line 179, in exc_logging_wrapper
    status = run_func(*args)
             ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
    return func(self, options, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/pip/_internal/commands/install.py", line 333, in run
    build_tracker = self.enter_context(get_build_tracker())
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/pip/_internal/cli/command_context.py", line 27, in enter_context
    return self._main_context.enter_context(context_provider)


## Step 3: Mount Google Drive

In [14]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Step 4: Extract & Fix Data (Run Once)

In [15]:
# Run the optimized data extraction script
!python colab_extract_data.py

üì¶ Extracting data with Windows path fix...
Traceback (most recent call last):
  File "/usr/lib/python3.12/zipfile/__init__.py", line 1433, in _RealGetContents
    endrec = _EndRecData(fp)
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/zipfile/__init__.py", line 315, in _EndRecData
    data = fpin.read(sizeEndCentDir)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 28] No space left on device

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/content/Medical-Image-Segmentation/colab_extract_data.py", line 28, in <module>
    with zipfile.ZipFile(zip_path, 'r') as z:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/zipfile/__init__.py", line 1370, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.12/zipfile/__init__.py", line 1435, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file


## Step 5: Helper Function - Save to Drive After Each Training

In [16]:
import os
import shutil
from datetime import datetime

def save_to_drive(model_name="transunet"):
    """Save all experiments to Google Drive"""
    drive_dir = f"/content/drive/MyDrive/medical_segmentation_results/{model_name}_experiments"
    local_exp_dir = "/content/Medical-Image-Segmentation/experiments"

    # Create drive directory
    os.makedirs(drive_dir, exist_ok=True)

    # Copy all experiments
    if os.path.exists(local_exp_dir):
        for exp in os.listdir(local_exp_dir):
            src = os.path.join(local_exp_dir, exp)
            dst = os.path.join(drive_dir, exp)

            if os.path.isdir(src):
                # Remove old version if exists
                if os.path.exists(dst):
                    shutil.rmtree(dst)
                # Copy new version
                shutil.copytree(src, dst)
                print(f"‚úÖ Saved: {exp}")

    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    print(f"\nüéâ All results saved to Drive at {timestamp}")
    print(f"üìÅ Location: {drive_dir}")

def check_already_trained(model_name, data_fraction):
    """Check if this configuration was already trained"""
    drive_dir = f"/content/drive/MyDrive/medical_segmentation_results/{model_name}_experiments"

    if not os.path.exists(drive_dir):
        return False

    # Look for experiments with this model and data fraction
    frac_str = f"{int(data_fraction*100)}pct"
    for exp in os.listdir(drive_dir):
        if model_name in exp and frac_str in exp:
            print(f"‚è≠Ô∏è  Skipping {model_name} {frac_str} - already trained!")
            print(f"   Found: {exp}")
            return True

    return False

print("‚úÖ Helper functions loaded!")

‚úÖ Helper functions loaded!


## Step 6: Train TransUNet - 10% Data

**Expected time:** ~12 minutes

In [17]:
if not check_already_trained("transunet", 0.1):
    !python -m src.train \
        --model transunet \
        --epochs 20 \
        --batch_size 12 \
        --data_fraction 0.1 \
        --lr 1e-4

    # Save immediately to Drive
    save_to_drive("transunet")
    print("\n‚úÖ TransUNet 10% training complete and saved!")
else:
    print("‚úÖ TransUNet 10% already completed!")

‚è≠Ô∏è  Skipping transunet 10pct - already trained!
   Found: transunet_10pct_20251206_180113
‚úÖ TransUNet 10% already completed!


## Step 7: Train TransUNet - 25% Data

**Expected time:** ~20 minutes

In [18]:
if not check_already_trained("transunet", 0.25):
    !python -m src.train \
        --model transunet \
        --epochs 20 \
        --batch_size 12 \
        --data_fraction 0.25 \
        --lr 1e-4

    # Save immediately to Drive
    save_to_drive("transunet")
    print("\n‚úÖ TransUNet 25% training complete and saved!")
else:
    print("‚úÖ TransUNet 25% already completed!")

‚è≠Ô∏è  Skipping transunet 25pct - already trained!
   Found: transunet_25pct_20251206_181019
‚úÖ TransUNet 25% already completed!


## Step 8: Train TransUNet - 50% Data

**Expected time:** ~35 minutes

In [19]:
if not check_already_trained("transunet", 0.5):
    !python -m src.train \
        --model transunet \
        --epochs 20 \
        --batch_size 12 \
        --data_fraction 0.5 \
        --lr 1e-4

    # Save immediately to Drive
    save_to_drive("transunet")
    print("\n‚úÖ TransUNet 50% training complete and saved!")
else:
    print("‚úÖ TransUNet 50% already completed!")

‚è≠Ô∏è  Skipping transunet 50pct - already trained!
   Found: transunet_50pct_20251206_182411
‚úÖ TransUNet 50% already completed!


## Step 9: Train TransUNet - 100% Data

**Expected time:** ~60 minutes

In [20]:
if not check_already_trained("transunet", 1.0):
    !python -m src.train \
        --model transunet \
        --epochs 20 \
        --batch_size 12 \
        --data_fraction 1.0 \
        --lr 1e-4

    # Save immediately to Drive
    save_to_drive("transunet")
    print("\n‚úÖ TransUNet 100% training complete and saved!")
else:
    print("‚úÖ TransUNet 100% already completed!")

‚è≠Ô∏è  Skipping transunet 100pct - already trained!
   Found: transunet_100pct_20251206_185217
‚úÖ TransUNet 100% already completed!


## Step 10: Final Summary

In [21]:
import os

drive_dir = "/content/drive/MyDrive/medical_segmentation_results/transunet_experiments"

print("="*60)
print("üéâ TransUNet Training Complete!")
print("="*60)

if os.path.exists(drive_dir):
    experiments = [d for d in os.listdir(drive_dir) if os.path.isdir(os.path.join(drive_dir, d))]
    print(f"\n‚úÖ Total experiments saved: {len(experiments)}")
    print(f"\nüìÅ Saved experiments:")
    for exp in sorted(experiments):
        print(f"   - {exp}")
    print(f"\nüìç Location: {drive_dir}")
else:
    print("\n‚ö†Ô∏è No experiments found in Drive")

print("\n" + "="*60)
print("Next step:")
print("Run COLAB_RESULTS.ipynb to evaluate all models")
print("="*60)

üéâ TransUNet Training Complete!

‚úÖ Total experiments saved: 4

üìÅ Saved experiments:
   - transunet_100pct_20251206_185217
   - transunet_10pct_20251206_180113
   - transunet_25pct_20251206_181019
   - transunet_50pct_20251206_182411

üìç Location: /content/drive/MyDrive/medical_segmentation_results/transunet_experiments

Next step:
Run COLAB_RESULTS.ipynb to evaluate all models
