# IgT5 + ESM-2 Training - ULTRA SPEED v2.6 (SIMPLIFIED)

**No file writing - just direct execution!**

Expected: 12-20Ã— faster than baseline (~3-4 hours for 50 epochs)

## Step 1: Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

import os
os.chdir('/content/drive/MyDrive/AbAg_Training')
print(f"Current directory: {os.getcwd()}")

# Check available storage
print("\n" + "="*60)
print("Storage Check:")
!df -h /content/drive/MyDrive | grep -v Filesystem
print("="*60)

## Step 2: Install Dependencies

In [None]:
print("Installing dependencies...")
!pip install -q transformers pandas scipy scikit-learn tqdm sentencepiece faesm bitsandbytes accelerate
print("âœ“ All dependencies installed")

## Step 3: Verify Installation

In [None]:
import torch

print("\n" + "="*60)
print("INSTALLATION VERIFICATION")
print("="*60)

print(f"\nâœ“ PyTorch: {torch.__version__}")
print(f"âœ“ CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"âœ“ GPU: {torch.cuda.get_device_name(0)}")
    print(f"âœ“ BFloat16 supported: {torch.cuda.is_bf16_supported()}")

# Check FlashAttention
try:
    import faesm
    print("\nâœ“âœ“âœ“ FAESM: FlashAttention available!")
except ImportError:
    print("\nâš  FAESM not installed - will use PyTorch SDPA")

# Check BitsAndBytes
try:
    import bitsandbytes
    print("âœ“âœ“âœ“ BitsAndBytes: INT8 quantization available!")
except ImportError:
    print("âš  BitsAndBytes not installed - will use BFloat16 only")

print("\n" + "="*60)
print("Ready to train!")
print("="*60)

## Step 4: Check if training script exists, if not upload it

In [None]:
import os

if os.path.exists('train_ultra_speed_v26.py'):
    print("âœ“ Training script found!")
    !ls -lh train_ultra_speed_v26.py
else:
    print("âš  Training script not found!")
    print("\nPlease upload train_ultra_speed_v26.py to your Google Drive at:")
    print("/content/drive/MyDrive/AbAg_Training/")
    print("\nOr run this to upload from your computer:")
    print("from google.colab import files")
    print("uploaded = files.upload()")

## Step 5: Start Training! ðŸš€

**Expected**: ~3-4 hours for 50 epochs (12-20Ã— faster than baseline)

The script auto-detects Colab and uses optimized settings.

In [None]:
# Run training - auto-resumes from checkpoint if exists
!python train_ultra_speed_v26.py

## Step 6: Monitor Progress

In [None]:
import torch
from pathlib import Path
import time

checkpoint_path = 'outputs_max_speed/checkpoint_latest.pth'
if Path(checkpoint_path).exists():
    checkpoint = torch.load(checkpoint_path, map_location='cpu')
    print(f"Epoch: {checkpoint['epoch'] + 1}/50")
    print(f"Batch: {checkpoint['batch_idx'] + 1}")
    print(f"Best Spearman: {checkpoint['best_val_spearman']:.4f}")

    elapsed = time.time() - checkpoint['timestamp']
    print(f"\nLast saved: {elapsed/60:.1f} minutes ago")
else:
    print("No checkpoint found yet - training just started")

## Step 7: Check Disk Space

In [None]:
!df -h / | grep -v Filesystem

print("\nDisk usage breakdown:")
!du -sh outputs_max_speed 2>/dev/null || echo "No checkpoints yet"
!du -sh ~/.cache/huggingface 2>/dev/null || echo "No HF cache"

print("\nðŸ’¡ v2.6 auto-cleans when disk > 150GB")