# Alignment Handbook - Development Setup for Colab

This notebook sets up your [forked Alignment Handbook](https://github.com/pbjornsson/alignment-handbook) for development on Google Colab.

## Development Workflow:
1. **Local (VS Code):** Make changes to your fork
2. **Local (Git):** Commit and push to GitHub
3. **Colab:** Pull changes and test with GPU

**Before running:**
1. Go to `Runtime` â†’ `Change runtime type`
2. Select `T4 GPU` (free tier) or `A100 GPU` (Colab Pro)
3. Click `Save`

## Step 1: Verify GPU Access

In [None]:
!nvidia-smi

## Step 2: Mount Google Drive (Recommended)

Mount Google Drive to save your trained models and outputs across Colab sessions.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Step 3: Clone Your Fork

**Run this cell only once** (first time setting up). It clones your fork from GitHub.

In [None]:
# Clone your fork
!git clone https://github.com/pbjornsson/alignment-handbook.git
%cd alignment-handbook

# Configure git (optional, for commits from Colab)
!git config --global user.name "pbjornsson"
!git config --global user.email "your-email@example.com"  # Update with your email

!pwd
!git branch

## Step 3b: Pull Latest Changes (Development Workflow)

**Run this cell after pushing changes from VS Code** to sync your latest code to Colab.

**Skip Step 3** above if you've already cloned the repo and just need to pull updates.

In [None]:
%cd /content/alignment-handbook

# Pull latest changes from your fork
!git pull origin main  # Or replace 'main' with your branch name

# If you're working on a specific branch:
# !git fetch origin
# !git checkout your-branch-name
# !git pull origin your-branch-name

print("\nâœ“ Code synced!")
!git log -1 --oneline

## Step 3c: Switch Branch (Optional)

If you're working on a feature branch, use this cell to switch branches:

In [None]:
# Replace 'feature-branch-name' with your actual branch name
BRANCH_NAME = "main"  # Change this to your branch

%cd /content/alignment-handbook
!git fetch origin
!git checkout {BRANCH_NAME}
!git pull origin {BRANCH_NAME}

print(f"\nâœ“ Switched to branch: {BRANCH_NAME}")
!git branch --show-current
!git log -1 --oneline

## Step 4: Install Dependencies

This will take several minutes. We'll install:
- PyTorch 2.6.0 with CUDA 12.6 support
- Alignment Handbook in **editable mode** (so code changes take effect immediately)
- Flash Attention 2 for faster training

In [None]:
# Install uv package manager (faster than pip)
!pip install -q uv

In [None]:
# Install PyTorch 2.6.0 + torchvision with CUDA 12.6
# IMPORTANT: Install torchvision with torch to avoid version conflicts
!uv pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu126 --system

In [None]:
# Install alignment-handbook in EDITABLE mode
# This allows you to modify code and see changes without reinstalling
%cd /content/alignment-handbook
!pip install -e . --no-deps
!uv pip install -r <(pip freeze | grep -v alignment-handbook) --system

### Alternative: Full Install (if editable mode has issues)

If the editable install above doesn't work, use this instead:

In [None]:
# Alternative: Standard installation
# Uncomment and run if editable mode fails
# %cd /content/alignment-handbook
# !uv pip install . --system

In [None]:
# Install Flash Attention 2 (this takes a few minutes to compile)
!uv pip install "flash-attn==2.7.4.post1" --no-build-isolation --system

## Step 4b: Reload Code After Changes (Development Workflow)

**After pulling new changes from Step 3b**, you may need to reload Python modules.

For most changes, this is **not needed** if you used editable install. But if changes don't appear:

In [None]:
# Option 1: Restart runtime (nuclear option)
# Runtime â†’ Restart Runtime

# Option 2: Reload specific modules
import importlib
import sys

# Remove cached modules
modules_to_reload = [m for m in sys.modules if m.startswith('alignment')]
for module in modules_to_reload:
    del sys.modules[module]

print(f"âœ“ Reloaded {len(modules_to_reload)} modules")

# Option 3: If you made major changes, reinstall
# !pip install -e . --no-deps --force-reinstall

## Step 5: Verify Installation

In [None]:
import torch
import transformers
import trl
import peft

print("="*50)
print("SYSTEM INFO")
print("="*50)
print(f"âœ“ PyTorch version: {torch.__version__}")
print(f"âœ“ CUDA available: {torch.cuda.is_available()}")
print(f"âœ“ CUDA version: {torch.version.cuda}")
print(f"âœ“ GPU device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")
print(f"âœ“ Transformers version: {transformers.__version__}")
print(f"âœ“ TRL version: {trl.__version__}")
print(f"âœ“ PEFT version: {peft.__version__}")

# Check Flash Attention
try:
    import flash_attn
    print(f"âœ“ Flash Attention 2 installed: {flash_attn.__version__}")
except ImportError:
    print("âœ— Flash Attention 2 not installed")

# Check alignment-handbook installation
print("\n" + "="*50)
print("ALIGNMENT HANDBOOK")
print("="*50)
try:
    import alignment
    print(f"âœ“ Alignment module loaded from: {alignment.__file__}")
    print(f"  (Editable install: {'/content/alignment-handbook' in alignment.__file__})")
except ImportError as e:
    print(f"âœ— Alignment module not found: {e}")

# Show current git status
print("\n" + "="*50)
print("GIT STATUS")
print("="*50)
%cd /content/alignment-handbook
!git branch --show-current
!git log -1 --oneline
print("\nâœ“ Setup complete!")

## Step 6: Login to Hugging Face

You'll need a Hugging Face account and access token:
1. Go to [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Create a new token with `write` access
3. Copy the token and paste it below when prompted

In [None]:
from huggingface_hub import login
login()

## You're Ready! ðŸŽ‰

### Development Workflow Recap:

1. **Make changes locally in VS Code** (with Claude Code assistance)
2. **Commit and push:**
   ```bash
   git add .
   git commit -m "Your changes"
   git push origin main  # or your-branch-name
   ```
3. **In Colab, run Step 3b** to pull your changes
4. **Test your changes** in the cells below

### Quick Commands:

**Explore recipes:**

In [None]:
%cd /content/alignment-handbook
!ls -la recipes/

**Run a training example:**

In [None]:
# Example: Supervised Fine-Tuning (SFT)
# Uncomment to run:
# !accelerate launch scripts/run_sft.py recipes/zephyr-7b-beta/sft/config_full.yaml

**Monitor training with Weights & Biases (optional):**

In [None]:
# Optional: Login to wandb for experiment tracking
# !pip install -q wandb
# !wandb login

## Example: View a Training Config

In [None]:
%cd /content/alignment-handbook
!cat recipes/zephyr-7b-beta/sft/config_full.yaml

## Example: Create Custom Config for Testing

Modify a config to:
- Save outputs to Google Drive
- Use smaller batch sizes for T4 GPU
- Enable faster iteration for testing

In [None]:
import yaml

# Read the original config
with open('recipes/zephyr-7b-beta/sft/config_full.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Modify for Colab development
config['output_dir'] = '/content/drive/MyDrive/alignment-handbook/experiments/test-run'
config['per_device_train_batch_size'] = 1  # Reduce for T4 GPU
config['gradient_accumulation_steps'] = 4   # Maintain effective batch size
config['max_steps'] = 100                   # Quick test run
config['save_steps'] = 50                   # Save checkpoint mid-run
config['logging_steps'] = 10                # Frequent logging

# Save modified config
config_path = 'config_colab_test.yaml'
with open(config_path, 'w') as f:
    yaml.dump(config, f)

print(f"âœ“ Modified config saved to: {config_path}")
print(f"  Output directory: {config['output_dir']}")
print(f"  Max steps: {config['max_steps']}")
print(f"\nTo run training:")
print(f"  !accelerate launch scripts/run_sft.py {config_path}")

## Testing Your Code Changes

Use these cells to test changes you made to the alignment-handbook code:

In [None]:
# Test your module imports
from alignment import model_utils  # Example - adjust to your changes

# Add your testing code here
print("Testing custom changes...")

## Important Notes

### Runtime Management:
- **Colab free tier:** ~12 hours max, can disconnect earlier
- **Save checkpoints regularly** to Google Drive
- **Files in /content/** are deleted when runtime disconnects
- **Files in /content/drive/** persist across sessions

### Memory Management (T4 GPU = 16GB VRAM):
- Use smaller batch sizes: `per_device_train_batch_size: 1`
- Increase gradient accumulation: `gradient_accumulation_steps: 4-8`
- Use LoRA/QLoRA for large models
- Enable gradient checkpointing in config

### Development Tips:
- With editable install, Python changes take effect immediately
- Config file changes always take effect (no reinstall needed)
- If changes don't appear, use Step 4b to reload modules
- Test with small `max_steps` values first

### Troubleshooting:
- **Out of memory:** Reduce batch size, use gradient checkpointing
- **Code changes not appearing:** Run Step 4b to reload modules
- **Git conflicts:** Commit/stash local changes before pulling
- **Slow training:** Check Flash Attention is installed correctly