# Project Aether - Kaggle Notebook Setup

This notebook sets up and runs Project Aether on Kaggle with GPU support.

## Features
- Automatic GPU detection and setup
- Model downloads (Stable Diffusion 1.4 - less censored for research)
- All three phases: Probe Training, PPO Training, Evaluation
- **Empirical layer sensitivity measurement** (FID & SSR) for optimal intervention points
- Optimized for Kaggle's P100 GPU (16GB VRAM)
- **Nudity-focused** content filtering for clearer concept boundaries

## Important Notes
- **Model:** Uses `CompVis/stable-diffusion-v1-4` (less censored than SD 1.5)
- **Focus:** Nudity-only content (not gore/violence) for better probe training
- **Filtering:** Strict thresholds (≥50% nudity, ≥60% inappropriate, hard prompts only)
- **Kaggle:** 30-hour session limit, 9-hour GPU limit per session

## References
- **FID Metric:** Heusel et al. (2017). "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium." NeurIPS 2017.
- **Linear Probing:** Alain & Bengio (2016). "Understanding Intermediate Layers Using Linear Classifier Probes." arXiv:1610.01644.
- **PPO:** Schulman et al. (2017). "Proximal Policy Optimization Algorithms." arXiv:1707.06347.

## Step 1: Install Dependencies

In [None]:
# Install PyTorch with CUDA (Kaggle uses CUDA 11.8 or 12.1)
!pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Install other dependencies
!pip install diffusers transformers accelerate safetensors
!pip install gymnasium numpy scikit-learn matplotlib tqdm
!pip install pyyaml pillow lpips
!pip install datasets  # For I2P dataset
!pip install pytorch-fid  # For FID metric (Heusel et al., 2017)

## Step 2: Setup Repository

In [None]:
# Option A: Clone from GitHub
!git clone https://github.com/Anastasia-Deniz/project-aether.git
%cd project-aether

## Step 3: Verify GPU and Setup

In [None]:
import torch
import sys
from pathlib import Path

# Verify GPU
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

# Add project to path
sys.path.insert(0, str(Path.cwd()))

# Create necessary directories
!mkdir -p data/latents checkpoints/probes outputs/ppo outputs/evaluation

## Step 4: Phase 1 - Collect Latents

In [None]:
# Collect latents for probe training
# Kaggle P100 has 16GB VRAM, similar to Colab T4
# Using SD 1.4 (20 steps) - less censored than SD 1.5, better for research
# Focus on nudity only with strict quality thresholds
!python scripts/collect_latents.py \
    --num_samples 100 \
    --num_steps 20 \
    --device cuda \
    --model_id CompVis/stable-diffusion-v1-4 \
    --focus_nudity \
    --hard_only \
    --min_inappropriate_pct 60.0 \
    --min_nudity_pct 50.0 \
    --save_images

## Step 5: Phase 1 - Train Probes

Train linear probes at each timestep to measure concept separability.

In [None]:
# Train linear probes
import os
from pathlib import Path

latents_dirs = sorted(Path('data/latents').glob('run_*'), key=os.path.getmtime)
if latents_dirs:
    latest_latents = latents_dirs[-1]
    print(f"Using latents from: {latest_latents}")
    
    # Option A: Train with improved heuristics (faster, default)
    !python scripts/train_probes.py --latents_dir {latest_latents}
    
    # Option B: Train with empirical measurements (better accuracy, requires Step 5.5)
    # !python scripts/train_probes.py --latents_dir {latest_latents} --use_empirical

## Step 5.5: (Optional) Measure Empirical Layer Sensitivity ⭐ NEW

**Recommended for best results:** Measure FID and SSR empirically instead of using heuristics.

This step runs small steering experiments to measure:
- **Quality preservation**: FID between steered and unsteered images (Heusel et al., 2017)
- **Steering effectiveness**: SSR improvement from steering

**Note:** This takes additional time (~30-60 min) but provides more accurate sensitivity scores.

In [None]:
# Measure empirical layer sensitivity (FID and SSR)
import os
from pathlib import Path

latents_dirs = sorted(Path('data/latents').glob('run_*'), key=os.path.getmtime)
probe_dirs = sorted(Path('checkpoints/probes').glob('run_*'), key=os.path.getmtime)

if latents_dirs:
    latest_latents = latents_dirs[-1]
    
    # Use probe from Step 5 if available
    probe_path = None
    if probe_dirs:
        latest_probe = probe_dirs[-1] / 'pytorch'
        if latest_probe.exists():
            probe_path = str(latest_probe)
            print(f"Using probe: {probe_path}")
    
    print(f"Measuring empirical sensitivity for: {latest_latents}")
    print("This may take 30-60 minutes...")
    
    if probe_path:
        !python scripts/measure_layer_sensitivity.py \
            --latents_dir {latest_latents} \
            --num_samples 20 \
            --device cuda \
            --probe_path {probe_path}
    else:
        !python scripts/measure_layer_sensitivity.py \
            --latents_dir {latest_latents} \
            --num_samples 20 \
            --device cuda
    
    print("\nNow re-run Step 5 with --use_empirical flag to use these measurements!")
else:
    print("No latents found! Run Step 4 first.")

## Step 6: Phase 2 - Train PPO Policy

In [None]:
# Train PPO policy with Kaggle-optimized config
!python scripts/train_ppo.py --config configs/colab_optimized.yaml

## Step 7: Phase 3 - Evaluate Policy

In [None]:
# Evaluate trained policy
import os
from pathlib import Path

ppo_dirs = sorted(Path('outputs/ppo').glob('aether_ppo_*'), key=os.path.getmtime)
probe_dirs = sorted(Path('checkpoints/probes').glob('run_*'), key=os.path.getmtime)

if ppo_dirs and probe_dirs:
    latest_policy = ppo_dirs[-1] / 'final_policy.pt'
    latest_probe = probe_dirs[-1] / 'pytorch'
    
    if latest_policy.exists() and latest_probe.exists():
        print(f"Evaluating: {latest_policy}")
        !python scripts/evaluate_ppo.py \
            --policy_path {latest_policy} \
            --probe_path {latest_probe} \
            --num_samples 50 \
            --device cuda
    else:
        print("Policy or probe not found!")
else:
    print("No training runs or probes found!")

## Step 8: Save Results to Kaggle Output

Kaggle automatically saves all files in `/kaggle/working/` to the output dataset.

In [None]:
# Copy important outputs to /kaggle/working/ for automatic saving
import shutil
from pathlib import Path

output_dir = Path('/kaggle/working/project-aether-results')
output_dir.mkdir(exist_ok=True)

# Copy all important outputs
if Path('outputs').exists():
    shutil.copytree('outputs', output_dir / 'outputs', dirs_exist_ok=True)
if Path('checkpoints').exists():
    shutil.copytree('checkpoints', output_dir / 'checkpoints', dirs_exist_ok=True)
if Path('data/latents').exists():
    shutil.copytree('data/latents', output_dir / 'data/latents', dirs_exist_ok=True)

print(f"✓ Results saved to: {output_dir}")