# ViZDoom Deep RL Ablation Study

**Master's RL Course Final Project**

This notebook runs the complete ViZDoom ablation study on Google Colab with automatic Drive backup.

## Workflow
1. **Setup** - Install dependencies, mount Drive
2. **Quick Test** - Verify environment works (2 min)
3. **Single Training Test** - Confirm learning works (30 min)
4. **Ablation Phase 1** - Algorithm comparison: DQN vs Deep SARSA
5. **Ablation Phase 2** - Learning rate ablation
6. **Ablation Phase 3** - Extensions: DDQN, Dueling, PER
7. **Analysis** - Generate report and plots

**Important**: Run phases in separate sessions if needed (Colab ~12h limit)

---
## 1. Environment Setup

**First time?** Run all cells in Section 1.

**Resuming or updating?** Just run the cell below - it handles everything!

In [None]:
# ============================================
# ONE-CLICK SETUP / RESUME / UPDATE
# ============================================
# Run this cell to:
# - Install dependencies (first time)
# - Pull latest code (if repo updated)
# - Reconnect to Drive (if session restarted)
# - Resume from where you left off
# ============================================

import sys
import os

IN_COLAB = 'google.colab' in sys.modules
print(f"Running on Colab: {IN_COLAB}")

if IN_COLAB:
    # 1. Install system dependencies (only if needed)
    import shutil
    if not shutil.which('xvfb-run'):
        print("\n[1/5] Installing system dependencies...")
        os.system('apt-get update -qq')
        os.system('apt-get install -qq -y libboost-all-dev libsdl2-dev libopenal-dev xvfb python3-opengl')
    else:
        print("\n[1/5] System dependencies already installed ✓")

    # 2. Install Python packages (only if needed)
    try:
        import vizdoom
        print("[2/5] Python packages already installed ✓")
    except ImportError:
        print("[2/5] Installing Python packages...")
        os.system('pip install -q vizdoom==1.2.4 gymnasium==1.2.3')
        os.system('pip install -q torch torchvision')
        os.system('pip install -q wandb hydra-core omegaconf')
        os.system('pip install -q matplotlib opencv-python numpy')

    # 3. Setup virtual display
    os.system('Xvfb :1 -screen 0 1024x768x24 &')
    os.environ['DISPLAY'] = ':1'
    print("[3/5] Virtual display configured ✓")

    # 4. Mount Google Drive
    from google.colab import drive
    if not os.path.exists('/content/drive/MyDrive'):
        drive.mount('/content/drive')
    print("[4/5] Google Drive mounted ✓")

    # 5. Setup repository and results
    DRIVE_OUTPUT = '/content/drive/MyDrive/vizdoom-ablation-results'
    REPO_PATH = '/content/vizdoom-ablation'
    os.makedirs(DRIVE_OUTPUT, exist_ok=True)

    if os.path.exists(REPO_PATH):
        # Repo exists - pull latest changes
        print("[5/5] Pulling latest code...")
        os.system(f'cd {REPO_PATH} && git pull')
    else:
        # First time - clone repo
        print("[5/5] Cloning repository...")
        os.system(f'git clone https://github.com/lynxrafu/visdoom-ablation.git {REPO_PATH}')

    os.chdir(REPO_PATH)

    # Link results to Drive (recreate symlink)
    if os.path.exists('results') and not os.path.islink('results'):
        os.system('rm -rf results')
    if not os.path.exists('results'):
        os.symlink(DRIVE_OUTPUT, 'results')

    print(f"\n{'='*50}")
    print(f"✓ Ready! Working directory: {os.getcwd()}")
    print(f"✓ Results saved to: {DRIVE_OUTPUT}")

    # Show existing results
    from pathlib import Path
    runs = list(Path(DRIVE_OUTPUT).glob("**/metadata.json"))
    print(f"✓ Existing runs found: {len(runs)}")
    print(f"{'='*50}")

else:
    DRIVE_OUTPUT = 'results'
    os.makedirs(DRIVE_OUTPUT, exist_ok=True)
    print(f"Local mode - results saved to: {DRIVE_OUTPUT}")

In [None]:
# Skip this cell - handled by ONE-CLICK SETUP above

In [None]:
# Skip this cell - handled by ONE-CLICK SETUP above

---
## 2. Quick Test (~2 minutes)

Verify that ViZDoom environment and agent work correctly before long training runs.

In [None]:
# Run quick 10-episode training test (WandB disabled for speed)
print("Running quick training test (10 episodes)...")
!python experiments/train.py \
    training.num_episodes=10 \
    logging.wandb_enabled=false \
    logging.csv_log=false

print("\nQuick test passed! Environment and training loop work correctly.")

---
## 3. Configuration (Edit This Cell Only!)

**All training settings in ONE place.** Change values here, then run any training cell.

In [None]:
# ============================================
# CONFIGURATION - EDIT HERE ONLY!
# ============================================
# All training settings in one place.
# Change these values, then run any training/ablation cell.

# --- Training Duration ---
EPISODES = 500              # Episodes per training run (500=test, 2000=full)

# --- Scenarios to Test ---
SCENARIOS = [
    'VizdoomBasic-v0',      # Easy - basic shooting (recommended to start)
    # 'VizdoomTakeCover-v0',  # Medium - dodge fireballs
    # 'VizdoomDeathmatch-v0', # Hard - full combat
]

# --- Seeds for Statistical Significance ---
SEEDS = [1, 2, 3]           # Multiple seeds for reliable results

# --- Display Summary ---
print("=" * 50)
print("CURRENT CONFIGURATION")
print("=" * 50)
print(f"Episodes per run:  {EPISODES}")
print(f"Scenarios:         {SCENARIOS}")
print(f"Seeds:             {SEEDS}")
print(f"Total runs needed: {len(SCENARIOS) * len(SEEDS)} per phase")
print("=" * 50)
print("\nEdit this cell to change settings, then run training cells.")

In [None]:
# Single DQN Training Run (uses config from cell above)
scenario = SCENARIOS[0]  # First scenario from config
seed = SEEDS[0]          # First seed from config

print(f"Training DQN on {scenario} for {EPISODES} episodes (seed={seed})...")
print("Track progress at: https://wandb.ai/\n")

!python experiments/train.py \
    env.scenario={scenario} \
    training.num_episodes={EPISODES} \
    seed={seed}

print("\nTraining complete! Check WandB dashboard for detailed metrics.")

In [None]:
# View the learning curve
import matplotlib.pyplot as plt
from PIL import Image
import glob

# Find the most recent curve
curve_files = glob.glob('results/*_curve.png')
if curve_files:
    latest = max(curve_files, key=os.path.getctime)
    img = Image.open(latest)
    plt.figure(figsize=(12, 6))
    plt.imshow(img)
    plt.axis('off')
    plt.title('Learning Curve')
    plt.show()
else:
    print("No learning curves found yet.")

---
## 4. Ablation Study

Run ablation phases using the configuration from Section 3.

| Phase | What it tests | Runs per scenario |
|-------|--------------|-------------------|
| `algorithms` | DQN vs Deep SARSA | 2 × seeds |
| `lr` | Learning rates (0.0001, 0.001, 0.01) | 3 × seeds |
| `extensions` | DDQN, Dueling, PER | 4 × seeds |

In [None]:
# Prepare command line arguments from configuration (Section 3)
seeds_str = ' '.join(map(str, SEEDS))
scenarios_str = ' '.join(SCENARIOS)

print(f"Using configuration from Section 3:")
print(f"  Episodes:  {EPISODES}")
print(f"  Scenarios: {scenarios_str}")
print(f"  Seeds:     {seeds_str}")

In [None]:
# ============================================
# PHASE 1: Algorithm Comparison
# Compares: DQN (off-policy) vs Deep SARSA (on-policy)
# ============================================
print("=" * 60)
print("PHASE 1: Algorithm Comparison (DQN vs Deep SARSA)")
print(f"Episodes: {EPISODES} | Scenarios: {len(SCENARIOS)} | Seeds: {len(SEEDS)}")
print("Track progress at: https://wandb.ai/")
print("=" * 60)

!python experiments/ablate.py \
    --phase algorithms \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {EPISODES}

print("\nPhase 1 complete! Results saved to Drive and WandB.")

In [None]:
# ============================================
# PHASE 2: Learning Rate Ablation
# Tests: lr = 0.0001, 0.001, 0.01
# ============================================
print("=" * 60)
print("PHASE 2: Learning Rate Ablation")
print(f"Episodes: {EPISODES} | Scenarios: {len(SCENARIOS)} | Seeds: {len(SEEDS)}")
print("Track progress at: https://wandb.ai/")
print("=" * 60)

!python experiments/ablate.py \
    --phase lr \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {EPISODES}

print("\nPhase 2 complete! Results saved to Drive and WandB.")

In [None]:
# ============================================
# PHASE 3: DQN Extensions
# Tests: DDQN, Dueling DQN, Prioritized Experience Replay
# ============================================
print("=" * 60)
print("PHASE 3: DQN Extensions (DDQN, Dueling, PER)")
print(f"Episodes: {EPISODES} | Scenarios: {len(SCENARIOS)} | Seeds: {len(SEEDS)}")
print("Track progress at: https://wandb.ai/")
print("=" * 60)

!python experiments/ablate.py \
    --phase extensions \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {EPISODES}

print("\nPhase 3 complete! Results saved to Drive and WandB.")

In [None]:
# Check results saved on Drive - New organized structure
import glob
import os
from pathlib import Path

print("=" * 60)
print("RESULTS SAVED TO DRIVE")
print("=" * 60)

results_path = Path('results')
if results_path.exists():
    # Find all run directories (contain metadata.json)
    run_dirs = list(results_path.glob("**/metadata.json"))
    print(f"\nTotal training runs: {len(run_dirs)}")

    # Show recent runs
    print("\nRecent runs:")
    for meta_path in sorted(run_dirs, key=lambda x: x.stat().st_mtime, reverse=True)[:5]:
        run_dir = meta_path.parent
        import json
        with open(meta_path) as f:
            meta = json.load(f)
        print(f"  - {run_dir.relative_to(results_path)}")
        print(f"    Agent: {meta.get('agent_type')} | Scenario: {meta.get('scenario')} | Seed: {meta.get('seed')}")

    if IN_COLAB:
        print(f"\nAll results backed up to: {DRIVE_OUTPUT}")
else:
    print("No results yet. Run training first.")

---
## 5. Results Analysis

Analyze all ablation results and generate the IEEE report figures.

**Run this after completing ablation phases** (or after each phase to see intermediate results).

In [None]:
# Use the ResultsAnalyzer for comprehensive analysis
from src.utils.analysis import ResultsAnalyzer, analyze_results

# Initialize analyzer
analyzer = ResultsAnalyzer("results/")
num_loaded = analyzer.load_all()
print(f"Loaded {num_loaded} experiment results")

In [None]:
# Display summary table
if num_loaded > 0:
    summary_df = analyzer.summary()
    print("Results Summary:")
    display(summary_df)
else:
    print("No results found. Run ablations first.")

In [None]:
# Compare algorithms and print rankings
if num_loaded > 0:
    analyzer.compare_algorithms()
    
    for scenario in analyzer.get_scenarios():
        analyzer.print_comparison(scenario)
        
        # Get best algorithm
        best_algo, best_result = analyzer.get_best_algorithm(scenario, "reward")
        print(f"\nBest algorithm for {scenario}: {best_algo} (reward: {best_result.reward_mean:.2f})")

In [None]:
# Generate complete report with all plots and exports
if num_loaded > 0:
    analyzer.generate_report(
        output_dir="results/report",
        include_plots=True,
        include_tables=True
    )
    
    # Display generated files
    import glob
    report_files = glob.glob("results/report/*")
    print(f"\nGenerated {len(report_files)} report files:")
    for f in report_files:
        print(f"  - {f}")

---
## 6. Export for IEEE Report

Export results in formats suitable for the IEEE report.

---
## Quick Reference

### Your Workflow

| Situation | What to do |
|-----------|------------|
| **First time** | Run cell 2 (ONE-CLICK SETUP) → Run cell 7 (verify imports) → Continue |
| **Session timeout** | Run cell 2 (ONE-CLICK SETUP) → Continue where you left off |
| **Repo updated** | Run cell 2 (ONE-CLICK SETUP) - it auto-pulls changes |
| **Check progress** | Run Section 5 (Analysis) anytime |

### Session Management
- **ONE-CLICK SETUP (cell 2)** handles everything: install, mount, pull, resume
- **Results are on Drive** - never lost even if session dies
- **Just run cell 2** whenever you start/resume - it's smart enough to skip what's already done

### Training Time Estimates
| Phase | Scenarios | Seeds | Episodes | Time |
|-------|-----------|-------|----------|------|
| Quick test | 1 | 1 | 10 | ~2 min |
| Single training | 1 | 1 | 500 | ~30 min |
| Algorithm comparison | 1 | 3 | 500 | ~1-2 hours |
| Learning rate ablation | 1 | 3 | 500 | ~1-2 hours |
| Extensions ablation | 1 | 3 | 500 | ~2-3 hours |

### Output Structure
```
Google Drive/vizdoom-ablation-results/
└── 2025-12-25/
    └── 143052_dqn_VizdoomBasic_v0_lr0.0001_seed42/
        ├── metadata.json      # What was run
        ├── config.yaml        # Full config
        ├── training_log.csv   # All metrics
        ├── summary.json       # Final results
        ├── checkpoints/       # Model weights
        └── plots/             # Learning curves
```

### Tips
- Use **GPU runtime** for faster training
- Run **cell 2 first** every time you open the notebook
- Check **Section 5 (Analysis)** to see all your runs
- Each run gets a **unique timestamp** - nothing is ever overwritten