# ViZDoom Deep RL Ablation Study

**Master's RL Course Final Project**

This notebook runs the complete ViZDoom ablation study on Google Colab with automatic Drive backup.

## Workflow
1. **Setup** - Install dependencies, mount Drive
2. **Quick Test** - Verify environment works (2 min)
3. **Single Training Test** - Confirm learning works (30 min)
4. **Ablation Phase 1** - Algorithm comparison: DQN vs Deep SARSA
5. **Ablation Phase 2** - Learning rate ablation
6. **Ablation Phase 3** - Extensions: DDQN, Dueling, PER
7. **Analysis** - Generate report and plots

**Important**: Run phases in separate sessions if needed (Colab ~12h limit)

---
## 1. Environment Setup

Install dependencies and set up virtual display for headless rendering.

In [None]:
# Check if running on Colab
import sys
IN_COLAB = 'google.colab' in sys.modules
print(f"Running on Colab: {IN_COLAB}")

In [None]:
# Install system dependencies (Colab only)
if IN_COLAB:
    !apt-get update -qq
    !apt-get install -qq -y \
        libboost-all-dev libsdl2-dev libopenal-dev \
        xvfb python3-opengl
    print("System dependencies installed!")

In [None]:
# Install Python packages
if IN_COLAB:
    !pip install -q vizdoom==1.2.4 gymnasium==1.2.3
    !pip install -q torch torchvision
    !pip install -q wandb hydra-core omegaconf
    !pip install -q matplotlib opencv-python numpy
    print("Python packages installed!")

In [None]:
# Set up virtual display for headless rendering
import os

if IN_COLAB:
    os.system('Xvfb :1 -screen 0 1024x768x24 &')
    os.environ['DISPLAY'] = ':1'
    print("Virtual display configured!")

In [None]:
# Mount Google Drive and setup directories
if IN_COLAB:
    from google.colab import drive
    drive.mount('/content/drive')
    
    # Create output directory on Drive
    DRIVE_OUTPUT = '/content/drive/MyDrive/vizdoom-ablation-results'
    os.makedirs(DRIVE_OUTPUT, exist_ok=True)
    
    # Clone repo or use existing
    REPO_PATH = '/content/vizdoom-ablation'
    if not os.path.exists(REPO_PATH):
        !git clone https://github.com/lynxrafu/visdoom-ablation.git {REPO_PATH}
    
    %cd {REPO_PATH}
    
    # Link results folder to Drive for automatic backup
    !rm -rf results
    !ln -s {DRIVE_OUTPUT} results
    
    print(f"Working directory: {os.getcwd()}")
    print(f"Results will be saved to: {DRIVE_OUTPUT}")
else:
    DRIVE_OUTPUT = 'results'
    os.makedirs(DRIVE_OUTPUT, exist_ok=True)
    print(f"Local mode - results saved to: {DRIVE_OUTPUT}")

In [None]:
# Add project to path
import sys
sys.path.insert(0, '.')

# Verify imports
import torch
import gymnasium
import vizdoom

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Gymnasium: {gymnasium.__version__}")
print(f"ViZDoom: {vizdoom.__version__}")

---
## 2. Quick Test (~2 minutes)

Verify that ViZDoom environment and agent work correctly before long training runs.

In [None]:
# Test environment
from src.envs import make_vizdoom_env

env = make_vizdoom_env('VizdoomBasic-v0')
print(f"Observation space: {env.observation_space}")
print(f"Action space: {env.action_space}")

obs, info = env.reset()
print(f"Initial obs shape: {obs.shape}")

# Take a few random steps
for i in range(5):
    action = env.action_space.sample()
    obs, reward, term, trunc, info = env.step(action)
    print(f"Step {i+1}: action={action}, reward={reward:.2f}")

env.close()
print("\nEnvironment test passed!")

In [None]:
# Run quick 10-episode training test
print("Running quick training test (10 episodes)...")
!python experiments/train.py \
    training.num_episodes=10 \
    logging.wandb_enabled=false \
    logging.save_csv=false

print("\nQuick test passed! Environment and training loop work correctly.")

---
## 3. Single Training Test (~30 minutes)

Train one DQN agent to verify learning happens before running full ablation.

In [None]:
# Configuration
import os
os.environ['WANDB_MODE'] = 'disabled'  # Disable WandB (or set to 'online' if you want logging)

# Training parameters for test run
TEST_EPISODES = 500  # Enough to see learning
TEST_SCENARIO = 'VizdoomBasic-v0'  # Easiest scenario

In [None]:
# Train DQN on Basic scenario
print(f"Training DQN on {TEST_SCENARIO} for {TEST_EPISODES} episodes...")
print("This takes ~30 minutes. You should see rewards increasing over time.\n")

!python experiments/train.py \
    env.scenario={TEST_SCENARIO} \
    training.num_episodes={TEST_EPISODES} \
    logging.wandb_enabled=false \
    seed=42

print("\nTraining complete! Check the learning curve below.")

In [None]:
# View the learning curve
import matplotlib.pyplot as plt
from PIL import Image
import glob

# Find the most recent curve
curve_files = glob.glob('results/*_curve.png')
if curve_files:
    latest = max(curve_files, key=os.path.getctime)
    img = Image.open(latest)
    plt.figure(figsize=(12, 6))
    plt.imshow(img)
    plt.axis('off')
    plt.title('Learning Curve')
    plt.show()
else:
    print("No learning curves found yet.")

---
## 4. Ablation Study

Run ablation phases one by one. Each phase saves results to Drive automatically.

| Phase | What it tests | Estimated time |
|-------|--------------|----------------|
| `algorithms` | DQN vs Deep SARSA | ~1-2 hours |
| `lr` | Learning rates (0.0001, 0.001, 0.01) | ~1-2 hours |
| `extensions` | DDQN, Dueling, PER | ~2-3 hours |

**Tip**: If Colab times out, just re-run from where you left off. Results are saved to Drive.

In [None]:
# Ablation configuration
ABLATION_EPISODES = 500      # Episodes per run
ABLATION_SEEDS = [1, 2, 3]   # Multiple seeds for statistical significance
ABLATION_SCENARIOS = ['VizdoomBasic-v0']  # Start with Basic, add more later

# Convert to string for command line
seeds_str = ' '.join(map(str, ABLATION_SEEDS))
scenarios_str = ' '.join(ABLATION_SCENARIOS)

print(f"Ablation settings:")
print(f"  Episodes: {ABLATION_EPISODES}")
print(f"  Seeds: {ABLATION_SEEDS}")
print(f"  Scenarios: {ABLATION_SCENARIOS}")

In [None]:
# ============================================
# PHASE 1: Algorithm Comparison (~1-2 hours)
# Compares: DQN (off-policy) vs Deep SARSA (on-policy)
# ============================================
print("=" * 60)
print("PHASE 1: Algorithm Comparison (DQN vs Deep SARSA)")
print("=" * 60)

!python experiments/ablate.py \
    --phase algorithms \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {ABLATION_EPISODES}

print("\nPhase 1 complete! Results saved to Drive.")

In [None]:
# ============================================
# PHASE 2: Learning Rate Ablation (~1-2 hours)
# Tests: lr = 0.0001, 0.001, 0.01
# ============================================
print("=" * 60)
print("PHASE 2: Learning Rate Ablation")
print("=" * 60)

!python experiments/ablate.py \
    --phase lr \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {ABLATION_EPISODES}

print("\nPhase 2 complete! Results saved to Drive.")

In [None]:
# ============================================
# PHASE 3: DQN Extensions (~2-3 hours)
# Tests: DDQN, Dueling DQN, Prioritized Experience Replay
# ============================================
print("=" * 60)
print("PHASE 3: DQN Extensions (DDQN, Dueling, PER)")
print("=" * 60)

!python experiments/ablate.py \
    --phase extensions \
    --scenarios {scenarios_str} \
    --seeds {seeds_str} \
    --episodes {ABLATION_EPISODES}

print("\nPhase 3 complete! Results saved to Drive.")

In [None]:
# Check results saved on Drive
import glob
import os

print("=" * 60)
print("RESULTS SAVED TO DRIVE")
print("=" * 60)

csv_files = sorted(glob.glob('results/*.csv'))
print(f"\nCSV result files: {len(csv_files)}")
for f in csv_files[-10:]:  # Show last 10
    size_kb = os.path.getsize(f) / 1024
    print(f"  - {os.path.basename(f)} ({size_kb:.1f} KB)")

png_files = glob.glob('results/*.png')
print(f"\nLearning curve plots: {len(png_files)}")

if IN_COLAB:
    print(f"\nAll results backed up to: {DRIVE_OUTPUT}")

---
## 5. Results Analysis

Analyze all ablation results and generate the IEEE report figures.

**Run this after completing ablation phases** (or after each phase to see intermediate results).

In [None]:
# Use the ResultsAnalyzer for comprehensive analysis
from src.utils.analysis import ResultsAnalyzer, analyze_results

# Initialize analyzer
analyzer = ResultsAnalyzer("results/")
num_loaded = analyzer.load_all()
print(f"Loaded {num_loaded} experiment results")

In [None]:
# Display summary table
if num_loaded > 0:
    summary_df = analyzer.summary()
    print("Results Summary:")
    display(summary_df)
else:
    print("No results found. Run ablations first.")

In [None]:
# Compare algorithms and print rankings
if num_loaded > 0:
    analyzer.compare_algorithms()
    
    for scenario in analyzer.get_scenarios():
        analyzer.print_comparison(scenario)
        
        # Get best algorithm
        best_algo, best_result = analyzer.get_best_algorithm(scenario, "reward")
        print(f"\nBest algorithm for {scenario}: {best_algo} (reward: {best_result.reward_mean:.2f})")

In [None]:
# Generate complete report with all plots and exports
if num_loaded > 0:
    analyzer.generate_report(
        output_dir="results/report",
        include_plots=True,
        include_tables=True
    )
    
    # Display generated files
    import glob
    report_files = glob.glob("results/report/*")
    print(f"\nGenerated {len(report_files)} report files:")
    for f in report_files:
        print(f"  - {f}")

---
## 6. Export for IEEE Report

Export results in formats suitable for the IEEE report.

In [None]:
# View generated LaTeX table for IEEE report
latex_path = "results/report/summary.tex"
if os.path.exists(latex_path):
    with open(latex_path, 'r') as f:
        latex_content = f.read()
    print("LaTeX Table (for IEEE report):")
    print(latex_content)
else:
    print("No LaTeX table yet. Generate report first.")

# View JSON report
json_path = "results/report/report.json"
if os.path.exists(json_path):
    import json
    with open(json_path, 'r') as f:
        report_data = json.load(f)
    print("\nReport contains:")
    print(f"  - {len(report_data.get('summary', []))} algorithm-scenario combinations")
    print(f"  - {len(report_data.get('comparisons', {}))} scenario comparisons")

---
## Notes

### Session Management
- **Colab free tier**: ~12 hour limit per session
- **If session times out**: Re-run cells 1-7 (setup), then continue from where you left off
- **Results are safe**: All outputs saved to Google Drive automatically

### Training Time Estimates
| Phase | Scenarios | Seeds | Episodes | Time |
|-------|-----------|-------|----------|------|
| Quick test | 1 | 1 | 10 | ~2 min |
| Single training | 1 | 1 | 500 | ~30 min |
| Algorithm comparison | 1 | 3 | 500 | ~1-2 hours |
| Learning rate | 1 | 3 | 500 | ~1-2 hours |
| Extensions | 1 | 3 | 500 | ~2-3 hours |

### For Full Ablation (All Scenarios)
To run on all 3 scenarios, update the config cell:
```python
ABLATION_SCENARIOS = ['VizdoomBasic-v0', 'VizdoomTakeCover-v0', 'VizdoomDeathmatch-v0']
```
This will take significantly longer (~10-15 hours total).

### Tips
- Use **GPU runtime** for faster training (Runtime > Change runtime type > GPU)
- Monitor progress via printed episode rewards
- Check Drive periodically to verify results are saving
- Run analysis after each phase to see intermediate results