# Notebook 03: Pre-Throw Coverage Analysis

**CRITICAL FRAMEWORK SHIFT:**
- Notebook 01 & 02 analyzed OUTPUT files (ball in air - post-throw)
- THIS notebook analyzes INPUT files (snap to throw - pre-throw)
- This is where CBs make coverage DECISIONS!

**Goal:** Analyze CB positioning changes BEFORE the ball is thrown

**Key Metric:** `growth_rate` - yards per frame of backpedal/press

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

print("=" * 70)
print("NOTEBOOK 03: PRE-THROW COVERAGE ANALYSIS")
print("WHERE DEFENSIVE BACKS MAKE THEIR COVERAGE DECISIONS")
print("=" * 70)

# Load cushion data
cushion_df = pd.read_csv('../data/processed/cushion_analysis_data.csv')
print(f"\n✓ Loaded {len(cushion_df):,} CB-WR pairings")

# Load INPUT files
print("\nLoading INPUT files (snap → throw phase)...")
input_dfs = {}
for week in range(1, 19):
    filepath = f'../data/train/input_2023_w{week:02d}.csv'
    try:
        input_dfs[week] = pd.read_csv(filepath)
        print(f"  Week {week:2d}: ✓")
    except FileNotFoundError:
        print(f"  Week {week:2d}: ✗")

print(f"\n✓ Loaded {len(input_dfs)} weeks of INPUT data")

# Define metric calculation function
def calculate_prethrow_metrics(pairing_row, input_dfs):
    week = int(pairing_row['week'])
    game_id = pairing_row['game_id']
    play_id = pairing_row['play_id']
    wr_id = pairing_row['wr_nfl_id']
    cb_id = pairing_row['cb_nfl_id']
    
    if week not in input_dfs:
        return None
    
    input_df = input_dfs[week]
    play_frames = input_df[
        (input_df['game_id'] == game_id) & 
        (input_df['play_id'] == play_id)
    ]
    
    wr_frames = play_frames[play_frames['nfl_id'] == wr_id].sort_values('frame_id')
    cb_frames = play_frames[play_frames['nfl_id'] == cb_id].sort_values('frame_id')
    
    cushions = []
    for frame in sorted(play_frames['frame_id'].unique()):
        wr_pos = wr_frames[wr_frames['frame_id'] == frame]
        cb_pos = cb_frames[cb_frames['frame_id'] == frame]
        
        if len(wr_pos) > 0 and len(cb_pos) > 0:
            distance = np.sqrt(
                (wr_pos['x'].values[0] - cb_pos['x'].values[0])**2 + 
                (wr_pos['y'].values[0] - cb_pos['y'].values[0])**2
            )
            cushions.append(distance)
    
    if len(cushions) < 2:
        return None
    
    presnap_cushion = cushions[0]
    at_throw_cushion = cushions[-1]
    num_frames = len(cushions)
    cushion_growth = at_throw_cushion - presnap_cushion
    growth_rate = cushion_growth / num_frames
    
    if growth_rate > 0.3:
        pattern = 'Aggressive Backpedal'
    elif growth_rate > 0.1:
        pattern = 'Moderate Backpedal'
    elif growth_rate > -0.1:
        pattern = 'Maintained'
    elif growth_rate > -0.3:
        pattern = 'Moderate Press'
    else:
        pattern = 'Aggressive Press'
    
    return {
        'presnap_cushion': presnap_cushion,
        'at_throw_cushion': at_throw_cushion,
        'num_prethrow_frames': num_frames,
        'cushion_growth': cushion_growth,
        'growth_rate': growth_rate,
        'prethrow_pattern': pattern
    }

# Process all pairings
print("\nProcessing all pairings...")
results = []
for idx, row in cushion_df.iterrows():
    result = calculate_prethrow_metrics(row, input_dfs)
    if result:
        results.append({**row.to_dict(), **result})
    if (idx + 1) % 100 == 0:
        print(f"  Processed {idx + 1:,}/{len(cushion_df):,}...")

final_df = pd.DataFrame(results)
print(f"\n✓ Processing complete: {len(final_df):,} pairings")

# Save results
output_path = '../data/processed/prethrow_coverage_data.csv'
final_df.to_csv(output_path, index=False)
print(f"\n✓ Saved: {output_path}")

# Summary statistics
print("\n" + "=" * 70)
print("SUMMARY")
print("=" * 70)
print(f"\nGrowth Rate Distribution:")
print(final_df['growth_rate'].describe())
print(f"\nPattern Distribution:")
print(final_df['prethrow_pattern'].value_counts())
print("\n✓ Ready for Notebook 04!")

NOTEBOOK 03: PRE-THROW COVERAGE ANALYSIS
WHERE DEFENSIVE BACKS MAKE THEIR COVERAGE DECISIONS

✓ Loaded 1,439 CB-WR pairings

Loading INPUT files (snap → throw phase)...
  Week  1: ✓
  Week  2: ✓
  Week  3: ✓
  Week  4: ✓
  Week  5: ✓
  Week  6: ✓
  Week  7: ✓
  Week  8: ✓
  Week  9: ✓
  Week 10: ✓
  Week 11: ✓
  Week 12: ✓
  Week 13: ✓
  Week 14: ✓
  Week 15: ✓
  Week 16: ✓
  Week 17: ✓
  Week 18: ✓

✓ Loaded 18 weeks of INPUT data

Processing all pairings...
  Processed 100/1,439...
  Processed 200/1,439...
  Processed 300/1,439...
  Processed 400/1,439...
  Processed 500/1,439...
  Processed 600/1,439...
  Processed 700/1,439...
  Processed 800/1,439...
  Processed 900/1,439...
  Processed 1,000/1,439...
  Processed 1,100/1,439...
  Processed 1,200/1,439...
  Processed 1,300/1,439...
  Processed 1,400/1,439...

✓ Processing complete: 1,439 pairings

✓ Saved: ../data/processed/prethrow_coverage_data.csv

SUMMARY

Growth Rate Distribution:
count    1439.000000
mean       -0.107610
std 