# CFP Selection Simulator - Configuration and Setup

This notebook provides an interactive configuration interface for the College Football Playoff Selection Simulator.

## Purpose

- Configure season parameters (year, week)
- Understand ranking methodologies
- Set custom ranking weights
- Validate API connection
- Quick start guide for analysis workflow

---

## Table of Contents

1. [Environment Setup](#Environment-Setup)
2. [Season Configuration](#Season-Configuration)
3. [Ranking Methodology Overview](#Ranking-Methodology-Overview)
4. [Ranking Weights Configuration](#Ranking-Weights-Configuration)
5. [API Validation](#API-Validation)
6. [Analysis Workflow](#Analysis-Workflow)

---

## 1. Environment Setup

Load required libraries and configuration.

In [None]:
import os
import sys
import json
from pathlib import Path
from datetime import datetime
import requests
import pandas as pd
from dotenv import load_dotenv
import ipywidgets as widgets
from IPython.display import display, HTML, Markdown

# Add src to path
sys.path.insert(0, os.path.abspath('..'))

# Load environment variables
load_dotenv()

print("Environment loaded successfully")
print(f"Current working directory: {os.getcwd()}")
print(f"Python version: {sys.version}")

---

## 2. Season Configuration

Configure the season year and week for analysis.

### Important Notes:

- **Season Year**: College football seasons span two calendar years (e.g., 2025 = 2025-2026 season)
- **Week Range**: Analysis uses weeks 5-15 (excludes early season weeks 0-4 for stability)
- **Data Source**: CollegeFootballData.com API provides real-time and historical data

In [None]:
# Default configuration
DEFAULT_YEAR = 2025
DEFAULT_WEEK = 15
DEFAULT_START_WEEK = 5

# Interactive widgets for configuration
year_widget = widgets.IntText(
    value=DEFAULT_YEAR,
    description='Season Year:',
    disabled=False,
    style={'description_width': 'initial'}
)

week_widget = widgets.IntSlider(
    value=DEFAULT_WEEK,
    min=5,
    max=15,
    step=1,
    description='Analysis Week:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
    style={'description_width': 'initial'}
)

start_week_widget = widgets.IntSlider(
    value=DEFAULT_START_WEEK,
    min=1,
    max=10,
    step=1,
    description='Start Week:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
    style={'description_width': 'initial'}
)

# Display configuration interface
display(HTML("<h3>Season Configuration</h3>"))
display(year_widget)
display(week_widget)
display(start_week_widget)

# Configuration summary
def show_config_summary(change=None):
    summary = f"""
    <div style='background-color: #f0f0f0; padding: 15px; border-radius: 5px; margin-top: 10px;'>
        <h4>Current Configuration</h4>
        <ul>
            <li><strong>Season:</strong> {year_widget.value}-{year_widget.value + 1}</li>
            <li><strong>Analysis Through Week:</strong> {week_widget.value}</li>
            <li><strong>Data Includes Weeks:</strong> {start_week_widget.value} to {week_widget.value}</li>
            <li><strong>Total Weeks Analyzed:</strong> {week_widget.value - start_week_widget.value + 1}</li>
        </ul>
    </div>
    """
    return HTML(summary)

summary_output = widgets.Output()
with summary_output:
    display(show_config_summary())

# Update summary on widget change
def update_summary(change):
    summary_output.clear_output()
    with summary_output:
        display(show_config_summary())

year_widget.observe(update_summary, names='value')
week_widget.observe(update_summary, names='value')
start_week_widget.observe(update_summary, names='value')

display(summary_output)

# Store configuration
SEASON_YEAR = year_widget.value
ANALYSIS_WEEK = week_widget.value
START_WEEK = start_week_widget.value

---

## 3. Ranking Methodology Overview

This simulator uses an **ensemble approach** combining five distinct ranking methodologies. Each algorithm evaluates teams from a different perspective, creating a comprehensive evaluation framework.

### Ranking Algorithms Explained

#### Colley Matrix Method (20% Weight)

**Philosophy:** Pure resume evaluation based solely on wins and losses.

**How It Works:**
- Uses linear algebra to solve a system of equations
- Each game creates constraints in the matrix
- Beating good teams raises your rating more than beating weak teams
- Margin of victory is completely ignored

**Formula:**
```
Rating = (1 + (Wins - Losses)) / (2 + Games Played)
```

**Strengths:**
- Transparent and reproducible
- Rewards quality wins over quantity
- Historical BCS component (proven methodology)

**Limitations:**
- Ignores dominant performances
- Close wins treated same as blowouts

---

#### Massey Rating System (25% Weight)

**Philosophy:** Predictive power evaluation incorporating scoring margin.

**How It Works:**
- Least-squares optimization to minimize prediction error
- Accounts for margin of victory (capped at 28 points)
- Adjusts for home field advantage (3.75 points)
- Creates ratings that predict future game outcomes

**Formula:**
```
Rating difference = Point differential (adjusted)
Minimize: Σ(Actual Margin - Predicted Margin)²
```

**Strengths:**
- Predictive accuracy for future games
- Recognizes dominant performances
- Mathematically rigorous

**Limitations:**
- Can overvalue blowout wins
- Sensitive to outlier games

---

#### Elo Rating System (20% Weight)

**Philosophy:** Dynamic game-by-game evaluation with temporal sensitivity.

**How It Works:**
- Ratings update after each game
- Upset wins gain more points than expected wins
- Season regression toward mean (1505 base rating)
- K-factor (32) controls rating volatility

**Formula:**
```
New Rating = Old Rating + K × (Actual - Expected)
Expected = 1 / (1 + 10^((Opponent Rating - Your Rating) / 400))
```

**Strengths:**
- Rewards strength of schedule implicitly
- Captures momentum and improvement
- Well-tested in competitive ranking

**Limitations:**
- Early season volatility
- Slow to recognize drastic changes

---

#### Strength of Record (20% Weight)

**Philosophy:** Schedule difficulty and context-aware performance.

**How It Works:**
- Calculates probability of achieving team's record
- Compares against top-25 baseline performance
- Weights quality of opponents faced
- Aligned with CFP committee methodology

**Formula:**
```
SOR = Probability(Top 25 team achieves this record vs. this schedule)
```

**Strengths:**
- Rewards difficult schedules
- Context-aware (who did you play?)
- Matches committee thinking

**Limitations:**
- Complex calculation
- Requires opponent strength estimation

---

#### Win Percentage (15% Weight)

**Philosophy:** Raw performance baseline.

**How It Works:**
- Simple wins divided by total games
- Baseline sanity check for other algorithms
- Ensures undefeated teams get recognition

**Formula:**
```
Win % = Wins / (Wins + Losses)
```

**Strengths:**
- Transparent and simple
- Universal understanding

**Limitations:**
- Ignores schedule difficulty
- No context of opponent quality

---

### Ensemble Composite Score

The final composite score combines all five algorithms:

```
Composite = 0.20(Colley) + 0.25(Massey) + 0.20(Elo) + 0.20(SOR) + 0.15(Win%)
```

**Why This Approach?**
- Balances resume (Colley, Win%) with power (Massey, Elo)
- Incorporates schedule difficulty (SOR)
- Reduces single-algorithm bias
- Creates stable, robust rankings

---

## 4. Ranking Weights Configuration

Customize the weight of each ranking algorithm in the composite score.

**Default Weights:**
- Colley Matrix: 20%
- Massey Ratings: 25%
- Elo System: 20%
- Strength of Record: 20%
- Win Percentage: 15%

In [None]:
# Default weights
DEFAULT_WEIGHTS = {
    'colley': 0.20,
    'massey': 0.25,
    'elo': 0.20,
    'sor': 0.20,
    'win_pct': 0.15
}

# Weight sliders
colley_weight = widgets.FloatSlider(
    value=DEFAULT_WEIGHTS['colley'],
    min=0.0,
    max=1.0,
    step=0.05,
    description='Colley Matrix:',
    readout_format='.0%',
    style={'description_width': 'initial'}
)

massey_weight = widgets.FloatSlider(
    value=DEFAULT_WEIGHTS['massey'],
    min=0.0,
    max=1.0,
    step=0.05,
    description='Massey Ratings:',
    readout_format='.0%',
    style={'description_width': 'initial'}
)

elo_weight = widgets.FloatSlider(
    value=DEFAULT_WEIGHTS['elo'],
    min=0.0,
    max=1.0,
    step=0.05,
    description='Elo System:',
    readout_format='.0%',
    style={'description_width': 'initial'}
)

sor_weight = widgets.FloatSlider(
    value=DEFAULT_WEIGHTS['sor'],
    min=0.0,
    max=1.0,
    step=0.05,
    description='Strength of Record:',
    readout_format='.0%',
    style={'description_width': 'initial'}
)

win_pct_weight = widgets.FloatSlider(
    value=DEFAULT_WEIGHTS['win_pct'],
    min=0.0,
    max=1.0,
    step=0.05,
    description='Win Percentage:',
    readout_format='.0%',
    style={'description_width': 'initial'}
)

# Display weights
display(HTML("<h3>Ranking Algorithm Weights</h3>"))
display(colley_weight)
display(massey_weight)
display(elo_weight)
display(sor_weight)
display(win_pct_weight)

# Weight validation
weight_validation = widgets.Output()

def validate_weights(change=None):
    total = (colley_weight.value + massey_weight.value + elo_weight.value + 
             sor_weight.value + win_pct_weight.value)
    
    weight_validation.clear_output()
    with weight_validation:
        if abs(total - 1.0) < 0.001:
            display(HTML(
                f"<div style='background-color: #d4edda; padding: 10px; border-radius: 5px;'>"
                f"<strong>Valid:</strong> Weights sum to {total:.1%}</div>"
            ))
        else:
            display(HTML(
                f"<div style='background-color: #f8d7da; padding: 10px; border-radius: 5px;'>"
                f"<strong>Warning:</strong> Weights sum to {total:.1%} (should be 100%)</div>"
            ))

# Attach validation to all sliders
for weight_slider in [colley_weight, massey_weight, elo_weight, sor_weight, win_pct_weight]:
    weight_slider.observe(validate_weights, names='value')

display(weight_validation)
validate_weights()

# Store weights
RANKING_WEIGHTS = {
    'colley': colley_weight.value,
    'massey': massey_weight.value,
    'elo': elo_weight.value,
    'sor': sor_weight.value,
    'win_pct': win_pct_weight.value
}

---

## 5. API Validation

Validate connection to CollegeFootballData.com API.

In [None]:
# Get API key
api_key = os.getenv('CFBD_API_KEY')

if not api_key:
    display(HTML(
        "<div style='background-color: #f8d7da; padding: 15px; border-radius: 5px;'>"
        "<h4>API Key Not Found</h4>"
        "<p>Please set your CFBD_API_KEY in the .env file.</p>"
        "<p>Get your free API key at: <a href='https://collegefootballdata.com/key' target='_blank'>https://collegefootballdata.com/key</a></p>"
        "</div>"
    ))
else:
    # Test API connection
    test_url = "https://api.collegefootballdata.com/teams/fbs"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "accept": "application/json"
    }
    params = {"year": year_widget.value}
    
    try:
        response = requests.get(test_url, headers=headers, params=params, timeout=10)
        
        if response.status_code == 200:
            teams_data = response.json()
            display(HTML(
                f"<div style='background-color: #d4edda; padding: 15px; border-radius: 5px;'>"
                f"<h4>API Connection Successful</h4>"
                f"<ul>"
                f"<li><strong>Status:</strong> Connected</li>"
                f"<li><strong>FBS Teams ({year_widget.value}):</strong> {len(teams_data)}</li>"
                f"<li><strong>API Endpoint:</strong> CollegeFootballData.com</li>"
                f"</ul>"
                f"</div>"
            ))
            
            # Show sample teams
            sample_teams = [team['school'] for team in teams_data[:10]]
            display(HTML(
                f"<div style='background-color: #f0f0f0; padding: 10px; border-radius: 5px; margin-top: 10px;'>"
                f"<strong>Sample Teams:</strong> {', '.join(sample_teams)}, ..."
                f"</div>"
            ))
        else:
            display(HTML(
                f"<div style='background-color: #f8d7da; padding: 15px; border-radius: 5px;'>"
                f"<h4>API Connection Failed</h4>"
                f"<p><strong>Status Code:</strong> {response.status_code}</p>"
                f"<p><strong>Error:</strong> {response.text}</p>"
                f"</div>"
            ))
    except Exception as e:
        display(HTML(
            f"<div style='background-color: #f8d7da; padding: 15px; border-radius: 5px;'>"
            f"<h4>API Connection Error</h4>"
            f"<p><strong>Error:</strong> {str(e)}</p>"
            f"</div>"
        ))

---

## 6. Analysis Workflow

Recommended sequence for running the complete analysis.

### Step-by-Step Guide

#### Step 1: Data Pipeline (`01_data_pipeline.ipynb`)
**Purpose:** Fetch and prepare FBS game data from CFBD API

**What It Does:**
- Retrieves FBS team list for selected season
- Fetches game results for specified week range
- Filters for FBS-only matchups
- Caches data locally for reproducibility

**Output:** `data/cache/fbs_games_{year}_week{week}.parquet`

---

#### Step 2: Ranking Algorithms (`02_ranking_algorithms.ipynb`)
**Purpose:** Calculate individual ranking scores

**What It Does:**
- Implements Colley Matrix methodology
- Calculates Massey Ratings with MOV adjustments
- Computes Elo ratings game-by-game
- Determines Strength of Record
- Calculates basic win percentage

**Output:** Individual ranking DataFrames for each algorithm

---

#### Step 3: Composite Rankings (`03_composite_rankings.ipynb`)
**Purpose:** Combine algorithms into ensemble score

**What It Does:**
- Normalizes all ranking scores (0-1 scale)
- Applies configured weights
- Calculates weighted composite score
- Generates final team rankings

**Output:** Composite rankings with all algorithm scores

---

#### Step 4: Playoff Selection (`04_playoff_selection.ipynb`)
**Purpose:** Select 12-team playoff field per CFP protocol

**What It Does:**
- Identifies conference champions
- Applies 5 highest-ranked conference champion rule
- Selects 7 at-large bids by composite ranking
- Seeds teams according to CFP bracket format
- Generates first-round matchups

**Output:** 12-team playoff bracket with seeding

---

#### Step 5: Validation & Backtesting (`05_validation_backtesting.ipynb`)
**Purpose:** Compare simulator results to actual CFP selections

**What It Does:**
- Runs simulator on historical seasons (2014-2023)
- Calculates selection accuracy
- Computes ranking correlation metrics
- Identifies systematic biases

**Output:** Validation metrics and comparison analysis

---

#### Step 6: Quick Simulator (`06_quick_simulator.ipynb`)
**Purpose:** Streamlined end-to-end analysis

**What It Does:**
- Combines all steps into single workflow
- Minimal output, focus on final results
- Ideal for weekly updates during season

**Output:** Top 25 rankings and 12-team playoff bracket

---

### Quick Start Commands

Run notebooks in Jupyter Lab in this order:
1. `00_configuration.ipynb` (this notebook)
2. `01_data_pipeline.ipynb`
3. `02_ranking_algorithms.ipynb`
4. `03_composite_rankings.ipynb`
5. `04_playoff_selection.ipynb`

Or use the quick simulator for streamlined analysis:
- `06_quick_simulator.ipynb`

---

## Configuration Export

Save current configuration for use in other notebooks.

In [None]:
# Update configuration from widgets
config = {
    'season_year': year_widget.value,
    'analysis_week': week_widget.value,
    'start_week': start_week_widget.value,
    'weights': {
        'colley': colley_weight.value,
        'massey': massey_weight.value,
        'elo': elo_weight.value,
        'sor': sor_weight.value,
        'win_pct': win_pct_weight.value
    },
    'timestamp': datetime.now().isoformat()
}

# Save to file
config_dir = Path('../data/cache')
config_dir.mkdir(parents=True, exist_ok=True)
config_file = config_dir / 'simulation_config.json'

with open(config_file, 'w') as f:
    json.dump(config, f, indent=2)

display(HTML(
    f"<div style='background-color: #d4edda; padding: 15px; border-radius: 5px;'>"
    f"<h4>Configuration Saved</h4>"
    f"<p><strong>Location:</strong> {config_file}</p>"
    f"<p>This configuration will be used by subsequent notebooks.</p>"
    f"</div>"
))

# Display configuration
print("\nCurrent Configuration:")
print(json.dumps(config, indent=2))

---

## Summary

Configuration complete. You can now proceed to the analysis notebooks.

### Next Steps:

1. Verify configuration settings above
2. Start with `01_data_pipeline.ipynb` to fetch game data
3. Continue through notebooks 02-04 for complete analysis
4. Use `06_quick_simulator.ipynb` for streamlined weekly updates

### Need Help?

- **Documentation:** See `docs/` directory for detailed guides
- **Methodology:** Review methodology glossary in this notebook
- **Issues:** Report problems to project repository

---