# Project Beat Vegas Demo

This notebook walks through the end-to-end workflow for Project Beat Vegas — fetching NFL data, engineering features, training baseline models, and visualizing outputs.

## Setup

Install dependencies (run in a terminal) and authenticate to any data sources if needed:

```bash
pip install -r requirements.txt
```

Set the NFL data cache directory to avoid repeated downloads.

In [None]:
from pathlib import Path
import logging

import pandas as pd

from beat_vegas import data_load, features, pipeline, visuals

logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s: %(message)s')

SEASONS = [2021, 2022]
VALIDATION_SEASON = 2023
CACHE_DIR = Path('data/pbp')
CACHE_DIR.mkdir(parents=True, exist_ok=True)

In [None]:
datasets = pipeline.load_core_datasets(SEASONS, cache_dir=CACHE_DIR)
feature_config = features.FeatureConfig()
model_df = pipeline.build_model_ready_frame(datasets['weekly'], datasets['schedule'], feature_config)
model_df = model_df.sort_values(['season', 'week']).head(200)
feature_cols = pipeline.default_feature_columns(model_df)
results = pipeline.train_baseline_models(model_df, feature_cols, validation_seasons=[VALIDATION_SEASON])
moneyline_preds = results['moneyline'][0].predictions
total_preds = results['totals'][0].predictions
market_totals = datasets['schedule'][datasets['schedule']['season'] == VALIDATION_SEASON][['game_id', 'total_line']].rename(columns={'total_line': 'market_total'})
visuals.plot_win_probability_distribution(moneyline_preds)
visuals.plot_total_points_distribution(total_preds)
visuals.plot_model_vs_market(total_preds, market_totals)