# NFL Fantasy Football Projection Model Using XGBoost

This notebook presents a comprehensive workflow for building an NFL fantasy football projection model using XGBoost. Data sources include Yahoo, Pro-Football Reference, and SportsDataIO. The goal is to leverage advanced machine learning techniques and rich datasets to generate accurate player projections for fantasy football analysis and decision-making.

This notebook will use weekly data from the 2024 NFL Regular Season. 

In [1]:
import pandas as pd
import numpy as np
import os
import sys

project_root = os.path.abspath(os.path.join(os.getcwd(), os.pardir))
print(f"Project Root: {project_root}")
print("Sys Path Before:", sys.path)
if project_root not in sys.path:
    print("Inserting project root to sys.path")
    sys.path.insert(0, project_root)

# Now import
from data_api import SportsDataIO, Yahoo, PFR
from utils import describe_endpoint
from yahoo_helpers import get_all_players, get_player_details, get_player_stats

from dotenv import load_dotenv
load_dotenv()

Project Root: c:\Users\bengu\Documents\Sports Analysis Project\clairvoyent-raven-sports-analysis\src
Sys Path Before: ['C:\\Users\\bengu\\AppData\\Local\\Programs\\Python\\Python310\\python310.zip', 'C:\\Users\\bengu\\AppData\\Local\\Programs\\Python\\Python310\\DLLs', 'C:\\Users\\bengu\\AppData\\Local\\Programs\\Python\\Python310\\lib', 'C:\\Users\\bengu\\AppData\\Local\\Programs\\Python\\Python310', 'c:\\Users\\bengu\\.virtualenvs\\cfeproj-oIABPDjj', '', 'c:\\Users\\bengu\\.virtualenvs\\cfeproj-oIABPDjj\\lib\\site-packages', 'c:\\Users\\bengu\\.virtualenvs\\cfeproj-oIABPDjj\\lib\\site-packages\\win32', 'c:\\Users\\bengu\\.virtualenvs\\cfeproj-oIABPDjj\\lib\\site-packages\\win32\\lib', 'c:\\Users\\bengu\\.virtualenvs\\cfeproj-oIABPDjj\\lib\\site-packages\\Pythonwin']
Inserting project root to sys.path


True

In [2]:
# Initialize API wrappers
sdio_api = SportsDataIO(api_key=os.getenv("SPORTS_DATA_IO_API_KEY"))
yahoo_api = Yahoo(os.getenv("YAHOO_OAUTH_KEYS_PATH"))
pfr_api = PFR()

[2025-09-29 11:12:11,506 DEBUG] [yahoo_oauth.oauth.__init__] Checking 
[2025-09-29 11:12:11,516 DEBUG] [yahoo_oauth.oauth.token_is_valid] ELAPSED TIME : 764.3847229480743
[2025-09-29 11:12:11,517 DEBUG] [yahoo_oauth.oauth.token_is_valid] TOKEN IS STILL VALID


Getting league key
League key: 461.l.242497


In [None]:
"""
Define data paradigms

Position name-abbreviation mapping
"""

position_dict = {
    "rushing_and_receiving": { 'RB', 'WR', 'TE', 'FB', 'QB' }, 
    "kicking": { 'K' },
    "passing": { 'QB' },
    "defense": { 'DE', 'CB', 'LB', 'FS','OLB', 'S', 'DT', 'ILB', 'DL', 'DB', 'SS', 'NT', }
}


## Extract and Clean Necessary Data

Needed data

* 2024 Regular Season Weekly Stats (Sports Data IO)
* 2023 Regular Season Yearly Stats (Yahoo Fantasy)


In [3]:
all_players = get_all_players()
print(f"Total players loaded: {len(all_players)}")

Total players loaded: 2165


In [12]:
player_game_stats_2024 = sdio_api.fantasy.get_player_game_stats(season="2024REG", week=1)

In [13]:
player_game_stats_2024["Position"].unique()

array(['RB', 'QB', 'WR', 'K', 'DE', 'CB', 'LB', 'FS', 'TE', 'OLB', 'S',
       'DT', 'ILB', 'DL', 'DB', 'SS', 'NT', 'FB', 'OT', 'OL', 'LS', 'G',
       'C', 'P'], dtype=object)

## Apply the XGBoost Model

In [None]:
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.datasets import fetch_california_housing