# NBA Fantasy Draft Tool - 2025-26 Season
## Yahoo! Fantasy Basketball - Advanced Analytics

This notebook provides:
1. Player rankings optimized for Yahoo scoring
2. Consistency analysis (favoring predictable performers)
3. Injury risk modeling
4. Real-time draft tracker

**League Settings:**
- Scoring: PTS(1), REB(1.2), AST(1.5), STL(3), BLK(3), TO(-1)
- Roster: PG, SG, G, SF, PF, F, C, C, UTIL, UTIL, BN, BN, BN
- Draft: Live Standard, 12 teams

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from ipywidgets import interact, widgets
from IPython.display import display, HTML, clear_output
import warnings
warnings.filterwarnings('ignore')

# Set style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("✓ Libraries loaded successfully!")

ModuleNotFoundError: No module named 'matplotlib'

## Step 1: Data Collection

**Note:** First run may take 15-30 minutes to scrape all data.
Data will be cached for subsequent runs.

In [None]:
from data_collector import NBADataCollector
import os

# Check if data already exists
data_dir = 'data'
if os.path.exists(f'{data_dir}/game_logs.parquet'):
    print("Loading cached data...")
    game_logs = pd.read_parquet(f'{data_dir}/game_logs.parquet')
    standings = pd.read_csv(f'{data_dir}/standings.csv') if os.path.exists(f'{data_dir}/standings.csv') else None
    injury_data = pd.read_csv(f'{data_dir}/injury_data.csv') if os.path.exists(f'{data_dir}/injury_data.csv') else None
    print(f"✓ Loaded {len(game_logs):,} game logs")
else:
    print("Collecting fresh data (this will take ~20 minutes)...")
    collector = NBADataCollector(
        seasons=[2022, 2023, 2024, 2025],
        weights={2025: 0.40, 2024: 0.30, 2023: 0.20, 2022: 0.10}
    )
    data = collector.collect_all_data()
    collector.save_data(data, output_dir=data_dir)
    
    game_logs = data['game_logs']
    standings = data['standings']
    injury_data = data['injury_data']
    print("✓ Data collection complete!")

print(f"\nData Summary:")
print(f"  - Total games: {len(game_logs):,}")
print(f"  - Unique players: {game_logs['player_name'].nunique()}")
print(f"  - Seasons: {sorted(game_logs['season_end_year'].unique())}")

## Step 2: Feature Engineering

Creating advanced features:
- **Consistency Score**: Statistical measure of reliability
- **Optimized Moving Averages**: ML-weighted recent performance
- **Injury Risk Score**: Based on games missed and playing time volatility
- **Age Adjustment**: Peak performance curves (25-29 years old)

In [None]:
from feature_engineering import FantasyFeatureEngineer

print("Engineering features...")
engineer = FantasyFeatureEngineer(
    game_logs_df=game_logs,
    injury_df=injury_data,
    standings_df=standings
)

player_features = engineer.create_all_features()

# Display sample
print("\nSample Player Features:")
display(player_features.head(10))

# Save features
player_features.to_csv('data/player_features.csv', index=False)
print("\n✓ Features saved to data/player_features.csv")

## Step 3: Machine Learning Model

Train XGBoost model to project 2025-26 fantasy points

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import xgboost as xgb
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Prepare training data
feature_columns = [
    'avg_fp', 'median_fp', 'std_fp', 'floor', 'ceiling',
    'consistency_score', 'coef_variation', 'iqr_ratio',
    'optimized_ma_fp', 'optimized_ma_points', 'optimized_ma_rebounds', 'optimized_ma_assists',
    'injury_risk_score', 'age_adjustment', 'age', 'games_played_count'
]

# Filter to players with minimum games played
min_games = 30
model_data = player_features[player_features['games_played_count'] >= min_games].copy()

X = model_data[feature_columns].fillna(0)
y = model_data['avg_fp']  # Target: average fantasy points

# Scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train-test split (use 80% for training)
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42
)

# Train XGBoost model with optimized parameters
print("Training XGBoost model...")
model = xgb.XGBRegressor(
    objective='reg:squarederror',
    learning_rate=0.05,
    max_depth=5,
    n_estimators=500,
    subsample=0.8,
    colsample_bytree=0.8,
    reg_alpha=0.5,
    reg_lambda=1.0,
    random_state=42
)

model.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],
    verbose=False
)

# Evaluate
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"\nModel Performance:")
print(f"  MAE: {mae:.2f} fantasy points")
print(f"  RMSE: {rmse:.2f} fantasy points")

# Make predictions for all players
model_data['projected_fp'] = model.predict(X_scaled)

# Apply age and injury adjustments
model_data['adjusted_projection'] = (
    model_data['projected_fp'] * 
    model_data['age_adjustment'] * 
    (1 - model_data['injury_risk_score'] / 200)  # Reduce by up to 50% for high injury risk
)

print("\n✓ Projections complete!")

## Step 4: Player Rankings & Analysis

Generate draft rankings with position eligibility

In [None]:
# Calculate Value Over Replacement Player (VORP)
replacement_level = model_data['adjusted_projection'].quantile(0.40)
model_data['vorp'] = model_data['adjusted_projection'] - replacement_level

# Create composite draft score
# Weights: 50% projection, 30% consistency, 20% floor
model_data['draft_score'] = (
    0.50 * model_data['adjusted_projection'] +
    0.30 * model_data['consistency_score'] +
    0.20 * model_data['floor']
)

# Apply contender bonus (prefer players on playoff teams)
model_data['draft_score'] = np.where(
    model_data['is_contender'],
    model_data['draft_score'] * 1.05,
    model_data['draft_score'] * 0.95
)

# Sort by draft score
rankings = model_data.sort_values('draft_score', ascending=False).reset_index(drop=True)
rankings['overall_rank'] = range(1, len(rankings) + 1)

# Display top 50
display_cols = [
    'overall_rank', 'player_name', 'team', 'age',
    'adjusted_projection', 'consistency_score', 'floor', 'ceiling',
    'injury_risk_score', 'is_contender'
]

print("\n=== TOP 50 DRAFT TARGETS ===")
print("\nLegend:")
print("  - Adjusted Projection: Expected fantasy points per game")
print("  - Consistency Score: Higher = more reliable")
print("  - Floor/Ceiling: 10th/90th percentile performance")
print("  - Injury Risk: 0-100 (lower is better)\n")

display(rankings[display_cols].head(50))

# Save rankings
rankings.to_csv('data/draft_rankings.csv', index=False)
print("\n✓ Rankings saved to data/draft_rankings.csv")

## Step 5: Visualization - Player Profiles

Visualize why top players are ranked highly

In [None]:
def plot_player_profile(player_name, rankings_df, game_logs_df):
    """Create comprehensive player analysis visualization"""
    
    # Get player data
    player_row = rankings_df[rankings_df['player_name'] == player_name].iloc[0]
    player_games = game_logs_df[
        (game_logs_df['player_name'] == player_name) &
        (game_logs_df['season_end_year'] == 2025)
    ].sort_values('date')
    
    if len(player_games) == 0:
        print(f"No 2024-25 games found for {player_name}")
        return
    
    # Create figure
    fig = plt.figure(figsize=(16, 10))
    gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)
    
    # Title
    fig.suptitle(
        f"{player_name} - Rank #{int(player_row['overall_rank'])} | {player_row['team']} | Age {int(player_row['age'])}",
        fontsize=18, fontweight='bold'
    )
    
    # 1. Fantasy Points Trend
    ax1 = fig.add_subplot(gs[0, :])
    ax1.plot(range(len(player_games)), player_games['fantasy_points'], 
             marker='o', linewidth=2, markersize=4, alpha=0.6, label='Actual')
    ax1.axhline(player_row['adjusted_projection'], color='red', 
                linestyle='--', linewidth=2, label=f"Projection: {player_row['adjusted_projection']:.1f}")
    ax1.axhline(player_row['floor'], color='orange', 
                linestyle=':', linewidth=1.5, label=f"Floor: {player_row['floor']:.1f}")
    ax1.axhline(player_row['ceiling'], color='green', 
                linestyle=':', linewidth=1.5, label=f"Ceiling: {player_row['ceiling']:.1f}")
    ax1.set_title('Fantasy Points Per Game (2024-25 Season)', fontsize=14, fontweight='bold')
    ax1.set_xlabel('Game Number')
    ax1.set_ylabel('Fantasy Points')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # 2. Distribution of Fantasy Points
    ax2 = fig.add_subplot(gs[1, 0])
    ax2.hist(player_games['fantasy_points'], bins=20, color='skyblue', edgecolor='black', alpha=0.7)
    ax2.axvline(player_row['avg_fp'], color='red', linestyle='--', linewidth=2, label='Average')
    ax2.set_title('Fantasy Points Distribution', fontweight='bold')
    ax2.set_xlabel('Fantasy Points')
    ax2.set_ylabel('Frequency')
    ax2.legend()
    
    # 3. Key Metrics Radar
    ax3 = fig.add_subplot(gs[1, 1])
    metrics = ['Projection', 'Consistency', 'Floor', 'Health']
    values = [
        min(100, player_row['adjusted_projection'] / rankings_df['adjusted_projection'].max() * 100),
        min(100, player_row['consistency_score'] / rankings_df['consistency_score'].max() * 100),
        min(100, player_row['floor'] / rankings_df['floor'].max() * 100),
        100 - player_row['injury_risk_score']
    ]
    bars = ax3.barh(metrics, values, color=['#ff9999', '#66b3ff', '#99ff99', '#ffcc99'])
    ax3.set_xlim(0, 100)
    ax3.set_title('Key Metrics (Percentile)', fontweight='bold')
    ax3.set_xlabel('Score (0-100)')
    for i, bar in enumerate(bars):
        ax3.text(values[i] + 2, i, f"{values[i]:.0f}", va='center')
    
    # 4. Stats Breakdown
    ax4 = fig.add_subplot(gs[1, 2])
    stat_categories = ['Points', 'Rebounds', 'Assists', 'Stocks']
    stat_values = [
        player_games['points_scored'].mean() if 'points_scored' in player_games else 0,
        (player_games['offensive_rebounds'].mean() + player_games['defensive_rebounds'].mean()) if 'offensive_rebounds' in player_games else 0,
        player_games['assists'].mean() if 'assists' in player_games else 0,
        (player_games['steals'].mean() + player_games['blocks'].mean()) if 'steals' in player_games else 0
    ]
    bars = ax4.bar(stat_categories, stat_values, color='lightcoral', edgecolor='black')
    ax4.set_title('Average Stats Per Game', fontweight='bold')
    ax4.set_ylabel('Per Game Average')
    for bar in bars:
        height = bar.get_height()
        ax4.text(bar.get_x() + bar.get_width()/2., height,
                f"{height:.1f}", ha='center', va='bottom')
    
    # 5. Summary Stats Box
    ax5 = fig.add_subplot(gs[2, :])
    ax5.axis('off')
    
    summary_text = f"""
    PLAYER SUMMARY:
    
    Projected Fantasy Points/Game: {player_row['adjusted_projection']:.1f}
    Consistency Score: {player_row['consistency_score']:.1f} (Coef. of Var: {player_row['coef_variation']:.3f})
    Floor (10th %ile): {player_row['floor']:.1f} | Ceiling (90th %ile): {player_row['ceiling']:.1f}
    Injury Risk Score: {player_row['injury_risk_score']:.0f}/100
    Age Adjustment: {player_row['age_adjustment']:.2f}x
    Team: {player_row['team']} ({"Contender" if player_row['is_contender'] else "Non-Playoff"})
    Games Played (Recent): {player_row['games_played_count']:.0f}
    
    DRAFT RECOMMENDATION:
    {"Elite pick - High floor and consistency" if player_row['overall_rank'] <= 12
     else "Solid mid-round value" if player_row['overall_rank'] <= 50
     else "Late round upside play"}
    """
    
    ax5.text(0.1, 0.5, summary_text, fontsize=11, verticalalignment='center',
             fontfamily='monospace', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    plt.tight_layout()
    plt.show()

# Plot top 5 players
print("\nGenerating player profile visualizations...\n")
for i in range(min(5, len(rankings))):
    player = rankings.iloc[i]['player_name']
    print(f"Creating profile for: {player}")
    plot_player_profile(player, rankings, game_logs)

## Step 6: Real-Time Draft Tracker

Interactive tool to track picks and get recommendations during your live draft

In [None]:
class DraftTracker:
    def __init__(self, rankings_df, num_teams=12, your_pick=1):
        self.rankings = rankings_df.copy()
        self.num_teams = num_teams
        self.your_pick = your_pick
        self.drafted_players = []
        self.your_team = []
        self.current_round = 1
        self.available = rankings_df.copy()
        
    def mark_player_drafted(self, player_name, by_you=False):
        """Mark a player as drafted"""
        self.drafted_players.append(player_name)
        if by_you:
            self.your_team.append(player_name)
        self.available = self.available[self.available['player_name'] != player_name]
        
    def get_recommendations(self, n=10):
        """Get top N available players"""
        return self.available.head(n)[[
            'overall_rank', 'player_name', 'team', 'age',
            'adjusted_projection', 'consistency_score', 'floor', 'injury_risk_score'
        ]]
    
    def next_pick_number(self):
        """Calculate when your next pick is"""
        picks_made = len(self.drafted_players)
        
        # Snake draft logic
        current_round = (picks_made // self.num_teams) + 1
        pick_in_round = picks_made % self.num_teams
        
        if current_round % 2 == 1:  # Odd rounds go 1->12
            if pick_in_round < self.your_pick - 1:
                picks_until_yours = (self.your_pick - 1) - pick_in_round
            else:
                picks_until_yours = (self.num_teams - pick_in_round) + (self.num_teams - self.your_pick + 1)
        else:  # Even rounds go 12->1
            reverse_pick = self.num_teams - self.your_pick + 1
            if pick_in_round < reverse_pick - 1:
                picks_until_yours = (reverse_pick - 1) - pick_in_round
            else:
                picks_until_yours = (self.num_teams - pick_in_round) + self.your_pick
        
        return picks_until_yours
    
    def display_status(self):
        """Display current draft status"""
        clear_output(wait=True)
        
        print("="*80)
        print(f"  NBA FANTASY DRAFT TRACKER - Round {self.current_round}")
        print("="*80)
        print(f"Your Pick Position: {self.your_pick} of {self.num_teams}")
        print(f"Total Picks Made: {len(self.drafted_players)}")
        print(f"Picks Until Your Turn: {self.next_pick_number()}")
        print(f"\nYour Current Team ({len(self.your_team)} players):")
        if self.your_team:
            for i, p in enumerate(self.your_team, 1):
                print(f"  {i}. {p}")
        else:
            print("  (none yet)")
        
        print(f"\nTOP {min(15, len(self.available))} AVAILABLE PLAYERS:")
        display(self.get_recommendations(15))

# Initialize draft tracker
print("\n=== DRAFT TRACKER SETUP ===")
print("\nBefore starting your draft, update these settings:\n")

# User inputs
num_teams = 12  # Change this if needed
your_pick_position = 6  # UPDATE THIS when you learn your draft position!

tracker = DraftTracker(
    rankings_df=rankings,
    num_teams=num_teams,
    your_pick=your_pick_position
)

print(f"Draft tracker initialized for {num_teams}-team league")
print(f"Your draft position: #{your_pick_position}")
print("\nReady to start tracking!")
tracker.display_status()

### Draft Tracking - Mark Players as Drafted

Run this cell repeatedly during your draft to update availability

In [None]:
# Mark a player as drafted
# Change player_name to whoever just got picked
# Set by_you=True if YOU drafted them

player_name = "Nikola Jokic"  # CHANGE THIS
by_you = False  # Set to True if you picked them

tracker.mark_player_drafted(player_name, by_you=by_you)
tracker.display_status()

### Quick Search - Find Specific Player

Search for a specific player's ranking and stats

In [None]:
def search_player(player_name_search):
    """Search for player in rankings"""
    results = tracker.available[
        tracker.available['player_name'].str.contains(player_name_search, case=False, na=False)
    ][[
        'overall_rank', 'player_name', 'team', 'age',
        'adjusted_projection', 'consistency_score', 'floor', 'ceiling',
        'injury_risk_score', 'is_contender'
    ]]
    
    if len(results) == 0:
        print(f"No available players found matching '{player_name_search}'")
    else:
        display(results)

# Example search
search_player("Curry")  # Change this to search for any player


## Step 7: Export Rankings for Reference

Export to Excel for easy reference during draft

In [None]:
# Export to Excel with multiple sheets

output_file = 'NBA_Fantasy_Draft_Guide_2025.xlsx'

with pd.ExcelWriter(output_file, engine='openpyxl') as writer:
    # Overall rankings
    rankings[[
        'overall_rank', 'player_name', 'team', 'age',
        'adjusted_projection', 'consistency_score', 'floor', 'ceiling',
        'injury_risk_score', 'vorp', 'is_contender'
    ]].to_excel(writer, sheet_name='Overall Rankings', index=False)
    
    # Top 100
    rankings.head(100)[[
        'overall_rank', 'player_name', 'team', 'adjusted_projection', 'floor'
    ]].to_excel(writer, sheet_name='Top 100', index=False)
    
    # High consistency players
    rankings.nlargest(50, 'consistency_score')[[
        'player_name', 'team', 'consistency_score', 'coef_variation', 'floor'
    ]].to_excel(writer, sheet_name='Most Consistent', index=False)
    
    # Low injury risk
    rankings.nsmallest(50, 'injury_risk_score')[[
        'player_name', 'team', 'injury_risk_score', 'games_played_count', 'adjusted_projection'
    ]].to_excel(writer, sheet_name='Healthy Players', index=False)
    
    # High upside (high ceiling)
    rankings.nlargest(50, 'ceiling')[[
        'player_name', 'team', 'ceiling', 'avg_fp', 'adjusted_projection'
    ]].to_excel(writer, sheet_name='High Ceiling', index=False)

print(f"✓ Draft guide exported to {output_file}")
print("\nYou can print this file or keep it open during your draft!")

## Summary Statistics

In [None]:
print("=" * 80)
print("  FINAL RANKINGS SUMMARY")
print("=" * 80)

print(f"\nTotal Players Ranked: {len(rankings)}")
print(f"Average Projected FP/G: {rankings['adjusted_projection'].mean():.2f}")
print(f"Median Projected FP/G: {rankings['adjusted_projection'].median():.2f}")

print("\nTop 5 Most Consistent Players:")
for i, row in rankings.nlargest(5, 'consistency_score').iterrows():
    print(f"  {row['player_name']:25s} - Consistency: {row['consistency_score']:.1f}, CV: {row['coef_variation']:.3f}")

print("\nTop 5 Highest Ceilings:")
for i, row in rankings.nlargest(5, 'ceiling').iterrows():
    print(f"  {row['player_name']:25s} - Ceiling: {row['ceiling']:.1f} FP/G")

print("\nTop 5 Lowest Injury Risk:")
for i, row in rankings.nsmallest(5, 'injury_risk_score').iterrows():
    print(f"  {row['player_name']:25s} - Risk: {row['injury_risk_score']:.0f}/100, Games: {row['games_played_count']:.0f}")

print("\n" + "=" * 80)
print("Draft Tool Ready! Good luck with your draft on Monday Oct 20!")
print("=" * 80)

## Additional Analysis - Compare Multiple Players

In [None]:
def compare_players(player_names_list):
    """Compare multiple players side by side"""
    comparison = rankings[rankings['player_name'].isin(player_names_list)][
        ['player_name', 'overall_rank', 'adjusted_projection', 'consistency_score',
         'floor', 'ceiling', 'injury_risk_score', 'age']
    ].sort_values('overall_rank')
    
    print("\nPLAYER COMPARISON:")
    print("=" * 100)
    display(comparison)
    
    # Visual comparison
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    # Projection comparison
    axes[0].bar(comparison['player_name'], comparison['adjusted_projection'], color='steelblue')
    axes[0].set_title('Projected Fantasy Points/Game', fontweight='bold')
    axes[0].set_ylabel('Fantasy Points')
    axes[0].tick_params(axis='x', rotation=45)
    
    # Consistency comparison
    axes[1].bar(comparison['player_name'], comparison['consistency_score'], color='forestgreen')
    axes[1].set_title('Consistency Score', fontweight='bold')
    axes[1].set_ylabel('Consistency')
    axes[1].tick_params(axis='x', rotation=45)
    
    # Injury risk comparison (lower is better)
    axes[2].bar(comparison['player_name'], comparison['injury_risk_score'], color='coral')
    axes[2].set_title('Injury Risk Score (Lower = Better)', fontweight='bold')
    axes[2].set_ylabel('Risk Score')
    axes[2].tick_params(axis='x', rotation=45)
    
    plt.tight_layout()
    plt.show()

# Example: Compare similar-ranked players
compare_players(['Stephen Curry', 'Luka Doncic', 'Kevin Durant'])

## Bonus: Position Scarcity Analysis

Analyze which positions have the most depth

In [None]:
# Note: This requires position data which we don't have in the current dataset
# You would need to manually add position eligibility
# or scrape it from another source (Yahoo, ESPN, etc.)

print("Position scarcity analysis requires position eligibility data.")
print("To add this feature, you would need to:")
print("  1. Scrape position data from Yahoo Fantasy Basketball")
print("  2. Merge with rankings DataFrame")
print("  3. Analyze top players by position")
print("\nFor now, manually verify positions on Yahoo's website.")

## Final Checklist Before Draft

**Pre-Draft (Day Before):**
- [ ] Rerun entire notebook for fresh data
- [ ] Review top 50 players
- [ ] Check injury news for top prospects
- [ ] Print or save Excel guide

**Draft Day (30 min before):**
- [ ] Update your draft position in Step 6
- [ ] Initialize draft tracker
- [ ] Have Excel guide open
- [ ] Test player search function

**During Draft:**
- [ ] Mark each pick as it happens
- [ ] Review top available after each pick
- [ ] Balance positions (don't draft 3 centers!)
- [ ] Trust the consistency scores for early rounds
- [ ] Take upside swings in late rounds

**Good luck! 🏀**