# Fantasy Football Database Setup with Sleeper API

This notebook demonstrates setting up a PostgreSQL database for fantasy football statistics and connecting to the Sleeper API to collect data. We'll build the foundation for a comprehensive fantasy football analytics platform.

## Project Overview
- **Database**: PostgreSQL for robust data storage
- **API**: Sleeper Fantasy Football platform
- **Analytics**: Foundation for future data science workflows

---

## 1. Install Required Packages

First, we'll install all the necessary Python packages for our fantasy football database project.

In [None]:
# Install required packages
!pip install sleeper-py psycopg2-binary sqlalchemy pandas numpy matplotlib seaborn python-dotenv requests

## 2. Import Libraries and Dependencies

Import all the libraries we'll need for database operations, API calls, and data analysis.

In [None]:
# Core libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import requests
import json
from datetime import datetime, timedelta
import os
from dotenv import load_dotenv

# Database libraries
import psycopg2
from sqlalchemy import create_engine, text
import sqlalchemy as sa

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
plt.style.use('seaborn-v0_8')

print("Libraries imported successfully!")

## 3. Database Setup and Connection

Configure PostgreSQL connection and create database engine. Make sure you have PostgreSQL installed and running.

In [None]:
# Load environment variables
load_dotenv()

# Database configuration
DB_CONFIG = {
    'host': os.getenv('DB_HOST', 'localhost'),
    'port': os.getenv('DB_PORT', '5432'),
    'database': os.getenv('DB_NAME', 'ffb_stats'),
    'user': os.getenv('DB_USER', 'postgres'),
    'password': os.getenv('DB_PASSWORD', 'your_password')
}

# Create connection string
connection_string = f"postgresql://{DB_CONFIG['user']}:{DB_CONFIG['password']}@{DB_CONFIG['host']}:{DB_CONFIG['port']}/{DB_CONFIG['database']}"

# Create SQLAlchemy engine
try:
    engine = create_engine(connection_string, echo=False)
    
    # Test connection
    with engine.connect() as conn:
        result = conn.execute(text("SELECT version()"))
        version = result.fetchone()[0]
        print(f"Connected to PostgreSQL: {version[:50]}...")
        
except Exception as e:
    print(f"Database connection failed: {e}")
    print("Please ensure PostgreSQL is running and credentials are correct.")

## 4. Create Database Schema

Define and create the initial database schema for our fantasy football data.

In [None]:
# SQL schema definitions
schema_sql = {
    'users': """
    CREATE TABLE IF NOT EXISTS users (
        id VARCHAR PRIMARY KEY,
        username VARCHAR UNIQUE NOT NULL,
        display_name VARCHAR,
        avatar VARCHAR,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
    """,
    
    'leagues': """
    CREATE TABLE IF NOT EXISTS leagues (
        id VARCHAR PRIMARY KEY,
        name VARCHAR NOT NULL,
        season VARCHAR NOT NULL,
        sport VARCHAR DEFAULT 'nfl',
        status VARCHAR,
        season_type VARCHAR,
        total_rosters INTEGER,
        scoring_settings JSONB,
        roster_positions JSONB,
        settings JSONB,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
    """,
    
    'players': """
    CREATE TABLE IF NOT EXISTS players (
        id VARCHAR PRIMARY KEY,
        player_id VARCHAR UNIQUE,
        first_name VARCHAR,
        last_name VARCHAR,
        full_name VARCHAR,
        position VARCHAR,
        team VARCHAR,
        college VARCHAR,
        height VARCHAR,
        weight VARCHAR,
        age INTEGER,
        years_exp INTEGER,
        active BOOLEAN DEFAULT TRUE,
        injury_status VARCHAR,
        fantasy_data_id VARCHAR,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
    """,
    
    'rosters': """
    CREATE TABLE IF NOT EXISTS rosters (
        id SERIAL PRIMARY KEY,
        roster_id INTEGER NOT NULL,
        league_id VARCHAR REFERENCES leagues(id),
        owner_id VARCHAR REFERENCES users(id),
        co_owners JSONB,
        wins INTEGER DEFAULT 0,
        losses INTEGER DEFAULT 0,
        ties INTEGER DEFAULT 0,
        waiver_position INTEGER,
        waiver_budget_used INTEGER DEFAULT 0,
        total_moves INTEGER DEFAULT 0,
        settings JSONB,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
    """,
    
    'matchups': """
    CREATE TABLE IF NOT EXISTS matchups (
        id SERIAL PRIMARY KEY,
        matchup_id INTEGER,
        league_id VARCHAR REFERENCES leagues(id),
        roster_id INTEGER,
        week INTEGER NOT NULL,
        points DECIMAL,
        points_against DECIMAL,
        starters JSONB,
        starters_points JSONB,
        players JSONB,
        players_points JSONB,
        custom_points DECIMAL,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
    """
}

# Create tables
try:
    with engine.connect() as conn:
        for table_name, sql in schema_sql.items():
            conn.execute(text(sql))
            print(f"‚úì Created table: {table_name}")
        conn.commit()
    print("\nDatabase schema created successfully!")
    
except Exception as e:
    print(f"Error creating schema: {e}")

## 5. Initialize Sleeper API Client

Set up the Sleeper API client and test connectivity.

In [None]:
class SleeperAPI:
    """Simple Sleeper API client."""
    
    def __init__(self):
        self.base_url = "https://api.sleeper.app/v1"
        
    def get(self, endpoint):
        """Make GET request to Sleeper API."""
        url = f"{self.base_url}/{endpoint}"
        try:
            response = requests.get(url, timeout=30)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"API request failed: {e}")
            return None
    
    def get_nfl_state(self):
        """Get current NFL season state."""
        return self.get("state/nfl")
    
    def get_user(self, username):
        """Get user by username."""
        return self.get(f"user/{username}")
    
    def get_user_leagues(self, user_id, sport="nfl", season="2024"):
        """Get user's leagues."""
        return self.get(f"user/{user_id}/leagues/{sport}/{season}")
    
    def get_players(self, sport="nfl"):
        """Get all players."""
        return self.get(f"players/{sport}")
    
    def get_league_rosters(self, league_id):
        """Get league rosters."""
        return self.get(f"league/{league_id}/rosters")
    
    def get_league_matchups(self, league_id, week):
        """Get league matchups for a week."""
        return self.get(f"league/{league_id}/matchups/{week}")

# Initialize API client
sleeper = SleeperAPI()

# Test API connectivity
print("Testing Sleeper API connectivity...")
nfl_state = sleeper.get_nfl_state()

if nfl_state:
    print(f"‚úì Connected to Sleeper API")
    print(f"Current NFL Season: {nfl_state.get('season')}")
    print(f"Current Week: {nfl_state.get('week')}")
    print(f"Season Type: {nfl_state.get('season_type')}")
else:
    print("‚úó Failed to connect to Sleeper API")

## 6. Fetch Sample Data from Sleeper API

Retrieve sample data from the Sleeper API to test our integration.

In [None]:
# For demo purposes, let's fetch some sample data
# Note: Replace 'your_sleeper_username' with an actual Sleeper username

# Example: Get a user (you can replace this with any public Sleeper username)
sample_username = "example_user"  # Replace with actual username

print("Fetching sample data from Sleeper API...")

# 1. Get NFL players (limited sample for performance)
print("\n1. Fetching NFL players...")
players_data = sleeper.get_players()
if players_data:
    print(f"Total NFL players available: {len(players_data)}")
    
    # Convert to DataFrame for easier analysis
    players_list = []
    for player_id, player_info in list(players_data.items())[:100]:  # First 100 for demo
        player_info['sleeper_id'] = player_id
        players_list.append(player_info)
    
    players_df = pd.DataFrame(players_list)
    print(f"Sample players DataFrame shape: {players_df.shape}")
    print("\nSample players:")
    print(players_df[['full_name', 'position', 'team']].head(10))
    
else:
    print("Failed to fetch players data")
    players_df = pd.DataFrame()

# 2. Try to get user data (if username is provided)
user_data = None
leagues_data = None

if sample_username != "example_user":
    print(f"\n2. Fetching user data for: {sample_username}")
    user_data = sleeper.get_user(sample_username)
    
    if user_data:
        print(f"User found: {user_data.get('display_name', user_data.get('username'))}")
        
        # Get user's leagues
        user_id = user_data['user_id']
        leagues_data = sleeper.get_user_leagues(user_id)
        
        if leagues_data:
            print(f"Found {len(leagues_data)} leagues for user")
            leagues_df = pd.DataFrame(leagues_data)
            print("\nUser's leagues:")
            print(leagues_df[['name', 'season', 'total_rosters', 'status']].head())
        else:
            print("No leagues found for user")
    else:
        print("User not found")
else:
    print("\n2. Skipping user data (replace 'example_user' with actual username)")

print("\nData fetching complete!")

## 7. Data Processing and Validation

Process and clean the API response data before inserting into the database.

In [None]:
def clean_player_data(player_dict):
    """Clean and validate player data."""
    cleaned = {
        'id': player_dict.get('sleeper_id'),
        'player_id': player_dict.get('player_id'),
        'first_name': player_dict.get('first_name'),
        'last_name': player_dict.get('last_name'),
        'full_name': player_dict.get('full_name'),
        'position': player_dict.get('position'),
        'team': player_dict.get('team'),
        'college': player_dict.get('college'),
        'height': player_dict.get('height'),
        'weight': player_dict.get('weight'),
        'age': player_dict.get('age'),
        'years_exp': player_dict.get('years_exp'),
        'active': player_dict.get('active', True),
        'injury_status': player_dict.get('injury_status'),
        'fantasy_data_id': player_dict.get('fantasy_data_id')
    }
    
    # Convert numeric fields
    for field in ['age', 'years_exp']:
        if cleaned[field] is not None:
            try:
                cleaned[field] = int(cleaned[field])
            except (ValueError, TypeError):
                cleaned[field] = None
    
    return cleaned

# Process players data if we have it
if not players_df.empty:
    print("Processing player data...")
    
    # Clean the data
    processed_players = []
    for _, player in players_df.iterrows():
        cleaned_player = clean_player_data(player.to_dict())
        processed_players.append(cleaned_player)
    
    processed_players_df = pd.DataFrame(processed_players)
    
    print(f"Processed {len(processed_players_df)} players")
    print("\nData types:")
    print(processed_players_df.dtypes)
    
    print("\nSample processed data:")
    print(processed_players_df[['full_name', 'position', 'team', 'age', 'years_exp']].head())
    
    # Check for missing values
    print("\nMissing values:")
    missing_counts = processed_players_df.isnull().sum()
    print(missing_counts[missing_counts > 0])
    
else:
    print("No player data to process")
    processed_players_df = pd.DataFrame()

## 8. Insert Data into Database

Insert the processed data into our PostgreSQL database.

In [None]:
def insert_players_batch(df, engine):
    """Insert players data into database using batch insert."""
    if df.empty:
        return 0
    
    try:
        # Use pandas to_sql for efficient batch insert
        rows_inserted = df.to_sql(
            'players', 
            engine, 
            if_exists='append', 
            index=False,
            method='multi'
        )
        return rows_inserted
    except Exception as e:
        print(f"Error inserting players: {e}")
        return 0

# Insert sample players data
if not processed_players_df.empty:
    print("Inserting players into database...")
    
    try:
        # First, let's check if any players already exist
        with engine.connect() as conn:
            existing_count = conn.execute(text("SELECT COUNT(*) FROM players")).fetchone()[0]
            print(f"Existing players in database: {existing_count}")
        
        # Insert new players (avoid duplicates by checking IDs)
        if existing_count == 0:
            rows_inserted = insert_players_batch(processed_players_df, engine)
            print(f"Inserted {rows_inserted} players into database")
        else:
            print("Players already exist in database, skipping insert")
            
    except Exception as e:
        print(f"Database insert error: {e}")

# Insert user data if available
if user_data:
    print("\nInserting user data...")
    
    user_record = {
        'id': user_data['user_id'],
        'username': user_data.get('username'),
        'display_name': user_data.get('display_name'),
        'avatar': user_data.get('avatar')
    }
    
    try:
        user_df = pd.DataFrame([user_record])
        user_df.to_sql('users', engine, if_exists='append', index=False)
        print("User data inserted successfully")
    except Exception as e:
        print(f"Error inserting user data: {e}")

# Insert leagues data if available
if leagues_data:
    print("\nInserting leagues data...")
    
    try:
        # Prepare leagues DataFrame
        leagues_df = pd.DataFrame(leagues_data)
        leagues_df = leagues_df.rename(columns={'league_id': 'id'})
        
        # Select only the columns we need
        columns_to_keep = ['id', 'name', 'season', 'sport', 'status', 
                          'season_type', 'total_rosters']
        leagues_df = leagues_df[columns_to_keep]
        
        leagues_df.to_sql('leagues', engine, if_exists='append', index=False)
        print(f"Inserted {len(leagues_df)} leagues into database")
        
    except Exception as e:
        print(f"Error inserting leagues data: {e}")

print("\nData insertion complete!")

## 9. Verify Database Operations

Query the database to verify successful data insertion and explore our data.

In [None]:
# Verify data insertion by querying each table
print("Verifying database operations...\n")

tables_to_check = ['users', 'leagues', 'players', 'rosters', 'matchups']

for table in tables_to_check:
    try:
        # Get row count
        with engine.connect() as conn:
            count_result = conn.execute(text(f"SELECT COUNT(*) FROM {table}"))
            count = count_result.fetchone()[0]
            print(f"üìä {table.capitalize()}: {count} records")
            
            # Show sample data if records exist
            if count > 0:
                sample_query = f"SELECT * FROM {table} LIMIT 3"
                sample_df = pd.read_sql(sample_query, conn)
                print(f"Sample {table} data:")
                print(sample_df.to_string(index=False))
                print("\n" + "-"*80 + "\n")
                
    except Exception as e:
        print(f"Error checking {table}: {e}")

## Summary and Next Steps

üéâ **Congratulations!** You've successfully:

1. ‚úÖ Set up a PostgreSQL database for fantasy football data
2. ‚úÖ Connected to the Sleeper API
3. ‚úÖ Created database tables for users, leagues, players, rosters, and matchups
4. ‚úÖ Fetched and processed sample data from Sleeper
5. ‚úÖ Inserted data into the database
6. ‚úÖ Verified successful data operations

### Next Steps for Your Fantasy Football Analytics Platform:

1. **Data Collection Automation**
   - Set up scheduled jobs to regularly sync data from Sleeper
   - Implement incremental updates to avoid duplicates
   - Add error handling and logging

2. **Advanced Analytics**
   - Player performance trends
   - League competitiveness analysis
   - Waiver wire recommendations
   - Trade analysis and recommendations

3. **Data Science Features**
   - Machine learning models for player projections
   - Clustering analysis for player similarities
   - Time series forecasting for season outcomes

4. **Visualization Dashboard**
   - Interactive plots with Plotly/Dash
   - League standings and matchup visualizations
   - Player performance heatmaps

5. **API Extensions**
   - Integrate additional data sources (ESPN, Yahoo, etc.)
   - Add real-time game updates
   - Include injury reports and news

### Configuration Notes:
- Remember to update your `.env` file with actual database credentials
- Replace the sample username with your actual Sleeper username
- Consider implementing database migrations for schema changes

Happy analyzing! üèàüìä