# Load Kaggle Dataset
The Kaggle dataset from https://www.kaggle.com/datasets/eoinamoore/historical-nba-data-and-player-box-scores was loaded.

Make sure to have the Kaggle Authentication Key Downloaded in your environment.

The Kaggle data will be saved to ./data/*.csv

In [3]:
import kaggle

# Download the dataset
kaggle.api.dataset_download_files(
    'eoinamoore/historical-nba-data-and-player-box-scores',
    path='../data',  # where to save
    unzip=True      # automatically unzip
)

Dataset URL: https://www.kaggle.com/datasets/eoinamoore/historical-nba-data-and-player-box-scores


# Explore the Dataset
Now we will upload the data we downloaded into pandas dataframes, so that we can easily view and traverse.

**Import libraries**

In [None]:
# Import libraries for data exploration
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

print("‚úÖ Libraries imported successfully!")

**Load tables into Pandas dataframe**

In [7]:
# Load the main datasets
player_stats = pd.read_csv('../data/PlayerStatistics.csv')
games = pd.read_csv('../data/Games.csv')
players = pd.read_csv('../data/Players.csv')

print(f"üìä Player Statistics: {player_stats.shape}")
print(f"üèÄ Games: {games.shape}")
print(f"üë• Players: {players.shape}")

  player_stats = pd.read_csv('../data/PlayerStatistics.csv')


üìä Player Statistics: (1633902, 35)
üèÄ Games: (72097, 17)
üë• Players: (6678, 14)


  games = pd.read_csv('../data/Games.csv')


**Explore the "player" table**

In [8]:
# Check the structure of player statistics
print("=== PLAYER STATISTICS COLUMNS ===")
print(player_stats.columns.tolist())
print("\n=== FIRST 3 ROWS ===")
player_stats.head(3)

=== PLAYER STATISTICS COLUMNS ===
['firstName', 'lastName', 'personId', 'gameId', 'gameDate', 'playerteamCity', 'playerteamName', 'opponentteamCity', 'opponentteamName', 'gameType', 'gameLabel', 'gameSubLabel', 'seriesGameNumber', 'win', 'home', 'numMinutes', 'points', 'assists', 'blocks', 'steals', 'fieldGoalsAttempted', 'fieldGoalsMade', 'fieldGoalsPercentage', 'threePointersAttempted', 'threePointersMade', 'threePointersPercentage', 'freeThrowsAttempted', 'freeThrowsMade', 'freeThrowsPercentage', 'reboundsDefensive', 'reboundsOffensive', 'reboundsTotal', 'foulsPersonal', 'turnovers', 'plusMinusPoints']

=== FIRST 3 ROWS ===


Unnamed: 0,firstName,lastName,personId,gameId,gameDate,playerteamCity,playerteamName,opponentteamCity,opponentteamName,gameType,gameLabel,gameSubLabel,seriesGameNumber,win,home,numMinutes,points,assists,blocks,steals,fieldGoalsAttempted,fieldGoalsMade,fieldGoalsPercentage,threePointersAttempted,threePointersMade,threePointersPercentage,freeThrowsAttempted,freeThrowsMade,freeThrowsPercentage,reboundsDefensive,reboundsOffensive,reboundsTotal,foulsPersonal,turnovers,plusMinusPoints
0,Domantas,Sabonis,1627734,22500197,2025-11-09T21:00:00Z,Sacramento,Kings,Minnesota,Timberwolves,,,,,0,1,29.5,20.0,3.0,0.0,1.0,17.0,5.0,0.294,2.0,0.0,0.0,12.0,10.0,0.833,8.0,5.0,13.0,4.0,3.0,-19.0
1,Malik,Monk,1628370,22500197,2025-11-09T21:00:00Z,Sacramento,Kings,Minnesota,Timberwolves,,,,,0,1,19.36,2.0,5.0,0.0,0.0,7.0,1.0,0.143,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,-10.0
2,Donte,DiVincenzo,1628978,22500197,2025-11-09T21:00:00Z,Minnesota,Timberwolves,Sacramento,Kings,,,,,1,0,28.21,8.0,4.0,1.0,2.0,7.0,2.0,0.286,6.0,1.0,0.167,3.0,3.0,1.0,1.0,3.0,4.0,3.0,0.0,15.0


**Load in ESPN Fantasy Scoring Stats and check we have all the data in our tables**

In [10]:
# ESPN Fantasy Scoring Requirements:
# 3PM = 5 pts, 2PM = 3 pts, FTM = 1 pt, Missed shot = -1 pt
# REB = 1 pt, AST = 2 pts, STL = 4 pts, BLK = 4 pts, TOV = -2 pts

required_stats = {
  'fieldGoalsMade': 'For calculating made 2PT/3PT shots',
  'fieldGoalsAttempted': 'For calculating missed shots',
  'threePointersMade': 'For 3PT bonus (5 pts each)',
  'threePointersAttempted': 'For calculating missed 3PT shots',
  'freeThrowsMade': 'For FT points (1 pt each)',
  'freeThrowsAttempted': 'For calculating missed FT shots',
  'reboundsTotal': 'For rebounds (1 pt each)',
  'assists': 'For assists (2 pts each)',
  'steals': 'For steals (4 pts each)',
  'blocks': 'For blocks (4 pts each)',
  'turnovers': 'For turnovers (-2 pts each)'
}

print("üìã ESPN FANTASY SCORING REQUIREMENTS:")
all_available = True
for stat, description in required_stats.items():
    available = stat in player_stats.columns
    print(f"  {stat}: {'‚úÖ' if available else '‚ùå'} - {description}")
    if not available:
        all_available = False

print(f"\n{'‚úÖ ALL STATS AVAILABLE!' if all_available else '‚ùå MISSING STATS - Cannot calculate fantasy scores'}")

üìã ESPN FANTASY SCORING REQUIREMENTS:
  fieldGoalsMade: ‚úÖ - For calculating made 2PT/3PT shots
  fieldGoalsAttempted: ‚úÖ - For calculating missed shots
  threePointersMade: ‚úÖ - For 3PT bonus (5 pts each)
  threePointersAttempted: ‚úÖ - For calculating missed 3PT shots
  freeThrowsMade: ‚úÖ - For FT points (1 pt each)
  freeThrowsAttempted: ‚úÖ - For calculating missed FT shots
  reboundsTotal: ‚úÖ - For rebounds (1 pt each)
  assists: ‚úÖ - For assists (2 pts each)
  steals: ‚úÖ - For steals (4 pts each)
  blocks: ‚úÖ - For blocks (4 pts each)
  turnovers: ‚úÖ - For turnovers (-2 pts each)

‚úÖ ALL STATS AVAILABLE!


**Calculate Fantasy Scores with current data**

In [11]:
def calculate_espn_fantasy_score(row):
    """
    Calculate ESPN Fantasy Basketball score for a player's game
    
    Scoring:
    - 3PM = 5 pts (includes 3PT bonus)
    - 2PM = 3 pts
    - FTM = 1 pt
    - Missed shot = -1 pt
    - REB = 1 pt, AST = 2 pts, STL = 4 pts, BLK = 4 pts, TOV = -2 pts
    """
    # Made shots
    threepointers_made = row['threePointersMade'] * 5
    twopointers_made = (row['fieldGoalsMade'] - row['threePointersMade']) * 3
    freethrows_made = row['freeThrowsMade'] * 1
    
    # Missed shots (-1 each)
    fg_missed = (row['fieldGoalsAttempted'] - row['fieldGoalsMade']) * -1
    ft_missed = (row['freeThrowsAttempted'] - row['freeThrowsMade']) * -1
    
    # Other stats
    rebounds = row['reboundsTotal'] * 1
    assists = row['assists'] * 2
    steals = row['steals'] * 4
    blocks = row['blocks'] * 4
    turnovers = row['turnovers'] * -2
    
    total_score = (threepointers_made + twopointers_made + freethrows_made +
                 fg_missed + ft_missed + rebounds + assists + steals + blocks + turnovers)
    
    return total_score

In [12]:
# Calculate fantasy scores for all games
player_stats['espn_fantasy_score'] = player_stats.apply(calculate_espn_fantasy_score, axis=1)

print("‚úÖ ESPN Fantasy scores calculated!")
print(f"üìà Average fantasy score: {player_stats['espn_fantasy_score'].mean():.2f}")
print(f"üìä Score range: {player_stats['espn_fantasy_score'].min():.1f} to {player_stats['espn_fantasy_score'].max():.1f}")

‚úÖ ESPN Fantasy scores calculated!
üìà Average fantasy score: 18.13
üìä Score range: -35.0 to 134.0
