# Assignment 5: Draft Prospect Report

## Objective
Using the 2024-2025 D1 Dataset, create a report on a **draft prospect** evaulating their strengths, weaknesses, projected output, and fit with an existing team.

## Instructions
- Choose one player from the 2024-2025 Division 1 Dataset
- Analyse Usage & Efficiency: Include key metrics (USG%, eFG%, TS%, turnover rate, etc.) and write 2–3 sentences interpreting what these numbers tell you about:
  - The player’s offensive role
  - Their scoring effectiveness
  - Context (team style, pace, lineup fit if relevant)
- Pull Key Metrics by Category
- For each bucket below, choose relevant stats from the dataset and include 1–2 sentences of interpretation (what the numbers mean, not just what they are):
  - Shooting (e.g., 3P%, FT%, rim finishing, midrange volume, assisted vs. unassisted shots)
  - Creation (assist rate, turnover rate, pick-and-roll involvement, self-creation indicators)
  - Defense (steal%, block%, foul rate, defensive rebounding, matchup difficulty)
  - Rebounding (ORB%, DRB%, positional comparison if applicable)
- Position Group Comparison: Compare each prospect to peers at the same position (e.g., guards vs. guards). Include 1–2 sentences explaining:
  - Does this prospect stand out? Are they average? Does their skill profile fit their positional archetype?
- Strengths & Weaknesses Summary: Provide a concise final assessment summarising:
  - What this player does well
  - Main concerns or limitations
  - Any quick projection (e.g., role, developmental focus)

### Submission
- Format & Length
  - 1 Page or 2 Slides Maximum
  - Include tables, shot charts, or bullet lists, if useful

### Suggested Structure

**Header: Player Name | Position | College | Class Year**

**Section 1: Usage & Efficiency**

**Section 2: Shooting**

**Section 3: Creation**

**Section 4: Defense**

**Section 5: Rebounding**

**Section 6: Position Group Comparison**

**Section 7: Strengths & Weaknesses Summary**

---

In [1]:
# Import Libraries
import pandas as pd
import numpy as np
import datetime as dt

In [2]:
# Load workbook and gather sheets
# All College Players, Data from Class Lab
data = pd.read_csv("mbb_player_season_2025.csv")

# Specific College Players
college_stats = pd.read_excel("college_player_data.xlsx")

# Create a copy of the data
player_game_data = data

In [3]:
# Filter Players by Players that Played at Least 30 Games
player_game_data = player_game_data[player_game_data["games_played"] >= 30]

# Filter Players by Players that Played as Starters
player_game_data = player_game_data[player_game_data["games_started"] == player_game_data["games_played"]]

# Filter Players by Players that Scored 15+ PPG 
player_game_data = player_game_data[player_game_data["ppg"] >= 15]

# Filter by Chaz Lanier
chaz_lanier = player_game_data[player_game_data["player"] == "Chaz Lanier"]

In [4]:
player_game_data.columns

Index(['season', 'athlete_id', 'player', 'team', 'position', 'games_played',
       'games_started', 'minutes_total', 'mpg', 'pts_total', 'ppg',
       'pts_per_40', 'reb_total', 'rpg', 'reb_per_40', 'oreb_total', 'oreb_pg',
       'oreb_per_40', 'dreb_total', 'dreb_pg', 'dreb_per_40', 'ast_total',
       'apg', 'ast_per_40', 'stl_total', 'spg', 'stl_per_40', 'blk_total',
       'bpg', 'blk_per_40', 'tov_total', 'tovpg', 'tov_per_40', 'fg_pct',
       'fg3_pct', 'threepar', 'ft_pct', 'fta_rate', 'efg_pct', 'ts_pct',
       'usage', 'ast_pct', 'tov_pct', 'oreb_pct', 'dreb_pct',
       'usage_pctile_pos', 'ts_pctile_pos', 'efg_pctile_pos', 'ast_pctile_pos',
       'tov_pctile_pos', 'oreb_pctile_pos', 'dreb_pctile_pos',
       'fg3_pctile_pos', 'ft_pctile_pos', 'threepar_pctile_pos',
       'fta_rate_pctile_pos'],
      dtype='object')

In [5]:
# Create a Dataframe for Player Physicals
headings = ["Player Name", "Position", "Height", "Weight", "School"]
chaz_lanier_physicals = ["Chaz Lanier", "Guard", "6'4", "175lb", "Tennessee"]

player_physicals = pd.DataFrame([chaz_lanier_physicals], columns=headings)

player_physicals

Unnamed: 0,Player Name,Position,Height,Weight,School
0,Chaz Lanier,Guard,6'4,175lb,Tennessee


In [6]:
chaz_lanier

Unnamed: 0,season,athlete_id,player,team,position,games_played,games_started,minutes_total,mpg,pts_total,...,ts_pctile_pos,efg_pctile_pos,ast_pctile_pos,tov_pctile_pos,oreb_pctile_pos,dreb_pctile_pos,fg3_pctile_pos,ft_pctile_pos,threepar_pctile_pos,fta_rate_pctile_pos
636,2025,4700852,Chaz Lanier,Tennessee,G,38,38,1194,31.421053,684,...,0.648992,0.765682,0.166454,0.029311,0.342713,0.669213,0.819711,0.443092,0.695504,0.176567


In [7]:
college_stats

Unnamed: 0,Player,Season,Team,Conf,Class,Pos,G,GS,MP,FG,...,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,Awards
0,Chaz Lanier,2020-21,North Florida,A-Sun,FR,G,10,1,9.3,0.6,...,0.0,0.7,0.7,0.6,0.3,0.0,0.3,0.4,1.7,
1,Chaz Lanier,2021-22,North Florida,A-Sun,FR,G,31,8,21.0,1.4,...,0.8,1.9,2.7,1.1,0.7,0.2,0.5,1.0,4.5,
2,Chaz Lanier,2022-23,North Florida,A-Sun,SO,G,31,9,19.7,1.7,...,0.5,2.0,2.5,0.9,0.4,0.2,0.5,1.1,4.7,
3,Chaz Lanier,2023-24,North Florida,A-Sun,JR,G,32,31,33.4,6.7,...,0.7,4.1,4.8,1.8,0.9,0.3,1.6,1.5,19.7,
4,Chaz Lanier,2024-25,Tennessee,SEC,SR,G,38,38,31.4,6.4,...,0.5,3.4,3.9,1.1,0.9,0.1,1.1,1.6,18.0,"NABC-AA-3,SN-AA-3,W-AA-1"


In [None]:
# Using Functions from Assignment 1
# Define Estimated Advanced Metrics

# Shot Attempts
def shot_attempts(fga, fta):
    shot_attempts = fga + 0.44*fta
    return shot_attempts

#########################
# Usage
def usage (fga, fta, tov, tm_mp, mp, tm_fga, tm_fta, tm_tov, gp, tm_gp):
    # If any data is missing, then we return NAN
    if any(pd.isna(x) for x in [fga, fta, tov, tm_mp, mp, tm_fga, tm_fta, tm_tov]):
        return np.nan
    
    num = (fga + 0.44 * fta + tov) * (tm_mp)
    denom = (mp * (tm_fga + 0.44 * tm_fta + tm_tov))
    
    # Check for Divide-by-Zero
    if denom == 0:
        return np.nan

    # Calculate Usage
    usage = num / denom
    return usage
    
#########################
# Effective Field Goal Percentage (eFG%)
def eFG_percentage (fgm, made_3s, fga):
    num = (fgm + 0.5 * made_3s)
    denom = fga

    # Check for Divide-by-Zero
    if fga == 0:
        return np.nan
    
    # Calculate Effective Field Goal Percentage
    efg = num / denom
    return efg

#########################
# True Shooting Percentage (TS%)
def TS_percentage (pts, fga, fta):
    num = pts
    denom = 2 * (fga + 0.44 * fta)

    # Check for Divide-by-Zero
    if denom == 0:
        return np.nan

    # Calculate True Shooting Percentage
    ts = num / denom
    return ts

#########################
# Per 36-Minute Stat Calculator
def per36(stat_per_game, minutes_per_game):
    
    return stat_per_game * (36 / minutes_per_game)

#########################
# Gather Player Statistics
def get_player_data(df, player_name):
    player_df = df[df['Player'] == player_name]
    
    return player_df.reset_index(drop = True)

# Gather Team Statistics
def get_team_data(df, team_name):    
    team_df = df[df['Team'] == team_name]
    return team_df.reset_index(drop = True)

#########################
# Define Player Analysis
def calculate_player_metrics(player_df, team_df):
    # Calculate the Player's Shot Attempts
    player_shot_attempts = shot_attempts(fga = player_df["FGA"][0],
                                         fta = player_df["FTA"][0]
                                        )

    # Calculate the Player's Usage
    player_usage = usage(fga = player_df["FGA"][0], 
                         fta = player_df["FTA"][0], 
                         tov = player_df["TOV"][0],
                         tm_mp = team_df["Min"][0],
                         mp = player_df["Min"][0],
                         tm_fga = team_df["FGA"][0],
                         tm_fta = team_df["FTA"][0],
                         tm_tov = team_df["TOV"][0],
                         gp = player_df["GP"][0],
                         tm_gp = team_df["GP"][0]
                        )
    
    # Calculate the Effective Field Goal Percentage (eFG%)
    player_efg = eFG_percentage(player_df["FGM"][0],
                                player_df["3PM"][0],
                                player_df["FGA"][0]
                               )

    # Calculate the True Shooting Percentage (TS%)
    player_ts = TS_percentage(player_df["PTS"][0],
                              player_df["FGA"][0],
                              player_df["FTA"][0]
                             )

    # Calculate the PTS per Game at Per 36
    pts36 = per36(player_df["PTS"][0],
                  player_df["Min"][0]
                 )
    
    print("Player Metrics for " + player_df["Player"][0])
    print("Team: " + team_df["Team"][0])
    print("GP: " + str(player_df["GP"][0]))
    print("Minutes Per Game (MPG): " + str(player_df["Min"][0]))
    print("Points Per Game (PPG): " + str(player_df["PTS"][0]))
    print("Shot Attempts: " + str(round(player_shot_attempts, 2)))
    print("Usage Rate: " + str(round(player_usage*100, 2)) + "%")
    print("eFG%: " + str(round(player_efg*100, 2)) + "%")
    print("True Shooting Percentage: " + str(round(player_ts*100, 2)) + "%")
    print("PTS per 36: " + str(round(pts36, 2)) + " PTS")
    
    return [player_shot_attempts, player_usage, player_efg, player_ts]

In [None]:
# Question 1
cade_metrics = calculate_player_metrics(cade_cunningham, detroit_pistons)