# Draft Combine Sheets

The NBA Draft Combine is an event that brings amateur athletes together to participate in drills, interview with teams, and showcase their skills prior to the NBA Draft. The athletes go through physical measurements, basketball drills, and medical testing. The [stats.nba.com](https://stats.nba.com) API contains the `draftcombinestats` endpoint that has much of the data recorded at the combine.

This notebook leverages the `Draft` class from the `py_ball` package to explore the `draftcombinestats` endpoint with the goal of producing player sheets that summarize the performance of athletes at the combine.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt

from py_ball import draft

HEADERS = {'Connection': 'close',
           'Host': 'stats.nba.com',
           'Origin': 'http://stats.nba.com',
           'Upgrade-Insecure-Requests': '1',
           'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2)' + \
                         'AppleWebKit/537.36 (KHTML, like Gecko) ' + \
                         'Chrome/66.0.3359.117 Safari/537.36'}

The `league_id` and `season_year` are required parameters for the `draftcombinestats` endpoint. The NBA is the only league for which data are available, and the cell below pulls data for the most recent draft combine (2018-19 season).

In [7]:
league_id = '00' #NBA
season_year = '2018-19'
draft_data = draft.Draft(headers=HEADERS,
                         endpoint='draftcombinestats',
                         league_id=league_id,
                         season_year=season_year)
draft_df = pd.DataFrame(draft_data.data['DraftCombineStats'])
draft_df.head(10)

Unnamed: 0,BENCH_PRESS,BODY_FAT_PCT,FIRST_NAME,HAND_LENGTH,HAND_WIDTH,HEIGHT_WO_SHOES,HEIGHT_WO_SHOES_FT_IN,HEIGHT_W_SHOES,HEIGHT_W_SHOES_FT_IN,LANE_AGILITY_TIME,...,SPOT_NBA_CORNER_LEFT,SPOT_NBA_CORNER_RIGHT,SPOT_NBA_TOP_KEY,STANDING_REACH,STANDING_REACH_FT_IN,STANDING_VERTICAL_LEAP,THREE_QUARTER_SPRINT,WEIGHT,WINGSPAN,WINGSPAN_FT_IN
0,10.0,8.9,Rawle,8.5,10.0,74.75,6' 2.75'',76.25,6' 4.25'',11.5,...,1-5,2-5,3-5,99.0,8' 3'',32.5,3.15,217.4,80.75,6' 8.75''
1,,5.55,Grayson,8.25,10.0,75.0,6' 3'',76.5,6' 4.5'',10.31,...,,,,97.0,8' 1'',32.5,3.15,198.0,79.25,6' 7.25''
2,0.0,5.0,Kostas,9.25,9.5,81.0,6' 9'',82.5,6' 10.5'',12.48,...,2-5,4-5,3-5,110.0,9' 2'',29.5,3.21,194.8,86.25,7' 2.25''
3,11.0,7.95,Udoka,9.5,10.0,82.0,6' 10'',84.25,7' 0.25'',12.97,...,,,,112.5,9' 4.5'',31.0,3.12,273.8,91.0,7' 7''
4,,6.2,Mohamed,9.75,10.25,83.25,6' 11.25'',84.75,7' 0.75'',,...,,,,115.5,9' 7.5'',,,225.6,94.0,7' 10''
5,7.0,11.65,Jaylen,8.0,8.25,73.25,6' 1.25'',74.25,6' 2.25'',11.39,...,4-5,3-5,1-5,96.5,8' 0.5'',27.0,3.31,207.6,75.5,6' 3.5''
6,11.0,5.35,Keita,9.0,8.5,79.25,6' 7.25'',80.5,6' 8.5'',11.2,...,3-5,4-5,1-5,106.5,8' 10.5'',30.5,3.17,223.8,87.25,7' 3.25''
7,9.0,4.0,Tyus,8.5,9.25,77.0,6' 5'',78.75,6' 6.75'',11.04,...,4-5,4-5,4-5,102.0,8' 6'',32.0,3.07,200.2,81.0,6' 9''
8,0.0,7.6,Brian,9.0,10.0,78.25,6' 6.25'',79.5,6' 7.5'',11.58,...,3-5,2-5,2-5,103.5,8' 7.5'',30.5,3.28,202.0,82.25,6' 10.25''
9,,5.9,Miles,9.0,9.75,77.25,6' 5.25'',78.75,6' 6.75'',,...,,,,103.5,8' 7.5'',,,220.4,81.5,6' 9.5''


The `draft_df` DataFrame has 47 columns, and the view above does not allow one to see all of the feature names. The cell below lists them all.

In [14]:
list(draft_df)

['BENCH_PRESS',
 'BODY_FAT_PCT',
 'FIRST_NAME',
 'HAND_LENGTH',
 'HAND_WIDTH',
 'HEIGHT_WO_SHOES',
 'HEIGHT_WO_SHOES_FT_IN',
 'HEIGHT_W_SHOES',
 'HEIGHT_W_SHOES_FT_IN',
 'LANE_AGILITY_TIME',
 'LAST_NAME',
 'MAX_VERTICAL_LEAP',
 'MODIFIED_LANE_AGILITY_TIME',
 'OFF_DRIB_COLLEGE_BREAK_LEFT',
 'OFF_DRIB_COLLEGE_BREAK_RIGHT',
 'OFF_DRIB_COLLEGE_TOP_KEY',
 'OFF_DRIB_FIFTEEN_BREAK_LEFT',
 'OFF_DRIB_FIFTEEN_BREAK_RIGHT',
 'OFF_DRIB_FIFTEEN_TOP_KEY',
 'ON_MOVE_COLLEGE',
 'ON_MOVE_FIFTEEN',
 'PLAYER_ID',
 'PLAYER_NAME',
 'POSITION',
 'SEASON',
 'SPOT_COLLEGE_BREAK_LEFT',
 'SPOT_COLLEGE_BREAK_RIGHT',
 'SPOT_COLLEGE_CORNER_LEFT',
 'SPOT_COLLEGE_CORNER_RIGHT',
 'SPOT_COLLEGE_TOP_KEY',
 'SPOT_FIFTEEN_BREAK_LEFT',
 'SPOT_FIFTEEN_BREAK_RIGHT',
 'SPOT_FIFTEEN_CORNER_LEFT',
 'SPOT_FIFTEEN_CORNER_RIGHT',
 'SPOT_FIFTEEN_TOP_KEY',
 'SPOT_NBA_BREAK_LEFT',
 'SPOT_NBA_BREAK_RIGHT',
 'SPOT_NBA_CORNER_LEFT',
 'SPOT_NBA_CORNER_RIGHT',
 'SPOT_NBA_TOP_KEY',
 'STANDING_REACH',
 'STANDING_REACH_FT_IN',
 'STANDING_VE

Examining the list above contains about four categories for features:
- Player metadata (`FIRST_NAME`, `LAST_NAME`, `POSITION`, etc.)
- Measurement data (`HAND_LENGTH`, `HAND_WIDTH`, `HEIGHT_WO_SHOES`, etc.)
- Drill data (`BENCH_PRESS`, `LANE_AGILITY_TIME`, `MAX_VERTICAL_LEAP`, etc.)
- Shooting data (`OFF_DRIB_COLLEGE_BREAK_LEFT`, `SPOT_FIFTEEN_BREAK_LEFT`, `SPOT_NBA_TOP_KEY`, etc.)

All of this information seems relevant to include on a player summary sheet. The following sections will organize and manipulate the data so as to produce meaningful results for each data category.

## Player Metadata

The player metadata seems to be the simplest category. The only change necessary would be to create a `FULL_NAME` field.

In [8]:
draft_df['FULL_NAME'] = draft_df['FIRST_NAME'] + ' ' + draft_df['LAST_NAME']

## Measurement Data

Measurement data provide a sense of the size and potentially the fitness of an athlete. Due to the nature of NBA positions, players are effective across a wide variety of values of measurement data. This leads to the motivation to present not only the absolute value of measurement data values, but also values normalized by position. Such data transforming illustrates how players compare physically to others that play the same position.

First, the code below explores the measurement data.

In [16]:
measurement_df = draft_df[['PLAYER_ID', 'POSITION', 'BODY_FAT_PCT', 'HAND_LENGTH', 'HAND_WIDTH',
                           'HEIGHT_WO_SHOES', 'HEIGHT_WO_SHOES_FT_IN', 'HEIGHT_W_SHOES', 'HEIGHT_W_SHOES_FT_IN',
                           'STANDING_REACH', 'STANDING_REACH_FT_IN', 'WEIGHT', 'WINGSPAN', 'WINGSPAN_FT_IN']]
measurement_df.head(20)

Unnamed: 0,PLAYER_ID,POSITION,BODY_FAT_PCT,HAND_LENGTH,HAND_WIDTH,HEIGHT_WO_SHOES,HEIGHT_WO_SHOES_FT_IN,HEIGHT_W_SHOES,HEIGHT_W_SHOES_FT_IN,STANDING_REACH,STANDING_REACH_FT_IN,WEIGHT,WINGSPAN,WINGSPAN_FT_IN
0,1628959,SG,8.9,8.5,10.0,74.75,6' 2.75'',76.25,6' 4.25'',99.0,8' 3'',217.4,80.75,6' 8.75''
1,1628960,SG,5.55,8.25,10.0,75.0,6' 3'',76.5,6' 4.5'',97.0,8' 1'',198.0,79.25,6' 7.25''
2,1628961,PF,5.0,9.25,9.5,81.0,6' 9'',82.5,6' 10.5'',110.0,9' 2'',194.8,86.25,7' 2.25''
3,1628962,C,7.95,9.5,10.0,82.0,6' 10'',84.25,7' 0.25'',112.5,9' 4.5'',273.8,91.0,7' 7''
4,1628964,C,6.2,9.75,10.25,83.25,6' 11.25'',84.75,7' 0.75'',115.5,9' 7.5'',225.6,94.0,7' 10''
5,1628965,PG-SG,11.65,8.0,8.25,73.25,6' 1.25'',74.25,6' 2.25'',96.5,8' 0.5'',207.6,75.5,6' 3.5''
6,1628966,SG-SF,5.35,9.0,8.5,79.25,6' 7.25'',80.5,6' 8.5'',106.5,8' 10.5'',223.8,87.25,7' 3.25''
7,1628967,SG,4.0,8.5,9.25,77.0,6' 5'',78.75,6' 6.75'',102.0,8' 6'',200.2,81.0,6' 9''
8,1628968,SG-SF,7.6,9.0,10.0,78.25,6' 6.25'',79.5,6' 7.5'',103.5,8' 7.5'',202.0,82.25,6' 10.25''
9,1628970,SF,5.9,9.0,9.75,77.25,6' 5.25'',78.75,6' 6.75'',103.5,8' 7.5'',220.4,81.5,6' 9.5''


Conveniently, many of the measurements have both a value in inches, along with a display-friendly format. Also of note if the format of the `POSITION` field. It appears that players can have multiple positions listed, complicating our plan to normalize by `POSITION` slightly. The following cell drills down into this field further.

In [18]:
measurement_df.groupby('POSITION')['PLAYER_ID'].nunique()

POSITION
C         7
C-PF      3
PF        7
PF-C      4
PG       13
PG-SG     1
SF        5
SF-PF     1
SF-SG     4
SG       14
SG-PG     4
SG-SF     6
Name: PLAYER_ID, dtype: int64