# NBA Injuries
***
## Goal: 
Build model to predict the probability of a player missing a game due to injury within a particular time frame

## Approach:

### Part I: Data Preparation
Tasks:

1. Scrape injury history data from Pro Sports Transactions using Beautiful Soup
2. Scrape player statistics and information from NBA Stats using Beautiful Soup and Selenium and/or nba-api
3. Clean datasets
4. Merge the two datasets


***

In [1]:
import numpy as np
import pandas as pd

In [14]:
bio1213 = pd.read_csv('data/bios2012-13.csv')
bio1213.head()

Unnamed: 0,PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,PLAYER_HEIGHT,PLAYER_HEIGHT_INCHES,PLAYER_WEIGHT,COLLEGE,COUNTRY,...,GP,PTS,REB,AST,NET_RATING,OREB_PCT,DREB_PCT,USG_PCT,TS_PCT,AST_PCT
0,203932,Aaron Gordon,1610612743,DEN,25.0,6-8,80,235,Arizona,USA,...,50,618,284,161,2.1,0.055,0.15,0.204,0.547,0.165
1,1628988,Aaron Holiday,1610612754,IND,24.0,6-0,72,185,UCLA,USA,...,66,475,89,123,-0.2,0.012,0.06,0.189,0.503,0.139
2,1630174,Aaron Nesmith,1610612738,BOS,21.0,6-5,77,215,Vanderbilt,USA,...,46,218,127,23,-0.5,0.041,0.146,0.133,0.573,0.047
3,1627846,Abdel Nader,1610612756,PHX,27.0,6-5,77,225,Iowa State,Egypt,...,24,160,62,19,5.0,0.02,0.151,0.183,0.605,0.078
4,1629690,Adam Mokoka,1610612741,CHI,22.0,6-4,76,190,,France,...,14,15,5,5,-7.1,0.017,0.077,0.171,0.386,0.179


In [22]:
steph = bio1213[bio1213['PLAYER_NAME'] == 'Stephen Curry']
bio1213[bio1213['PLAYER_NAME'] == 'Stephen Curry']

Unnamed: 0,PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,PLAYER_HEIGHT,PLAYER_HEIGHT_INCHES,PLAYER_WEIGHT,COLLEGE,COUNTRY,...,GP,PTS,REB,AST,NET_RATING,OREB_PCT,DREB_PCT,USG_PCT,TS_PCT,AST_PCT
470,201939,Stephen Curry,1610612744,GSW,33.0,6-3,75,185,Davidson,USA,...,63,2015,345,363,4.6,0.013,0.135,0.331,0.655,0.283


In [23]:
import nba_api.stats.static.players as players
from nba_api.stats import endpoints


In [45]:
gamelog = endpoints.LeagueGameLog().get_data_frames()[0]
gamelog.head()

Unnamed: 0,SEASON_ID,TEAM_ID,TEAM_ABBREVIATION,TEAM_NAME,GAME_ID,GAME_DATE,MATCHUP,WL,MIN,FGM,...,DREB,REB,AST,STL,BLK,TOV,PF,PTS,PLUS_MINUS,VIDEO_AVAILABLE
0,22020,1610612747,LAL,Los Angeles Lakers,22000002,2020-12-22,LAL vs. LAC,L,240,38,...,37,45,22,4,2,19,20,109,-7,1
1,22020,1610612746,LAC,LA Clippers,22000002,2020-12-22,LAC @ LAL,W,240,44,...,29,40,22,10,3,16,29,116,7,1
2,22020,1610612744,GSW,Golden State Warriors,22000001,2020-12-22,GSW @ BKN,L,240,37,...,34,47,26,6,6,18,24,99,-26,1
3,22020,1610612751,BKN,Brooklyn Nets,22000001,2020-12-22,BKN vs. GSW,W,240,42,...,44,57,24,11,7,20,22,125,26,1
4,22020,1610612764,WAS,Washington Wizards,22000013,2020-12-23,WAS @ PHI,L,240,39,...,35,40,28,7,4,20,26,107,-6,1


In [39]:
gsw_bkn_id = '0022000001'
play_by_play = endpoints.PlayByPlay(game_id='0022000002').get_data_frames()[0]

In [40]:
play_by_play

Unnamed: 0,GAME_ID,EVENTNUM,EVENTMSGTYPE,EVENTMSGACTIONTYPE,PERIOD,WCTIMESTRING,PCTIMESTRING,HOMEDESCRIPTION,NEUTRALDESCRIPTION,VISITORDESCRIPTION,SCORE,SCOREMARGIN
0,0022000002,2,12,0,1,10:08 PM,12:00,,Start of 1st Period (10:08 PM EST),,,
1,0022000002,4,10,0,1,10:08 PM,12:00,Jump Ball Davis vs. Ibaka: Tip to James,,,,
2,0022000002,7,2,80,1,10:08 PM,11:40,MISS Davis 15' Step Back Jump Shot,,,,
3,0022000002,8,4,0,1,10:08 PM,11:37,,,George REBOUND (Off:0 Def:1),,
4,0022000002,9,1,6,1,10:09 PM,11:28,,,Beverley 1' Driving Layup (2 PTS),2 - 0,-2
...,...,...,...,...,...,...,...,...,...,...,...,...
494,0022000002,711,8,0,4,12:34 AM,0:56,,,SUB: Patterson FOR Batum,,
495,0022000002,716,2,79,4,12:35 AM,0:35,,,MISS Mann 17' Pullup Jump Shot,,
496,0022000002,717,4,0,4,12:35 AM,0:35,Horton-Tucker REBOUND (Off:0 Def:1),,,,
497,0022000002,718,1,1,4,12:36 AM,0:19,Kuzma 17' Jump Shot (15 PTS) (Harrell 3 AST),,,116 - 109,-7


What if we took injury dates, matched them to game, looked at play by play?

In [41]:
injuries = pd.read_csv('data/injuries.csv')

In [42]:
injuries

Unnamed: 0,Date,Team,Player,Injury
0,2012-10-30,Bulls,Derrick Rose,recovering from surgery on left knee to repai...
1,2012-10-30,Celtics,Darko Milicic,back spasms (DTD)
2,2012-10-30,Clippers,Grant Hill,bone bruise in right knee (DTD)
3,2012-10-30,Knicks,Amare Stoudemire / Amar'e Stoudemire,arthroscopic surgery on left knee (out indefi...
4,2012-10-30,Knicks,Iman Shumpert,recovering from surgery on left knee to repai...
...,...,...,...,...
7035,2020-08-10,Pelicans,Brandon Ingram,sore right knee (out for season)
7036,2020-08-11,76ers,Al Horford,sore left knee (DTD)
7037,2020-08-11,76ers,Tobias Harris,sore right ankle (DTD)
7038,2020-08-11,Magic,Mohamed Bamba / Mo Bamba,migraine headache (DTD)


In [60]:
steph_games = endpoints.PlayerGameLog(player_id=201939).get_data_frames()[0]
steph_injuries = injuries[injuries['Player'] == ' Stephen Curry']

In [61]:
steph_games

Unnamed: 0,SEASON_ID,Player_ID,Game_ID,GAME_DATE,MATCHUP,WL,MIN,FGM,FGA,FG_PCT,...,DREB,REB,AST,STL,BLK,TOV,PF,PTS,PLUS_MINUS,VIDEO_AVAILABLE
0,22020,201939,0022001070,"MAY 16, 2021",GSW vs. MEM,W,40,16,36,0.444,...,6,7,9,1,1,7,2,46,14,1
1,22020,201939,0022001039,"MAY 11, 2021",GSW vs. PHX,W,37,7,22,0.318,...,3,3,6,1,0,3,2,21,6,1
2,22020,201939,0022001030,"MAY 10, 2021",GSW vs. UTA,W,37,11,25,0.440,...,3,4,6,2,0,3,1,36,4,1
3,22020,201939,0022001017,"MAY 08, 2021",GSW vs. OKC,W,29,14,26,0.538,...,5,5,2,1,0,1,2,49,31,1
4,22020,201939,0022001001,"MAY 06, 2021",GSW vs. OKC,W,31,11,21,0.524,...,4,4,7,1,0,3,2,34,24,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58,22020,201939,0022000078,"JAN 01, 2021",GSW vs. POR,L,34,9,20,0.450,...,8,8,5,0,0,1,1,26,-27,1
59,22020,201939,0022000047,"DEC 29, 2020",GSW @ DET,W,35,9,17,0.529,...,5,5,6,2,0,8,5,31,3,1
60,22020,201939,0022000038,"DEC 27, 2020",GSW @ CHI,W,36,11,25,0.440,...,2,2,6,2,2,4,1,36,3,1
61,22020,201939,0022000006,"DEC 25, 2020",GSW @ MIL,L,29,6,17,0.353,...,4,4,6,1,0,2,2,19,-24,1


In [62]:
steph_injuries

Unnamed: 0,Date,Team,Player,Injury
466,2013-01-16,Warriors,Stephen Curry,sprained right ankle (DNP)
535,2013-01-29,Warriors,Stephen Curry,sprained right ankle (DNP)
1317,2013-11-08,Warriors,Stephen Curry,bruised/sprained left ankle (DNP)
2962,2014-04-16,Warriors,Stephen Curry,rest (DNP)
3364,2015-02-22,Warriors,Stephen Curry,sprained right ankle / sore right foot (P) (DTD)
3446,2015-03-13,Warriors,Stephen Curry,rest (DTD)
3933,2015-12-30,Warriors,Stephen Curry,bruised lower left leg (DTD)
4186,2016-03-01,Warriors,Stephen Curry,left ankle injury (DTD)
4462,2016-04-16,Warriors,Stephen Curry,sprained right ankle (DTD)
4473,2016-04-25,Warriors,Stephen Curry,sprained MCL in right knee (DTD)


In [78]:
injuries['Player'] = injuries['Player'].apply(str.strip)
injuries['Team'] = injuries['Team'] .apply(str.strip)
injuries['Injury'] = injuries['Injury'] .apply(str.strip)

In [80]:
injuries['Injury'][0]

'recovering from surgery on left knee to repair torn ACL (out indefinitely)'

In [81]:
injuries.to_csv('data/injuries.csv', index_label=False)