# NBA Players Injury Analysis

# Introduction
- NBA player injuries are trending up. Costing money from owners, wins from coaches and players, and ruinning expectations of fans by watching the teams star player on the bench in street clothes. In 2014-15 injuries to 2 key players, Kevin Love and Kyrie Irving during playoffs created too big a hurdle for Lebron James and the Cavaliers to beat the Golden State warriors in the NBA finals. The Los Angeles Clippers organization has been plagued with the "Clipper curse" due to untimely and unfortunate injuries. The most rescent Clipper injuries, Chris Paul fractures his left hand during early round playoffs 2015-16 season while Blake Griffin- another Clipper all-star- is nursing a strained hamstring. 

Shaun Livingston, drafted 4th pick by the Clippers in 2004, during midseason 2007 injures almost every part of his knee, tearing the anterior cruciate ligament (ACL), the posterior cruciate ligament (PCL), and the lateral meniscus, badly spraining his medial collateral ligament (MCL), and dislocating his patella and his tibio-fibular joint.



# Goal
- Create an analysis on nba players injuries by focusing on games played for each player in a given season. Each NBA team plays 82 games a season. My goal is to create a model to predict total games played in 2016-17 for each nba player.



### Target

- Target: Total games played for each player (per season)

### Possible Attributes:
- PFD: Fouls Drawn (average fouls drawn)
- Min: Minutes played (avg minutes playes)
- Shot Selection: Shooting Dataset
    - count per shot type or pct. per shot type
- Opponent Shot Selection: Opponent Shooting Dataset
- Schedule per season: number of back to backs + travel time (incorporate time zone)
- Initial height, weight
- Possible decrease in body weight, increase in games played (EDA): Player Bios dataset
- PIE: traditional advanced and clutch advanced dataset
- Points in the Paint 
- Unassisted % - traditional scoring and clutch scoring dataset
- Usage dataset for traditional and clutch
- Team
- Position


### Data Gathering/Munging
1. Scrape Regular Season NBA stats. link: http://stats.nba.com/
    - TRADITIONAL STATS
    - ADVANCED STATS
    - SHOOTING STATS 
    - OPPONENT SHOOTING STATS
    - PLAYER BIOS
    - TEAM SCHEDULE GAME LOGS
    
2. Merge individual player statistics per year




## Potential models

number of games played per season ~ 
    + average player stats
        field goals
        ('FGM', 'FGA', 'FG_PCT', 
         'FG3M', 'FG3A', 'FG3_PCT')
        free-throws
        ('FTM', 'FTA', 'FT_PCT')
        Rebounds
        ('OREB', 'DREB', 'REB')
        Assists/TOV 
        ('AST', 'TOV')
        Defense
        ('STL', 'BLK', 'BLKA')
        Fouls
        ('PF', 'PFD')
        Points
        ('PTS')
        fouls drawn 
        ('PFD')
        minutes played per game 
        ('MIN')
    + pct shots per distance 
        ('lessthan5ft_FGM', 'lessthan5ft_FGA', 'lessthan5ft_FG_PCT', '5_9ft_FGM',
         '5_9ft_FGA', '5_9ft_FG_PCT', '10_14ft_FGM', '10_14ft_FGA', '10_14ft_FG_PCT',
         '15_19ft_FGM', '15_19ft_FGA', '15_19ft_FG_PCT', '20_24ft_FGM', '20_24ft_FGA',
         '20_24ft_FG_PCT', '25_29ft_FGM', '25_29ft_FGA', '25_29ft_FG_PCT', '30_34ft_FGM',
         '30_34ft_FGA', '30_34ft_FG_PCT', '35_39ft_FGM') 
    + pct opponent shots per distance 
        ('opp_lessthan5ft_FGM', 'opp_lessthan5ft_FGA', 'opp_lessthan5ft_FG_PCT',
         'opp_5_9ft_FGM', 'opp_5_9ft_FGA', 'opp_5_9ft_FG_PCT', 'opp_10_14ft_FGM',
         'opp_10_14ft_FGA', 'opp_10_14ft_FG_PCT', 'opp_15_19ft_FGM', 'opp_15_19ft_FGA',
         'opp_15_19ft_FG_PCT', 'opp_20_24ft_FGM', 'opp_20_24ft_FGA', 'opp_20_24ft_FG_PCT',
         'opp_25_29ft_FGM', 'opp_25_29ft_FGA', 'opp_25_29ft_FG_PCT', 'opp_30_34ft_FGM',
         'opp_30_34ft_FGA', 'opp_30_34ft_FG_PCT') 
    + pct shot by type
        ('PCT_FGA_2PT', 'PCT_FGA_3PT', 'PCT_PTS_2PT', 'PCT_PTS_2PT_MR', 'PCT_PTS_3PT',
         'PCT_PTS_FB', 'PCT_PTS_FT', 'PCT_PTS_OFF_TOV', 'PCT_PTS_PAINT', 'PCT_AST_2PM')
    + schedule 
        ('B2B_COUNT', '3GMS_IN_4DAYS', '4GMS_IN_5DAYS') 
    + player_bios 
        ('PLAYER_HEIGHT_INCHES', 'PLAYER_WEIGHT') 
    + age 
        ('AGE') 
    + Player Impact Estimate that calculates a player's impact on each individual game they play 
        ('PIE')
    + pace
        ('PACE')
    + plus/minus
        ('PLUS_MINUS')
    + unassisted % 
        ('PCT_UAST_2PM', 'PCT_AST_3PM', 'PCT_UAST_3PM', 'PCT_AST_FGM', 'PCT_UAST_FGM')
    + usage metrics
        ('USG_PCT')
    + True shooting percentage
        ('TS_PCT')
    + team 
        (dummy coded columns)
    + years in league

    
Predictors from inital goal that I chose to leave out:
    - travel time: difficult
    - points in paint: taken care of with pct opponent shots per type
    - position(dummy coded columns: 'G', 'F', 'C'): difficult due to players who play multiple positions 
    throughout the season. Draymond Green in a single game played multiple positions including small forward, power 
    forward and center. Also, the Goldenstate warriors 'small ball' lineup- a lineup consisiting of no 'true' 
    center- has caused a trend away from playing traditional line-ups. Positions like shooting gaurd 
    and small forward are interchangeable as well as power forwards and centers. Paul George played 
    shooting gaurd, small forward and Power forward last year.
    
    
## Target

Each row is a player in a particular year.
Target variable is number of games played in that year.
Columns represent these stats for years prior to this year (however far back you want to go).

For example, in one row:

y = Lebron's games played in 2006 
X = [lebron's average fouls drawn 2005, avg mins played 2005, etc.]

Or, if y = games played 2006, X = [fouls drawn 2005, avg mins 2005, etc.]:

    Y       X
    [12,    [[21, 43],
     10,     [12, 23],
     9,      [20, 11],
     2,      [5, 17],
     16,     [14, 25],
     ...     ...
     
Each row a player.
