# Motivation

### The NBA Most Valuable Player Award is the most prestigious individual award handed out in the regular season. It has large implications for a players' career with MVP winners being viewed as the best in the game. 
### However, there is no clear criteria for judging the winner, and fans and analysts alike debate fiercly both about who is most deserving as well as the criteria used to determine the MVP. Here, I attempt to elucidate the important features for MVPs and develop a model to predict future winners.

# I. Extraction

### In this notebook I access the stats.nba.com API through use of the nba_api API client to extract player and team information and stats.

### nba_api repo located here --> https://github.com/swar/nba_api

In [1]:
from nba_api.stats.endpoints.commonallplayers import CommonAllPlayers
from nba_api.stats.endpoints.teamyearbyyearstats import TeamYearByYearStats
from nba_api.stats.endpoints.playerawards import PlayerAwards
from nba_api.stats.endpoints.playercareerstats import PlayerCareerStats
from nba_api.stats.static import teams
import pandas as pd
import pickle

#Extract NBA player info and filter to players who played after the year 2000.
all_players = CommonAllPlayers().get_data_frames()[0]
all_players['TO_YEAR'] = all_players['TO_YEAR'].astype('int64')
all_players = all_players[all_players['TO_YEAR']>=2000]

#Get player and team ids
player_id = all_players['PERSON_ID']
team_id = pd.DataFrame(teams.get_teams())['id']

player = all_players[['PERSON_ID', 'DISPLAY_FIRST_LAST']]
pickle.dump(player, open('player.pkl','wb'))

In [None]:
#Import custom functions
from mvp_functions.functions import *

#Initialize dataframes
player_career = pd.DataFrame()
team_stats = pd.DataFrame()
player_awards = pd.DataFrame()

#Custom function that extracts stats and dumps the output to .pkl file
extract_stats(team_stats, TeamYearByYearStats, team_id)
extract_stats(player_career, PlayerCareerStats, player_id)
extract_stats(player_awards, PlayerAwards, player_id)