## Factors impacting an NBA players 2023-2024 guaranteed contract value analysis notebook.
**What data are we exploring?**<br>
I am interested in sports analytics, and my favorite sport is basketball, so I used player performance stats from the last 5 NBA seasons. I wrote custom functions that leverage player names and IDs to pull stats from the NBA_API. After executing the functions and merging the returned dataframes, we have a dataset that contains 476 rows and 21 columns. The columns represent standard NBA performance metrics such as:
- total points
- total steals
- total assists
- etc.<br>

This notebook aims to inform players and their management firms how player performance impacts contract value.


### 1. Import libraries required for the analysis

Below is a list of libraries used to collect, manipulate, and analyze the data.

In [1]:
import altair as alt
from data_collection import * #custom created functions
import matplotlib as plt
#import nba_api
import numpy as np
import pandas as pd
import os
import seaborn as sns
import statsmodels as sm
import sys
import warnings

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
warnings.simplefilter(action='ignore', category=FutureWarning)

Check versions to ensure they match requirements.txt file

In [2]:
print("Altair:",alt.__version__)
print("Matplotlib:",plt.__version__)
#print("NBA_API:",nba_api.__version__)
print("Numpy:",np.__version__)
print("Pandas:",pd.__version__)
print("Python Version:",sys.version)
print("Seaborn:",sns.__version__)
print("Statsmodels:",sm.__version__)

Altair: 5.0.1
Matplotlib: 3.8.0
Numpy: 1.26.3
Pandas: 2.1.4
Python Version: 3.12.1 | packaged by Anaconda, Inc. | (main, Jan 19 2024, 15:44:08) [MSC v.1916 64 bit (AMD64)]
Seaborn: 0.12.2
Statsmodels: 0.14.0


### 2. Collecting data

Use the custom fucntions to get contract, player IDs, and stats from the last 5 years.<br>
Use help(function) to print the associated docString for a function.<br>
- ex. help(get_contract_data) prints the following:<br>

> Help on function get_contract_data in module data_collection:<br>
<br>
get_contract_data( )<br>
&emsp;Reads the raw 'player_contract_data' file and executes the following<br>
&emsp;manipulations:<br>
&emsp;&emsp;- Converts the 'Player' column to lowercase<br>
&emsp;&emsp;- Replaces accented characters with their english versions.<br>
&emsp;&emsp;- Splits the 'Player' column into 'First_Name' and 'Last_Name' columns.<br>
&emsp;&emsp;- Drops the original 'Player' column since we no longer need it.<br>
&emsp;&emsp;- Rename the '2023-24' column to something more representative<br>
&emsp;&emsp;- Drop the extra Reggie Bullock row<br>
<br>
&emsp;Args:<br>
&emsp;&emsp;None<br>
<br>
&emsp;Returns:<br>
&emsp;&emsp;Dataframe: Player with their associated contract values.

In [10]:
contract_data = get_contract_data()
contract_data.head()

Unnamed: 0,Current_Contract,First_Name,Last_Name
0,51915615.0,stephen,curry
1,47649433.0,kevin,durant
2,47607350.0,nikola,jokic
3,47607350.0,joel,embiid
4,47607350.0,lebron,james


In [14]:
id_data = get_players_with_ids()
id_data.head()

Unnamed: 0,First_Name,Last_Name,PlayerID
0,alaa,abdelnaby,76001
1,zaid,abdul-aziz,76002
2,kareem,abdul-jabbar,76003
3,mahmoud,abdul-rauf,51
4,tariq,abdul-wahad,1505


In [15]:
# Left join to get IDs added to the contracts dataframe
contract_with_ids = pd.merge(contract_data, id_data, 
                             left_on=['First_Name', 'Last_Name'], 
                             right_on=['First_Name', 'Last_Name'], how='left')

contract_with_ids.head()

Unnamed: 0,Current_Contract,First_Name,Last_Name,PlayerID
0,51915615.0,stephen,curry,201939
1,47649433.0,kevin,durant,201142
2,47607350.0,nikola,jokic,203999
3,47607350.0,joel,embiid,203954
4,47607350.0,lebron,james,2544


In [17]:
#get career stats from a players last five years in batches of 50
stats_df = get_analysis_df(50,contract_with_ids)
stats_df.head()

In [None]:
#save file in case API has issues
#id_data.to_csv('data/player_ids.csv', index= False)