# Projected Expected Average Statistics 
## Simple Projection System

The end goal of this project is to determine how players shoot when they are over- or under-performing, measured by their holistic performance on the court. <br>
We are measuring their holsitic performance on the court by comparing their actual statistics to the expected statistic.<br>
Which begs the question: "How can we accurately predict expected averages?"<br>
<br>
This project will utilize data that uses the simplest method possible to predict expected averages, the Simple Projection System (SPS). The SPS was chosen because these values themselves are going to be used in <br> calculations for a subsequent predictive model, and I wasn't comfortable building a novel predictive model accurate or precise enough to withstand compounding error analysis. Such a model itself could be the focus of <br> its own research (I can't envision Vegas or Macao releasing their predictive models for academic research), and as such, falls outside the scope of this paper. 
<br>
The SPS data was taken from basketball-reference.com - the link to the SPS explanation can be found here (https://www.basketball-reference.com/about/projections.html), but I'll give a brief outline of the process below.
<br>
1. The model takes the last three years of data into account, with the most recent year given a weight of 6, the second a weight of 3, and the third a weight of 1. <br>
2. The model then takes the weighted sum of minutes played <br>
3. and the weighted sum of the target statistic. <br>
4. Then, the model calculates the weighted sum of a league-average player scaled to the target player's minutes played, then multiplied by 1000. <br>
5. Calculating the player's expected statistic for 36 minutes is done as follows: <br>
<br>
(step 3 value + step 4 value) / (step 2 value + 1000) * 36 
<br> <br>
6. Finally, there is an Age Adjustment made - this takes into account that most players peak around 28 years of age. <br>
If the player is younger than 28, then the adjustment value is (age - 28) * 0.004. If they are older, then it's (28 - age) * 0.002. <br> <br>

Some additional notes, straight from BR:<br>
A few more notes regarding the SPS:<br>

For the following categories the sign of the age adjustment is reversed: field goals missed, 3-point field goals missed, free throws missed, turnovers, and personal fouls.<br>
For the shooting categories, shots missed are projected rather than shots attempted. Projected shots attempted are then computed by adding projected shots made and projected shots missed.<br>
Projected points are computed using projected field goals made, projected 3-point field goals made, and projected free throws made.<br>

### SPS projections

In [55]:
# importing necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn.metrics import mean_squared_error
from math import sqrt

In [56]:
marcels_1718 = pd.read_csv('/Users/seandiehl/Downloads/1718marcels.csv')
marcels_1819 = pd.read_csv('/Users/seandiehl/Downloads/marcels1819.csv')
pd.set_option('display.max_columns', None)

In [57]:
marcels_1819.head()
marcels_1819.dropna()

Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
0,1,Álex Abrines\abrinal01,Projected,4.2,10.2,2.8,7.1,1.5,1.7,0.8,3.7,1.3,1.2,0.3,1.0,3.8,12.5,0.407,0.385,0.852
1,1,Álex Abrines\abrinal01,Actual,3.4,9.6,2.5,7.8,0.7,0.8,0.3,2.9,1.2,1.0,0.4,0.9,3.2,10.1,0.357,0.323,0.923
2,2,Quincy Acy\acyqu01,Projected,3.9,10.1,2.4,6.7,1.7,2.1,1.2,6.9,1.6,0.9,0.8,1.6,3.9,11.9,0.389,0.358,0.786
3,2,Quincy Acy\acyqu01,Actual,1.2,5.3,0.6,4.4,2.0,2.9,0.9,7.3,2.3,0.3,1.2,1.2,7.0,5.0,0.222,0.133,0.700
4,3,Steven Adams\adamsst01,Projected,6.1,10.1,0.1,0.2,2.3,4.0,4.9,9.7,1.4,1.3,1.2,1.9,3.1,14.7,0.609,0.327,0.588
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
756,385,Cody Zeller\zelleco01,Actual,5.5,10.0,0.2,0.6,3.2,4.1,3.2,9.6,3.0,1.1,1.2,1.8,4.7,14.4,0.551,0.273,0.787
757,386,Tyler Zeller\zellety01,Projected,6.0,11.3,0.4,1.0,2.0,2.9,3.2,9.3,1.9,0.6,1.1,1.6,4.1,14.4,0.530,0.355,0.714
758,386,Tyler Zeller\zellety01,Actual,6.2,11.6,0.0,0.4,5.4,7.0,4.3,9.3,1.5,0.4,1.2,1.5,7.7,17.8,0.533,0.000,0.778
759,387,Ante Žižić\zizican01,Projected,7.4,12.0,0.7,1.9,3.1,4.2,3.0,8.7,2.0,0.7,1.6,1.9,4.0,18.6,0.615,0.374,0.748


In [58]:
projections = marcels_1819.loc[marcels_1819['Type']=='Projected']

projections.head()
projections.dropna()



Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
0,1,Álex Abrines\abrinal01,Projected,4.2,10.2,2.8,7.1,1.5,1.7,0.8,3.7,1.3,1.2,0.3,1.0,3.8,12.5,0.407,0.385,0.852
2,2,Quincy Acy\acyqu01,Projected,3.9,10.1,2.4,6.7,1.7,2.1,1.2,6.9,1.6,0.9,0.8,1.6,3.9,11.9,0.389,0.358,0.786
4,3,Steven Adams\adamsst01,Projected,6.1,10.1,0.1,0.2,2.3,4.0,4.9,9.7,1.4,1.3,1.2,1.9,3.1,14.7,0.609,0.327,0.588
6,4,Bam Adebayo\adebaba01,Projected,4.9,9.4,0.2,0.6,3.4,4.6,3.0,9.9,2.8,0.9,1.1,1.7,3.5,13.3,0.518,0.282,0.735
8,5,LaMarcus Aldridge\aldrila01,Projected,8.9,18.0,0.4,1.2,4.2,5.1,3.1,8.7,2.2,0.7,1.3,1.6,2.4,22.4,0.495,0.317,0.829
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
753,384,Thaddeus Young\youngth01,Projected,5.9,12.0,0.7,2.2,0.9,1.5,2.4,7.3,2.1,1.8,0.5,1.6,2.4,13.5,0.497,0.334,0.600
755,385,Cody Zeller\zelleco01,Projected,5.1,9.4,0.2,0.4,3.1,4.4,3.1,9.0,2.0,1.1,1.2,1.6,4.1,13.6,0.548,0.377,0.714
757,386,Tyler Zeller\zellety01,Projected,6.0,11.3,0.4,1.0,2.0,2.9,3.2,9.3,1.9,0.6,1.1,1.6,4.1,14.4,0.530,0.355,0.714
759,387,Ante Žižić\zizican01,Projected,7.4,12.0,0.7,1.9,3.1,4.2,3.0,8.7,2.0,0.7,1.6,1.9,4.0,18.6,0.615,0.374,0.748


In [59]:
actual = marcels_1819.loc[marcels_1819['Type']=='Actual']
actual.head()


Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
1,1,Álex Abrines\abrinal01,Actual,3.4,9.6,2.5,7.8,0.7,0.8,0.3,2.9,1.2,1.0,0.4,0.9,3.2,10.1,0.357,0.323,0.923
3,2,Quincy Acy\acyqu01,Actual,1.2,5.3,0.6,4.4,2.0,2.9,0.9,7.3,2.3,0.3,1.2,1.2,7.0,5.0,0.222,0.133,0.7
5,3,Steven Adams\adamsst01,Actual,6.5,10.9,0.0,0.0,2.0,3.9,5.3,10.3,1.7,1.6,1.0,1.8,2.8,14.9,0.595,,0.5
7,4,Bam Adebayo\adebaba01,Actual,5.3,9.1,0.1,0.3,3.1,4.3,3.1,11.2,3.5,1.3,1.2,2.3,3.8,13.7,0.576,0.2,0.735
9,5,LaMarcus Aldridge\aldrila01,Actual,9.2,17.7,0.1,0.6,4.7,5.5,3.4,10.0,2.6,0.6,1.4,1.9,2.4,23.1,0.519,0.238,0.847


In [60]:
actual.loc[(actual['PTS']==72)]

Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
605,308,Zhou Qi\qizh01,Actual,36.0,36.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,72.0,1.0,,


In [61]:
projections = projections.loc[ (projections['Player'].isin(actual.Player))]
projections

Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
0,1,Álex Abrines\abrinal01,Projected,4.2,10.2,2.8,7.1,1.5,1.7,0.8,3.7,1.3,1.2,0.3,1.0,3.8,12.5,0.407,0.385,0.852
2,2,Quincy Acy\acyqu01,Projected,3.9,10.1,2.4,6.7,1.7,2.1,1.2,6.9,1.6,0.9,0.8,1.6,3.9,11.9,0.389,0.358,0.786
4,3,Steven Adams\adamsst01,Projected,6.1,10.1,0.1,0.2,2.3,4.0,4.9,9.7,1.4,1.3,1.2,1.9,3.1,14.7,0.609,0.327,0.588
6,4,Bam Adebayo\adebaba01,Projected,4.9,9.4,0.2,0.6,3.4,4.6,3.0,9.9,2.8,0.9,1.1,1.7,3.5,13.3,0.518,0.282,0.735
8,5,LaMarcus Aldridge\aldrila01,Projected,8.9,18.0,0.4,1.2,4.2,5.1,3.1,8.7,2.2,0.7,1.3,1.6,2.4,22.4,0.495,0.317,0.829
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
753,384,Thaddeus Young\youngth01,Projected,5.9,12.0,0.7,2.2,0.9,1.5,2.4,7.3,2.1,1.8,0.5,1.6,2.4,13.5,0.497,0.334,0.600
755,385,Cody Zeller\zelleco01,Projected,5.1,9.4,0.2,0.4,3.1,4.4,3.1,9.0,2.0,1.1,1.2,1.6,4.1,13.6,0.548,0.377,0.714
757,386,Tyler Zeller\zellety01,Projected,6.0,11.3,0.4,1.0,2.0,2.9,3.2,9.3,1.9,0.6,1.1,1.6,4.1,14.4,0.530,0.355,0.714
759,387,Ante Žižić\zizican01,Projected,7.4,12.0,0.7,1.9,3.1,4.2,3.0,8.7,2.0,0.7,1.6,1.9,4.0,18.6,0.615,0.374,0.748


In [62]:
actual

Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%
1,1,Álex Abrines\abrinal01,Actual,3.4,9.6,2.5,7.8,0.7,0.8,0.3,2.9,1.2,1.0,0.4,0.9,3.2,10.1,0.357,0.323,0.923
3,2,Quincy Acy\acyqu01,Actual,1.2,5.3,0.6,4.4,2.0,2.9,0.9,7.3,2.3,0.3,1.2,1.2,7.0,5.0,0.222,0.133,0.700
5,3,Steven Adams\adamsst01,Actual,6.5,10.9,0.0,0.0,2.0,3.9,5.3,10.3,1.7,1.6,1.0,1.8,2.8,14.9,0.595,,0.500
7,4,Bam Adebayo\adebaba01,Actual,5.3,9.1,0.1,0.3,3.1,4.3,3.1,11.2,3.5,1.3,1.2,2.3,3.8,13.7,0.576,0.200,0.735
9,5,LaMarcus Aldridge\aldrila01,Actual,9.2,17.7,0.1,0.6,4.7,5.5,3.4,10.0,2.6,0.6,1.4,1.9,2.4,23.1,0.519,0.238,0.847
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
754,384,Thaddeus Young\youngth01,Actual,6.4,12.2,0.7,2.1,1.3,2.0,2.8,7.6,3.0,1.8,0.5,1.8,2.8,14.8,0.527,0.349,0.644
756,385,Cody Zeller\zelleco01,Actual,5.5,10.0,0.2,0.6,3.2,4.1,3.2,9.6,3.0,1.1,1.2,1.8,4.7,14.4,0.551,0.273,0.787
758,386,Tyler Zeller\zellety01,Actual,6.2,11.6,0.0,0.4,5.4,7.0,4.3,9.3,1.5,0.4,1.2,1.5,7.7,17.8,0.533,0.000,0.778
760,387,Ante Žižić\zizican01,Actual,6.1,11.0,0.0,0.0,3.1,4.4,3.6,10.6,1.8,0.4,0.7,2.0,3.8,15.3,0.553,,0.705


In [80]:
actual_heat_list = []

for i, row in actual.iterrows():
    heat_score = 0
    heat_score += row['PTS']
    heat_score += row['AST']
    heat_score += row['BLK']
    heat_score += row['STL']
    heat_score += row['TRB']

    actual_heat_list.append(heat_score)

In [82]:
projected_heat_list = []

for i, row in projections.iterrows():
    heat_score = 0
    heat_score += row['PTS']
    heat_score += row['AST']
    heat_score += row['BLK']
    heat_score += row['STL']
    heat_score += row['TRB']

    projected_heat_list.append(heat_score)

In [83]:
print(len(projected_heat_list))

375


In [81]:
print(len(actual_heat_list))

375


In [104]:
heat_rmse = sqrt(mean_squared_error(actual_heat_list, projected_heat_list))
print(max(actual_heat_list))
print(min(actual_heat_list))
print(np.mean(actual_heat_list))
print(np.median(actual_heat_list))
print(np.std(actual_heat_list))

heat_rmse

72.0
0.0
27.981333333333332
27.200000000000003
7.959847458058198


5.2570587467391565

In [89]:
rmspe_list = []
for i in range(1, 375):
    EPSILON =  1e-10 
    rmspe = (np.sqrt(np.mean(np.square((actual_heat_list[i] - projected_heat_list[i]) / (actual_heat_list[i] + EPSILON))))) * 100
    rmspe_list.append(rmspe)

In [106]:
for i in rmspe_list:
    if i > 1000:
        rmspe_list.remove(i)

print(np.mean(rmspe_list))
print(np.std(rmspe_list))
print(np.median(rmspe_list))

11.436471048960527
16.875079722991075
8.208955223849964


In [63]:
actual_pts_list = actual['PTS'].to_list()
projected_pts_list = projections['PTS'].to_list()

In [64]:
from sklearn.metrics import mean_squared_error

In [65]:
pts_rmse = sqrt(mean_squared_error(actual_pts_list, projected_pts_list))
print(min(actual_pts_list))
print(max(actual_pts_list))
print(np.mean(actual_pts_list))
pts_rmse

0.0
72.0
15.687466666666667


4.574978324174516

In [71]:
actual_ast_list = actual['AST'].to_list()
projected_ast_list = projections['AST'].to_list()

ast_rmse = sqrt(mean_squared_error(actual_ast_list, projected_ast_list))
print(max(actual_ast_list))
print(min(actual_ast_list))
print(np.mean(actual_ast_list))
ast_rmse

10.7
0.0
3.4066666666666667


1.117628441537407

In [70]:
actual_stl_list = actual['STL'].to_list()
projected_stl_list = projections['STL'].to_list()

stl_rmse = sqrt(mean_squared_error(actual_stl_list, projected_stl_list))
print(max(actual_stl_list))
print(min(actual_stl_list))
print(np.mean(actual_stl_list))
stl_rmse

6.8
0.0
1.1205333333333334


0.4836527680061388

In [72]:
actual_blk_list = actual['BLK'].to_list()
projected_blk_list = projections['BLK'].to_list()

blk_rmse = sqrt(mean_squared_error(actual_blk_list, projected_blk_list))
print(max(actual_blk_list))
print(min(actual_blk_list))
print(np.mean(actual_blk_list))
blk_rmse

6.0
0.0
0.7645333333333333


0.5271558909215882

In [107]:
actual_trb_list = actual['TRB'].to_list()
projected_trb_list = projections['TRB'].to_list()

trb_rmse = sqrt(mean_squared_error(actual_trb_list, projected_trb_list))
print(max(actual_trb_list))
print(min(actual_trb_list))
print(np.mean(actual_trb_list))
trb_rmse

26.3
0.0
7.002133333333334


1.7730275425572686

In [None]:
projections['PlayerID'] = projections['Player'].str.split('''\\''').str[1]
projections['Name'] = projections['Player'].str.split('''\\''').str[0]
projections.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  projections['PlayerID'] = projections['Player'].str.split('''\\''').str[1]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  projections['Name'] = projections['Player'].str.split('''\\''').str[0]


Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%,PlayerID,Name
0,1,Álex Abrines\abrinal01,Projected,4.2,10.2,2.8,7.1,1.5,1.7,0.8,3.7,1.3,1.2,0.3,1.0,3.8,12.5,0.407,0.385,0.852,abrinal01,Álex Abrines
2,2,Quincy Acy\acyqu01,Projected,3.9,10.1,2.4,6.7,1.7,2.1,1.2,6.9,1.6,0.9,0.8,1.6,3.9,11.9,0.389,0.358,0.786,acyqu01,Quincy Acy
4,3,Steven Adams\adamsst01,Projected,6.1,10.1,0.1,0.2,2.3,4.0,4.9,9.7,1.4,1.3,1.2,1.9,3.1,14.7,0.609,0.327,0.588,adamsst01,Steven Adams
6,4,Bam Adebayo\adebaba01,Projected,4.9,9.4,0.2,0.6,3.4,4.6,3.0,9.9,2.8,0.9,1.1,1.7,3.5,13.3,0.518,0.282,0.735,adebaba01,Bam Adebayo
8,5,LaMarcus Aldridge\aldrila01,Projected,8.9,18.0,0.4,1.2,4.2,5.1,3.1,8.7,2.2,0.7,1.3,1.6,2.4,22.4,0.495,0.317,0.829,aldrila01,LaMarcus Aldridge


In [None]:
projections_per_sec = projections.apply(lambda x: (x / 2160) if x.name in ['FG', 'FGA', '3P', '3PA', 'FT', 'FTA', 'ORB', 'TRB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS'] else x)

projections_per_sec.head()

Unnamed: 0,Rk,Player,Type,FG,FGA,3P,3PA,FT,FTA,ORB,TRB,AST,STL,BLK,TOV,PF,PTS,FG%,3P%,FT%,PlayerID,Name
0,1,Álex Abrines\abrinal01,Projected,0.001944,0.004722,0.001296,0.003287,0.000694,0.000787,0.00037,0.001713,0.000602,0.000556,0.000139,0.000463,0.001759,0.005787,0.407,0.385,0.852,abrinal01,Álex Abrines
2,2,Quincy Acy\acyqu01,Projected,0.001806,0.004676,0.001111,0.003102,0.000787,0.000972,0.000556,0.003194,0.000741,0.000417,0.00037,0.000741,0.001806,0.005509,0.389,0.358,0.786,acyqu01,Quincy Acy
4,3,Steven Adams\adamsst01,Projected,0.002824,0.004676,4.6e-05,9.3e-05,0.001065,0.001852,0.002269,0.004491,0.000648,0.000602,0.000556,0.00088,0.001435,0.006806,0.609,0.327,0.588,adamsst01,Steven Adams
6,4,Bam Adebayo\adebaba01,Projected,0.002269,0.004352,9.3e-05,0.000278,0.001574,0.00213,0.001389,0.004583,0.001296,0.000417,0.000509,0.000787,0.00162,0.006157,0.518,0.282,0.735,adebaba01,Bam Adebayo
8,5,LaMarcus Aldridge\aldrila01,Projected,0.00412,0.008333,0.000185,0.000556,0.001944,0.002361,0.001435,0.004028,0.001019,0.000324,0.000602,0.000741,0.001111,0.01037,0.495,0.317,0.829,aldrila01,LaMarcus Aldridge


In [None]:
projections_per_sec.to_csv('/Users/seandiehl/Downloads/projections_per_sec.csv')