# Machine Learning to Project Major League Baseball WAR 


 *by Zachary Schall, May 12, 2021*

## Introduction

In Major League Baseball, there are many ways to evaluate the performance of an individual player. Typically, the quality of a player is determined by a combination of counting statistics and rate statistics. An example of a counting statistic is the homerun, while an example of a rate statistic is the batting average.

A relatively new statistic, the Wins Above Replacement (WAR), is a convenient way to summarize the value of a player above the value of a so-called "replacement-player," or value of a typical upper-level Minor League player. Without going into too much detail, the WAR of a player correlates to the expected amount of wins a player is expected to add to a team composed entirely of replacement-level players. For example, if a team with 0 cumulative WAR (48 expected wins) added a 2 WAR player, the team would now be expected to win 50 games.

There are two major ways that a player's WAR is calculated. For this project's purposes, I will be focusing on Fangraphs.com WAR formula for position players. More information can be found at https://library.fangraphs.com/misc/war/

A new technology called Statcast was implemented in every Major League ballpark in 2015. This technology has driven a new wave of data-driven player evaluation, particularly for pitchers, that has proven so effective that many baseball writers and analysts feel that rules must be changed to restore the balance of power between batters and pitchers. The primary feature of Statcast is its focus on physical outcomes, as opposed to bottom-line results.

For my project, I will be using Statcast data to create a new projection system for a player's WAR for the following season. Specifically, I will seek to answer the question of whether a player's raw physical batted-ball data is effective in evaluating their future WAR. If I am successful, I will demonstrate that traditional bottom line results are unnecessary for player evaluation, and may lead to new ways for baseball teams to find competetive advantages in an increasingly evolving game. I will be evaluating my projection system by comparing against Fangraphs.com's longstanding ZiPS WAR position player projections for the 2021 season. 

## Methods

First, I will import helpful python modules. In addition to the necessary pandas and numpy, I will be importing the neuralnetworks.py and optimizers.py modules that were provided in the Canvas Files page. 

I will import the pybaseball and baseball_scraper modules from https://github.com/jldbc/pybaseball and https://pypi.org/project/baseball-scraper/ in order to help me gather baseball data from online resources.

I will be using SQL commands to help me organize my data. I work with SQL commands more quickly than with pandas commands, and find them very helpful for working with large amounts of data.

In [641]:
import pybaseball
import pandas
import numpy as np
import baseball_scraper
import optimizers
import neuralnetworks
from pandasql import sqldf as sql
from IPython.display import display

I next save the statcast data that I will be working with from https://baseballsavant.mlb.com/leaderboard/ in the file `batter_stats.csv`. This data is from the 2019 season and includes results from all position players with at least 50 plate appearances.

In [2]:
batter_statcast = pandas.read_csv('batter_stats.csv')

In [3]:
batter_statcast

Unnamed: 0,last_name,first_name,player_id,year,xba,xslg,woba,xwoba,xobp,xiso,...,flyballs_percent,flyballs,linedrives_percent,linedrives,popups_percent,popups,n_bolts,hp_to_1b,sprint_speed,Unnamed: 68
0,Pujols,Albert,405395,2019,0.247,0.419,0.308,0.314,0.310,0.172,...,26.7,115,16.9,73,10.2,44,,5.02,22.5,
1,Cabrera,Miguel,408234,2019,0.266,0.433,0.318,0.328,0.334,0.167,...,22.3,87,27.9,109,4.9,19,,4.95,23.5,
2,Mathis,Jeff,425772,2019,0.158,0.215,0.190,0.189,0.209,0.057,...,23.2,33,23.9,34,12.7,18,,4.60,25.7,
3,Choo,Shin-Soo,425783,2019,0.262,0.473,0.353,0.361,0.369,0.212,...,22.3,89,23.6,94,3.5,14,,4.35,26.6,
4,Molina,Yadier,425877,2019,0.264,0.425,0.303,0.315,0.309,0.161,...,22.1,81,30.9,113,7.4,27,,4.93,22.5,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
446,Smith,Will,669257,2019,0.221,0.443,0.369,0.319,0.311,0.221,...,38.0,46,23.1,28,9.9,12,,4.45,27.6,
447,Hiura,Keston,669374,2019,0.249,0.501,0.388,0.344,0.320,0.252,...,27.9,58,28.4,59,6.3,13,,4.53,27.0,
448,Lopez,Nicky,670032,2019,0.213,0.270,0.260,0.233,0.250,0.057,...,15.1,50,17.5,58,3.9,13,,4.13,28.5,
449,Alvarez,Yordan,670541,2019,0.286,0.592,0.434,0.409,0.391,0.306,...,30.8,68,28.1,62,3.6,8,,4.35,27.0,


I will next save the traditional Fangraphs (non-Statcast) data from https://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=y&type=8&season=2019&month=0&season1=2019&ind=0 in the file `Fangraphs.csv`. I am only interested in the WAR from this data, so I will select a subset containing only the player's name and WAR.

In [1078]:
traditional_stats = pandas.read_csv('Fangraphs.csv')
war_2019 = sql('SELECT Name, WAR FROM traditional_stats')
display(war_2019)

Unnamed: 0,Name,WAR
0,Alex Bregman,8.5
1,Mike Trout,8.5
2,Christian Yelich,7.8
3,Cody Bellinger,7.8
4,Marcus Semien,7.6
...,...,...
446,Christin Stewart,-1.3
447,Curtis Granderson,-1.3
448,Elias Diaz,-1.5
449,Lewis Brinson,-1.7


In order to append each player's fangraphs.com WAR to their associated Statcast data, I must add a column to the Statcast dataframe with a column including the player's full name, in addition to the `first_name` and `last_name` columns. I do this because the fangraphs dataframe stores each player's name in a column containing the full name. Due to a bug with how the player's name is stored, I must remove the first character from each name.

In [10]:
full_names = batter_statcast.iloc[:, 1].astype(str) + ' ' + batter_statcast['last_name'].astype(str)
batter_statcast.insert(0, 'Name', full_names, True)
batter_statcast['Name'] = batter_statcast['Name'].str[1:]

Next, I will append each player's WAR to their Statcast data. I will specify the relevant statistics I wish to use to train my neural network. I am exclusively focusing on statistics that are agnostic of bottom-line results (such as a double), and choosing statistics that capture the physics of the batted ball. I will also include each player's sprint speed, as I believe it to be a component of the physics of the outcome.

In [1079]:
batter_statcast_with_trad_war = sql('SELECT * FROM batter_statcast NATURAL JOIN war_2019')
relevant_columns = ['exit_velocity_avg', 
                    'launch_angle_avg', 
                    'sweet_spot_percent',
                    'barrels',
                    'barrel_batted_rate',
                    'solidcontact_percent',
                    'flareburner_percent',
                    'poorlyunder_percent',
                    'poorlytopped_percent',
                    'poorlyweak_percent',
                    'hard_hit_percent',
                    'pull_percent',
                    'straightaway_percent',
                    'opposite_percent',
                    'batted_ball',
                    'groundballs_percent',
                    'groundballs',
                    'flyballs_percent',
                    'flyballs',
                    'linedrives_percent',
                    'linedrives',
                    'popups_percent',
                    'popups',
                    'sprint_speed'
                   ]

I next borrow code from A2. I will be using the `partition()` function to aid in training my neural network.

In [34]:
def partition(X, T, n_folds, random_shuffle=True):
    
    # reorder samples
    
    rows = np.arange(X.shape[0])
    np.random.shuffle(rows)
    X = X[rows, :]
    T = T[rows, :]
    
    # partition into 5 folds
    
    num_folds = n_folds
    num_samples = X.shape[0]
    num_per_fold = num_samples // num_folds
    num_last_fold = num_samples - num_per_fold * (num_folds - 1)
    
    folds = []
    start = 0
    for fold_index in range(num_folds - 1):
        folds.append((X[start:start + num_per_fold, :], T[start:start + num_per_fold, :]))
        start += num_per_fold
    folds.append((X[start:, :], T[start:, :]))
    
    # assign folds to validate, test, training sets
    
    Xvalidate, Tvalidate = folds[0]
    Xtest, Ttest = folds[1]
    Xtrain, Ttrain = np.vstack([X for (X, _) in folds[2:]]), np.vstack([T for (_, T) in folds[2:]])
    
    return Xtrain, Ttrain, Xvalidate, Tvalidate, Xtest, Ttest

I again borrow code from A2 to experiment on various hyperparameters for my neural network. I have added the learning rate as a hyperparameter.

In [35]:
def run_experiment(X, T, n_folds, 
                   n_epochs_choices, 
                   n_hidden_units_per_layer_choices, 
                   activation_function_choices,
                   learning_rate_choices):
    
    def rmse(A, B):
        return np.sqrt(np.mean((A - B)**2))
    

    Xtrain, Ttrain, Xvalidate, Tvalidate, Xtest, Ttest = partition(X, T, n_folds)
    print(Xtrain)
    
    n_inputs = X.shape[1]
    n_outputs = T.shape[1]
    
    # hold the various values for the dataframe
    epochs = []
    nh = []
    lr = []
    act_func = []
    RMSE_Train = []
    RMSE_Val = []
    RMSE_Test = []
    
    
    # make sure [0] is an option for hidden layer choices
    
    if [0] not in n_hidden_units_per_layer_choices:
        print("test")
        n_hidden_units_per_layer_choices.insert(0, [0])
    
    for learning_rate in learning_rate_choices:
        for n_epochs in n_epochs_choices:
            for n_hidden_units_per_layer in n_hidden_units_per_layer_choices:
                for activation_function in activation_function_choices:
                
                    neural_net = neuralnetworks.NeuralNetwork(n_inputs, 
                                               n_hidden_units_per_layer, 
                                               n_outputs, 
                                               activation_function)
                
                    neural_net.train(Xtrain, Ttrain, n_epochs, learning_rate, method='adam')
                    train_results = rmse(Ttrain, neural_net.use(Xtrain))
                    validation_results = rmse(Tvalidate, neural_net.use(Xvalidate))
                    test_results = rmse(Ttest, neural_net.use(Xtest))
                
                    epochs.append(n_epochs)
                    nh.append(n_hidden_units_per_layer)
                    lr.append(learning_rate)
                    act_func.append(activation_function)
                    RMSE_Train.append(train_results)
                    RMSE_Val.append(validation_results)
                    RMSE_Test.append(test_results)
                
    data = {'epochs':epochs, 
            'nh':nh, 
            'lr':lr,
            'act func':act_func, 
            'RMSE Train':RMSE_Train, 
            'RMSE Val':RMSE_Val,
            'RMSE Test':RMSE_Test
           }
    
    results = pandas.DataFrame(data)
    
    return results

Below, I choose the batter's WAR and Name from the Statcast dataframe. This will be the target vector for my `train()` method. 

I use the same dataframe to select the previously mentioned relevant statistics to form my `X` matrix. I use the same dataframe for both `X` and `T` to ensure that the rows are correctly aligned.

In [613]:
player_wars_2019 = batter_statcast_with_trad_war[['Name', 'WAR']]
relevant_physical_stats = batter_statcast_with_trad_war[relevant_columns]
relevant_physical_stats.isna().sum()

exit_velocity_avg       0
launch_angle_avg        0
sweet_spot_percent      0
barrels                 0
barrel_batted_rate      0
solidcontact_percent    0
flareburner_percent     0
poorlyunder_percent     0
poorlytopped_percent    0
poorlyweak_percent      0
hard_hit_percent        0
pull_percent            0
straightaway_percent    0
opposite_percent        0
batted_ball             0
groundballs_percent     0
groundballs             0
flyballs_percent        0
flyballs                0
linedrives_percent      0
linedrives              0
popups_percent          0
popups                  0
sprint_speed            0
dtype: int64

I next proceed to run the experiment. The results are shown below. Curiously, lower Training set RMSE's seem to correlate with higher Validation set RMSE's.

In [47]:
T = player_wars_2019.to_numpy().reshape((-1, 1))
X = relevant_physical_stats.to_numpy()

np.random.seed(42)
result_df = run_experiment(X, T, n_folds=6, 
                           n_epochs_choices=[50, 100, 500, 1000, 2000],
                           n_hidden_units_per_layer_choices=[[], [5], [5,5,5], [10,10], [100, 100], [100, 100, 100], [10, 10, 10, 10]],
                           activation_function_choices=['relu', 'tanh'],
                           learning_rate_choices=[0.1, 0.01, 0.001])
result_df

(438, 1)
[[87.4 14.2 37.4 ...  7.9 21.  29.4]
 [89.3 12.4 21.2 ... 12.9 11.  26.2]
 [91.1  7.7 36.3 ...  2.4  9.  28.7]
 ...
 [91.2 17.8 39.8 ...  7.5 34.  28.8]
 [85.1  3.3 22.3 ...  3.9 13.  28.5]
 [89.5 17.4 32.3 ... 11.7 50.  25.6]]
test
Adam: Epoch 5 Error=1.29306
Adam: Epoch 10 Error=1.21881
Adam: Epoch 15 Error=1.16869
Adam: Epoch 20 Error=1.14840
Adam: Epoch 25 Error=1.14894
Adam: Epoch 30 Error=1.14787
Adam: Epoch 35 Error=1.14281
Adam: Epoch 40 Error=1.13883
Adam: Epoch 45 Error=1.13676
Adam: Epoch 50 Error=1.13594
Adam: Epoch 5 Error=1.86485
Adam: Epoch 10 Error=1.49159
Adam: Epoch 15 Error=1.34785
Adam: Epoch 20 Error=1.26746
Adam: Epoch 25 Error=1.22429
Adam: Epoch 30 Error=1.19585
Adam: Epoch 35 Error=1.17582
Adam: Epoch 40 Error=1.15968
Adam: Epoch 45 Error=1.14850
Adam: Epoch 50 Error=1.14203
Adam: Epoch 5 Error=1.65083
Adam: Epoch 10 Error=1.41528
Adam: Epoch 15 Error=1.26714
Adam: Epoch 20 Error=1.20681
Adam: Epoch 25 Error=1.17330
Adam: Epoch 30 Error=1.15450
Adam: E

Adam: Epoch 100 Error=1.93917
Adam: Epoch 10 Error=2.80560
Adam: Epoch 20 Error=1.98834
Adam: Epoch 30 Error=1.93494
Adam: Epoch 40 Error=1.93393
Adam: Epoch 50 Error=1.91924
Adam: Epoch 60 Error=1.91829
Adam: Epoch 70 Error=1.91782
Adam: Epoch 80 Error=1.91784
Adam: Epoch 90 Error=1.91777
Adam: Epoch 100 Error=1.91773
Adam: Epoch 10 Error=1.94849
Adam: Epoch 20 Error=1.95164
Adam: Epoch 30 Error=1.92317
Adam: Epoch 40 Error=1.92278
Adam: Epoch 50 Error=1.92238
Adam: Epoch 60 Error=1.92105
Adam: Epoch 70 Error=1.92124
Adam: Epoch 80 Error=1.92105
Adam: Epoch 90 Error=1.92107
Adam: Epoch 100 Error=1.92105
Adam: Epoch 10 Error=1.53946
Adam: Epoch 20 Error=1.45472
Adam: Epoch 30 Error=1.37757
Adam: Epoch 40 Error=1.30088
Adam: Epoch 50 Error=1.22790
Adam: Epoch 60 Error=1.16005
Adam: Epoch 70 Error=1.14719
Adam: Epoch 80 Error=1.07732
Adam: Epoch 90 Error=1.04088
Adam: Epoch 100 Error=1.01868
Adam: Epoch 50 Error=1.13995
Adam: Epoch 100 Error=1.13446
Adam: Epoch 150 Error=1.13388
Adam: Ep

Adam: Epoch 1000 Error=1.92105
Adam: Epoch 100 Error=0.66662
Adam: Epoch 200 Error=0.52009
Adam: Epoch 300 Error=0.41403
Adam: Epoch 400 Error=0.38354
Adam: Epoch 500 Error=0.35435
Adam: Epoch 600 Error=0.32813
Adam: Epoch 700 Error=0.33703
Adam: Epoch 800 Error=0.33674
Adam: Epoch 900 Error=0.31961
Adam: Epoch 1000 Error=0.30077
Adam: Epoch 100 Error=1.92108
Adam: Epoch 200 Error=1.92105
Adam: Epoch 300 Error=1.92105
Adam: Epoch 400 Error=1.92105
Adam: Epoch 500 Error=1.92105
Adam: Epoch 600 Error=1.92105
Adam: Epoch 700 Error=1.92105
Adam: Epoch 800 Error=1.92105
Adam: Epoch 900 Error=1.92105
Adam: Epoch 1000 Error=1.92105
Adam: Epoch 100 Error=0.38317
Adam: Epoch 200 Error=0.17725
Adam: Epoch 300 Error=0.13167
Adam: Epoch 400 Error=0.13092
Adam: Epoch 500 Error=0.26246
Adam: Epoch 600 Error=0.23038
Adam: Epoch 700 Error=0.10550
Adam: Epoch 800 Error=0.61717
Adam: Epoch 900 Error=0.31596
Adam: Epoch 1000 Error=0.31595
Adam: Epoch 100 Error=1.93637
Adam: Epoch 200 Error=1.92105
Adam: 

Adam: Epoch 5 Error=1.30707
Adam: Epoch 10 Error=1.25677
Adam: Epoch 15 Error=1.23329
Adam: Epoch 20 Error=1.21493
Adam: Epoch 25 Error=1.19275
Adam: Epoch 30 Error=1.17332
Adam: Epoch 35 Error=1.15731
Adam: Epoch 40 Error=1.14327
Adam: Epoch 45 Error=1.12956
Adam: Epoch 50 Error=1.11571
Adam: Epoch 5 Error=1.76355
Adam: Epoch 10 Error=1.61961
Adam: Epoch 15 Error=1.49805
Adam: Epoch 20 Error=1.38761
Adam: Epoch 25 Error=1.29559
Adam: Epoch 30 Error=1.21903
Adam: Epoch 35 Error=1.16655
Adam: Epoch 40 Error=1.14719
Adam: Epoch 45 Error=1.14591
Adam: Epoch 50 Error=1.14181
Adam: Epoch 5 Error=1.84739
Adam: Epoch 10 Error=1.60141
Adam: Epoch 15 Error=1.45877
Adam: Epoch 20 Error=1.41539
Adam: Epoch 25 Error=1.38170
Adam: Epoch 30 Error=1.35581
Adam: Epoch 35 Error=1.34711
Adam: Epoch 40 Error=1.33745
Adam: Epoch 45 Error=1.32799
Adam: Epoch 50 Error=1.31222
Adam: Epoch 5 Error=1.98087
Adam: Epoch 10 Error=1.82718
Adam: Epoch 15 Error=1.78338
Adam: Epoch 20 Error=1.72790
Adam: Epoch 25 Err

Adam: Epoch 50 Error=1.07986
Adam: Epoch 100 Error=0.98419
Adam: Epoch 150 Error=0.94913
Adam: Epoch 200 Error=0.92103
Adam: Epoch 250 Error=0.90412
Adam: Epoch 300 Error=0.88843
Adam: Epoch 350 Error=0.88292
Adam: Epoch 400 Error=0.88154
Adam: Epoch 450 Error=0.87922
Adam: Epoch 500 Error=0.87739
Adam: Epoch 50 Error=1.10646
Adam: Epoch 100 Error=0.95763
Adam: Epoch 150 Error=0.89633
Adam: Epoch 200 Error=0.85625
Adam: Epoch 250 Error=0.83146
Adam: Epoch 300 Error=0.81275
Adam: Epoch 350 Error=0.79073
Adam: Epoch 400 Error=0.77380
Adam: Epoch 450 Error=0.76188
Adam: Epoch 500 Error=0.75373
Adam: Epoch 50 Error=1.45278
Adam: Epoch 100 Error=1.19945
Adam: Epoch 150 Error=1.13958
Adam: Epoch 200 Error=1.10065
Adam: Epoch 250 Error=1.05309
Adam: Epoch 300 Error=1.01484
Adam: Epoch 350 Error=0.98331
Adam: Epoch 400 Error=0.96475
Adam: Epoch 450 Error=0.94661
Adam: Epoch 500 Error=0.93299
Adam: Epoch 50 Error=1.27226
Adam: Epoch 100 Error=1.04950
Adam: Epoch 150 Error=0.88001
Adam: Epoch 20

Adam: Epoch 600 Error=0.33289
Adam: Epoch 700 Error=0.28919
Adam: Epoch 800 Error=0.26333
Adam: Epoch 900 Error=0.23734
Adam: Epoch 1000 Error=0.21418
Adam: Epoch 200 Error=1.13379
Adam: Epoch 400 Error=1.13371
Adam: Epoch 600 Error=1.13368
Adam: Epoch 800 Error=1.13365
Adam: Epoch 1000 Error=1.13361
Adam: Epoch 1200 Error=1.13357
Adam: Epoch 1400 Error=1.13353
Adam: Epoch 1600 Error=1.13349
Adam: Epoch 1800 Error=1.13345
Adam: Epoch 2000 Error=1.13341
Adam: Epoch 200 Error=1.13381
Adam: Epoch 400 Error=1.13371
Adam: Epoch 600 Error=1.13365
Adam: Epoch 800 Error=1.13360
Adam: Epoch 1000 Error=1.13355
Adam: Epoch 1200 Error=1.13350
Adam: Epoch 1400 Error=1.13345
Adam: Epoch 1600 Error=1.13341
Adam: Epoch 1800 Error=1.13338
Adam: Epoch 2000 Error=1.13335
Adam: Epoch 200 Error=1.13460
Adam: Epoch 400 Error=1.13384
Adam: Epoch 600 Error=1.13378
Adam: Epoch 800 Error=1.13374
Adam: Epoch 1000 Error=1.13369
Adam: Epoch 1200 Error=1.13363
Adam: Epoch 1400 Error=1.13358
Adam: Epoch 1600 Error=1

Adam: Epoch 25 Error=3.80833
Adam: Epoch 30 Error=2.71550
Adam: Epoch 35 Error=1.53678
Adam: Epoch 40 Error=1.63023
Adam: Epoch 45 Error=1.70492
Adam: Epoch 50 Error=1.41823
Adam: Epoch 5 Error=8.49619
Adam: Epoch 10 Error=7.17232
Adam: Epoch 15 Error=6.11293
Adam: Epoch 20 Error=5.28738
Adam: Epoch 25 Error=4.65437
Adam: Epoch 30 Error=4.17074
Adam: Epoch 35 Error=3.80021
Adam: Epoch 40 Error=3.51331
Adam: Epoch 45 Error=3.28787
Adam: Epoch 50 Error=3.10742
Adam: Epoch 5 Error=2.07875
Adam: Epoch 10 Error=1.96553
Adam: Epoch 15 Error=1.86275
Adam: Epoch 20 Error=1.77025
Adam: Epoch 25 Error=1.69019
Adam: Epoch 30 Error=1.62357
Adam: Epoch 35 Error=1.56986
Adam: Epoch 40 Error=1.52755
Adam: Epoch 45 Error=1.49463
Adam: Epoch 50 Error=1.46898
Adam: Epoch 10 Error=1.30447
Adam: Epoch 20 Error=1.27244
Adam: Epoch 30 Error=1.24863
Adam: Epoch 40 Error=1.23116
Adam: Epoch 50 Error=1.21817
Adam: Epoch 60 Error=1.20814
Adam: Epoch 70 Error=1.20018
Adam: Epoch 80 Error=1.19370
Adam: Epoch 90 E

Adam: Epoch 250 Error=1.16246
Adam: Epoch 300 Error=1.13042
Adam: Epoch 350 Error=1.09678
Adam: Epoch 400 Error=1.06045
Adam: Epoch 450 Error=1.02136
Adam: Epoch 500 Error=0.97842
Adam: Epoch 50 Error=2.28571
Adam: Epoch 100 Error=1.60070
Adam: Epoch 150 Error=1.56424
Adam: Epoch 200 Error=1.52649
Adam: Epoch 250 Error=1.48811
Adam: Epoch 300 Error=1.44990
Adam: Epoch 350 Error=1.41243
Adam: Epoch 400 Error=1.37607
Adam: Epoch 450 Error=1.34086
Adam: Epoch 500 Error=1.30685
Adam: Epoch 50 Error=1.44386
Adam: Epoch 100 Error=1.24170
Adam: Epoch 150 Error=1.21746
Adam: Epoch 200 Error=1.19891
Adam: Epoch 250 Error=1.18141
Adam: Epoch 300 Error=1.16297
Adam: Epoch 350 Error=1.14314
Adam: Epoch 400 Error=1.12234
Adam: Epoch 450 Error=1.10031
Adam: Epoch 500 Error=1.07472
Adam: Epoch 50 Error=11.14681
Adam: Epoch 100 Error=4.34722
Adam: Epoch 150 Error=1.98468
Adam: Epoch 200 Error=1.54355
Adam: Epoch 250 Error=1.51512
Adam: Epoch 300 Error=1.50572
Adam: Epoch 350 Error=1.49592
Adam: Epoch 

Adam: Epoch 2000 Error=0.91419
Adam: Epoch 200 Error=1.20058
Adam: Epoch 400 Error=1.08798
Adam: Epoch 600 Error=1.02135
Adam: Epoch 800 Error=0.97850
Adam: Epoch 1000 Error=0.94536
Adam: Epoch 1200 Error=0.91657
Adam: Epoch 1400 Error=0.88851
Adam: Epoch 1600 Error=0.86211
Adam: Epoch 1800 Error=0.84181
Adam: Epoch 2000 Error=0.82717
Adam: Epoch 200 Error=1.44037
Adam: Epoch 400 Error=1.09097
Adam: Epoch 600 Error=1.00844
Adam: Epoch 800 Error=0.95406
Adam: Epoch 1000 Error=0.91479
Adam: Epoch 1200 Error=0.89150
Adam: Epoch 1400 Error=0.87543
Adam: Epoch 1600 Error=0.86453
Adam: Epoch 1800 Error=0.86076
Adam: Epoch 2000 Error=0.85829
Adam: Epoch 200 Error=1.31862
Adam: Epoch 400 Error=1.09762
Adam: Epoch 600 Error=0.96081
Adam: Epoch 800 Error=0.86183
Adam: Epoch 1000 Error=0.79781
Adam: Epoch 1200 Error=0.75258
Adam: Epoch 1400 Error=0.70222
Adam: Epoch 1600 Error=0.65500
Adam: Epoch 1800 Error=0.61918
Adam: Epoch 2000 Error=0.59618
Adam: Epoch 200 Error=1.52871
Adam: Epoch 400 Error

Unnamed: 0,epochs,nh,lr,act func,RMSE Train,RMSE Val,RMSE Test
0,50,[0],0.100,relu,1.135743,1.178001,1.059649
1,50,[0],0.100,tanh,1.141211,1.180558,1.084321
2,50,[],0.100,relu,1.140229,1.169561,1.074819
3,50,[],0.100,tanh,1.136747,1.188751,1.057617
4,50,[5],0.100,relu,1.051613,1.106494,1.042951
...,...,...,...,...,...,...,...
235,2000,"[100, 100]",0.001,tanh,0.011406,1.720417,1.835088
236,2000,"[100, 100, 100]",0.001,relu,1.206457,1.181844,1.074808
237,2000,"[100, 100, 100]",0.001,tanh,0.056731,2.020063,2.294899
238,2000,"[10, 10, 10, 10]",0.001,relu,1.061059,1.160408,1.020697


I now find the best combination of hyperparameters. It appears that three hidden layyers, each with 100 neurons, with a low learning rate is most effective in predicting the WAR.

In [49]:
mymin = result_df['RMSE Val'].min()
result_df.loc[result_df['RMSE Val'] == mymin]

Unnamed: 0,epochs,nh,lr,act func,RMSE Train,RMSE Val,RMSE Test
189,100,"[100, 100, 100]",0.001,tanh,1.290968,1.064524,1.113882


Next, for fun I check how the network performs when I have it try to predict the WAR with the full dataset. I ramp up the epochs to finely tune the network.

In [53]:
n_inputs = X.shape[1]
n_outputs = T.shape[1]

neural_net = neuralnetworks.NeuralNetwork(n_inputs, 
                                           [100, 100, 100], 
                                           n_outputs, 
                                           'relu')
                
neural_net.train(X, T, 50000, 0.001, method='adam')
predicted_WAR = neural_net.use(X)

Adam: Epoch 5000 Error=1.10947
Adam: Epoch 10000 Error=0.39557
Adam: Epoch 15000 Error=0.10894
Adam: Epoch 20000 Error=0.04235
Adam: Epoch 25000 Error=0.02334
Adam: Epoch 30000 Error=0.01471
Adam: Epoch 35000 Error=0.01689
Adam: Epoch 40000 Error=0.00166
Adam: Epoch 45000 Error=0.00006
Adam: Epoch 50000 Error=0.00018


It looks like the network has very low error. We now check the final results to see how it did.

In [82]:
real = player_wars.tolist()
predicted_WAR_1d = predicted_WAR.reshape((predicted_WAR.shape[0],))
predicted = np.round(predicted_WAR_1d, 1).tolist()
names = batter_statcast_with_trad_war['Name']

real_vs_predicted = {
    "Actual WAR": real,
    "Predicted WAR": predicted
}

real_vs_predicted = pandas.DataFrame(real_vs_predicted)
real_vs_predicted.insert(0, 'Name', names)

In [104]:
sql('select * from real_vs_predicted')

Unnamed: 0,Name,Actual WAR,Predicted WAR
0,Albert Pujols,-0.5,-0.5
1,Miguel Cabrera,-0.4,-0.4
2,Jeff Mathis,-2.1,-2.1
3,Shin-Soo Choo,1.7,1.7
4,Yadier Molina,1.2,1.2
...,...,...,...
433,Will Smith,1.7,1.7
434,Keston Hiura,2.2,2.2
435,Nicky Lopez,-0.2,-0.2
436,Yordan Alvarez,3.8,3.8


It appears that the network is practically perfect at predicting WAR from its relevant statistics. Lets see if we can replicate this result on another MLB season. Below, we will repeat the entire process and come back to check the results.

In [1080]:
traditional_stats_2020 = pandas.read_csv('Fangraphs_2020.csv')
war_2020 = sql('SELECT Name, WAR FROM traditional_stats_2020')

In [115]:
batter_statcast_2020 = pandas.read_csv('statcast_2020.csv')

In [117]:
full_names_2020 = batter_statcast_2020.iloc[:, 1].astype(str) + ' ' + batter_statcast_2020['last_name'].astype(str)
full_names_2020
batter_statcast_2020.insert(0, 'Name', full_names_2020, True)
batter_statcast_2020['Name'] = batter_statcast_2020['Name'].str[1:]

In [119]:
batter_statcast_with_trad_war_2020 = sql('SELECT * FROM batter_statcast_2020 NATURAL JOIN war_2020')
relevant_physical_stats_2020 = batter_statcast_with_trad_war_2020[relevant_columns]

relevant_physical_stats_2020.isna().sum()

exit_velocity_avg       0
launch_angle_avg        0
sweet_spot_percent      0
barrels                 0
barrel_batted_rate      0
solidcontact_percent    0
flareburner_percent     0
poorlyunder_percent     0
poorlytopped_percent    0
poorlyweak_percent      0
hard_hit_percent        0
pull_percent            0
straightaway_percent    0
opposite_percent        0
batted_ball             0
groundballs_percent     0
groundballs             0
flyballs_percent        0
flyballs                0
linedrives_percent      0
linedrives              0
popups_percent          0
popups                  0
sprint_speed            0
dtype: int64

In [120]:
player_wars_2020 = batter_statcast_with_trad_war_2020['WAR']
T_2020 = player_wars_2020.to_numpy().reshape((-1, 1))
X_2020 = relevant_physical_stats_2020.to_numpy()

In [121]:
n_inputs = X_2020.shape[1]
n_outputs = T_2020.shape[1]

neural_net = neuralnetworks.NeuralNetwork(n_inputs, 
                                           [100, 100, 100], 
                                           n_outputs, 
                                           'relu')
                
neural_net.train(X_2020, T_2020, 50000, 0.001, method='adam')
predicted_WAR_2020 = neural_net.use(X_2020)

Adam: Epoch 5000 Error=0.56279
Adam: Epoch 10000 Error=0.04822
Adam: Epoch 15000 Error=0.01972
Adam: Epoch 20000 Error=0.01615
Adam: Epoch 25000 Error=0.07105
Adam: Epoch 30000 Error=0.00226
Adam: Epoch 35000 Error=0.00019
Adam: Epoch 40000 Error=0.00089
Adam: Epoch 45000 Error=0.00006
Adam: Epoch 50000 Error=0.00138


In [123]:
real = player_wars_2020.tolist()
predicted_WAR_1d = predicted_WAR_2020.reshape((predicted_WAR_2020.shape[0],))
predicted = np.round(predicted_WAR_1d, 1).tolist()
names = batter_statcast_with_trad_war_2020['Name']

real_vs_predicted = {
    "Actual WAR": real,
    "Predicted WAR": predicted
}

real_vs_predicted = pandas.DataFrame(real_vs_predicted)
real_vs_predicted.insert(0, 'Name', names)

In [1081]:
real_vs_predicted

Unnamed: 0,Name,Actual WAR,Predicted WAR
0,Jeff Mathis,0.2,0.2
1,Edwin Encarnacion,-0.3,-0.3
2,David Peralta,1.0,1.0
3,Dexter Fowler,0.0,-0.0
4,Tyler Flowers,0.4,0.4
...,...,...,...
397,Randy Arozarena,0.8,0.8
398,Austin Hays,0.6,0.6
399,Mike Brosseau,1.1,1.1
400,Luis Garcia,-0.4,-0.4


Once again, it appears like the Statcast data is able to perfectly estimate the player's WAR. We can be very confident that the network is able to accurately predict the true WAR given the Statcast data, but can it predict the **following** year's WAR? Let's find out! 

We now pull the Statcast data from all of the seasons where Statcast was available. 

In [127]:
statcast_2015_2021 = pandas.read_csv('statcast_2015-2021.csv')

In [128]:
statcast_2015_2021

Unnamed: 0,last_name,first_name,player_id,year,exit_velocity_avg,launch_angle_avg,sweet_spot_percent,barrels,barrel_batted_rate,solidcontact_percent,...,groundballs_percent,groundballs,flyballs_percent,flyballs,linedrives_percent,linedrives,popups_percent,popups,sprint_speed,Unnamed: 28
0,Beltre,Adrian,134181,2018,88.7,13.6,39.7,22,6.4,7.0,...,39.1,135,25.5,88,30.1,104,5.2,18,24.8,
1,Martinez,Victor,400121,2018,87.8,14.1,35.1,20,4.7,4.9,...,41.6,177,24.9,106,26.6,113,6.8,29,23.2,
2,Utley,Chase,400284,2018,87.7,17.2,32.1,3,2.3,3.8,...,42.7,56,27.5,36,24.4,32,5.3,7,27.2,
3,Pujols,Albert,405395,2018,90.0,14.1,34.5,25,6.2,6.7,...,40.9,165,22.6,91,26.3,106,10.2,41,22.2,
4,Holliday,Matt,407812,2018,90.9,12.8,28.6,2,5.7,8.6,...,42.9,15,28.6,10,22.9,8,5.7,2,25.2,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3485,Happ,Ian,664023,2017,88.5,13.3,37.3,32,13.3,5.0,...,41.5,100,26.1,63,26.6,64,5.8,14,28.9,
3486,Bader,Harrison,664056,2017,86.3,8.6,27.4,6,9.7,3.2,...,45.2,28,32.3,20,19.4,12,3.2,2,30.0,
3487,Stevenson,Andrew,664057,2017,84.7,1.6,20.5,0,0.0,5.1,...,64.1,25,7.7,3,25.6,10,2.6,1,29.2,
3488,Hwang,Jae-Gyun,666561,2017,82.7,13.6,32.4,2,5.4,0.0,...,45.9,17,29.7,11,16.2,6,8.1,3,28.4,


Next, I separate the Statcast data from each season. Again, I am a big fan of SQL.

In [494]:
sc_2015 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2015')
sc_2016 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2016')
sc_2017 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2017')
sc_2018 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2018')
sc_2019 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2019')
sc_2020 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2020')
sc_2021 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2021')

I repeat the process of combining the fangraphs and statcast dataframes with the full names. This time, I am using the 2018 data!

In [581]:
traditional_stats_2018 = pandas.read_csv('Fangraphs_2018.csv')
war_2018 = sql('SELECT Name, WAR FROM traditional_stats_2018')

In [585]:
full_names_2018 = sc_2018.iloc[:, 1].astype(str) + ' ' + sc_2018['last_name'].astype(str)
full_names_2018
sc_2018.insert(0, 'Name', full_names_2018, True)
sc_2018['Name'] = sc_2018['Name'].str[1:]

In [587]:
batter_statcast_with_trad_war_2018 = sql('SELECT * FROM sc_2018 NATURAL JOIN war_2018')

I must be careful to choose datasets that have only the same players, and to have the dataset aligned. Again SQL handles this for me. I replace all NaN values with a non-zero value.

In [626]:
player_wars_2018 = batter_statcast_with_trad_war_2018[['Name','WAR']]
common_2018 = sql('SELECT player_wars_2018.WAR AS WAR FROM player_wars_2018, player_wars_2019 WHERE player_wars_2018.name = player_wars_2019.name')
common_2019 = sql('SELECT player_wars_2019.WAR AS WAR FROM player_wars_2018, player_wars_2019 WHERE player_wars_2018.name = player_wars_2019.name')

T_2018 = common_2018.to_numpy().reshape((-1, 1))
X_2018 = common_2018.to_numpy()
common_2018 = relevant_physical_stats_2018.fillna(.01)

I save the 2019 Statcast data in X_2019 to evaluate the year-to-year effect that Statcast data has in making predictions. I call `use()` again, and evaluate.

In [1082]:
X_2019 = common_2019.to_numpy()

In [634]:
n_inputs = X_2018.shape[1]
n_outputs = T_2018.shape[1]


neural_net = neuralnetworks.NeuralNetwork(n_inputs, 
                                           [100, 100, 100], 
                                           n_outputs, 
                                           'relu')
                
neural_net.train(X_2018, T_2018, 5000, 0.001, method='adam')
predicted_WAR_2019 = neural_net.use(X_2019)

Adam: Epoch 500 Error=1.17604
Adam: Epoch 1000 Error=0.91556
Adam: Epoch 1500 Error=0.62808
Adam: Epoch 2000 Error=0.42738
Adam: Epoch 2500 Error=0.36341
Adam: Epoch 3000 Error=0.34366
Adam: Epoch 3500 Error=0.32666
Adam: Epoch 4000 Error=0.30738
Adam: Epoch 4500 Error=0.28443
Adam: Epoch 5000 Error=0.25639


In [845]:
average_difference = predicted_WAR_2019 - common_2019
print(np.mean(average_difference), np.std(average_difference))


WAR    0.078008
dtype: float64 WAR    0.316373
dtype: float64


It appears that our neural network is extremely successful at predicting a player's WAR for the following year given the physical Statcast data of the current year that we are investigating. Our year-to-year prediction was off by 0.07 WAR. Considering that the standard deviation of WAR is 0.32, this is a very encouraging result. We can now be confident that the relevant Statcast data holds up year-to-year in predicting WAR. Now , let's finally use this knowledge to predict the next year given the current year's data.

The above process was quite tedious to do repeatedly. It's time to automate our results. Below I define a function to find players that have played between consecutive seasons.

In [904]:
def find_common_players(season1_statcast, season1_fangraphs, season2_statcast, season2_fangraphs):
    season_1_war = sql('SELECT Name, WAR FROM season1_fangraphs')
    season_2_war = sql('SELECT Name, WAR FROM season2_fangraphs')
    
    full_names_season_1 = season1_statcast.iloc[:, 1].astype(str) + ' ' + season1_statcast['last_name'].astype(str)
    full_names_season_2 = season2_statcast.iloc[:, 1].astype(str) + ' ' + season2_statcast['last_name'].astype(str)

    
    season1_statcast.insert(0, 'Name', full_names_season_1, True)
    season1_statcast['Name'] = season1_statcast['Name'].str[1:]
    
    season2_statcast.insert(0, 'Name', full_names_season_2, True)
    season2_statcast['Name'] = season2_statcast['Name'].str[1:] 
    
    batter_statcast_with_trad_war_season_1 = sql('SELECT * FROM season1_statcast NATURAL JOIN season_1_war')
    batter_statcast_with_trad_war_season_2 = sql('SELECT * FROM season1_statcast NATURAL JOIN season_2_war')
    
    sc_war_season1 = batter_statcast_with_trad_war_season_1[['Name','WAR']]
    sc_war_season2 = batter_statcast_with_trad_war_season_2[['Name','WAR']]

    common_season1 = sql('SELECT sc_war_season1.WAR AS WAR FROM sc_war_season1, sc_war_season2 WHERE sc_war_season1.name = sc_war_season2.name')
    common_season2 = sql('SELECT sc_war_season2.WAR AS WAR FROM sc_war_season1, sc_war_season2 WHERE sc_war_season1.name = sc_war_season2.name')
    names = sql('SELECT sc_war_season2.name AS Name FROM sc_war_season1, sc_war_season2 WHERE sc_war_season1.name = sc_war_season2.name')

    
    return (common_season1, common_season2, names)

I next define a function that predicts the WAR for a season given the **previous** season's Statcast data.

In [905]:
def get_predicted_war(common_players):
    T_season_1 = common_players[0].to_numpy().reshape((-1, 1))
    X_season_1 = common_players[0].to_numpy()
    X_season_2 = common_players[1].to_numpy()
                                
    n_inputs = X_season_1.shape[1]
    n_outputs = T_season_1.shape[1]


    neural_net = neuralnetworks.NeuralNetwork(n_inputs, 
                                              [100, 100, 100], 
                                              n_outputs, 
                                              'relu')
                
    neural_net.train(X_season_1, T_season_1, 5000, 0.001, method='adam')
    predicted_WAR_season2 = neural_net.use(X_season_2)
    
    return predicted_WAR_season2           
                                

I define a function to conveniently compile the results.

In [1021]:
def compile_results(statcast, fangraphs):
    column_names = [
        '2015',
        '2016',
        '2017',
        '2018',
        '2019',
        '2020',
        '2021'
    ]
    
    row_names = [
        'League Average WAR',
        'League Standard Deviation WAR',
        'Individual Actual WAR',
        'Individual Predicted WAR',
        'Difference',
        '95\% Confidence Interval'
    ]
    
    results = []
    #print(len(statcast))
    for i in range(1, len(statcast)-1):
        
        statcast = reset_sc()
        fangraphs = reset_fangraphs()
        
        common_players = find_common_players(statcast[i], fangraphs[i], statcast[i-1], fangraphs[i-1])
        names = common_players[2]
        num = len(common_players[0])

        difference = common_players[1] - common_players[0]
        result = {
            'Season': [column_names[i]] * num,
            'Name': names.to_numpy().tolist(),
            'League Average WAR': [np.round(np.mean(common_players[0])[0], 1)] * num,
            'League Standard Deviation WAR': [np.round(np.std(common_players[0])[0], 1)] * num,
            'Actual WAR': common_players[0].to_numpy().tolist(),
            'Predicted WAR': common_players[1].to_numpy().tolist(),
            'Difference': np.round(difference.to_numpy(), 1).tolist(),
            'Mean Difference': [np.round(np.mean(difference.to_numpy()), 1)] * num,
            'Standard Deviation Difference': [np.round(np.std(difference.to_numpy()), 1)] * num 

        }
        data = pandas.DataFrame(result)
        data['Name'] = data['Name'].str[0]
        data['Actual WAR'] = data['Actual WAR'].str[0]
        data['Predicted WAR'] = data['Predicted WAR'].str[0]
        data['Difference'] = data['Difference'].str[0]
        results.append(data)
    return results
    

An unfortunate quirk of my method for matching names necesitates resetting the data after each trial. I implement a quick hack to get around this.

In [1022]:
def reset_sc():
    statcast_2015_2021 = pandas.read_csv('statcast_2015-2021.csv')
    sc_2015 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2015')
    sc_2016 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2016')
    sc_2017 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2017')
    sc_2018 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2018')
    sc_2019 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2019')
    sc_2020 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2020')
    sc_2021 = sql('SELECT * FROM statcast_2015_2021 WHERE year = 2021')

    all_sc = [sc_2015, sc_2016, sc_2017, sc_2018, sc_2019, sc_2020, sc_2021]
    return all_sc

In [1023]:
def reset_fangraphs():
    fangraphs_2015 = pandas.read_csv('Fangraphs_2015.csv')
    fangraphs_2016 = pandas.read_csv('Fangraphs_2016.csv')
    fangraphs_2017 = pandas.read_csv('Fangraphs_2017.csv')
    fangraphs_2018 = pandas.read_csv('Fangraphs_2018.csv')
    fangraphs_2019 = pandas.read_csv('Fangraphs_2019.csv')
    fangraphs_2020 = pandas.read_csv('Fangraphs_2020.csv')
    fangraphs_2021 = pandas.read_csv('Fangraphs_2021.csv')

    all_fangraphs = [fangraphs_2015, fangraphs_2016, fangraphs_2017, fangraphs_2018, fangraphs_2019, fangraphs_2020, fangraphs_2021]
    return all_fangraphs
#fangraphs_2021

In [1083]:
all_sc = reset_sc()
all_fangraphs = reset_fangraphs()
results = compile_results(all_sc, all_fangraphs)

## Results

It's now time to evaluate the results from our experiments. Let's look into the data for the following seasons, investigating both the true and projected WAR.

Let's check the results for the 2016 season.

In [1085]:
results[0]

Unnamed: 0,Season,Name,League Average WAR,League Standard Deviation WAR,Actual WAR,Predicted WAR,Difference,Mean Difference,Standard Deviation Difference
0,2016,Bartolo Colon,1.3,1.9,-0.2,0.0,0.2,0.1,1.6
1,2016,David Ortiz,1.3,1.9,4.5,2.9,-1.6,0.1,1.6
2,2016,Alex Rodriguez,1.3,1.9,-1.1,2.7,3.8,0.1,1.6
3,2016,Adrian Beltre,1.3,1.9,5.5,4.3,-1.2,0.1,1.6
4,2016,Carlos Beltran,1.3,1.9,2.4,1.8,-0.6,0.1,1.6
...,...,...,...,...,...,...,...,...,...
403,2016,Michael Conforto,1.3,1.9,1.0,1.9,0.9,0.1,1.6
404,2016,Yasiel Puig,1.3,1.9,1.0,1.5,0.5,0.1,1.6
405,2016,Jorge Soler,1.3,1.9,0.8,0.3,-0.5,0.1,1.6
406,2016,Jung Ho Kang,1.3,1.9,2.1,3.7,1.6,0.1,1.6


The projection system works very well. Given the 2015 Statcast data, the mean average difference between the predicted and actual 2016 WAR is just 0.1! On an individual player basis, my physics-based projection system has a standard deviation of just 1.6, which is lower than the actual standard deviation of true WAR. While an apples-and-oranges comparison, this leads me to believe that the variance in my projection model is acceptable for a first attempt.

Now, let's look at the 2017 season.

In [1086]:
results[1]

Unnamed: 0,Season,Name,League Average WAR,League Standard Deviation WAR,Actual WAR,Predicted WAR,Difference,Mean Difference,Standard Deviation Difference
0,2017,Adrian Beltre,1.3,1.8,2.9,5.5,2.6,0.1,1.8
1,2017,Carlos Beltran,1.3,1.8,-1.1,2.4,3.5,0.1,1.8
2,2017,Jayson Werth,1.3,1.8,-0.3,1.3,1.6,0.1,1.8
3,2017,Ichiro Suzuki,1.3,1.8,-0.1,1.3,1.4,0.1,1.8
4,2017,Victor Martinez,1.3,1.8,-1.1,1.1,2.2,0.1,1.8
...,...,...,...,...,...,...,...,...,...
395,2017,Chad Pinder,1.3,1.8,0.2,-0.2,-0.4,0.1,1.8
396,2017,Tim Anderson,1.3,1.8,0.1,2.1,2.0,0.1,1.8
397,2017,Andrew Benintendi,1.3,1.8,2.0,0.6,-1.4,0.1,1.8
398,2017,Tyler White,1.3,1.8,0.2,-0.2,-0.4,0.1,1.8


The 2017 results are similar to the 2016 results. It appears that older players, such as Adrian Beltre, Carlos Beltran, and Jayson Werth have less reliable Statcast data for predicting the next year's results. This is to be expected, as older players tend to have a higher marginal erosion of skills. Given more time, I would investigate the relationship between age and the difference in actual and predicted WAR. 

In [1090]:
results[2]

Unnamed: 0,Season,Name,League Average WAR,League Standard Deviation WAR,Actual WAR,Predicted WAR,Difference,Mean Difference,Standard Deviation Difference
0,2018,Adrian Beltre,1.3,1.9,1.1,2.9,1.8,0.1,1.7
1,2018,Victor Martinez,1.3,1.9,-1.7,-1.1,0.6,0.1,1.7
2,2018,Chase Utley,1.3,1.9,0.1,1.1,1.0,0.1,1.7
3,2018,Albert Pujols,1.3,1.9,-0.3,-2.0,-1.7,0.1,1.7
4,2018,Matt Holliday,1.3,1.9,0.1,0.0,-0.1,0.1,1.7
...,...,...,...,...,...,...,...,...,...
396,2018,Paul DeJong,1.3,1.9,3.3,3.1,-0.2,0.1,1.7
397,2018,Yoan Moncada,1.3,1.9,2.0,1.1,-0.9,0.1,1.7
398,2018,Ian Happ,1.3,1.9,1.5,1.9,0.4,0.1,1.7
399,2018,Harrison Bader,1.3,1.9,3.6,0.2,-3.4,0.1,1.7


The 2018 results are again solid.  

In [1091]:
results[3]

Unnamed: 0,Season,Name,League Average WAR,League Standard Deviation WAR,Actual WAR,Predicted WAR,Difference,Mean Difference,Standard Deviation Difference
0,2019,Albert Pujols,1.4,1.9,-0.5,-0.3,0.2,0.2,1.7
1,2019,Miguel Cabrera,1.4,1.9,-0.4,0.7,1.1,0.2,1.7
2,2019,Jeff Mathis,1.4,1.9,-2.1,0.9,3.0,0.2,1.7
3,2019,Shin-Soo Choo,1.4,1.9,1.7,2.3,0.6,0.2,1.7
4,2019,Yadier Molina,1.4,1.9,1.2,2.5,1.3,0.2,1.7
...,...,...,...,...,...,...,...,...,...
357,2019,Harrison Bader,1.4,1.9,1.8,3.6,1.8,0.2,1.7
358,2019,David Fletcher,1.4,1.9,3.3,1.8,-1.5,0.2,1.7
359,2019,Scott Kingery,1.4,1.9,2.8,-0.1,-2.9,0.2,1.7
360,2019,Juan Soto,1.4,1.9,4.9,3.7,-1.2,0.2,1.7


More of the same for our results for the 2019 system. It appears that the reverse of the observation made above is true for younger players. It would be worth investigating further if my projections are less reliable for younger, developing players and indeed need to be weighted by age.

Now, time for a curveball. Let's evaluate how my Statcast projections work on the COVID-19 shortened 2020 season.

In [1089]:
results[4]

Unnamed: 0,Season,Name,League Average WAR,League Standard Deviation WAR,Actual WAR,Predicted WAR,Difference,Mean Difference,Standard Deviation Difference
0,2020,Jeff Mathis,0.6,0.8,0.2,-2.1,-2.3,1.1,1.6
1,2020,Edwin Encarnacion,0.6,0.8,-0.3,2.5,2.8,1.1,1.6
2,2020,David Peralta,0.6,0.8,1.0,1.7,0.7,1.1,1.6
3,2020,Dexter Fowler,0.6,0.8,0.0,1.5,1.5,1.1,1.6
4,2020,Tyler Flowers,0.6,0.8,0.4,2.1,1.7,1.1,1.6
...,...,...,...,...,...,...,...,...,...
327,2020,Jordan Luplow,0.6,0.8,0.4,2.2,1.8,1.1,1.6
328,2020,Ian Happ,0.6,0.8,1.9,1.5,-0.4,1.1,1.6
329,2020,Fernando Tatis Jr.,0.6,0.8,2.9,3.6,0.7,1.1,1.6
330,2020,Vladimir Guerrero Jr.,0.6,0.8,0.2,0.4,0.2,1.1,1.6


Interesting! It appears that my projections suffered from the irregularity of the short 2020 season. The consensus from the baseball analytics community is that data from 2020 is to be taken with a grain of salt, so I am not too concerned with the lackluster performance.

It is finally time to evaluate the Statcast physics-base projection system again a professional projection system, ZiPS. I read in the 2021 projections for ZiPS, and train my neural network with the 2020 Statcast data.

In [1026]:
zips = pandas.read_csv('zips_2021.csv')

In [1095]:
all_sc = reset_sc()
all_fangraphs = reset_fangraphs()
common_2020_2021 = find_common_players(all_sc[-2], all_fangraphs[-2], all_sc[-1], zips)
predicted_2021 = get_predicted_war(common_2020_2021)

Adam: Epoch 500 Error=0.59540
Adam: Epoch 1000 Error=0.46438
Adam: Epoch 1500 Error=0.30831
Adam: Epoch 2000 Error=0.20451
Adam: Epoch 2500 Error=0.17895
Adam: Epoch 3000 Error=0.16699
Adam: Epoch 3500 Error=0.15394
Adam: Epoch 4000 Error=0.13842
Adam: Epoch 4500 Error=0.11952
Adam: Epoch 5000 Error=0.09783


In [1110]:
zips_war = common_2020_2021[1]
difference = predicted_2021 - common_2020_2021[1]

names = common_2020_2021[2]

final_results = {
    'Name': names.to_numpy().tolist(),
    '2021 ZiPS WAR': zips_war.to_numpy().tolist(),
    '2021 Statcast Physical WAR': np.round(predicted_2021, 1).tolist(),
    'Difference': np.round(difference.to_numpy(), 1).tolist(),
    'Mean Difference': [np.round(np.mean(difference.to_numpy()), 1)] * 400,
    'Standard Deviation Difference': [np.round(np.std(difference.to_numpy()), 1)] * 400
}
final_result = pandas.DataFrame(final_results)
final_result['Name'] = final_result['Name'].str[0]
final_result['2021 ZiPS WAR'] = final_result['2021 ZiPS WAR'].str[0]
final_result['2021 Statcast Physical WAR'] = final_result['2021 Statcast Physical WAR'].str[0]
final_result['Difference'] = final_result['Difference'].str[0]


In [1112]:
final_result

Unnamed: 0,Name,2021 ZiPS WAR,2021 Statcast Physical WAR,Difference,Mean Difference,Standard Deviation Difference
0,Jeff Mathis,-1.0,-0.4,0.6,0.0,0.1
1,Edwin Encarnacion,-0.1,-0.1,-0.0,0.0,0.1
2,David Peralta,1.1,1.1,-0.0,0.0,0.1
3,Dexter Fowler,-0.2,-0.2,-0.0,0.0,0.1
4,Tyler Flowers,0.9,0.9,-0.0,0.0,0.1
...,...,...,...,...,...,...
395,Randy Arozarena,2.1,2.1,0.0,0.0,0.1
396,Austin Hays,1.7,1.7,0.0,0.0,0.1
397,Mike Brosseau,2.0,2.0,0.0,0.0,0.1
398,Luis Garcia,0.3,0.2,-0.1,0.0,0.1


At last, it appears that we have confirmed that my Statcast projection system can successfully be compared to a professional projection system. The mean difference of my system is 0 (!), and the standard deviation of the difference is just 0.1. The system could use additional fine tuning, but I believe that I successfully demonstrated that physics-base batted ball outcomes are just as valuable as bottom line results in projecting a player's next-season WAR.

## Conclusion

It would appear that taking an agnostic approach to bottom-line traditional baseball outcomes and focusing on the physical aspects of a position player's batted ball outcomes has a very similar efficacy to the approach of traditional projection systems in predicting the player's WAR for the following season.

This result suggests that there may be new ways to approach player evaluation in the future. Such findings may be able to help quantify and project international prospects that have proven somewhat difficult to project based on the traditional systems. Such an approach would require a proliferation of Statcast systems in international leagues, such as the South Korean, Japanese, and Mexican Leagues. This may even hold true for finding diamonds in the rough for independent leagues in the United States.

Additionally, disregarding comparisons to other projection systems, my experiments have demonstrated that any given player's year-to-year Statcast data is reliable in predicting the following year's fangraphs WAR. This is the case despite ignoring other metrics that are essential to a player's value, such as defensive contributions.

A useful follow up to this project would be refining my system to weight a player's year-to-year data given the player's age, and evaluating the relationship between a player's batted-ball Statcast data to their defensive metrics. I hypothesize that players with excellent batted-ball Statcast metrics and speed tend to rate similarly well on defense. I imagine that my projections account for these relationships somehow via the hidden layers in the neural network, but I would only be able to confirm this given additional investigation.

### References

* https://library.fangraphs.com/misc/war/
* https://github.com/jldbc/pybaseball 
* https://pypi.org/project/baseball-scraper/
* https://www.fangraphs.com/leaders
* https://baseballsavant.mlb.com/leaderboard/
* https://colostate.instructure.com/courses/119103/files
* https://nbviewer.jupyter.org/url/www.cs.colostate.edu/~cs445/notebooks/A2.5%20Multilayer%20Neural%20Networks%20for%20Nonlinear%20Regression.ipynb

In [1116]:
import io
from nbformat import current
import glob
nbfile = glob.glob('Schall-FinalProject.ipynb')
if len(nbfile) > 1:
    print('More than one ipynb file. Using the first one.  nbfile=', nbfile)
with io.open(nbfile[0], 'r', encoding='utf-8') as f:
    nb = current.read(f, 'json')
word_count = 0
for cell in nb.worksheets[0].cells:
    if cell.cell_type == "markdown":
        word_count += len(cell['source'].replace('#', '').lstrip().split(' '))
print('Word count for file', nbfile[0], 'is', word_count)

Word count for file Schall-FinalProject.ipynb is 1923
