# atBatModel.ipynb

This Jupyter notebook is meant as an example of how to use atBatModel.py. Required packages:
* numpy
* pandas
* pybaseball
* sklearn

In [1]:
import atBatModel as abm
import numpy as np

Initialize player classes using a name. The players' names and various IDs are saved as class variables.

In [2]:
pitcher = abm.player(name_first = "Gerrit", name_last = "Cole")
batter = abm.player(key_mlbam = 545361) # Mike Trout

print("Player name: ", batter.playerName)
print(batter.playerID)

Player name:  Mike Trout
name_last               trout
name_first               mike
key_mlbam              545361
key_retro            troum001
key_bbref           troutmi01
key_fangraphs           10155
mlb_played_first       2011.0
mlb_played_last        2021.0
Name: 0, dtype: object


Download player Statcast data. Per pybaseball, it is represented as a Pandas dataFrame. It is returned as a function output and saved to the player object's namespace. By default, the date range is from today to January 1 of the previous year.

In [3]:
pitcher.getStatcastData(playerType="pitcher", verbose=True)
batter.getStatcastData(playerType="batter", verbose=True, dateRange=["2020-01-01","2022-01-01"])

Gathering Player Data
Pitcher: Gerrit Cole
Found 3504 observations from 2021-01-01 to 2022-04-10.

Gathering Player Data
Batter: Mike Trout
Found 1859 observations from 2020-01-01 to 2022-01-01.



Unnamed: 0,pitch_type,game_date,release_speed,release_pos_x,release_pos_z,player_name,batter,pitcher,events,description,...,fld_score,post_away_score,post_home_score,post_bat_score,post_fld_score,if_fielding_alignment,of_fielding_alignment,spin_axis,delta_home_win_exp,delta_run_exp
0,FF,2021-05-17,95.1,1.85,6.60,"Trout, Mike",545361,656529,walk,ball,...,1,1,0,0,1,Standard,Standard,136.0,0.022,0.187
1,FF,2021-05-17,93.6,1.58,6.66,"Trout, Mike",545361,656529,,ball,...,1,1,0,0,1,Standard,Standard,146.0,0.016,0.137
2,CU,2021-05-17,80.2,1.64,6.77,"Trout, Mike",545361,656529,,called_strike,...,1,1,0,0,1,Standard,Standard,321.0,0.000,-0.061
3,CU,2021-05-17,79.7,1.70,6.64,"Trout, Mike",545361,656529,,ball,...,1,1,0,0,1,Standard,Standard,321.0,0.000,0.070
4,FF,2021-05-17,94.6,1.79,6.66,"Trout, Mike",545361,656529,,ball,...,1,1,0,0,1,Standard,Standard,144.0,0.000,0.047
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1854,,2020-02-25,,,,"Trout, Mike",545361,675923,field_out,hit_into_play,...,0,0,3,3,0,,,,-0.017,
1855,,2020-02-25,,,,"Trout, Mike",545361,592254,walk,ball,...,0,0,0,0,0,,,,0.024,
1856,,2020-02-25,,,,"Trout, Mike",545361,592254,,ball,...,0,0,0,0,0,,,,0.000,
1857,,2020-02-25,,,,"Trout, Mike",545361,592254,,ball,...,0,0,0,0,0,,,,0.000,


To construct the Markov model, we initialize an instance of the markovModel class using pitcher and batter classes as parameters. The construction of the Markov matrix and calculation of the logistic model takes place automatically. The sanitized data used to calculate the model is also saved in the object namespace.

In [4]:
model = abm.markovModel(pitcher = pitcher, batter = batter)

np.round(model.markovMatrix, 3)

array([[0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.429, 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.425, 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.   , 0.417, 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.451, 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.444, 0.   , 0.   , 0.431, 0.   , 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.   , 0.447, 0.   , 0.   , 0.428, 0.   , 0.   , 0.   ,
        0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ],

To obtain the raw outcome vector, we use the method simulatePitches. We may specify the starting count, as well as a number of iterations of the matrix equation x_{n+1} = Ax_n. Every iteration is like one pitch in an at-bat.

In [5]:
outcomeVector = model.simulatePitches(100, (0,0))

np.round(outcomeVector, 3)

array([0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   , 0.   ,
       0.   , 0.   , 0.   , 0.188, 0.   , 0.104, 0.04 , 0.066, 0.601])

We may also compute stats using the method outcomeStats.

In [6]:
stats = model.outcomeStats()
stats

pitcher_name    Gerrit Cole
batter_name      Mike Trout
x_pBB                 0.188
x_pHBP                  0.0
x_p1B                 0.104
x_p2B                  0.04
x_pHR                 0.066
x_pOut                0.601
x_AVG                  0.21
x_OBP                 0.399
x_SLG                 0.448
x_OPS                 0.847
x_wOBA                0.412
dtype: object