# When to Pull a Pitcher

#### James Fraser, Nick Korompilas and Luis Rosales

CBE 40455: Final Project  
December 14, 2017

## Introduction

In baseball, when entering the 9th inning with a lead, managers need to make a decision whether to stick with the pitcher they currently have in, or pull that pitcher in favor of the closer. A lot of factors go into this decision of whether to pull a pitcher or not. This notebook implements some of these factors and a multi stage decision module to determine when a team should pull their current pitcher in favor of their closer.

This model is adapted from: 

Kantor, Jeffrey. “Points After Touchdown Decision.”Https://Github.com/Jckantor/CBE40455/Blob/Master/Notebooks/Points%20after%20Touchdown%20Decision.Ipynb.

Some sources used were:

Albert, JIm. “Calculation of In-Game Win Probabilities.” Exploring Baseball Data with R, 29 Dec. 2014, baseballwithr.wordpress.com/2014/12/29/calculation-of-in-game-win-probabilities/.

Haechrel, Matt. “Matchup Probabilities in Major League Baseball.” Matchup Probabilities in Major League Baseball | Society for American Baseball Research, sabr.org/research/matchup-probabilities-major-league-baseball.

Hirotsu, Nobuyoshi, and Mike Wright. “Modelling a Baseball Game to Optimize Pitcher Substitution Strategies Incorporating Handedness of Players.” IMA Journal of Management Mathematics, 2005, pp. 179–194.

Nichols, David. “Expected Runs/Chance of Scoring Table.” Nichols's Expected Runs Table, www.nssl.noaa.gov/users/brooks/public_html/feda/datasets/expectedruns.html.

Sokoi, Joel S. “An Intuitive Markov Chain Lesson From Baseball.” INFORMS Transactions on Education. An Intuitive Markov Chain Lesson from Baseball , 2004.

# Initializations

In [82]:
%matplotlib inline

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import csv 
import pandas as pd
import math


# Inputting Data

Using percentages gathered from (Haechrel) regarding expectancy of a single, double, tripple ect., a 24 x 24 situation matrix was made. The matrix starts with a certian situation, say no one on, man on first, and gives the probabilities of moving from the initial state to any other state possible. For example, the probability of going from the initial state to men on second and third no one out is 0.04. Conversely, going from the initial situation to bases loaded no one out is not possible so the probability in that space of the matrix is 0.0.

An expected runs matrix was also made using data from (Nichols) which was based off the average runs scored in each situation from the last 20 years of professional baseball. This was then used to calculate the team's win probability, which was a function of lead.

An outs matrix was also made. This is a 24 x 24 matrix which shows the number of outs that occur from the transition from state A to state B.

In [83]:
# Imports data set
f = open('situation matrix(5).csv','r')
matrix = []
for line in f:
    line = line.rstrip()
    x=line.split(',')
    x = x[0:]
    for m in range(24):
        if x[m]=='':
            x[m]=-1
        else:
            x[m] = float(x[m])
    matrix.append(x) 
    
g = open('expected runs(2).csv','r')
runsscored = []
for line in g:
    line = line.rstrip()
    y=line.split(',')
    y = y[0:]
    for m in range(1):
        if y[m]=='':
            y[m]=-1
        else:
            y[m] = float(y[m])
    runsscored.append(y)    
    
h = open('outs.csv','r')
outs = []
for line in h:
    line = line.rstrip()
    z=line.split(',')
    z = z[0:]
    for m in range(24):
        if z[m]=='':
            z[m]=-1
        else:
            z[m] = float(z[m])
    outs.append(z)     

# Multi-Stage Model of Decision Process

Each stage in the multi-stage decision process corresponds to an at-bat. The state of the game is monitored after each at bat, and a new win percentage is calculated. Each state is modeled by 3 parameters:

-Outcome of the prior at-bat (S): Bases Empty ('Empty'), Man on First ('First'), Man on Second ('Second'), Man on Third ('Third'), Men on First and Second ('First/Second'), Men on First and Third ('Corners'), Men on Second and Third ('Second/Third'), Bases Loaded ('Loaded')

-Lead prior to the at-bat (L)

-Outs in the inning (O)

The cell below enumerates values for each parameter. The set of possible states of the game is from as the product of these parameters which will be used as in index for the following calculations.


In [84]:
S=['Empty', 'First', 'Second', 'Third', 'First/Second', 'Corners', 'Second/Third', 'Loaded']
L=range(0,10) # lead- from 0 to 10, assume never bring in closer when not winning or tied
O=[0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2] #outs- 0,1,or 2 outs in the 9th inning



# Decision Prior to the At-Bat

Suppose the Away team is going into the 9th inning with a lead or in a tie game. The Home team has the last opportunity to bat in the game, and the away team manager has a decision to make. The manager can either keep the current pitcher or bring in the closer to finish out the game. The closer is a pitcher that specializes in relieving in the ninth inning and is the best reliever on the team. The decision to bring him in is a tricky one because you dont want to waste the closer in a game where a regular reliever would have won the game because then the closer will not be available the next day. 

There model simulates 5 different scenarios in the ninth inning. Those scenarios being a tie game and then an away team lead up to 4. The model then runs through each of the 24 scenarios in the inning for said lead. The model initially calculates a win probability at the current state before the at bat which is a function of the lead and probability of the home team scoring from the current scenario. The parameters for the win probability were obtained from (Albert) The model then calculates the probability of landing in a subsequent scenario from the current scenario. The model then finds the expected runs scored from the at bat by multiplying the probability of landing in a new scenario by the runs scored to get to that scenario. The same multiplication is done to find the expected number of outs for the at bat by multiplying the probability of landing in a new scenario by the outs necessary to get to that scenario. The model then continues to simulate the inning by multiplying the new scenario matrix probaiities to find subsequent probabilities to land in new scenarios. It does this until the number of outs is greater than 2.5 because it rounds to 3 outs which means the inning would be over. The model then calculates the new probability of winning the game as a function of the new lead which would be the initial lead minus the expected runs scored. 

The model then tracks the difference between the starting winning percentage and the win percentage at the end of the simulation and if the difference is bigger than 3 percent change in win probablity, then the model advices the manager to bring in the closer.

In [85]:
OUTS=np.matmul(outs,matrix)
decision_matrix = np.zeros((5,24))
for i in range (0,5):
    Q=L[i]
    for j in range (0,24):
        x=np.zeros((1,24))
        x[0,j]=1
        Outs=O[j]
        WinProbabilitycurrent=math.exp(.06+.8*Q+runsscored[j][0])/(1+math.exp(.06+.8*Q+runsscored[j][0]))
        while (Outs<2.5):
            newscenario=np.matmul(x,matrix)
            ExpectedRunsNewScenario=np.matmul(runsscored[j],newscenario)
            NewRuns=float(ExpectedRunsNewScenario[j])
            FinalRuns=Q-NewRuns   
            WinProbability=math.exp(.06+.8*FinalRuns)/(1+math.exp(.06+.8*FinalRuns))
            NEwOuts=np.matmul(OUTS[j,:],newscenario.transpose())
            Outs=NEwOuts +Outs
        decision_matrix[i,j] = WinProbability - WinProbabilitycurrent













# Decision Chart

Below is the decision chart for the scenario described above.

Lead- Away teams lead in the bottom of the 9th

Scenarios- Scenarios 1-8 signify each of the 8 on base scenarios with 0 outs in the order that they are listed in S in the code above, 9-16 1 out, and 17-24 2 outs. 

The marker of one indicates that the manager should bring in the closer while a * indicates that the manager should leave in the reliever. 

As seen below the manager is not adviced to bring in closer in any scenario with a lead greater than 3 and only adviced to bring in the closer in messy situations with a 3 run lead (man on second and third or bases loaded)


In [86]:
print("      Scenarios->")
print("Lead")  
print("    ",end="")

for t in range(1,25):
    print("{0:3.0f} ".format(t),end="")
print("")

      
for d in range(0,5):
    print("{0:3.0f} ".format(d),end="")
    for t in range(0,24):
        if (decision_matrix[d,t] <= -.03)  :
            print("  1 ",end="") 
        else:
            print("  * ",end="")
    print("")                    


      Scenarios->
Lead
      1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24 
  0   *   1   1   1   1   1   1   1   *   1   1   1   1   1   1   1   *   1   1   1   1   1   1   1 
  1   *   1   1   1   1   1   1   1   *   1   1   1   1   1   1   1   *   1   1   1   1   1   1   1 
  2   *   *   1   1   1   1   1   1   *   *   1   1   1   1   1   1   *   *   1   1   1   1   1   1 
  3   *   *   *   *   *   *   1   1   *   *   *   *   *   *   1   1   *   *   *   *   *   *   1   1 
  4   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   * 
