# The Best and Worst Pitchers in MLB History

In [1]:
import pandas as pd
pitching = pd.read_csv("../baseballdatabank-2023.1/core/Pitching.csv")

Data courtesy of [Sean Lahman's Baseball](http://www.seanlahman.com/download-baseball-database)

In [2]:
import scipy.stats as stats
import matplotlib.pyplot as plt
import numpy as np

In [3]:
pitching.head()

Unnamed: 0,playerID,yearID,stint,teamID,lgID,W,L,G,GS,CG,...,IBB,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP
0,bechtge01,1871,1,PH1,,1,2,3,3,2,...,,7,,0,146.0,0,42,,,
1,brainas01,1871,1,WS3,,12,15,30,30,30,...,,7,,0,1291.0,0,292,,,
2,fergubo01,1871,1,NY2,,0,0,1,0,0,...,,2,,0,14.0,0,9,,,
3,fishech01,1871,1,RC1,,4,16,24,24,22,...,,20,,0,1080.0,1,257,,,
4,fleetfr01,1871,1,NY2,,0,1,1,1,1,...,,0,,0,57.0,0,21,,,


In [None]:
pitching.columns

## Calculating Pitcher Totals

We'll group by `playerID` to be able to work with career statistics *per pitcher*.

In [None]:
pitchers = pitching.groupby('playerID').sum()

In [None]:
pitchers.sample(8)

## Restoring Year

This has ruined our `yearID` column! We'll want to have a sense of *when* our pitchers pitched later (see below), so let's now add a column that will record a pitcher's *first year* in the Major Leagues.

In [None]:
pitchers['firstYear'] = pitching.groupby('playerID').agg({'yearID': 'min'})

In [None]:
pitchers.sample(8)

## Adding a WHIP Column

Now then: How shall we measure pitching prowess? There are of course many statistics we might use, but a good one is **WHIP**: **W**alks plus **H**its divided by **I**nnings **P**itched.

We'll have to create this as a new column since it doesn't exist per se in our data. But we can calculate it. Note that our data includes `IPouts` rather than `IP`, where `IP` * 3 = `IPouts`.

In [None]:
pitchers['WHIP'] = 3 * (pitchers['BB'] + pitchers['IBB'] + pitchers['H']) / pitchers['IPouts']

In [None]:
pitchers['WHIP'].sample(10)

## Plotting the WHIP Distribution

We'll need to drop the infinite values (from those pitchers who appeared in games but never got an out!).

In [None]:
finite_whips = pitchers[pitchers['WHIP'] != np.inf]

In [None]:
finite_whips['WHIP'].hist(bins=20);

In [None]:
finite_whips[finite_whips['WHIP'] < 10]['WHIP'].hist();

In [None]:
finite_whips[finite_whips['WHIP'] < 6]['WHIP'].hist(bins=15);

This distribution looks much like a **Poisson** Distribution, which is appropriate, since in effect we're talking about predicting a number of events (walks or hits) in a certain amount of time (an inning). The conjugate prior for a Poisson Distribution is a **Gamma** Distribution.

## Typical WHIP

In [None]:
# Average over every pitcher's career

finite_whips['WHIP'].mean()

Let's go with 1.67 as our ordinary WHIP. This will serve as our Bayesian prior, although we'll still need to make a choice about how many innings (and walks + hits) to use as our baseline.

## Going Bayesian: Adding a `MAP_WHIP` column

We can't just look directly at WHIPs, since some pitchers' stats will be misleadingly low or misleadingly high: If someone pitched two innings over their whole career and didn't give up a hit or a walk, that pitcher would have a career WHIP of 0, but that doesn't make him the greatest pitcher of all time.

In [None]:
pitchers.sort_values(['WHIP', 'IPouts'], ascending=[True, False]).head()

The man at the top here is [Al Braithwood](https://www.baseball-reference.com/players/b/braital01.shtml). Ever heard of him? No, because, even though he has a career WHIP of 0, he only ever pitched three innings!

So we need some kind of baseline to compare pitchers against. This is why we calculated an average WHIP. This baseline will serve as our Bayesian prior, and, because of the nature of conjugacy, all we need to do is to add this baseline to existing data to calculate a posterior.

Let's see what we get if we use 100 innings and 160 walks + hits, and look for the best adjusted WHIPs of all time.

In [None]:
pitchers['MAP_WHIP'] = 3 * (pitchers['BB'] + pitchers['IBB'] + pitchers['H'] + 167) / (pitchers['IPouts'] + 300)

In [None]:
pitchers.sort_values('MAP_WHIP', ascending=True).head()

At the top there is [Addie Joss](https://www.baseball-reference.com/players/j/jossad01.shtml), a real legend of the early game.

## Trying Different Starting Numbers

In [None]:
def whip_prior(wh, ipouts, ascending=False, number=10, data=pitchers):
    """
    This function takes in a number of hits plus walks and a number of
    innings to use as prior values for the Bayesian MAP Method. It returns
    the top (worst) hitter according to the MAP average. The ratio
    of hits plus walks to innings should be (near) 167:100. By default it finds
    the *worst* pitchers. To find the best, set the `ascending` parameter
    to True. 
    """
    data['MAP_WHIP'] = 3 * (data['BB'] + data['IBB'] + data['H'] + wh) / (data['IPouts'] + ipouts)
    return data.sort_values('MAP_WHIP', ascending=ascending).head(number)

Let's try this function out!

In [None]:
whip_prior(16.7, 30)

## Proceeding Systematically

We'll grab the worst pitcher for lots of different starting numbers of innings pitched.

In [None]:
worsts = {}
for ipouts in range(10, 10001, 10):
    worsts[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1).iloc[0].name

In [None]:
set(worsts.values())

For different numbers of innings we get different pitchers with the highest WHIP.

In [None]:
pitcher_with_ip = {}
for ipout_num in worsts:
    if worsts[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[worsts[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

Presumably, pitching only 10 or even 80 outs is not enough to feel confident that we are dealing with the worst pitcher. Once we get up to 180 outs, we meet [William Stecher](https://www.baseball-reference.com/players/s/stechch01.shtml).

In [None]:
pitching[pitching['playerID'] == 'stechch01'].loc[:, ['playerID', 'yearID', 'H', 'BB', 'IBB', 'IPouts']]

In [None]:
pitchers[pitchers.index == 'stechch01']

At 770 outs, we encounter [John McMullin](https://www.baseball-reference.com/players/m/mcmuljo01.shtml), who died all the way back in 1881 (!).

In [None]:
pitching[pitching['playerID'] == 'mcmuljo01'].loc[:, ['playerID', 'yearID', 'H', 'BB', 'IBB', 'IPouts']]

In [None]:
pitchers[pitchers.index == 'mcmuljo01']

## Looking for Pitchers after 1900

There were some rule changes before 1900 that changed the game quite a lot. In the 19th century there were restrictions on pitching that we wouldn't recognize today -- restrictions that made it quite difficult for pitchers to excel. So we might bracket pitchers whose earliest years were in the 19th century. Let's see what happens if we do that.

In [None]:
pitchers20th = pitchers.copy()
pitchers20th = pitchers20th[pitchers20th['firstYear'] > 1900]

In [None]:
worsts = {}
for ipouts in range(10, 10001, 10):
    worsts[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1, data=pitchers20th).iloc[0].name

In [None]:
set(worsts.values())

In [None]:
pitcher_with_ip = {}
for ipout_num in worsts:
    if worsts[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[worsts[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

In this way we find [Stu Flythe](https://www.baseball-reference.com/players/f/flythst01.shtml) and [Dick Weik](https://www.baseball-reference.com/players/w/weikdi01.shtml), both of whom are in the running for the worst Major League pitcher of all time.

## Distinguishing Between Starters and Relievers

Another distinction we might like to draw is between starting pitchers and relief pitchers. This has been an important part of the game for nearly a century.

Probably this will be especially relevant when we look for the *best* pitchers, since managers tend to tolerate higher WHIPs for their starters than for their relievers.

In [None]:
pitchers.columns

How shall we determine whether a pitcher was a reliever or not? Probably the easiest way is to take advantage of the `GS` (**G**ames **S**tarted) statistic. What we can do is to compare that statistic with the `G` (**G**ames) statistic. If a sufficiently low percentage of games in which a pitcher appeared are games that that pitcher started, then we can feel safe in classifying that pitcher as a reliever.

So: How low is sufficiently low? Let's try a ratio of 1/4. If less than one quarter of the games in which a pitcher appeared are games that the pitcher started, then we'll call that pitcher a reliever.

(There are some subtleties here (pitchers transitioning from starter to reliever or the reverse, pitchers with very low counts of games in the first place, etc.), but I think we can safely ignore these complications for now.)

In [None]:
pitchers20th['reliever'] = pitchers20th['GS'] / pitchers20th['G'] < 0.25

In [None]:
pitchers20th.head()

In [None]:
starters = pitchers20th.copy()
starters = starters[starters['reliever'] == False]

relievers = pitchers20th.copy()
relievers = relievers[relievers['reliever'] == True]

In [None]:
worst_starters = {}
for ipouts in range(10, 10001, 10):
    worst_starters[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1, data=starters).iloc[0].name

In [None]:
set(worst_starters.values())

In [None]:
pitcher_with_ip = {}
for ipout_num in worst_starters:
    if worst_starters[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[worst_starters[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

We see Dick Weik again, but a name that didn't show up before is that of [Hayden Penn](https://www.baseball-reference.com/players/p/pennha01.shtml). But Weik is the worst starter for 260 IP_outs all the way up to 2670 IP_outs, so he's my pick for the worst starter in the history of the Major Leagues.

![img](https://www.baseball-reference.com/req/202303230/images/headshots/a/af48bbb6_davis.jpg)

Image from baseball-reference.com

In [None]:
pitching[pitching['playerID'] == 'weikdi01'].loc[:, ['playerID', 'yearID', 'H', 'BB', 'IBB', 'IPouts']]

In [None]:
pitchers[pitchers.index == 'weikdi01']

In [None]:
worst_relievers = {}
for ipouts in range(10, 10001, 10):
    worst_relievers[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1, data=relievers).iloc[0].name

In [None]:
set(worst_relievers.values())

In [None]:
pitcher_with_ip = {}
for ipout_num in worst_relievers:
    if worst_relievers[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[worst_relievers[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

Surely, 820 IP_outs is more than enough for a good baseline. And so we see once again [Stu Flythe](https://www.baseball-reference.com/players/f/flythst01.shtml), who is my pick for the worst reliever in the history of the Major Leagues.

![img](https://www.baseball-reference.com/req/202303230/images/headshots/e/eef0c0d3_davis.jpg)

Image from baseball-reference.com

In [None]:
pitching[pitching['playerID'] == 'flythst01'].loc[:, ['playerID', 'yearID', 'H', 'BB', 'IBB', 'IPouts']]

In [None]:
pitchers[pitchers.index == 'flythst01']

## Best Starters and Relievers

In [None]:
best_starters = {}
for ipouts in range(10, 10001, 10):
    best_starters[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1, ascending=True, data=starters).iloc[0].name

In [None]:
set(best_starters.values())

In [None]:
pitcher_with_ip = {}
for ipout_num in best_starters:
    if best_starters[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[best_starters[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

The best starter of all time is very arguably [Addie Joss](https://baseball-reference.com/players/j/jossad01.shtml).

![img](https://www.baseball-reference.com/req/202303230/images/headshots/5/5e51b2e7_sabr.jpg)

Image from baseball-reference.com

In [None]:
pitching[pitching['playerID'] == 'jossad01'].loc[:, ['playerID', 'yearID', 'H', 'BB', 'IBB', 'IPouts']]

In [None]:
pitchers[pitchers.index == 'jossad01']

In [None]:
best_relievers = {}
for ipouts in range(10, 10001, 10):
    best_relievers[ipouts] = whip_prior(ipouts*167/300, ipouts, number=1, ascending=True, data=relievers).iloc[0].name

In [None]:
set(best_relievers.values())

In [None]:
pitcher_with_ip = {}
for ipout_num in best_relievers:
    if best_relievers[ipout_num] not in pitcher_with_ip:
        pitcher_with_ip[best_relievers[ipout_num]] = ipout_num

In [None]:
pitcher_with_ip

Here the decision is a bit more difficult. But arguably, a sufficient number of IP_outs pitched to be confident in our choice is less than 300, in which case our choice for the greatest reliever of all time will be [Koji Uehara](https://www.baseball-reference.com/players/u/ueharko01.shtml).

![img](https://www.baseball-reference.com/req/202303230/images/headshots/e/e109d804_mlbam.jpg)

Image from baseball-reference.com

In [None]:
pitching[pitching['playerID'] == 'ueharko01']

In [None]:
pitchers[pitchers.index == 'ueharko01']