# Extra data

The point of this notebook is to start collecting the following pitching data:
- Using pitcher data from last year as well as the previous starts in the current season if possible
- Get stats (ERA, WHIP) from pitchers last 1-3 starts
- Avg. Home Attendance from the previous year
- Team ranking from previous year (could be from their W-L record or statcast has a rank column)
- Bring in the data from the month the game was played (This should already be in there and we just need to not remove it)
- Need to bring a feature in to measure errors (either team fielding percentage or errors/9 would work)

In [196]:
import pybaseball as pyb

import pandas as pd
import numpy as np
import requests
import time

from bs4 import BeautifulSoup

## Pitcher summary

A summary of all pitchers

In [46]:
yearly_pitching_df = pyb.pitching_stats(2001, 2019)

In [59]:
yearly_pitching_df.head()

Unnamed: 0,IDfg,Season,Name,Team,Age,W,L,WAR,ERA,G,...,Med%+,Hard%+,EV,LA,Barrels,Barrel%,maxEV,HardHit,HardHit%,Events
49,60,2001,Randy Johnson,Diamondbacks,37,21,6,10.4,2.49,35,...,,,,,,,,,,0
68,60,2004,Randy Johnson,Diamondbacks,40,16,14,9.6,2.6,35,...,98.0,108.0,,,,,,,,0
299,73,2002,Curt Schilling,Diamondbacks,35,23,7,9.3,3.23,36,...,106.0,102.0,,,,,,,,0
1,10954,2018,Jacob deGrom,Mets,30,10,9,9.0,1.7,32,...,104.0,76.0,86.3,11.1,20.0,0.039,112.9,148.0,0.287,515
29,1303,2011,Roy Halladay,Phillies,34,19,6,8.7,2.35,32,...,96.0,72.0,,,0.0,,,0.0,,0


In [49]:
sum(yearly_pitching_df.groupby(['Name'])['IDfg'].nunique() > 1)

0

No pitcher had multiple IDfg. This shows that the names are unique (e.g. no two pitchers both name "John Smith").

In [65]:
yearly_pitching_df.groupby('Name')['Team'].unique()

Name
A.J. Burnett    [Blue Jays, Marlins, Pirates, Yankees, Phillies]
A.J. Griffin                                         [Athletics]
Aaron Cook                                             [Rockies]
Aaron Harang           [Reds, Braves, Dodgers, Padres, Phillies]
Aaron Nola                                            [Phillies]
                                      ...                       
Zach Duke                                              [Pirates]
Zach Eflin                                            [Phillies]
Zack Godley                                       [Diamondbacks]
Zack Greinke     [Royals, - - -, Dodgers, Diamondbacks, Brewers]
Zack Wheeler                                              [Mets]
Name: Team, Length: 447, dtype: object

In [277]:
pitchers_summary_df = yearly_pitching_df.groupby('Name').agg({'Season': [min, max], 
                                                              'Age': [min, max],
                                                              'G': sum})
pitchers_summary_df['teams'] = yearly_pitching_df.groupby('Name')['Team'].unique()
pitchers_summary_df['num_teams'] = yearly_pitching_df.groupby('Name')['Team'].nunique()
pitchers_summary_df.head()

Unnamed: 0_level_0,Season,Season,Age,Age,G,teams,num_teams
Unnamed: 0_level_1,min,max,min,max,sum,Unnamed: 6_level_1,Unnamed: 7_level_1
Name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
A.J. Burnett,2001,2015,24,38,370,"[Blue Jays, Marlins, Pirates, Yankees, Phillies]",5
A.J. Griffin,2013,2013,25,25,32,[Athletics],1
Aaron Cook,2006,2008,27,29,89,[Rockies],1
Aaron Harang,2005,2015,27,37,279,"[Reds, Braves, Dodgers, Padres, Phillies]",5
Aaron Nola,2017,2019,24,26,94,[Phillies],1


In [278]:
pitchers_summary_df.columns = ['first_season', 'last_season', 'start_age', 'end_age', 'games_played', 'teams', 'num_teams']

In [279]:
pitchers_summary_df = pitchers_summary_df.reset_index()

In [280]:
pitchers_summary_df.head()

Unnamed: 0,Name,first_season,last_season,start_age,end_age,games_played,teams,num_teams
0,A.J. Burnett,2001,2015,24,38,370,"[Blue Jays, Marlins, Pirates, Yankees, Phillies]",5
1,A.J. Griffin,2013,2013,25,25,32,[Athletics],1
2,Aaron Cook,2006,2008,27,29,89,[Rockies],1
3,Aaron Harang,2005,2015,27,37,279,"[Reds, Braves, Dodgers, Padres, Phillies]",5
4,Aaron Nola,2017,2019,24,26,94,[Phillies],1


In [211]:
pitcher_keys = []

for name in pitchers_summary_df['Name'].unique():
    # Try and get their first and last name to search for. If this splits into more than
    # just two parts, record it and move on
    try:
        first, last = name.split(' ')
    except Exception as e:
        row = [name] + [None]*4
        pitcher_keys.append(row)
        continue
        
    # If you get a first and last name, look them up. If this returns more than one player,
    # record it and move on. If not, get their data and 
    pitcher_data = pyb.playerid_lookup(last, first)
    if pitcher_data.shape[0] > 1:
        row = [name] + [None]*4
        pitcher_keys.append(row)
        continue
    else:
        try:
            row = [name] + list(pitcher_data[['key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs']].values[0])
        except Exception as e:
            row = [name] + [None]*4
        pitcher_keys.append(row)
        
    # Sleep for one second to avoid rate limiting
    time.sleep(1)

Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering 

Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering 

Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering 

In [281]:
pitchers_keys_df = pd.DataFrame(pitcher_keys, columns=['Name', 'key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs'])
pitchers_keys_df.head()

Unnamed: 0,Name,key_mlbam,key_retro,key_bbref,key_fangraphs
0,A.J. Burnett,,,,
1,A.J. Griffin,,,,
2,Aaron Cook,346871.0,cooka002,cookaa01,1571.0
3,Aaron Harang,421685.0,haraa001,haranaa01,1451.0
4,Aaron Nola,605400.0,nolaa001,nolaaa01,16149.0


In [282]:
pitchers_summary_df = pitchers_summary_df.merge(pitchers_keys_df, how='left', on='Name')

In [283]:
pitchers_summary_df.head()

Unnamed: 0,Name,first_season,last_season,start_age,end_age,games_played,teams,num_teams,key_mlbam,key_retro,key_bbref,key_fangraphs
0,A.J. Burnett,2001,2015,24,38,370,"[Blue Jays, Marlins, Pirates, Yankees, Phillies]",5,,,,
1,A.J. Griffin,2013,2013,25,25,32,[Athletics],1,,,,
2,Aaron Cook,2006,2008,27,29,89,[Rockies],1,346871.0,cooka002,cookaa01,1571.0
3,Aaron Harang,2005,2015,27,37,279,"[Reds, Braves, Dodgers, Padres, Phillies]",5,421685.0,haraa001,haranaa01,1451.0
4,Aaron Nola,2017,2019,24,26,94,[Phillies],1,605400.0,nolaa001,nolaaa01,16149.0


In [284]:
pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()].shape[0]

46

46 pitchers couldn't be processed, so about 10%.

In [285]:
pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()].head()

Unnamed: 0,Name,first_season,last_season,start_age,end_age,games_played,teams,num_teams,key_mlbam,key_retro,key_bbref,key_fangraphs
0,A.J. Burnett,2001,2015,24,38,370,"[Blue Jays, Marlins, Pirates, Yankees, Phillies]",5,,,,
1,A.J. Griffin,2013,2013,25,25,32,[Athletics],1,,,,
7,Adam Eaton,2003,2004,25,26,64,[Padres],1,,,,
27,Bobby Jones,2001,2001,31,31,33,[Padres],1,,,,
45,Brian Anderson,2003,2004,31,32,67,"[- - -, Royals]",2,,,,


In [266]:
pyb.playerid_lookup('Griffin', 'A. J.')

Gathering player lookup table. This may take a moment.


Unnamed: 0,name_last,name_first,key_mlbam,key_retro,key_bbref,key_fangraphs,mlb_played_first,mlb_played_last
0,griffin,a. j.,456167,grifa002,griffaj01,11132,2012.0,2017.0


It looks like at least one issue is players with first names who are initial are stored with a space between their initials.

In [249]:
pitcher_keys_initials = []

for name in pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()]['Name'].unique():
    try:
        first, last = name.split(' ')
    except Exception as e:
        row = [name] + [None]*4
        pitcher_keys_missing.append(row)
        continue
    if '.' in first:
        first = first.replace('.', '. ')
        first = first.rstrip(' ')
    else:
        continue

    # If you get a first and last name, look them up. If this returns more than one player,
    # record it and move on. If not, get their data and 
    pitcher_data = pyb.playerid_lookup(last, first)
    if pitcher_data.shape[0] > 1:
        row = [name] + [None]*4
        pitcher_keys_missing.append(row)
        continue
    else:
        try:
            row = [name] + list(pitcher_data[['key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs']].values[0])
        except Exception as e:
            row = [name] + [None]*4
        pitcher_keys_missing.append(row)

    # Sleep for one second to avoid rate limiting
    time.sleep(1)

Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.


In [286]:
pitcher_keys_initials

[['A.J. Burnett', 150359, 'burna001', 'burnea.01', 512],
 ['A.J. Griffin', 456167, 'grifa002', 'griffaj01', 11132],
 ['C.J. Wilson', 450351, 'wilsc004', 'wilsocj01', 3580],
 ['Chan Ho Park', None, None, None, None],
 ['J.A. Happ', 457918, 'happj001', 'happja01', 7410],
 ['Jorge De La Rosa', None, None, None, None],
 ['R.A. Dickey', 285079, 'dickr001', 'dicker.01', 1245],
 ['Rubby de la Rosa', None, None, None, None],
 ['Tony Armas Jr.', None, None, None, None]]

In [287]:
pitchers_keys_df = pd.DataFrame(pitcher_keys_initials, columns=['Name', 'key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs'])
pitchers_keys_df = pitchers_keys_df.set_index('Name')
pitchers_summary_df = pitchers_summary_df.set_index('Name')
pitchers_summary_df.update(pitchers_keys_df)

In [295]:
pitchers_keys_df = pitchers_keys_df.reset_index()
pitchers_summary_df = pitchers_summary_df.reset_index()

In [296]:
pitchers_summary_df.head()

Unnamed: 0,Name,first_season,last_season,start_age,end_age,games_played,teams,num_teams,key_mlbam,key_retro,key_bbref,key_fangraphs
0,A.J. Burnett,2001,2015,24,38,370,"[Blue Jays, Marlins, Pirates, Yankees, Phillies]",5,150359.0,burna001,burnea.01,512.0
1,A.J. Griffin,2013,2013,25,25,32,[Athletics],1,456167.0,grifa002,griffaj01,11132.0
2,Aaron Cook,2006,2008,27,29,89,[Rockies],1,346871.0,cooka002,cookaa01,1571.0
3,Aaron Harang,2005,2015,27,37,279,"[Reds, Braves, Dodgers, Padres, Phillies]",5,421685.0,haraa001,haranaa01,1451.0
4,Aaron Nola,2017,2019,24,26,94,[Phillies],1,605400.0,nolaa001,nolaaa01,16149.0


In [289]:
pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()].shape[0]

41

Only fixed five pitchers.

In [290]:
pyb.playerid_lookup('De La Rosa')

Gathering player lookup table. This may take a moment.


Unnamed: 0,name_last,name_first,key_mlbam,key_retro,key_bbref,key_fangraphs,mlb_played_first,mlb_played_last
0,de la rosa,dane,451773,delad001,delarda01,10095,2011.0,2014.0
1,de la rosa,eury,545001,delae002,delareu01,4055,2013.0,2014.0
2,de la rosa,francisco,113227,delaf001,dela_fr01,1003165,1991.0,1991.0
3,de la rosa,jesus,113228,delaj101,dela_je01,1003166,1975.0,1975.0
4,de la rosa,jorge,407822,delaj001,rosajo01,2047,2004.0,2018.0
5,de la rosa,rubby,523989,delar003,delarru01,3862,2011.0,2017.0
6,de la rosa,tomas,150401,delat001,delarto01,7227,2000.0,2006.0


Next we'll do pitchers with multiple spaces in their name. It looks like usually the first word after splitting is the first name, and everything else should be grouped into the last name.

In [308]:
pitcher_keys_spaces = []

for name in pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()]['Name'].unique():
    try:
        first, last = name.split(' ', 1)
    except Exception as e:
        row = [name] + [None]*4
        pitcher_keys_spaces.append(row)
        continue

    # If you get a first and last name, look them up. If this returns more than one player,
    # record it and move on. If not, get their data and 
    pitcher_data = pyb.playerid_lookup(last, first)
    if pitcher_data.shape[0] > 1:
        row = [name] + [None]*4
        pitcher_keys_spaces.append(row)
        continue
    else:
        try:
            row = [name] + list(pitcher_data[['key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs']].values[0])
        except Exception as e:
            row = [name] + [None]*4
        pitcher_keys_spaces.append(row)

    # Sleep for one second to avoid rate limiting
    time.sleep(1)

Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering player lookup table. This may take a moment.
Gathering 

In [309]:
pitcher_keys_spaces

[['Adam Eaton', None, None, None, None],
 ['Bobby Jones', None, None, None, None],
 ['Brian Anderson', None, None, None, None],
 ['Carlos Martinez', None, None, None, None],
 ['Chan Ho Park', None, None, None, None],
 ['Charlie Morton', None, None, None, None],
 ['Chris Carpenter', None, None, None, None],
 ['Chris Young', None, None, None, None],
 ['Cliff Lee', None, None, None, None],
 ['Danys Baez', None, None, None, None],
 ['Doug Davis', None, None, None, None],
 ['Eduardo Rodriguez', None, None, None, None],
 ['Erasmo Ramirez', None, None, None, None],
 ['Freddy Garcia', None, None, None, None],
 ['Greg Smith', None, None, None, None],
 ['Hyun-Jin Ryu', None, None, None, None],
 ['Jae Seo', None, None, None, None],
 ['James McDonald', None, None, None, None],
 ['Jason Johnson', None, None, None, None],
 ["Jeff D'Amico", None, None, None, None],
 ['John Patterson', None, None, None, None],
 ['Jon Niese', None, None, None, None],
 ['Jorge De La Rosa', 407822, 'delaj001', 'rosajo01'

In [310]:
pitchers_keys_df = pd.DataFrame(pitcher_keys_spaces, columns=['Name', 'key_mlbam', 'key_retro', 'key_bbref', 'key_fangraphs'])
pitchers_keys_df = pitchers_keys_df.set_index('Name')
pitchers_summary_df = pitchers_summary_df.set_index('Name')
pitchers_summary_df.update(pitchers_keys_df)

In [311]:
pitchers_keys_df = pitchers_keys_df.reset_index()
pitchers_summary_df = pitchers_summary_df.reset_index()

In [312]:
pitchers_summary_df[pitchers_summary_df['key_mlbam'].isna()].shape[0]

39

Fixed two more. I'm guessing most of the remaining ones are just people who appear more than once, or (maybe) don't appear at all (is that possible? I assume BR should have everyone).

In [313]:
pitchers_summary_df.to_csv('pitchers_summary.csv', index=False)

## Fetch pitchers game-by-game data from BR

pybaseball doesn't seem to give you access to game-by-game stats for pitchers, which I need. So taking their code and modifying it to pull directly from BR. Note that this _does_ include ERA, but _doesn't_ include WHIP.

In [168]:
def pitcher_bref(br_id, season):
    """
    Get season-level Pitching Statistics for Specific Team (from Baseball-Reference)
    ARGUMENTS:
    br_id : str : The BR unique identifier. You can get this from playerid_lookup in the key_bberf columns
    end_season : int : season you want data for (data is returned on a game-by-game basis)
    """

    url = f"https://www.baseball-reference.com/players/gl.fcgi?id={br_id}&t=p&year={season}"

    data = []
    headings = None
    stats_url = url
    response = requests.get(stats_url)
    soup = BeautifulSoup(response.content, 'html.parser')

    table = soup.find_all('table', {'id': 'pitching_gamelogs'})[0]

    if headings is None:
        headings = [row.text.strip() for row in table.find_all('th')[1:50]]

    rows = table.find_all('tr')
    # Skip the last row, as this is a footer with only yearly summary data
    for row in rows[:-1]:
        cols = row.find_all('td')
        cols = [ele.text.strip() for ele in cols]
        cols = [col.replace('*', '').replace('#', '') for col in cols]  # Removes '*' and '#' from some names
        cols = [col for col in cols if 'Totals' not in col and 'NL teams' not in col and 'AL teams' not in col]  # Removes Team Totals and other rows
        cols.insert(2, season)
        data.append([ele for ele in cols[0:]])

    headings.insert(2, "Year")
    data = pd.DataFrame(data=data, columns=headings) # [:-5]  # -5 to remove Team Totals and other rows (didn't work in multi-year queries)
    data.columns = [x if x != '' else 'at' for x in data.columns]
    data = data.dropna()  # Removes Row of All Nones
    data.reset_index(drop=True, inplace=True)  # Fixes index issue (Index was named 'W" for some reason)
    
    return data

In [169]:
test_df = pitcher_bref('cookaa01', 2002)

In [170]:
test_df.head()

Unnamed: 0,Gcar,Gtm,Year,Date,Tm,at,Opp,Rslt,Inngs,Dec,...,GDP,SF,ROE,aLI,WPA,acLI,cWPA,RE24,Entered,Exited
0,1,116,2002.0,Aug 10,COL,,CHC,"L,1-15",6-7,,...,0,0,0,0.0,0.0,0.0,0.00%,0.14,6t --- 0 out d13,7t 3 out d14
1,2,121,2002.0,Aug 16,COL,@,ATL,"L,1-4",8-GF(8),,...,0,0,0,0.14,0.007,0.0,0.00%,0.51,8b --- 0 out d3,8b end d 3
2,3,125,2002.0,Aug 20,COL,,MON,"W,8-6",6-7,,...,0,0,0,0.51,-0.094,0.0,0.00%,-2.12,6t --- 0 out a5,7t 1-- 2 out a2
3,4,127,2002.0,Aug 22,COL,,MON,"W,14-6",6-6,H(1),...,0,0,0,1.17,0.054,0.01,0.00%,0.57,6t --- 0 out a3,6t 3 out a3
4,5,131,2002.0,Aug 26,COL,,SFG,"L,3-4",GS-7,,...,4,0,0,0.8,-0.098,0.0,0.00%,-0.6,1t start tie,7t 12- 0 out d2


Examples of this data can be seen [here](https://www.baseball-reference.com/players/gl.fcgi?id=cookaa01&t=p&year=2002).

Descriptions of columns:
- Gcar -- Career Game Number for Player
- Gtm -- Season Game Number for Team. Number in parentheses indicates number of team games the player did not play in from one appearance to next.
- Date -- A number in parentheses indicates which game of a doubleheader.
- Rslt -- Game Result for Team. W - Win, L - Loss, T - Tie (for a suspended game)
- Inngs -- Innings Played by Player
    - CG - Complete Game started and finished
    - GS-# - Game Started to what inning
    - #-GF, Inning entered to end of game
    - #-# - Inning Entered to Inning Left
    - (#) Game did not go 9 innings (only shown when player finished the game).
    - For pitchers, an SHO means they shutout the opposition. A zero for the innings means the innings played is unknown.
- Dec -- Decision, Save, or Hold
    - W - Win (pitcher record after game)
    - L - Loss (pitcher record after game)
    - BW - Blown Save and Win (pitcher record after game)
    - BL - Blown Save and Loss (pitcher record after game)
    - S - Save (pitcher saves thus far)
    - BSv - Blown Save (pitcher blown saves thus far)
    - H - Hold (pitcher holds thus far)
- DR -- Days Rest. Number or days since their previous appearance. 99 if start of season or 99 or more days (may include demotions). -1 if pitching both games of double-header.
- IP -- Innings Pitched
- H -- Hits/Hits Allowed
- R -- Runs Scored/Allowed
- ER -- Earned Runs Allowed
- BB -- Bases on Balls/Walks
- SO -- Strikeouts
- HR -- Home Runs Hit/Allowed
- HBP -- Times Hit by a Pitch.
- ERA -- 9 * ER / IP. For recent years, leaders need 1 IP per team game played.
- BF -- Batters Faced
- Pit -- Number of pitches in the PA.
- Str -- Strikes. Includes both pitches in the zone and those swung at out of the zone.
- StL -- Strikes Looking. Strikes called by the umpire.
- StS -- Strikes Swinging. Strikes due to a swing and a miss.
- GB -- Ground Balls. Includes bunts and all other ground balls.
- FB -- Fly Balls. Includes Fly Balls, Line Drives, and Pop-Ups.
- LD -- Line Drives. These are double-counted in Fly Balls as well.
- PU -- Pop Ups. Generally, high fly balls that land within the infield circle. These are double-counted in Fly Balls as well.
- Unk -- Unknown batted ball type. A ball in play for which we don’t know the type.
- GSc -- Game Score. Developed by Bill James
    1. Start with 50 points.
    2. Add 1 point for each out recorded, so 3 points for every complete inning pitched.
    3. Add 2 points for each inning completed after the 4th.
    4. Add 1 point for each strikeout.
    5. Subtract 2 points for each hit allowed.
    6. Subtract 4 points for each earned run allowed.
    7. Subtract 2 points for each unearned run allowed.
    8. Subtract 1 point for each walk.
- IR -- Inherited Runners. Number of runners on base when pitcher entered the game.
- IS -- Inherited Score. Number or percentage of runners on base when pitcher entered the game who subsequently scored. These runners show up in the previous pitcher’s ERA.
- SB -- Stolen Bases
- CS -- Caught Stealing
- PO -- Pickoffs. Runner picked off a base. May include cases they were safe on an error. Also includes Pickoff Caught Stealing plays.
- AB -- At Bats
- 2B -- Doubles Hit/Allowed
- 3B -- Triples Hit/Allowed
- IBB -- Intentional Bases on Balls
- GDP -- Double Plays Grounded Into. Only includes standard 6-4-3, 4-3, etc. double plays. For gamelogs only in seasons we have play-by-play, we include triple plays as well. All official seasonal totals do not include GITP's.
- SF -- Sacrifice Flies
- ROE -- Reached On Error. Times a batter reached due to an error. DOES NOT include a fielder’s choice where no out was recorded.
- aLI -- Average Leverage Index. The average pressure the pitcher or batter saw in this game or season. 1.0 is average pressure, below 1.0 is low pressure and above 1.0 is high pressure.
- WPA -- Win Probability Added by Pitcher. Given average teams, this is the change in probability. A change of +/- 1 would indicate one win added or lost.
- acLI -- Average Championship Leverage Index. The average pressure the pitcher or batter saw in this game or season. 1.0 is average pressure, below 1.0 is low pressure and above 1.0 is high pressure.
- cWPA -- Championship Win Probability Added by Pitcher. Given average teams, this is the change in probability, displayed in percentage points. A change of +/- 100% would indicate one world series win added or lost.
- RE24 -- Base-Out Runs Saved. Given the bases occupied/out situation, how many runs did the pitcher save in the resulting play. Compared to average, so 0 is average, and above 0 is better than average
- Entered -- The situation when pitcher entered game. 
    - Inning top or bottom: 8b (bottom of 8th) 
    - bases occupied or start of inning: ’---’ (bases empty) 
    - score from pitching team’s perspective 
        - ahead/down and runs or tie: a4 (ahead by 4 runs) 
- Exited -- The situation when pitcher exited game
    - Inning top or bottom: 4t (top of 4th)
    - bases occupied, 3 outs, or end of game: ’123’ (bases loaded)
    - score from pitching team’s perspective
        - ahead/down and runs or tie: d2 (down by 2 runs)