# P2 Project Submission
### By Garrett Busch 
### Mar 2017

## Introduction 

Baseball has long been considered America's pasttime.  It's hard to argue with considering the continued success, world intrigue, baseball coliseum's and the oh-so unforgetable comfort food. Looking past these inviting activities, your left with a game, just a game. This game, for the most part, has been played almost exactly the same way for 100+ years. 1 of the major reasons baseball has remained the same (aside from some mound changes, strikezone variation, baseball technology)  is because of the prevalence of statistics. 

In today's game, we have what are called saber-metrics that look to compare anything from  offensive prowess to defensive efficiency to pitcher accuracy & pitch selection.  In this project we will engineer our own metric for offensive ability as we work with a dataset that is geared toward traditional baseball data points (Hit, At Bat, Home Run, etc.). The purpose of this project is to provide a logical data analysis project that ultimately seeks to help answer questions I pose.

Let's get started.

http://www.billjamesonline.com/article785/

http://pandas.pydata.org/pandas-docs/stable/merging.html

## Questions to be answered

In terms of individual and team performance, what were some of the  greatest disparities from an individuals season vs. the next best on the team? (an unequivocal MVP)

What players, over the course of their careers, had constantly faced the above matched environment?
[See answer](#question1)

Did the above scenario's have any show any interesting trends when it comes to individual end-of-season awards?

How did team's fare that matched this particular condition? Can a player truly "carry" the whole team?

Aside from comparing individual vs. team how did players born from "warm" weather states fare in comparison to those that wern't?

##  Verify and checking data

Generally, we'd like to take an initial dive into the data either to identify holes, anomolies, etc.. 

We'll need to import some initial libraries.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
plt.style.use('ggplot')
%matplotlib inline

Lets also  include the links to the associated data.

In [2]:
GHreposit = "https://raw.githubusercontent.com/garrettbusch15/P2-Baseball-Analysis-Project/"
subfolder = "master/Data/"

Now read the data in.

In [3]:
# Shift-Tab to see paramters/help for function
fileMaster = pd.read_csv(GHreposit + subfolder + 'Master.csv')
fileBatting = pd.read_csv(GHreposit + subfolder + 'Batting.csv')
fileAppearance = pd.read_csv(GHreposit + subfolder + 'Appearances.csv')
fileSalaries = pd.read_csv(GHreposit + subfolder + 'Salaries.csv')
fileHOF = pd.read_csv(GHreposit + subfolder + 'HallOfFame.csv')
fileAwards = pd.read_csv(GHreposit + subfolder + 'AwardsPlayers.csv')

In [4]:
fileMaster.head(5)

Unnamed: 0,playerID,birthYear,birthMonth,birthDay,birthCountry,birthState,birthCity,deathYear,deathMonth,deathDay,...,nameLast,nameGiven,weight,height,bats,throws,debut,finalGame,retroID,bbrefID
0,aardsda01,1981.0,12.0,27.0,USA,CO,Denver,,,,...,Aardsma,David Allan,205.0,75.0,R,R,4/6/2004,9/28/2013,aardd001,aardsda01
1,aaronha01,1934.0,2.0,5.0,USA,AL,Mobile,,,,...,Aaron,Henry Louis,180.0,72.0,R,R,4/13/1954,10/3/1976,aaroh101,aaronha01
2,aaronto01,1939.0,8.0,5.0,USA,AL,Mobile,1984.0,8.0,16.0,...,Aaron,Tommie Lee,190.0,75.0,R,R,4/10/1962,9/26/1971,aarot101,aaronto01
3,aasedo01,1954.0,9.0,8.0,USA,CA,Orange,,,,...,Aase,Donald William,190.0,75.0,R,R,7/26/1977,10/3/1990,aased001,aasedo01
4,abadan01,1972.0,8.0,25.0,USA,FL,Palm Beach,,,,...,Abad,Fausto Andres,184.0,73.0,L,L,9/10/2001,4/13/2006,abada001,abadan01


The master seems like the best place to start.  We have a few different fields in the Master dataframe including text, dates and floating numbers.  There are a few fields worth filling out even if we may not neccessarily be directly using the fields in this analysis.

In [5]:
# 1/24/15 is the date of this data's publishing
fileMaster['finalGame'].fillna('1/24/2015', inplace=True)
fileMaster['debut'] = pd.to_datetime(fileMaster['debut'])
fileMaster['finalGame'] = pd.to_datetime(fileMaster['finalGame'])

We also will have a few fields that can simply  be calculated by what we have available that may be useful down the road.

In [6]:
fileMaster['careerLength'] = fileMaster['finalGame'] - fileMaster['debut']

One of the questions we've outlined above wants to see performance results of 'warm-weather born players' vs. all the others. The 'warm-weather' states have been outlined above but are is generally the southern half of the US.

In [7]:
fair_weather_stats = ['CA','TX','FL','AZ','NV','NM', 'GA', 'LA', 'AL', 'MS']

In [8]:
fileMaster['StateWeather'] = fileMaster['birthState'].map(lambda x: True if x in fair_weather_stats else False)

At this point, below is what the master dataframe looks like:

In [9]:
fileMaster.head(1)

Unnamed: 0,playerID,birthYear,birthMonth,birthDay,birthCountry,birthState,birthCity,deathYear,deathMonth,deathDay,...,weight,height,bats,throws,debut,finalGame,retroID,bbrefID,careerLength,StateWeather
0,aardsda01,1981.0,12.0,27.0,USA,CO,Denver,,,,...,205.0,75.0,R,R,2004-04-06,2013-09-28,aardd001,aardsda01,3462 days,False


We can now look to round out the batting dataframe of the data.
For this there are a number of common statistics i.e. batting average, on-base percentage, slugging and OPS+ that are not included in the data. Also, we will add one measurement which is a product of AB's and OPS+, which attempts to model the contributed offensive effort are player added in a season. It is this measurement that we will use to measure value added by a players contribution. In a more  comprehensive dataset this would be known as "WAR" or wins-above-replacement.

*Generally, these statistics are rounded to 3 decimal places for presentation.

**The below resulting dataframe essentially gives us performance by player by season.

In [10]:
def c_num(s):
    try:
        return float(s)
    except Exception:
        return 0
def f_Avg(AB, H):
    return round(H / AB,3)
def f_Obp(H,BB,IBB,HBP,SF,AB):
    n = (H + BB + HBP)
    d = (AB + BB + HBP + SF)
    return round(n / d,3)
def f_Slug(H, Dbl, Trpl, HR, AB):
    return round(((H - Dbl - Trpl - HR) + (Dbl * 2) + (Trpl * 3) + (HR * 4)) / AB,3)

Now these calculation will be used in calculating the dataframe.

In [11]:
# Insert common hitting statistics into batting dataframe
fileBatting.fillna(0, inplace=True)
fileBatting['statBA'] = fileBatting.apply(lambda row: f_Avg(row['AB'],row['H']) if row['AB'] != 0 else 0, axis=1)
fileBatting['statOBP'] = fileBatting.apply(lambda row: f_Obp(row['H'],row['BB'],row['IBB'],row['HBP'],row['SF'],row['AB']) if (row['AB']+ row['BB'] + row['HBP'] + row['SF']) != 0 else 0, axis=1)
fileBatting['statSLUG'] = fileBatting.apply(lambda row: f_Slug(row['H'],row['2B'],row['3B'],row['HR'],row['AB']) if row['AB'] != 0 else 0, axis=1)
fileBatting['statOPS+'] = fileBatting['statOBP'] + fileBatting['statSLUG']
fileBatting['statOpsWght'] = fileBatting.apply(lambda row: (row['statOPS+'] * row['AB']), axis=1)

We will also add a TRUE/FALSE column for MVP award..

What different type of awards are in this dataset? (fileAwards)

In [12]:
fileAwards.awardID.unique()

array(['Pitching Triple Crown', 'Triple Crown',
       'Baseball Magazine All-Star', 'Most Valuable Player',
       'TSN All-Star', 'TSN Guide MVP',
       'TSN Major League Player of the Year', 'TSN Pitcher of the Year',
       'TSN Player of the Year', 'Rookie of the Year', 'Babe Ruth Award',
       'Lou Gehrig Memorial Award', 'World Series MVP', 'Cy Young Award',
       'Gold Glove', 'TSN Fireman of the Year', 'All-Star Game MVP',
       'Hutch Award', 'Roberto Clemente Award', 'Rolaids Relief Man Award',
       'NLCS MVP', 'ALCS MVP', 'Silver Slugger', 'Branch Rickey Award',
       'Hank Aaron Award', 'TSN Reliever of the Year',
       'Comeback Player of the Year'], dtype=object)

As we want "Most Valuable Player" (in each respective league) we can code this as so..

In [None]:
fileBatting.

In [13]:
sub_batID = fileBatting[['playerID', 'yearID']]
sub_awdID = fileAwards.query('awardID=="Most Valuable Player"')[['playerID', 'yearID']]
fileBatting['MVP'] = sub_batID.isin(sub_awdID).all(axis=1)
fileBatting.query('playerID == "musiast01"')

Unnamed: 0,playerID,yearID,stint,teamID,lgID,G,AB,R,H,2B,...,HBP,SH,SF,GIDP,statBA,statOBP,statSLUG,statOPS+,statOpsWght,MVP
29609,musiast01,1941,1,SLN,NL,12,47.0,8.0,20.0,4.0,...,0.0,0.0,0.0,0.0,0.426,0.449,0.574,1.023,48.081,False
30164,musiast01,1942,1,SLN,NL,140,467.0,87.0,147.0,32.0,...,2.0,5.0,0.0,3.0,0.315,0.397,0.49,0.887,414.229,False
30707,musiast01,1943,1,SLN,NL,157,617.0,108.0,220.0,48.0,...,2.0,10.0,0.0,17.0,0.357,0.425,0.562,0.987,608.979,False
31263,musiast01,1944,1,SLN,NL,146,568.0,112.0,197.0,51.0,...,5.0,4.0,0.0,7.0,0.347,0.44,0.549,0.989,561.752,False
32497,musiast01,1946,1,SLN,NL,156,624.0,124.0,228.0,50.0,...,3.0,2.0,0.0,7.0,0.365,0.434,0.587,1.021,637.104,False
33100,musiast01,1947,1,SLN,NL,149,587.0,113.0,183.0,30.0,...,4.0,6.0,0.0,18.0,0.312,0.398,0.504,0.902,529.474,False
33669,musiast01,1948,1,SLN,NL,155,611.0,135.0,230.0,46.0,...,3.0,1.0,0.0,18.0,0.376,0.45,0.702,1.152,703.872,False
34240,musiast01,1949,1,SLN,NL,157,612.0,128.0,207.0,41.0,...,2.0,0.0,0.0,12.0,0.338,0.438,0.624,1.062,649.944,False
34819,musiast01,1950,1,SLN,NL,146,555.0,105.0,192.0,41.0,...,3.0,0.0,0.0,11.0,0.346,0.437,0.596,1.033,573.315,False
35424,musiast01,1951,1,SLN,NL,152,578.0,124.0,205.0,30.0,...,1.0,1.0,0.0,6.0,0.355,0.449,0.614,1.063,614.414,False


In [14]:
sub_awdID.query('playerID == "musiast01"')

Unnamed: 0,playerID,yearID
1589,musiast01,1943
1748,musiast01,1946
1844,musiast01,1948


Now we'd like to begin performing some useful analysis/comparisons. In order to do so we will need to summarize this data across multiple seasons, teams & stats.

First, we will outline the stats we will summarize.

In [15]:
stats_to_summarize = {
    # 'G':0,
    # 'AB':0,
    'R':0,
    'H':0,
    '2B':0,
    '3B':0,
    'HR':0,
    'BB':0,
    'RBI':0,
    'SO':0,
    'IBB':0,
    'HBP':0,
    'SH':0,
    'SF':0
    # 'lgAvg':[]
    # 'lgObp':[]
    # 'lgSlug':[]
    # 'lgOps+':[]
    # 'lgOpsWght':[]
}

Next, we need a procedure that will pull a unique list of  years (from batting data) and summarize across the league how players performed in that particular season.

In [16]:
def season_stat_compiliation():
    dicBat = {}
    for year in pd.unique(fileBatting.yearID.ravel()):
        dicBat[year] = season_offense_summary(year)
    return pd.DataFrame(dicBat)

In [17]:
def season_offense_summary(year):
    dicSum = {}
    dicSum['Players'] = len(fileBatting.loc[fileBatting['yearID'] == year])
    dicSum['AB'] = fileBatting.loc[(fileBatting['yearID'] == year), 'AB'].sum()
    dicSum['G'] = fileBatting.loc[(fileBatting['yearID'] == year), 'G'].sum()
    for stat in stats_to_summarize.keys():
        dicSum[stat] = fileBatting.loc[(fileBatting['yearID'] == year), stat].sum()
        dicSum[stat + '_G'] = round(fileBatting.loc[(fileBatting['yearID'] == year), stat].sum() / dicSum['G'],4)
        dicSum[stat + '_AB'] = round(fileBatting.loc[(fileBatting['yearID'] == year), stat].sum() / dicSum['AB'],4)
    dicSum['lgAvg'] = f_Avg(dicSum['AB'],dicSum['H'])
    dicSum['lgObp'] = f_Obp(dicSum['H'],dicSum['BB'],dicSum['IBB'], dicSum['HBP'],dicSum['SF'],dicSum['AB'])
    dicSum['lgSlug'] = f_Slug(dicSum['H'],dicSum['2B'],dicSum['3B'],dicSum['HR'],dicSum['AB'])
    dicSum['lgOps+'] = dicSum['lgObp'] + dicSum['lgSlug']
    dicSum['lgOpsWght'] = dicSum['AB'] * dicSum['lgOps+']
    
    return dicSum

We can then run the above.

In [18]:
league_stat_summary = season_stat_compiliation()

In [19]:
league_stat_summary.head(1)

Unnamed: 0,1871,1872,1873,1874,1875,1876,1877,1878,1879,1880,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014
2B,434.0,567.0,556.0,633.0,839.0,633.0,431.0,481.0,958.0,980.0,...,8863.0,9135.0,9197.0,9014.0,8737.0,8486.0,8399.0,8261.0,8222.0,8137.0


Now lets summarize the data into a dataframe with both the master and batting data. Merging on the 'playerID' field will give us the desired result.

In [20]:
master_player_season = pd.merge(fileMaster, fileBatting, on='playerID', how='outer')
master_player_season.head(1)

Unnamed: 0,playerID,birthYear,birthMonth,birthDay,birthCountry,birthState,birthCity,deathYear,deathMonth,deathDay,...,HBP,SH,SF,GIDP,statBA,statOBP,statSLUG,statOPS+,statOpsWght,MVP
0,aardsda01,1981.0,12.0,27.0,USA,CO,Denver,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False


You can see from the above that the columns on the end were what we calculated in the batting column and them as well as other batting fields have been layered onto the master data.

Next, we can begin summarizing at the team level.

*We instantiate new functions for avg/obp/slug here as were are looking up based on 2 variables.

In [21]:
def f_Parse_AVG(yr, team):
    tmp = fileBatting[(fileBatting.teamID == team) & (fileBatting.yearID == yr)].sum()
    H = tmp['H']
    AB = tmp['AB']
    return f_Avg(AB, H)
def f_Parse_OBP(yr, team):
    tmp = fileBatting[(fileBatting.teamID == team) & (fileBatting.yearID == yr)].sum()
    H = tmp['H']
    BB = tmp['BB']
    IBB = tmp['IBB']
    HBP = tmp['HBP']
    SF = tmp['SF']
    AB = tmp['AB']
    return f_Obp(H,BB,IBB,HBP,SF,AB)
def f_Parse_SLUG(yr, team):
    tmp = fileBatting[(fileBatting.teamID == team) & (fileBatting.yearID == yr)].sum()
    H = tmp['H']
    AB = tmp['AB']
    Dbl = tmp['2B']
    Trpl = tmp['3B']
    HR = tmp['HR']
    return f_Slug(H,Dbl,Trpl,HR,AB)


In [22]:
def unique_team_season():
    tms = fileBatting[['yearID', 'teamID']].copy()
    tms.drop_duplicates(inplace=True)   
    tms['team_BA'] = tms.apply(lambda w: f_Parse_AVG(w['yearID'],w['teamID']), axis=1)
    tms['team_OBP'] = tms.apply(lambda w: f_Parse_OBP(w['yearID'],w['teamID']), axis=1)
    tms['team_SLUG'] = tms.apply(lambda w: f_Parse_SLUG(w['yearID'],w['teamID']), axis=1)
    tms['team_OPS+'] = tms['team_OBP'] + tms['team_SLUG']
    tms['team_OpsWght'] = tms.apply(lambda w: (w['team_OPS+'] * fileBatting[(fileBatting.teamID == w['teamID']) & (fileBatting.yearID == w['yearID'])].sum()['AB']), axis=1)
    return tms

In [23]:
team_stat_summary = unique_team_season()
team_stat_summary.head(1)

Unnamed: 0,yearID,teamID,team_BA,team_OBP,team_SLUG,team_OPS+,team_OpsWght
0,1871,TRO,0.308,0.334,0.417,0.751,937.248


From the finished result above we have the relevent metrics each team outputted by season.

This is critical in our next step as we look to compare an individuals performance by that of his teammates.

In [24]:
master_player_team = pd.merge(master_player_season, team_stat_summary, on=['yearID','teamID'], how='outer')
master_player_team['%_team_OPS+'] = master_player_team['statOpsWght'] / master_player_team['team_OpsWght']

In [25]:
master_player_team.head(1)

Unnamed: 0,playerID,birthYear,birthMonth,birthDay,birthCountry,birthState,birthCity,deathYear,deathMonth,deathDay,...,statSLUG,statOPS+,statOpsWght,MVP,team_BA,team_OBP,team_SLUG,team_OPS+,team_OpsWght,%_team_OPS+
0,aardsda01,1981.0,12.0,27.0,USA,CO,Denver,,,,...,0.0,0.0,0.0,False,0.27,0.357,0.438,0.795,4409.07,0.0


In the above we've now created a 'master' relations dataframe from individual & team data. From this dataframe we've gone ahead and calculated a % column to calculate the portion of value (OPS weighted) of the player as % of the overall team's output.

Below are atleast some initial insights that are quite interesting.

In [26]:
def df_lookup(ind):
    return '%s %s @ %s in %s' % (master_player_team.ix[ind, 'nameFirst'], \
                             master_player_team.ix[ind, 'nameLast'], \
                             round(master_player_team.ix[ind, '%_team_OPS+'],3), \
                             master_player_team.ix[ind, 'yearID'])

In [27]:
time_filter = 1970
placement = master_player_team.loc[(master_player_team['yearID']>time_filter),'%_team_OPS+'].argmax()
print df_lookup(placement)

Sammy Sosa @ 0.164 in 2001.0


<a id='question1'></a>
Now we can grab some answers to our questions.

In [28]:
master_player_team[(master_player_team.yearID > 1900)].sort_values('%_team_OPS+', ascending=False).head(5)[['playerID','nameFirst', 'nameLast', '%_team_OPS+', 'yearID', 'MVP']]

Unnamed: 0,playerID,nameFirst,nameLast,%_team_OPS+,yearID,MVP
72023,musiast01,Stan,Musial,0.182107,1948.0,False
93237,cobbty01,Ty,Cobb,0.17421,1917.0,False
13890,lajoina01,Nap,Lajoie,0.173805,1910.0,False
100028,stonege01,George,Stone,0.172135,1906.0,False
80988,kleinch01,Chuck,Klein,0.169715,1933.0,False


In [33]:
master_player_team.query('MVP == "TRUE"')

Unnamed: 0,playerID,birthYear,birthMonth,birthDay,birthCountry,birthState,birthCity,deathYear,deathMonth,deathDay,...,statSLUG,statOPS+,statOpsWght,MVP,team_BA,team_OBP,team_SLUG,team_OPS+,team_OpsWght,%_team_OPS+


If we roll this out to seasons after 1900, which can generally be assumed when baseball began forming into the game we know more today, we see Stan Musial in 1948 with the greatest % of his team's offsensive output. If you take a look at the 1948 Cardinals season, you will see that Stan's MVP season truly stood out from the rest.

http://www.baseball-reference.com/teams/STL/1948.shtml

There are many things to discover about this data. Having played baseball and having general awareness to many baseball statistics there a certain "random" which may be of fun fact. For instance, it may be interesting to know who was the best "team-mate" of all time, or maybe how players "peak" years have evolved over time.

Some highlited details that Udacity would like me to look at are the relationship between different metrics, analyzing independent (3) and dependent (1) variables, and the characteristics of players with the highest salaries.

Lets begin..

## A few supporting metrics:

In [29]:
# Career OPS of players of fair-weather states vs not
# f_Obp(H,BB,IBB,HBP,SF,AB)
"Fair weather: " + str(f_Obp(master_player_season.groupby('StateWeather')['H'].sum()[1],\
                             master_player_season.groupby('StateWeather')['BB'].sum()[1],\
                             master_player_season.groupby('StateWeather')['IBB'].sum()[1],\
                             master_player_season.groupby('StateWeather')['HBP'].sum()[1],\
                             master_player_season.groupby('StateWeather')['SF'].sum()[1],\
                             master_player_season.groupby('StateWeather')['AB'].sum()[1]))+\
" Not so Fair weather: " + str(f_Obp(x.groupby('StateWeather')['H'].sum()[0],\
                             master_player_season.groupby('StateWeather')['BB'].sum()[0],\
                             master_player_season.groupby('StateWeather')['IBB'].sum()[0],\
                             master_player_season.groupby('StateWeather')['HBP'].sum()[0],\
                             master_player_season.groupby('StateWeather')['SF'].sum()[0],\
                             master_player_season.groupby('StateWeather')['AB'].sum()[0]))
#Print "Fair Weather OPS: " + x.loc[x['StateWeather'] == True,'H'].sum() +\
#x.loc[x['StateWeather'] == True,'BB'].sum()+\
#x.loc[x['StateWeather'] == True,'IBB'].sum()+\
#x.loc[x['StateWeather'] == True,'HBP'].sum()+\
#x.loc[x['StateWeather'] == True,'SF'].sum()+\
#x.loc[x['StateWeather'] == True,'AB'].sum()

NameError: name 'x' is not defined

Number of people who have played MLB (from dataset).. 18,589

In [None]:
totplayers = fileMaster.shape[0]

How many different countries have been represented in the MLB?.. 52

In [None]:
fileMaster.groupby(['birthCountry']).size()

So..

In [None]:
len(fileMaster.groupby(['birthCountry']).size())

It seems there maybe some holes in our dataset i.e. players without a birth country..

In [None]:
fileMaster.fillna

In [None]:
fileMaster.groupby(['birthCountry']).size()

Of the USA players how many states have been represented?.. 51 (All + DC included)

In [None]:
fileMaster[(fileMaster['birthCountry'] == 'USA')].groupby(['birthState']).size()

In [None]:
len(fileMaster[(fileMaster['birthCountry'] == 'USA')].groupby(['birthState']).size())

What % of players have become deceased?.. ~50%

In [None]:
fileMaster.groupby(['deathYear']).size().sum() / totplayers

Heaviest/Lightest/Tallest/Shortest reported player ever?

In [None]:
fileMaster.loc[fileMaster['weight'].argmax(),'nameGiven'] + " " + fileMaster.loc[fileMaster['weight'].argmax(),'nameLast'] + " @ " + str(fileMaster['weight'].max())

In [None]:
fileMaster.loc[fileMaster['weight'].argmin(),'nameGiven'] + " " + fileMaster.loc[fileMaster['weight'].argmin(),'nameLast'] + " @ " + str(fileMaster['weight'].min())

In [None]:
fileMaster.loc[fileMaster['height'].argmax(),'nameGiven'] + " " + fileMaster.loc[fileMaster['height'].argmax(),'nameLast'] + " @ " + str(fileMaster['height'].max())

In [None]:
fileMaster.loc[fileMaster['height'].argmin(),'nameGiven'] + " " + fileMaster.loc[fileMaster['height'].argmin(),'nameLast'] + " @ " + str(fileMaster['height'].min())

In [None]:
fileMaster.plot(x='height', y='weight', style='ro')

Now lets get to some of these baseball stats. 

Max HRs per year

In [None]:
#fileBatting.groupby('yearID')['HR'].max()

Interesting, and how about something arbitrary, ABs?

In [None]:
#fileBatting.groupby('yearID')['AB'].max()

Lets flip to the defensive side of the ball.

By year, by position, average

In [None]:
#fileFielding.groupby('yearID').groups

Creating a teammate relations table

Example: playerID: {
                    yearID: {
                             teammates playerID: 
                   }        }