# Investigating Baseball Salaries

It is easy to come up with several reasonable expectations about the relationship between a player's salary and some construct. For example, one might expect that successful players will receive higher salaries. But do these reasonable expectations always hold? In this analysis, we search for an answer to the broad question: What is the relationship between salary and success?

**Before we begin**: since our salary data starts in 1985, we must exclude all prior data. 

## "Success"

In [6]:
import pandas as pd

project_dir = '/Users/carrier/Documents/Projects/Udacity/Data Analyst Nanodegree/P2/'

One can measure success in several ways, depending, for example, on how broadly one restricts one's view. In this analysis, we will study success at two levels&mdash;team and player. One can also approach, for each level, the question of success from many angles. We will devote our attention to just a few.

Before we begin, we need to modify the salary entries a bit. Currently, each salary is listed nominally for the year that it was recorded. We need to adjust all salaries for inflation so that their value represents their worth in 2015. Information from inflation.txt was scrapped from the table provided [here](http://www.usinflationcalculator.com/inflation/historical-inflation-rates/). All such values come from the [monthly CPI publication](https://www.bls.gov/cpi/home.htm) by the Bureau of Labor Statistics.

In [7]:
ave_infl = pd.read_table(project_dir + 'inflation.txt', usecols=['YEAR', 'AVE'])
ave_infl = ave_infl.set_index(ave_infl['YEAR'].astype(int))
ave_infl = ave_infl.ix[1986:, 'AVE']/100

def year_to_rate(year):
    ret_rate = 1
    
    if year != 2015:
        for rate in ave_infl.loc[(year + 1):]:
                ret_rate *= (1 + rate)
    
    return ret_rate

salaries = pd.read_csv(project_dir + 'baseball_data/Salaries.csv')
salaries['salary'] = round(salaries['yearID'].apply(year_to_rate) * salaries['salary'], 2)

### Team-level success
#### Is there a relationship between a team's expenditure on salary and its success?
For this question, we define a team's success by total wins for the season and by entry into, and performance during, the postseason.

In order to begin our analysis, we need to compile the relevant information.

Notes about this data:

* We're missing SeriesPost data for the year 1994
* We need to adjust the salaries for inflation.