# Former Teams of Players on Playoff Teams

What playoff team should fans of non-playoff teams root for, based on which playoff team has the most former members of your favorite team? 

Data from: https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=&strgroup=career&statgroup=1&startDate=2014-01-01&endDate=2019-09-27&filter=&position=B&statType=player&autoPt=true&sort=22,1&pg=0&players=&splitArrPitch=&splitTeams=false

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', 1000)
pd.set_option('display.width', 1000)

## Data - Past Seasons

Player seasons from 2009 through 2019, split by team. 

TODO: get data going back to 2001. (CC Sabathia made his debut in 2001, the earliest debut of any player on a playoff team in 2019) 

Batting: 
https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=&strgroup=career&statgroup=1&startDate=2001-01-01&endDate=2019-09-30&filter=&position=B&statType=player&autoPt=false&sort=22,1&pg=0&players=&splitArrPitch=&splitTeams=true

Pitching:

In [2]:
batting = pd.read_csv('fangraphs_batting.csv')
pitching = pd.read_csv('fangraphs_pitching.csv')

Combine Plate Appearances by batters and Batters Faced by pitchers into one metric, `appearances`. 

In [3]:
batting['appearances'] = batting['PA']
pitching['appearances'] = pitching['TBF']

Union batting and pitching data into one de-duplicated dataframe, with one row per player-season-team. 

In [4]:
columns_to_union = ['playerId', 'Season', 'Name', 'Tm', 'appearances']
players_with_duplicates = pd.concat([pitching[columns_to_union], batting[columns_to_union]])
players = players_with_duplicates.groupby(['playerId', 'Season', 'Name', 'Tm']).appearances.sum().reset_index()

## Data - Current Playoff Players

A "playoff player" is defined as a player who has made an appearance in September 2019 for one of the 10 teams in the playoffs. 

Data sources: https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=&strgroup=career&statgroup=1&startDate=2019-09-01&endDate=2019-09-28&filter=&position=B&statType=player&autoPt=false&sort=22,1&pg=0&players=&splitArrPitch=&splitTeams=true

In [5]:
batting_sept = pd.read_csv('fangraphs_batting_sept_2019.csv')
pitching_sept = pd.read_csv('fangraphs_pitching_sept_2019.csv')

cols_sept = ['playerId', 'Name', 'Date', 'Tm']
sept_games = pd.concat([batting_sept[cols_sept], pitching_sept[cols_sept]]).drop_duplicates()

In [6]:
teams = sorted(list(batting_sept.Tm.unique()))

playoff_teams = [
    'NYY', 'TBR', 'MIN', 'HOU', 'OAK'
    , 'ATL', 'WSN', 'STL', 'MIL', 'LAD'
]

not_playoff_teams = [t for t in teams if t not in playoff_teams]

In [7]:
most_recent = pd.DataFrame(sept_games.groupby('playerId').Date.max().reset_index())
most_recent_team = pd.merge(most_recent, sept_games, how='inner', on=['playerId', 'Date'])

playoff_players = most_recent_team[most_recent_team.Tm.isin(playoff_teams)][['playerId', 'Name', 'Tm']].reset_index()

In [8]:
player_teams = players.groupby('playerId')['Tm'].apply(list).reset_index(name='team_list')

# TODO: add first and last appearance year to this dataframe
player_team_appearances = players.groupby(['playerId', 'Tm']).appearances.sum().reset_index()

In [9]:
playoff_player_teams = pd.merge(playoff_players, player_teams, how='inner', on='playerId')

In [10]:
def former_players_from_team(team, playoff_player_teams, player_team_appearances): 
    playoff_from_team_list = []
    for idx, row in playoff_player_teams.iterrows(): 
        if team in row.team_list: 
            playoff_from_team_list += [{
                'playerId': row.playerId
                , 'Name': row.Name
                , 'playoff_team': row.Tm
            }]
            
    playoff_from_team = pd.DataFrame(playoff_from_team_list)
    playoff_from_team_app = player_team_appearances[
        (player_team_appearances['playerId'].isin(playoff_from_team.playerId))
        & (player_team_appearances['Tm'] == team)
    ]

    return pd.merge(playoff_from_team_app, playoff_from_team, how='inner', on='playerId')

## Iterate through non-playoff teams

In [11]:
# for t in not_playoff_teams:
for t in ['COL']:
    print('======================================================')
    print('Team not in playoffs: {}\n'.format(t))
    df = former_players_from_team(t, playoff_player_teams, player_team_appearances)
    
    df_app = pd.DataFrame(
        df.groupby('playoff_team').appearances.sum().reset_index(name='player_appearances')
        ).sort_values(by='player_appearances', ascending=False)
    
    for idx, row in df_app.iterrows(): 
        print('Appearances that current {} players made for {}: {}\n'.format(
            row.playoff_team, t, row.player_appearances))
        print(df[df.playoff_team == row.playoff_team][['Tm', 'Name', 'appearances']]
              .sort_values(by='appearances', ascending=False).to_string(index=False))
        print('\n')
    print('\n\n')

Team not in playoffs: COL

Appearances that current NYY players made for COL: 5911

Tm           Name  appearances
COL    DJ LeMahieu         3737
COL  Adam Ottavino         1661
COL   Tommy Kahnle          444
COL  Mike Tauchman           69


Appearances that current STL players made for COL: 2608

Tm           Name  appearances
COL  Dexter Fowler         2608


Appearances that current MIL players made for COL: 1984

Tm           Name  appearances
COL   Jordan Lyles         1326
COL  Drew Pomeranz          658


Appearances that current WSN players made for COL: 1249

Tm           Name  appearances
COL  Gerardo Parra         1249


Appearances that current ATL players made for COL: 412

Tm               Name  appearances
COL  Charlie Culberson          337
COL       Chris Martin           69
COL      Rafael Ortega            6


Appearances that current OAK players made for COL: 195

Tm            Name  appearances
COL  Brett Anderson          195


Appearances that current HOU play