# Fantasy Football Analysis

The purpose of this notebook is to provide some simple analyses of each year's fantasy football league. The data that I use as input is weekly point totals for each team for the regular season (13 weeks). At this point, I don't consider finer grained data like position points per week, etc.

In [3]:
import pandas as pd

# Import weekly scores
data_file = 'data_2019.csv'
data = pd.read_csv(data_file)
data = data.set_index('Team')
NUM_TEAMS = data.shape[0]
actual_wins = data['Actual Wins']
del data['Actual Wins']

# Import the schedules
schedule_file = 'schedule_2019.csv'
schedule = pd.read_csv(schedule_file)
schedule = schedule.set_index('Team')

In [7]:
# Quality check the schedule
teams = set(schedule.index)
for week in range(1,14):
    assert 12 == len(set(schedule['Week {}'.format(week)]).intersection(teams))
    
print('Schedule seems valid')
    

Schedule seems valid


### Basic Analysis
Here, we calculate means and standard deviations for each team.

In [2]:
data_mean = data.mean(axis=1)
data_std = data.std(axis=1)
basic_stats = pd.DataFrame({'Weekly Mean':data_mean, 'Weekly Standard Dev': data_std})

In [3]:
basic_stats

Unnamed: 0_level_0,Weekly Mean,Weekly Standard Dev
Team,Unnamed: 1_level_1,Unnamed: 2_level_1
DJ Purple,84.534,16.073686
Eat 4 Dicks Asendorf,91.492,18.485327
Aah Fuckin’ Shitdick,106.666,21.966251
Pubic Faith,79.336,22.936637
Peter’s Team,84.408,24.313768
Jbone!,89.338,22.11468
Zach Attack,99.304,20.257177
THROW IT TO SANDERS,90.462,19.645151
Lambeau Leapers,115.038,12.617615
Frank The Tank,93.17,27.517389


##### Sorted Highest Weekly Averages

In [4]:
basic_stats.sort_values(by=['Weekly Mean'], ascending=False)

Unnamed: 0_level_0,Weekly Mean,Weekly Standard Dev
Team,Unnamed: 1_level_1,Unnamed: 2_level_1
Lambeau Leapers,115.038,12.617615
Aah Fuckin’ Shitdick,106.666,21.966251
Zach Attack,99.304,20.257177
Chubby Winners,97.462,23.460426
Rudy Was Offsides,93.358,20.119223
Frank The Tank,93.17,27.517389
Eat 4 Dicks Asendorf,91.492,18.485327
THROW IT TO SANDERS,90.462,19.645151
Jbone!,89.338,22.11468
DJ Purple,84.534,16.073686


##### Sorted by Highest Weekly Standard Deviation

In [5]:
basic_stats.sort_values(by=['Weekly Standard Dev'], ascending=False)

Unnamed: 0_level_0,Weekly Mean,Weekly Standard Dev
Team,Unnamed: 1_level_1,Unnamed: 2_level_1
Frank The Tank,93.17,27.517389
Peter’s Team,84.408,24.313768
Chubby Winners,97.462,23.460426
Pubic Faith,79.336,22.936637
Jbone!,89.338,22.11468
Aah Fuckin’ Shitdick,106.666,21.966251
Zach Attack,99.304,20.257177
Rudy Was Offsides,93.358,20.119223
THROW IT TO SANDERS,90.462,19.645151
Eat 4 Dicks Asendorf,91.492,18.485327


### Schedule Luck Analysis
Here, I calculate how lucky each team was. This is computed on a weekly basis and asks the question "What if my opponent was different this week?" The way that our league works, you are matched up with one opponent and play head-to-head against them for that week. If you score a lot of points, but end up playing the team that scored the most, that's pretty unlucky. If you score hardly any point, but end up playing the team that scored the least points, that's pretty lucky.

To compute a luck metric, I compute the expected value of your wins for each week by comparing your weekly points to every other team's points that week. 
 - If you scored the most points, you would beat every team that you could play and your expected win value is 1. 
 - If you scored the least points, you would lose to every team that you could play and your expected win value is 0.
 - If your score would beat 4 teams, but lose to 7 teams, than your expected win value is 4/11

In [6]:
def calc_team_expected_wins(team, df):
    # type: (str, pd.DataFrame) -> float
    """
    This function will calculate the expected record of a given team
    assuming they played every other team that week.
    
    Args: team - string of the team name toze analy
    df: a DataFrame whose rows are team names and columns are weekly scores
    """
    
    # We subtract the team's scores from the entire matrix
    # positive values represent an opponent scored more (loss)
    # negative values represent that team scored more (win)
    temp = df - df.loc[team].values.squeeze()

    # True values now represent a Loss
    temp_bool = temp > 0

    # Summing represents the number of losses we'd expect that week
    # 1 minus represents expected win, and we'll divide by number of team - 1 to normalize to 1
    exp_weekly_win = 1 - temp_bool.sum(axis=0)/(NUM_TEAMS-1)

    return sum(exp_weekly_win)

In [8]:
exp_wins = []

for team in data.index:
    exp_wins.append(calc_team_expected_wins(team, data))
    
exp_wins = pd.Series(exp_wins, index=data.index)

In [11]:
weekly_luck = pd.DataFrame({'Actual Wins':actual_wins, 'Expected Wins': exp_wins, 'Difference': actual_wins-exp_wins})

##### Luckiest Teams
I sort this by highest difference. A positive difference represents you were lucky and won more games than expected. A negative difference represents you were unlucky and won fewer games than expected

In [13]:
weekly_luck.sort_values(by=['Difference'], ascending=False)

Unnamed: 0_level_0,Actual Wins,Expected Wins,Difference
Team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Peter’s Team,6,4.090909,1.909091
Eat 4 Dicks Asendorf,6,4.545455,1.454545
Jbone!,6,4.545455,1.454545
Rudy Was Offsides,6,4.727273,1.272727
Chubby Winners,7,5.909091,1.090909
Lambeau Leapers,9,8.272727,0.727273
Aah Fuckin’ Shitdick,7,6.909091,0.090909
Pubic Faith,2,2.636364,-0.636364
THROW IT TO SANDERS,3,4.181818,-1.181818
Frank The Tank,3,4.454545,-1.454545


##### Sorted Luck by Expected Wins
We want to see what the standings would look like without luck.

In [14]:
weekly_luck.sort_values(by=['Expected Wins'], ascending=False)

Unnamed: 0_level_0,Actual Wins,Expected Wins,Difference
Team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lambeau Leapers,9,8.272727,0.727273
Aah Fuckin’ Shitdick,7,6.909091,0.090909
Zach Attack,3,6.090909,-3.090909
Chubby Winners,7,5.909091,1.090909
Rudy Was Offsides,6,4.727273,1.272727
Eat 4 Dicks Asendorf,6,4.545455,1.454545
Jbone!,6,4.545455,1.454545
Frank The Tank,3,4.454545,-1.454545
THROW IT TO SANDERS,3,4.181818,-1.181818
Peter’s Team,6,4.090909,1.909091


### Team Excitement Rankings

Here I want to evaluate how exciting your season was. This will look at your average point differential, win or lose. Essentially if you won and lost games by 5 points every week, that would be way more intense & exciting than if every week was a blowout and you won or lost by 30 points. 

I'm going to use the median absolute differnce in each week's game - so that big blowouts won't affect it as much as if we used the mean

In [21]:
import numpy as np

In [31]:
# There is probably a slick pandas was to do this - for now, bruteforce 

exc_index = []
med_exc = []
mean_exc = []
num_exc = []
num_blow = []

for team in data.index:
    score_diff = []
    for week in range(1, 13):
    
        # Your score
        your_score = data['Week {}'.format(week)][team]
    
        # Opponents score
        opp_score = data['Week {}'.format(week)][schedule['Week {}'.format(week)][team]]
        
        score_diff.append(abs(your_score - opp_score))
    
    # Logging
    exc_index.append(team)
    mean_exc.append(np.mean(score_diff))
    med_exc.append(np.median(score_diff))
    num_exc.append(sum(np.array(score_diff) < 10))
    num_blow.append(sum(np.array(score_diff) > 25))
    
exc_df = pd.DataFrame({'Mean Score Diff':mean_exc, 'Median Score Diff':med_exc, 'Number Close Games':num_exc, 'Number Blowouts':num_blow}, index = exc_index)

##### Sort by Median Score Differential

A lower differential means your weekly games were relatively close each week

In [34]:
exc_df.sort_values(by=['Median Score Diff'], ascending=True)

Unnamed: 0,Mean Score Diff,Median Score Diff,Number Close Games,Number Blowouts
Eat 4 Dicks Asendorf,18.055,15.15,5,4
Rudy Was Offsides,20.115,16.86,5,5
DJ Purple,19.33,18.27,5,3
Chubby Winners,21.326667,20.7,4,5
Frank The Tank,29.685,22.64,3,5
Zach Attack,27.523333,22.92,1,4
Aah Fuckin’ Shitdick,28.333333,23.24,0,4
Pubic Faith,26.855,23.67,0,5
THROW IT TO SANDERS,25.051667,24.18,2,6
Jbone!,27.93,29.58,4,6


##### Sort by Number of Close Games

I defined these as games where the final score was within 10 points. To me, that's the difference between maybe swapping out a bench player who could have played better that week.

In [36]:
exc_df.sort_values(by=['Number Close Games'], ascending=False)

Unnamed: 0,Mean Score Diff,Median Score Diff,Number Close Games,Number Blowouts
DJ Purple,19.33,18.27,5,3
Eat 4 Dicks Asendorf,18.055,15.15,5,4
Rudy Was Offsides,20.115,16.86,5,5
Jbone!,27.93,29.58,4,6
Chubby Winners,21.326667,20.7,4,5
Peter’s Team,26.508333,29.79,3,7
Frank The Tank,29.685,22.64,3,5
THROW IT TO SANDERS,25.051667,24.18,2,6
Zach Attack,27.523333,22.92,1,4
Aah Fuckin’ Shitdick,28.333333,23.24,0,4


##### Sort by Number of Blowouts

I defined these as games where the final score was more than 25 points apart. To me, this probably meant that there were no bench/substitution mistakes and that no matter what, the winner was foregone pretty quickly.

In [38]:
exc_df.sort_values(by=['Number Blowouts'], ascending=False)

Unnamed: 0,Mean Score Diff,Median Score Diff,Number Close Games,Number Blowouts
Lambeau Leapers,32.283333,31.29,0,8
Peter’s Team,26.508333,29.79,3,7
Jbone!,27.93,29.58,4,6
THROW IT TO SANDERS,25.051667,24.18,2,6
Pubic Faith,26.855,23.67,0,5
Frank The Tank,29.685,22.64,3,5
Chubby Winners,21.326667,20.7,4,5
Rudy Was Offsides,20.115,16.86,5,5
Eat 4 Dicks Asendorf,18.055,15.15,5,4
Aah Fuckin’ Shitdick,28.333333,23.24,0,4


### Team Luck Analysis
Here I want to evaluate
a team's week-to-week variance. This is a slightly different luck analysis than the schedule analysis above. Instead of assuming that your score for the week is fixed and looking at who you played, this analysis will keep your opponent the same, but look at your scores for other weeks and see if you would have won. This asks the question "What if my player's schedule was different?"


In [None]:
# TODO