*Updated 1/17/2020*

This is a tool for calculating playoff probabilities in a fantasy sports league that is (1) points-based and (2) head-to-head. It requires the following data:

1. A **scores** frame which contains each team's final score for each week of the league.
2. A **remaining schedule** frame.
3. A **current standings** frame containing current standings as well as any necessary information for a tiebreaker.

If your points system has a high probability for ties or a complex tiebreaker system you may want to augment this code. In my case, I am assuming that the current points order will stay approximately the same, and expect that one team passing another in terms of overall points - the key tiebreaker - would greatly affect the odds calculations.

In all cases, I am outputting my versions of these files for reference in the code below.

In [33]:
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('fivethirtyeight') #i don't wind up using this

import pandas as pd
import collections

In [8]:
scores = pd.read_csv('scores_through_wk12.csv', encoding='latin-1', index_col='Team') #updated dataframe of scores for week
team_list = list(scores.index) #list of teams
wk_count = len(scores.columns) #number of weeks to-date
wk_list = list(range(1, wk_count + 1)) #list of completed weeks as integers

In [9]:
scores #example of the scores file

Unnamed: 0_level_0,1,2,3,4,5,6,7,8,9,10,11,12
Team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Board Man Gets Paid,1036.79,1353.36,1376.31,1013.1,1242.36,1053.71,1189.66,1442.08,1348.92,1158.55,1091.27,1323.05
NYXL,1034.9,1030.71,1297.8,1206.47,1267.95,1199.28,1076.31,1049.74,1361.05,984.67,1282.78,1227.4
Spicy O G,1038.5,1116.45,1298.49,1142.79,984.32,1229.73,1209.19,1299.33,1152.04,1082.58,1036.42,1019.6
Net Browser,1043.93,1300.4,1449.26,1120.58,1019.16,1155.19,1218.41,1090.02,995.62,1049.73,1292.38,1211.63
JokicUCantScratch,825.41,965.57,1156.94,1277.01,1199.67,1264.71,1140.13,1313.67,1116.18,1151.72,1005.2,1211.8
Wire City FC,822.98,1069.23,1090.42,1159.01,1214.51,1089.24,1060.08,1165.5,1160.19,990.47,1107.5,1112.67
Going Back Zubac,929.38,928.04,915.85,1109.49,1076.12,1027.26,1140.34,1149.42,1337.62,1153.24,959.29,1039.81
LSDB,848.48,987.6,1099.21,982.0,1260.03,1395.58,1175.86,1301.51,1190.32,1133.1,1100.07,770.74
Im still a fun guy!,981.53,1324.56,948.13,1033.69,1036.69,1268.43,1265.21,1143.89,1303.07,1059.35,769.12,1017.52
Butterfly Crushers,857.14,1202.5,1096.95,1151.38,1107.6,1127.18,1235.89,1213.96,830.55,908.66,1060.66,1188.42


In [10]:
#initialize the win probability values to be passed to 

columns = ['Team', 'Opponent', 'WinProb']
win_probs = pd.DataFrame(columns=columns)
win_probs

Unnamed: 0,Team,Opponent,WinProb


In [11]:
#calculates the likelihood each team is to defeat each other team in a given week and appends it to that frame above

num = 0
denom = wk_count

for team in team_list:
    opponent_list = [item for item in team_list if item is not team]
    for opponent in opponent_list:
        num = 0
        for week in wk_list:
            if scores.at[team, str(week)] > scores.at[opponent, str(week)]:
                num = num + 1
        win_probs = win_probs.append({'Team': team, 'Opponent': opponent, 'WinProb': float(num/denom)}, ignore_index=True)

In [12]:
#win_probs[win_probs['Team'] == 'Wire City FC']
win_probs = win_probs.drop_duplicates() #there will be two pairs of each matchup based on above code, this gets rid of that
win_probs.sort_values('WinProb')

Unnamed: 0,Team,Opponent,WinProb
121,Josh's Team,Board Man Gets Paid,0.000000
126,Josh's Team,Wire City FC,0.000000
123,Josh's Team,Spicy O G,0.083333
110,DuttaRightThing,Board Man Gets Paid,0.083333
122,Josh's Team,NYXL,0.083333
...,...,...,...
32,Spicy O G,Josh's Team,0.916667
109,Butterfly Crushers,Josh's Team,0.916667
43,Net Browser,Josh's Team,0.916667
10,Board Man Gets Paid,Josh's Team,1.000000


In [13]:
remaining_sched = pd.read_csv('wcfc_remaining.csv', encoding='latin-1', index_col=None)
remaining_sched.head()

Unnamed: 0,Week,Team,Opponent
0,13,Wire City FC,Net Browser
1,13,NYXL,JokicUCantScratch
2,13,Spicy O G,LSDB
3,13,Butterfly Crushers,Going Back Zubac
4,13,Josh's Team,Im still a fun guy!


In [14]:
schedule_probs = pd.merge(remaining_sched, win_probs, on=['Team', 'Opponent'], how='inner')

In [17]:
#the next few code blocks represent a single run of the actual simulation loop

#for each win probability, randomly guess a winner of the matchup with odds equal to the win probability

win_predicts = []
for value in schedule_probs['WinProb']:
    win_predicts.append(np.random.choice([1,0], p=[value, (1-value)]))
    
schedule_probs['team_win'] = win_predicts
schedule_probs['oppo_win'] = [item - 1 if item == 1 else item + 1 for item in win_predicts]
schedule_probs.head()

Unnamed: 0,Week,Team,Opponent,WinProb,team_win,oppo_win
0,13,Wire City FC,Net Browser,0.333333,0,1
1,13,NYXL,JokicUCantScratch,0.583333,1,0
2,13,Spicy O G,LSDB,0.5,0,1
3,13,Butterfly Crushers,Going Back Zubac,0.75,0,1
4,13,Josh's Team,Im still a fun guy!,0.166667,0,1


In [18]:
#simplify that frame down to the number of wins each team gains in that simmed run of the league

wins1 = schedule_probs[['Team', 'team_win']]
wins2 = schedule_probs[['Opponent', 'oppo_win']]
wins1 = wins1.groupby('Team').sum()
wins2 = wins2.groupby('Opponent').sum()
wins2 = wins2.rename(columns={'oppo_win':'team_win'})

wins3 = pd.concat([wins1, wins2])
wins3['Team'] = wins3.index
simmed_wins = wins3.groupby('Team').sum()
simmed_wins

Unnamed: 0_level_0,team_win
Team,Unnamed: 1_level_1
Board Man Gets Paid,5
Butterfly Crushers,3
DuttaRightThing,0
Going Back Zubac,5
Im still a fun guy!,5
JokicUCantScratch,5
Josh's Team,0
LSDB,5
NYXL,7
Net Browser,6


In [21]:
#read in the current standings frame

current_standings = pd.read_csv('wcfc_current_standings.csv', encoding='latin-1', index_col='Team')
current_standings = current_standings[['Points', 'W']]
new_standings = current_standings
current_standings

Unnamed: 0_level_0,Points,W
Team,Unnamed: 1_level_1,Unnamed: 2_level_1
Board Man Gets Paid,14629.16,10
NYXL,14019.06,8
Spicy O G,13609.44,8
Net Browser,13946.31,7
JokicUCantScratch,13628.01,7
Wire City FC,13041.8,7
Going Back Zubac,12765.86,7
LSDB,13244.5,5
Im still a fun guy!,13151.19,5
Butterfly Crushers,12980.89,5


In [22]:
#add the simmed wins to the current wins to get a simmed "outcome" of the league

for team in team_list:
    new_standings.at[team, 'W'] = new_standings.at[team, 'W'] + simmed_wins.at[team, 'team_win']
    
new_standings

Unnamed: 0_level_0,Points,W
Team,Unnamed: 1_level_1,Unnamed: 2_level_1
Board Man Gets Paid,14629.16,15
NYXL,14019.06,15
Spicy O G,13609.44,10
Net Browser,13946.31,13
JokicUCantScratch,13628.01,12
Wire City FC,13041.8,12
Going Back Zubac,12765.86,12
LSDB,13244.5,10
Im still a fun guy!,13151.19,10
Butterfly Crushers,12980.89,8


In [23]:
#rearrange the simulated outcome frame by wins and then points (the tiebreaker metric) to determine the "place" of each team
#in the simmed league

simmed_standings = current_standings.sort_values(by=['W', 'Points'], ascending=False)
simmed_standings['place'] = np.arange(1,13)
simmed_standings

Unnamed: 0_level_0,Points,W,place
Team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Board Man Gets Paid,14629.16,15,1
NYXL,14019.06,15,2
Net Browser,13946.31,13,3
JokicUCantScratch,13628.01,12,4
Wire City FC,13041.8,12,5
Going Back Zubac,12765.86,12,6
Spicy O G,13609.44,10,7
LSDB,13244.5,10,8
Im still a fun guy!,13151.19,10,9
Butterfly Crushers,12980.89,8,10


In [45]:
#this is the equivalent of looping the previous five code blocks n number of times to get a distribution of possible outcomes
#for each team

all_predictions = pd.DataFrame(columns=['Points', 'W', 'place'])

i = 0

number_sims = 10000 #takes about 36 seconds per thousand sims on my machine
                               
while i < number_sims: 
    win_predicts = []
    for value in schedule_probs['WinProb']:
        win_predicts.append(np.random.choice([1,0], p=[value, (1-value)]))

    schedule_probs['team_win'] = win_predicts
    schedule_probs['oppo_win'] = [item - 1 if item == 1 else item + 1 for item in win_predicts]
    
    wins1 = schedule_probs[['Team', 'team_win']]
    wins2 = schedule_probs[['Opponent', 'oppo_win']]
    wins1 = wins1.groupby('Team').sum()
    wins2 = wins2.groupby('Opponent').sum()
    wins2 = wins2.rename(columns={'oppo_win':'team_win'})

    wins3 = pd.concat([wins1, wins2])
    wins3['Team'] = wins3.index
    simmed_wins = wins3.groupby('Team').sum()
    
    current_standings = pd.read_csv('wcfc_current_standings.csv', encoding='latin-1', index_col='Team')
    current_standings = current_standings[['Points', 'W']]
    new_standings = current_standings
    
    for team in team_list:
        new_standings.at[team, 'W'] = new_standings.at[team, 'W'] + simmed_wins.at[team, 'team_win']
        
    simmed_standings = current_standings.sort_values(by=['W', 'Points'], ascending=False)
    simmed_standings['place'] = np.arange(1,13)
    all_predictions = pd.concat([all_predictions, simmed_standings])
    i = i + 1
    
    if (i % 10 == 0):
        print(i) #totally unnecessary, just gives a sense of progress

10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
420
430
440
450
460
470
480
490
500
510
520
530
540
550
560
570
580
590
600
610
620
630
640
650
660
670
680
690
700
710
720
730
740
750
760
770
780
790
800
810
820
830
840
850
860
870
880
890
900
910
920
930
940
950
960
970
980
990
1000
1010
1020
1030
1040
1050
1060
1070
1080
1090
1100
1110
1120
1130
1140
1150
1160
1170
1180
1190
1200
1210
1220
1230
1240
1250
1260
1270
1280
1290
1300
1310
1320
1330
1340
1350
1360
1370
1380
1390
1400
1410
1420
1430
1440
1450
1460
1470
1480
1490
1500
1510
1520
1530
1540
1550
1560
1570
1580
1590
1600
1610
1620
1630
1640
1650
1660
1670
1680
1690
1700
1710
1720
1730
1740
1750
1760
1770
1780
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
2010
2020
2030
2040
2050
2060
2070
2080
2090
2100
2110
2120
2130
2140
2150
2160
2170
2180
2190
2200
2210
222

In [46]:
#table of distribution's of each team's simmed outcomes

all_predictions['team'] = all_predictions.index
outcome_dist = pd.pivot_table(all_predictions, values = ['place'], index = ['team'], columns=all_predictions.place.values, aggfunc='count', fill_value=0)
outcome_dist

Unnamed: 0_level_0,place,place,place,place,place,place,place,place,place,place,place,place
Unnamed: 0_level_1,1,2,3,4,5,6,7,8,9,10,11,12
team,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
Board Man Gets Paid,9590,337,58,13,1,1,0,0,0,0,0,0
Butterfly Crushers,0,8,47,127,339,624,1046,1720,2483,3514,92,0
DuttaRightThing,0,0,0,0,0,1,1,4,23,203,9763,5
Going Back Zubac,0,64,253,625,1120,1560,1707,1821,1576,1269,5,0
Im still a fun guy!,0,7,37,173,464,972,1520,1901,2327,2500,99,0
JokicUCantScratch,10,506,1370,2195,2334,1646,1014,548,282,95,0,0
Josh's Team,0,0,0,0,0,0,0,0,0,0,5,9995
LSDB,0,10,86,314,909,1589,1913,1865,1723,1560,31,0
NYXL,310,6135,2091,804,389,153,75,33,8,2,0,0
Net Browser,22,1343,3033,2516,1344,844,506,231,119,42,0,0


In [63]:
#in this league, the top six teams make the playoffs and the top two teams receive a first-round bye
#i am calculating the probability for each team of achieving each goal by simply summing the relevant columns

outcome_dist['FRB'] = (outcome_dist.place[1] + outcome_dist.place[2]) / number_sims
outcome_dist['Playoffs'] = (outcome_dist.place[1] + outcome_dist.place[2] 
                            + outcome_dist.place[3] + outcome_dist.place[4]
                            + outcome_dist.place[5] + outcome_dist.place[6]) / number_sims

outcome_dist.sort_values('Playoffs', ascending=False)

Unnamed: 0_level_0,place,place,place,place,place,place,place,place,place,place,place,place,FRB,Playoffs
Unnamed: 0_level_1,1,2,3,4,5,6,7,8,9,10,11,12,Unnamed: 13_level_1,Unnamed: 14_level_1
team,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2
Board Man Gets Paid,9590,337,58,13,1,1,0,0,0,0,0,0,0.9927,1.0
NYXL,310,6135,2091,804,389,153,75,33,8,2,0,0,0.6445,0.9882
Spicy O G,65,1477,2706,2477,1810,833,392,163,58,19,0,0,0.1542,0.9368
Net Browser,22,1343,3033,2516,1344,844,506,231,119,42,0,0,0.1365,0.9102
JokicUCantScratch,10,506,1370,2195,2334,1646,1014,548,282,95,0,0,0.0516,0.8061
Wire City FC,3,113,319,756,1290,1777,1826,1714,1401,796,5,0,0.0116,0.4258
Going Back Zubac,0,64,253,625,1120,1560,1707,1821,1576,1269,5,0,0.0064,0.3622
LSDB,0,10,86,314,909,1589,1913,1865,1723,1560,31,0,0.001,0.2908
Im still a fun guy!,0,7,37,173,464,972,1520,1901,2327,2500,99,0,0.0007,0.1653
Butterfly Crushers,0,8,47,127,339,624,1046,1720,2483,3514,92,0,0.0008,0.1145
