This notebook is a Monte Carlo simulation of a LearnedLeague season. 

It ingests the following information:

1. The recent stats of all players in the league, defined as average TCA and DE in the three most recent seasons they have played in the past twelve months
2. The league's remaining schedule

Remaining TCA is projected based on a weighted average of the recent history and to-date performance that season, with an assumption towards mean regression that diminishes as the season continues.

Individual matchups are determined by projecting a sequence of integers 0-6 that sum to the remaining estimated TCA, running that number through a function assigning a random score weighted by your opponent's defensive efficiency, and compared to your opponent's score.

Final output is a frame sorted by median final placement, with additional promotion/relegation percentage chances added for public leagues.

*Model Limits*

1. Newer players are likely disadvantaged by this system, as it is to be expected that they will outperform their previous seasons as they learn to "read" the questions.
2. Players who have significantly improved themsleves between seasons will not be recognized as such until midseason or later. Additionally, because the model does not give special weight to the most recent season, recent improvement is not assumed to be permanent until repeated in multiple seasons.
3. Individual matchups are blind to the category stats of the individual players; it is my belief that a matchup between players with very different relative strengths is more likely to yield an upset than two players of equivalent relative ability.
4. The "defensive table" and weighting curve used here are not based on meaningful research, and are merely my attempts to make reasonable assumptions about proper inputs.

Example output below is based on a run of Tundra A as of MD5, LL100.

## User Inputs

In [151]:
league_type = 'public' #private #public

season = 100
league_name = 'Tundra' #capitalize
rundle = 'A' # blank for private
division = 0 #1, 2, 3; zero for non-divided rundles
players = 28

#pro/rel info for public leagues - will be ignored in private leagues
promotion = 3
relegation = 4

### Package Installs + Setup

In [152]:
import matplotlib.pyplot as plt
import numpy as np

import pandas as pd
import collections
from functools import reduce

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

from webdriver_manager.chrome import ChromeDriverManager

import time
import random
from statistics import median

import os
import os.path

pd.set_option('display.max_columns', 30)

In [153]:
current_season = 'LL' + str(season)

def formatted_lg(rundle, string, division):
    if (league_type == 'public') & (division == 0):
        return(rundle + '_' + string.capitalize())
    elif (league_type == 'public') & (division != 0):
        return(rundle + '_' + string.capitalize() + '_Div_' + str(division))
    elif league_type == 'private':
        return(string.title().replace(' ', '_'))
    
formatted_league = formatted_lg(rundle, league_name, division)

formatted_league

'A_Tundra'

In [154]:
def league_string(league_type, season, league_name, rundle, division):
    '''
    function that sets up the url for the league
    '''
    
    league_name = formatted_lg(rundle, league_name, division)
    
    if (league_type == 'public'):
        final_string = str(season) + '&' + league_name
        return(final_string)
        #if (division == 0):
        #    return(final_string)
        #else:
        #    return(final_string + '_Div_' + str(division))
    elif (league_type == 'private'):
        final_string = str(season) + '&' + league_name
        return(final_string)
    
def standings_url(string):
    return('https://learnedleague.com/standings.php?' + string)

In [155]:
league = league_string(league_type, season, league_name, rundle, division)
sched_path = 'LL' + str(season) + '_' + league_name + '_' + formatted_league + '_rundle_sched.csv'
standings_url(league)

'https://learnedleague.com/standings.php?100&A_Tundra'

In [156]:
chromeOptions = webdriver.ChromeOptions()
prefs = {"download.default_directory" : os.path.dirname(os.path.realpath('__file__'))}
chromeOptions.add_experimental_option("prefs",prefs)
service = Service('chromedriver')
driver = webdriver.Chrome(options=chromeOptions)

driver.get('https://learnedleague.com/')

credentials = pd.read_json('credentials.json')

login = driver.find_element(By.NAME, 'username')
login.send_keys(credentials['learnedleague']['username'])

pw = driver.find_element(By.NAME, 'password')
pw.send_keys(credentials['learnedleague']['password'])

clickable = driver.find_element(By.NAME, 'login')
clickable.click()

time.sleep(1)

driver.get(standings_url(league))

## Getting Baseline Info

In [157]:
def profile_getter(players):
    
    driver.get(standings_url(league))
    urls = []
    for i in range(1, players + 1):
        y = driver.find_element(By.XPATH, '//*[@id="lft"]/div[1]/table/tbody/tr[' + str(i) + ']/td[3]/a')
        url = y.get_attribute('href')
        urls.append(url)
        
    return(urls)

def schedule_generator():
    #run once at start of season to get schedule
        
    if os.path.isfile(sched_path): #if you've already generated the schedule this skips a step
        return 'schedule file already exists'
    
    sched = pd.DataFrame(columns=['MD', 'P1', 'P2'])
    for k in range(1,26): #number of weeks in reg season less one
        time.sleep(1)
        if league_type == 'private':
            driver.get('https://learnedleague.com/schedule.php?' + str(season) + '&' + str(k) + '&' + formatted_league)
        elif league_type == 'public':
            driver.get('https://learnedleague.com/schedule.php?' + str(season) + '&' + str(k) + '&' + formatted_league)
            
        row_count = int((players / 2) + 1) #stops after the last row of players
        
        for i in range(1,row_count):
                x1 = driver.find_element(By.XPATH, '//*[@id="main"]/div/div[1]/div[2]/table/tbody/tr[' + str(i) + ']/td[2]').text
                x2 = driver.find_element(By.XPATH, '//*[@id="main"]/div/div[1]/div[2]/table/tbody/tr[' + str(i) + ']/td[4]').text
                print(k, x1, x2)
                
                sched.loc[len(sched.index)] = [k, x1, x2]  
    
    sched.to_csv(filepath)
    return sched

In [158]:
schedule_generator()

'schedule file already exists'

In [159]:
urls = profile_getter(players)

In [160]:
player_list = list(pd.read_csv(sched_path)[:(int(players/2))][['P1']]['P1']) + \
              list(pd.read_csv(sched_path)[:(int(players/2))][['P2']]['P2'])

In [162]:
def last_five_seasons(number):
    minus_one = 'LL' + str(number - 1)
    minus_two = 'LL' + str(number - 2)
    minus_three = 'LL' + str(number - 3)
    minus_four = 'LL' + str(number - 4)
    minus_five = 'LL' + str(number - 5)
    return([minus_one, minus_two, minus_three, minus_four, minus_five])

def get_season_summaries(number):
    for league in last_five_seasons(number):
        if os.path.isfile(league + '_Leaguewide_MD25.csv'): #if you've already generated the schedule this skips a step
            print(league + ' file available')
        else:
            league_no = league.replace('LL', '')
            driver.get('https://learnedleague.com/lgwide.php?' + league_no)
            print(league + ' file downloaded')
            
def average_stats(player_list):
    '''
    revised version of the average_stats function that does not involve hitting the website 25 times
    '''
    get_season_summaries(season)
    
    stats = pd.DataFrame(columns=['Player', 'TCA', 'DE'])
    
    #reads in most recent five season files
    
    szn_1 = pd.read_csv(last_five_seasons(season)[0] + '_Leaguewide_MD25.csv', encoding='latin-1')
    szn_2 = pd.read_csv(last_five_seasons(season)[1] + '_Leaguewide_MD25.csv', encoding='latin-1')
    szn_3 = pd.read_csv(last_five_seasons(season)[2] + '_Leaguewide_MD25.csv', encoding='latin-1')
    szn_4 = pd.read_csv(last_five_seasons(season)[3] + '_Leaguewide_MD25.csv', encoding='latin-1')
    szn_5 = pd.read_csv(last_five_seasons(season)[4] + '_Leaguewide_MD25.csv', encoding='latin-1')
    
    frames = [szn_1, szn_2, szn_3, szn_4, szn_5]
    
    for player in player_list:
        player_frame = pd.DataFrame(columns=['Player', 'TCA', 'DE'])
        
        for szn in frames:
            # checks if player has a row in a historical LL season file, pulls in values if so
            row = szn[szn.Player == player][['TCA', 'DE']]
            try:
                player_frame.loc[len(player_frame.index)] = [player, row['TCA'].iloc[0], row['DE'].iloc[0]]
            except:
                pass

        if len(player_frame) > 0: 
            player_frame = player_frame[:3] #if they've played all four seasons, only take most recent three
            player_frame['DE'] = [float(i) for i in player_frame.DE]
        
            stats.loc[len(stats.index)] = [player, round(player_frame.TCA.mean()), round(player_frame.DE.mean(),3)]
        
        if len(player_frame) == 0:
            print(player + ' not found in past five seasons')
      
    return(stats)

'''def average_stats_v1(urls):
    
    # DEPRECATED - included for reference
    
    # pulls average TCA and DE for the most recent three seasons, with a tolerance of one missed season this year; 
    # you can mess with the "last four seasons" function above if you want to use a longer range of recent seasons
        
    stats = pd.DataFrame(columns=['Player', 'TCA', 'DE'])
    time.sleep(1)
    for url in urls:
        driver.get(url + "&2")
        time.sleep(1)
        
        try:
            name = driver.find_element(By.XPATH, '//*[@id="main"]/div/div[1]/div[3]/div[2]/h1').text
        except:
            name = driver.find_element(By.XPATH, '//*[@id="main"]/div/div[1]/div[2]/div[2]/h1').text
        
        df = pd.read_html(driver.page_source)[4]
        df = df[df.Season.str.contains('LL')]
        df = df[df.Season.isin(last_five_seasons(season))]
        
        if len(df) > 0: 
            df = df[:3] #if they've played all four seasons, only take most recent three
            df['DE'] = [float(i) for i in df.DE]
        
            stats.loc[len(stats.index)] = [name, round(df.TCA.mean()), round(df.DE.mean(),3)]  
            
        else: #rare edge case where player is lapsed by more than a year; takes most recent season
            df = pd.read_html(driver.page_source)[4]
            df = df[df.Season.str.contains('LL')]
            df = df[~df['Season'].isin([current_season])]
            df['DE'] = [float(i) for i in df.DE]
            df = df.reset_index()
            df = df.loc[0]
            
            stats.loc[len(stats.index)] = [name, round(df.TCA.mean()), round(df.DE.mean(),3)]  
            
    return(stats)'''

'def average_stats_v1(urls):\n    \n    # DEPRECATED - included for reference\n    \n    # pulls average TCA and DE for the most recent three seasons, with a tolerance of one missed season this year; \n    # you can mess with the "last four seasons" function above if you want to use a longer range of recent seasons\n        \n    stats = pd.DataFrame(columns=[\'Player\', \'TCA\', \'DE\'])\n    time.sleep(1)\n    for url in urls:\n        driver.get(url + "&2")\n        time.sleep(1)\n        \n        try:\n            name = driver.find_element(By.XPATH, \'//*[@id="main"]/div/div[1]/div[3]/div[2]/h1\').text\n        except:\n            name = driver.find_element(By.XPATH, \'//*[@id="main"]/div/div[1]/div[2]/div[2]/h1\').text\n        \n        df = pd.read_html(driver.page_source)[4]\n        df = df[df.Season.str.contains(\'LL\')]\n        df = df[df.Season.isin(last_five_seasons(season))]\n        \n        if len(df) > 0: \n            df = df[:3] #if they\'ve played all four seas

In [163]:
player_stats = average_stats(player_list)

player_stats.sort_values('TCA', ascending=False)

LL99 file available
LL98 file available
LL97 file available
LL96 file available
LL95 file available
MillerRA not found in past five seasons


Unnamed: 0,Player,TCA,DE
0,BurgessD,139,0.71
9,ReedN,123,0.745
12,ZuffranieriJ,123,0.751
2,CraneN,122,0.731
14,Evaskis-GarrettC,121,0.707
1,ChuHau,119,0.72
8,PetersonDE,118,0.624
16,FiestaR,116,0.674
11,ZimmermanP,112,0.695
13,CheungE,112,0.765


In [165]:
# code for manually replacing stats for whatever reason

#player_stats.iloc[[17],[1]] = 0
#player_stats.iloc[[17],[2]] = .580

# code for manually adding a missing player

#player_stats.loc[len(player_stats.index)] = ['MillerRA', 0, .580]

### Matchup Sim Functions

In [167]:
def defensive_table(de):
    '''
    Just a little something I cooked up to make defense kinda-sorta matter
    and convert correct answers into point totals
    '''
    probs_table = [{0: 1, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9:0},
               {0: (2*.1667)*de, 1: (2*.3333)*de, 2: (2*.3333)*(1-de), 3: (2*.1667)*(1-de), 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9:0},
               {0: 0, 1: (2*.1111)*de, 2: (2*.2222)*de, 3: .3333, 4: (2*.2222)*(1-de), 5: (2*.1111)*(1-de), 6: 0, 7: 0, 8: 0, 9:0},
               {0: 0, 1: 0, 2: (2*.08333)*de, 3: (2*.1667)*de, 4: (2*.25)*de, 5: (2*.25)*(1-de), 6: (2*.1667)*(1-de), 7: (2*.08333)*(1-de), 8: 0, 9:0},
               {0: 0, 1: 0, 2: 0, 3: 0, 4: (2*.1111)*de, 5: (2*.2222)*de, 6: .3333, 7: (2*.2222)*(1-de), 8: (2*.1111)*(1-de), 9:0},
               {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: (2*.1667)*de, 7: (2*.3333)*(de), 8: (2*.3333)*(1-de), 9:(2*.1667)*(1-de)},
               {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9:1}]
    return pd.DataFrame(probs_table)

def heads_up_points(score, DE):
    '''
    random points total for one shot
    '''
    
    probs = list(defensive_table(DE).loc[score])
    points = random.choices([0,1,2,3,4,5,6,7,8,9], weights=probs, k=1)
    
    return(points[0])

def heads_up_points_noisy(score, DE):
    '''
    random points total for one match with a little bit of bullshit
    because sometimes nonsense happens and we gotta account for that
    '''
    
    noise = random.randint(1,100)
    points = heads_up_points(score, DE)
    
    if (noise < 6) and (points > 0) and (score != 6):
        return(points - 1)
    if (noise > 95) and (points < 9) and (score != 0):
        return(points + 1)
    else:
        return(points)
    
def set_of_results(m, n):
    '''
    Generates a set of numbers of length (n) between 1 and 6, normally distributed,
    that sums up to total (m) 
    
    To be used to produce a random set of results for the rest of the season
    '''
    flag = 0
    while flag == 0:
        s = np.random.normal(m/n, 1, n) #mu = 1 is a number I made up as well that seemed to produce a coherent set of #s

        value_list = []

        for i in range(0,len(s)):
            j = round(s[i])
            if j > 6:
                j = 6
            if j < 0:
                j = 0
            value_list.append(j)

        if (sum(value_list) < (m + 1)) and (sum(value_list) > (m - 1)):
            flag = 1
        
    return(value_list)

def get_medians(df):
    '''sort that i like the most for this'''
    median_col = []
    
    for player in df.index:
        sim_pos = list(df.loc[player])
        all_results = []
        
        for i in range(1,len(sim_pos)+1):
            rel_list = list([i] * sim_pos[i-1])
            all_results = all_results + rel_list

        med = median(all_results)
        median_col.append(med)
        
    return(median_col)

def get_list(df, col):
    return(list(df[col]))

### The Loop

The top commented out portion pulls the current standings and creates a stats that remain constant and unmodified in the following loops.

The i loop can be killed at any time to continue down the notebook. To continue appending after, run the "loop" cell again.

In [172]:
#run ONCE initially to intake current league standings

driver.get(standings_url(league))

sched_main = pd.read_csv(sched_path)
sched_main = sched_main[['MD', 'P1', 'P2']]

all_sims = pd.DataFrame(columns=['win', 'lose', 'tie', 'points', 'TCA', 'score', 'differential','FL','place'])

#driver.get(standings_url(league))

to_date = pd.read_html(driver.page_source)[0][['Player', 'W', 'L', 'T', 'PTS', 'MPD', 'TMP', 'TCA', 'FL']]

## example of ad-hoc fix if player name is too long
#to_date = to_date.replace({'Evaskis-Garre.': 'Evaskis-GarrettC'})

for col in ['W', 'L', 'T', 'PTS', 'MPD', 'TMP', 'TCA', 'FL']:
    to_date[col] = [float(i) for i in to_date[col]]

completed = int(to_date['W'][0] + to_date['L'][0] + to_date['T'][0])

if completed > 0:
    to_date['PER_DAY'] = round((to_date['TCA'] / (to_date['W'] + to_date['L'] + to_date['T'] - to_date['FL'])), 2)
    to_date['PER_DAY_PACE'] = to_date['PER_DAY'] * 25
    
else:
    to_date['PER_DAY'] = 0
    to_date['PER_DAY_PACE'] = 0

MD_weight = [0,	0,	0,	0,	0,	0,	0.03,	0.07,	0.11,	0.17,\
             0.24,	0.32,	0.4,	0.5,	0.6,	0.66,	0.76,\
             0.83,	0.89,	0.93,	0.97,	1,	1,	1,	1,	1]


I do not recommend waiting to run all 10,000 sims.

In [173]:
#begin generating sims; one sim takes about 4.5 seconds

session_sims = pd.DataFrame(columns=['win', 'lose', 'tie', 'points', 'TCA', 'score', 'differential','FL','place'])

player_list = []
win_list = []
lose_list = []
tie_list = []
points_list = []
TCA_list = []
score_list = []
differential_list = []
FL_list = []
place_list = []

for i in range(0, 10000): #you can lower this if you want, or pause this cell and continue running after
    seed = random.randint(0,5000000000)
    
    sim = player_stats
    sim = sim.merge(to_date, on='Player', how='left', suffixes=['','_todate'])
    
    sim['weighted_TCA'] = (sim['TCA'] * (1- MD_weight[completed])) + (sim['PER_DAY_PACE'] * (MD_weight[completed]))
    sim['weighted_TCA'] = [round(i, 0) for i in sim.weighted_TCA]        
    sim['TCA_reversion'] = [round(i - (j*completed),0) for i, j in zip(sim.weighted_TCA, sim.PER_DAY)] #round(i - (j*completed),0)
    sim['random_matches'] = [set_of_results(i, 25 - completed) for i in sim.TCA_reversion]

    sched = sched_main

    sched = sched_main[sched_main['MD'] > completed]
    sched = sched.reset_index()

    sched['P1_TCA'] = [sim[sim.Player == i]['random_matches'].iloc[0][j-completed-1] for i, j in zip(sched.P1, sched.MD)]
    sched['P2_TCA'] = [sim[sim.Player == i]['random_matches'].iloc[0][j-completed-1] for i, j in zip(sched.P2, sched.MD)]

    sched['P1_DE'] = [player_stats[player_stats.Player == i]['DE'].iloc[0] for i in sched.P1]
    sched['P2_DE'] = [player_stats[player_stats.Player == i]['DE'].iloc[0] for i in sched.P2]

    sched['P1_score'] = [heads_up_points_noisy(i,j) for i,j in zip(sched.P1_TCA, sched.P2_DE)]
    sched['P2_score'] = [heads_up_points_noisy(i,j) for i,j in zip(sched.P2_TCA, sched.P1_DE)]

    P1_win = []
    P2_win = []

    P1_tie = []
    P2_tie = []

    P1_lose = []
    P2_lose = []

    for i in range(0, len(sched)):
        if sched.P1_score[i] > sched.P2_score[i]:
            P1_win.append(1)
            P2_win.append(0)
            P1_tie.append(0)
            P2_tie.append(0)
            P1_lose.append(0)
            P2_lose.append(1)
        elif sched.P1_score[i] < sched.P2_score[i]:
            P1_win.append(0)
            P2_win.append(1)
            P1_tie.append(0)
            P2_tie.append(0)
            P1_lose.append(1)
            P2_lose.append(0)
        else:
            P1_win.append(0)
            P2_win.append(0)
            P1_tie.append(1)
            P2_tie.append(1)
            P1_lose.append(0)
            P2_lose.append(0)

    sched['P1_win'] = P1_win
    sched['P2_win'] = P2_win

    sched['P1_lose'] = P1_lose
    sched['P2_lose'] = P2_lose

    sched['P1_tie'] = P1_tie
    sched['P2_tie'] = P2_tie

    sched['differential'] = sched.P1_score - sched.P2_score

    long = pd.DataFrame(columns=['MD', 'Player', 'TCA', 'score', 'win', 'lose', 'tie', 'differential'])

    for i in range(0, len(sched)):
        long.loc[len(long.index)] = [sched.MD[i], sched.P1[i], sched.P1_TCA[i], sched.P1_score[i], sched.P1_win[i], \
                                     sched.P1_lose[i], sched.P1_tie[i],sched.differential[i]]
        
        long.loc[len(long.index)] = [sched.MD[i], sched.P2[i], sched.P2_TCA[i], sched.P2_score[i], sched.P2_win[i], \
                                     sched.P2_lose[i], sched.P2_tie[i],(sched.differential[i] * -1)]
        


    simmed_standings = long.groupby(by="Player").sum()

    simmed_standings['points'] = (simmed_standings['win'] * 2) + simmed_standings['tie']
    simmed_standings = simmed_standings[['win', 'lose', 'tie', 'points', 'TCA', 'score', 'differential']]

    simmed_standings = simmed_standings.sort_values(['points', 'differential', 'score', 'TCA'], ascending=False)
    simmed_standings['place'] = range(1,len(simmed_standings)+1)

    simmed_standings.merge(to_date, on='Player', how='left', suffixes=['','_todate'])

    spt = simmed_standings.merge(to_date, on='Player', how='left', suffixes=['','_todate']) #simmed plus todate

    simmed_latest = pd.DataFrame(columns=['Player', 'W', 'L', 'T', 'Pts', 'TCA', 'TMP', 'Diff', 'FL'])

    #simmed_standings['place'] = range(1,len(simmed_standings)+1)

    simmed_latest['Player'] = spt['Player']
    simmed_latest['W'] = spt['win'] + spt['W']
    simmed_latest['L'] = spt['lose'] + spt['L']
    simmed_latest['T'] = spt['tie'] + spt['T']
    simmed_latest['Pts'] = spt['points'] + spt['PTS']
    simmed_latest['TCA'] = spt['TCA'] + spt['TCA_todate']
    simmed_latest['TMP'] = spt['score'] + spt['TMP']
    simmed_latest['Diff'] = spt['differential'] + spt['MPD']
    simmed_latest['FL'] = spt['FL']

    simmed_latest = simmed_latest.sort_values(['Pts', 'Diff', 'TMP', 'TCA'], ascending=False)
    simmed_latest['place'] = range(1,len(simmed_latest)+1)

    simmed_latest = simmed_latest.set_index(['Player'])

    print(simmed_latest[['place']])
    
    player_list.extend(list(simmed_latest.index))
    
    win_list.extend(get_list(simmed_latest, 'W'))
    lose_list.extend(get_list(simmed_latest, 'L'))
    tie_list.extend(get_list(simmed_latest, 'T'))
    points_list.extend(get_list(simmed_latest, 'Pts'))
    TCA_list.extend(get_list(simmed_latest, 'TCA'))
    score_list.extend(get_list(simmed_latest, 'TMP'))
    differential_list.extend(get_list(simmed_latest, 'Diff'))
    FL_list.extend(get_list(simmed_latest, 'FL'))
    place_list.extend(get_list(simmed_latest, 'place'))

    #session_sims = pd.concat([session_sims, simmed_latest])

                  place
Player                 
BurgessD              1
Evaskis-GarrettC      2
CraneN                3
ChuHau                4
SmithC14              5
ReedN                 6
FiestaR               7
ZuffranieriJ          8
GrubbsA               9
PetersonDE           10
GregoryB             11
TemkinE              12
CheungE              13
RogersCP             14
ZimmermanP           15
BerkeA               16
KligisJ              17
RobinsonDrusoe       18
EnsonR               19
BonneauJ             20
EgelmanS             21
FrithP               22
SteeleM              23
IngramB              24
HovelandR            25
JapingaM             26
Gozlan Z             27
MillerRA             28
                  place
Player                 
BurgessD              1
Evaskis-GarrettC      2
CraneN                3
SmithC14              4
ChuHau                5
FiestaR               6
ReedN                 7
RogersCP              8
BerkeA                9
ZimmermanP      

                  place
Player                 
BurgessD              1
ChuHau                2
CraneN                3
Evaskis-GarrettC      4
EgelmanS              5
GrubbsA               6
ZimmermanP            7
BonneauJ              8
ReedN                 9
FiestaR              10
SteeleM              11
BerkeA               12
ZuffranieriJ         13
RogersCP             14
PetersonDE           15
GregoryB             16
IngramB              17
Gozlan Z             18
EnsonR               19
KligisJ              20
TemkinE              21
SmithC14             22
CheungE              23
RobinsonDrusoe       24
FrithP               25
JapingaM             26
HovelandR            27
MillerRA             28
                  place
Player                 
BurgessD              1
ChuHau                2
Evaskis-GarrettC      3
ZimmermanP            4
CraneN                5
FiestaR               6
GrubbsA               7
SmithC14              8
BonneauJ              9
EgelmanS        

                  place
Player                 
BurgessD              1
Evaskis-GarrettC      2
SmithC14              3
BonneauJ              4
ZuffranieriJ          5
ChuHau                6
CraneN                7
ZimmermanP            8
GrubbsA               9
PetersonDE           10
EgelmanS             11
BerkeA               12
FiestaR              13
GregoryB             14
RogersCP             15
TemkinE              16
Gozlan Z             17
KligisJ              18
ReedN                19
CheungE              20
FrithP               21
IngramB              22
HovelandR            23
RobinsonDrusoe       24
SteeleM              25
JapingaM             26
EnsonR               27
MillerRA             28
                  place
Player                 
Evaskis-GarrettC      1
BurgessD              2
ZuffranieriJ          3
ChuHau                4
ZimmermanP            5
BonneauJ              6
CraneN                7
RogersCP              8
GrubbsA               9
ReedN           

                  place
Player                 
BurgessD              1
CraneN                2
RogersCP              3
Evaskis-GarrettC      4
PetersonDE            5
BonneauJ              6
ZuffranieriJ          7
EgelmanS              8
ChuHau                9
BerkeA               10
ReedN                11
GregoryB             12
FiestaR              13
GrubbsA              14
SmithC14             15
HovelandR            16
TemkinE              17
CheungE              18
Gozlan Z             19
RobinsonDrusoe       20
FrithP               21
EnsonR               22
JapingaM             23
SteeleM              24
ZimmermanP           25
IngramB              26
KligisJ              27
MillerRA             28
                  place
Player                 
BurgessD              1
ZuffranieriJ          2
Evaskis-GarrettC      3
GregoryB              4
RogersCP              5
PetersonDE            6
ChuHau                7
CraneN                8
BerkeA                9
FiestaR         

                  place
Player                 
ZuffranieriJ          1
ChuHau                2
BurgessD              3
PetersonDE            4
ReedN                 5
Evaskis-GarrettC      6
CraneN                7
GregoryB              8
EgelmanS              9
SmithC14             10
CheungE              11
ZimmermanP           12
TemkinE              13
RobinsonDrusoe       14
BonneauJ             15
BerkeA               16
RogersCP             17
SteeleM              18
GrubbsA              19
FiestaR              20
IngramB              21
EnsonR               22
Gozlan Z             23
KligisJ              24
FrithP               25
HovelandR            26
JapingaM             27
MillerRA             28
                  place
Player                 
BurgessD              1
ChuHau                2
RogersCP              3
BonneauJ              4
CraneN                5
JapingaM              6
GregoryB              7
FiestaR               8
Evaskis-GarrettC      9
ZuffranieriJ    

KeyboardInterrupt: 

When you halt the above cell, run the below cell **once** to log all simulations created to the all_sims object. (If you run this cell multiple times without re-running the above cell, it will add the same batch of simulations multiple times.)

In [174]:
session_sims = pd.DataFrame(columns=['player', 'win', 'lose', 'tie', 'points', 'TCA', 'score', 'differential', 'FL', 'place'])

session_sims['player'] = player_list
session_sims['win'] = win_list
session_sims['lose'] = lose_list
session_sims['tie'] = tie_list
session_sims['points'] = points_list
session_sims['TCA'] = TCA_list
session_sims['score'] = score_list
session_sims['differential'] = differential_list
session_sims['FL'] = FL_list
session_sims['place'] = place_list

session_sims = session_sims.set_index('player')

all_sims = pd.concat([all_sims, session_sims])
sim_count = len(all_sims) / players #number of iterations run
sim_count

54.0

In [175]:
all_sims['player'] = all_sims.index
wide = pd.pivot_table(all_sims, values = ['place'], index = ['player'], columns=all_sims['place'].values, aggfunc='count', fill_value=0)

In [176]:
if league_type == 'public':
    wide['median_pos'] = get_medians(wide)
    promotion_frame = wide['place'][range(1,promotion+1)]
    promotion_frame['promoted'] = [promotion_frame.iloc[i].sum(axis=0) for i in range(0, len(promotion_frame))]
    relegation_frame = wide['place'][range(players-relegation + 1, players + 1)]
    relegation_frame['relegated'] = [relegation_frame.iloc[i].sum(axis=0) for i in range(0, len(relegation_frame))]
    wide['promoted'] = promotion_frame['promoted']
    wide['relegated'] = relegation_frame['relegated']
    
else:
    wide['median_pos'] = get_medians(wide)

In [177]:
wide = wide.sort_values('median_pos', ascending=True)

wide[['median_pos', 'place']]

Unnamed: 0_level_0,median_pos,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place,place
Unnamed: 0_level_1,Unnamed: 1_level_1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
player,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2
BurgessD,1.0,45,7,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
ChuHau,3.0,4,18,7,7,3,3,6,1,3,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CraneN,5.0,2,8,11,4,6,5,6,5,1,1,0,0,1,0,2,1,0,0,1,0,0,0,0,0,0,0,0,0
Evaskis-GarrettC,5.0,2,11,8,5,5,7,3,2,3,2,1,2,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0
BonneauJ,6.0,0,3,7,10,4,6,1,2,4,1,5,2,0,2,3,0,3,0,0,1,0,0,0,0,0,0,0,0
ZuffranieriJ,7.0,1,2,5,9,7,2,6,4,3,5,4,2,2,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0
RogersCP,9.0,0,2,4,2,4,3,1,10,4,3,0,3,2,5,4,0,3,1,1,0,0,1,0,1,0,0,0,0
ReedN,9.0,0,1,1,4,3,4,5,8,4,3,4,4,1,2,3,3,0,0,2,2,0,0,0,0,0,0,0,0
PetersonDE,9.5,0,2,4,5,3,4,6,2,1,2,6,4,2,4,4,2,0,1,0,0,1,0,0,0,1,0,0,0
BerkeA,12.0,0,0,0,2,3,1,2,4,4,4,3,5,2,4,3,5,3,3,0,4,0,1,1,0,0,0,0,0


In [178]:
wide = wide.apply(lambda x: x/sim_count)

In [179]:
if league_type == 'public':
    wide[['promoted', 'relegated', 'place']].to_csv('probs_LL' + league + '_md' + str(completed) + '.csv')
elif league_type == 'private':
    wide[['place']].to_csv('probs_LL' + str(season) +'_' + league + '_md' + str(completed) + '.csv')