# Elo ratings based on regular-season games

This notebook implements Elo ratings for NCAA regular-season games using the same formula as FiveThirtyEight's NBA Elo ratings. My resources for this were:

- https://en.wikipedia.org/wiki/Elo_rating_system
- https://fivethirtyeight.com/features/how-we-calculate-nba-elo-ratings/
- https://github.com/fivethirtyeight/nfl-elo-game/blob/master/forecast.py

(The last link above is for 538's NFL Elos (not NBA), but it was useful for a code example of the approach. )

The idea here is to get another feature to be plugged in (alongside seeds, etc.) when predicting tournament games.

In [1]:
import numpy as np
import pandas as pd
from sklearn.metrics import log_loss

The following parameter `K` affects how quickly the Elo adjusts to new information. Here I'm just using the value that 538 found most appropriate for the NBA -- I haven't done any analysis around whether this value is also the best in terms of college basketball.

I also use the same home-court advantage as 538: the host team gets an extra 100 points added to their Elo.

In [2]:
K = 20.
HOME_ADVANTAGE = 100.

In [3]:
rs = pd.read_csv("test/RegularSeasonDetailedResults_only2018.csv")
rs.head(3)

Unnamed: 0,Season,DayNum,WTeamID,WScore,LTeamID,LScore,WLoc,NumOT,WFGM,WFGA,...,LFGA3,LFTM,LFTA,LOR,LDR,LAst,LTO,LStl,LBlk,LPF
0,2018,11,1104,82,1272,70,N,0,26,57,...,17,22,36,15,22,7,17,7,3,22
1,2018,11,1107,69,1233,67,H,0,24,62,...,22,11,14,5,16,14,7,6,3,21
2,2018,11,1112,101,1319,67,H,0,34,57,...,13,17,30,9,10,11,11,3,1,24


In [4]:
team_ids = set(rs.WTeamID).union(set(rs.LTeamID))
len(team_ids)

351

I'm going to initialise all teams with a rating of 1500. There are two differences here with the 538 approach:

- New entrants (when and where there are any) will start at the average 1500 Elo rather than a lower rating probably more appropriate for a new team.
- There is no reversion to the mean between seasons. Each team's Elo starts exactly where it left off the previous season.  My justification here is that we only care about the end-of-season rating in terms of making predictions on the NCAA tournament, so even if ratings are a little off at first, they have the entire regular season to converge to something more appropriate.

In [5]:
# This dictionary will be used as a lookup for current
# scores while the algorithm is iterating through each game
elo_dict = dict(zip(list(team_ids), [1500] * len(team_ids)))

In [6]:
# New columns to help us iteratively update elos
rs['margin'] = rs.WScore - rs.LScore
rs['w_elo'] = None
rs['l_elo'] = None

The three functions below contain the meat of the Elo calculation:

In [7]:
def elo_pred(elo1, elo2):
    return(1. / (10. ** (-(elo1 - elo2) / 400.) + 1.))

def expected_margin(elo_diff):
    return((7.5 + 0.006 * elo_diff))

def elo_update(w_elo, l_elo, margin):
    elo_diff = w_elo - l_elo
    pred = elo_pred(w_elo, l_elo)
    mult = ((margin + 3.) ** 0.8) / expected_margin(elo_diff)
    update = K * mult * (1 - pred)
    return(pred, update)

In [8]:
# I'm going to iterate over the games dataframe using 
# index numbers, so want to check that nothing is out
# of order before I do that.
assert np.all(rs.index.values == np.array(range(rs.shape[0]))), "Index is out of order."

In [9]:
preds = []

# Loop over all rows of the games dataframe
for i in range(rs.shape[0]):
    
    # Get key data from current row
    w = rs.at[i, 'WTeamID']
    l = rs.at[i, 'LTeamID']
    margin = rs.at[i, 'margin']
    wloc = rs.at[i, 'WLoc']
    
    # Does either team get a home-court advantage?
    w_ad, l_ad, = 0., 0.
    if wloc == "H":
        w_ad += HOME_ADVANTAGE
    elif wloc == "A":
        l_ad += HOME_ADVANTAGE
    
    # Get elo updates as a result of the game
    pred, update = elo_update(elo_dict[w] + w_ad,
                              elo_dict[l] + l_ad, 
                              margin)
    elo_dict[w] += update
    elo_dict[l] -= update
    preds.append(pred)

    # Stores new elos in the games dataframe
    rs.loc[i, 'w_elo'] = elo_dict[w]
    rs.loc[i, 'l_elo'] = elo_dict[l]

Let's take a look at the last few games in the games dataframe to check that the Elo ratings look reasonable.

In [10]:
rs.tail(10)

Unnamed: 0,Season,DayNum,WTeamID,WScore,LTeamID,LScore,WLoc,NumOT,WFGM,WFGA,...,LOR,LDR,LAst,LTO,LStl,LBlk,LPF,margin,w_elo,l_elo
4713,2018,115,1430,70,1167,47,A,0,24,46,...,9,15,10,13,9,1,17,23,1584.95,1435.04
4714,2018,115,1431,74,1256,72,H,0,29,55,...,6,16,16,11,6,2,15,2,1400.16,1505.37
4715,2018,115,1442,82,1295,74,H,0,24,52,...,4,26,10,13,2,3,29,8,1416.49,1466.7
4716,2018,115,1443,93,1150,55,H,0,32,57,...,13,16,10,19,8,2,21,38,1622.34,1356.9
4717,2018,115,1447,64,1148,62,A,0,26,56,...,7,15,11,13,6,4,12,2,1603.59,1433.32
4718,2018,115,1450,78,1143,76,A,0,28,56,...,8,24,17,12,2,3,16,2,1428.62,1372.44
4719,2018,115,1453,96,1324,90,A,0,32,56,...,14,23,15,9,6,2,17,6,1415.43,1525.43
4720,2018,115,1454,72,1178,49,A,0,27,58,...,10,22,6,8,3,4,12,23,1479.78,1357.46
4721,2018,115,1456,96,1423,83,H,0,30,56,...,15,21,19,13,5,3,19,13,1508.5,1418.16
4722,2018,115,1458,70,1321,64,A,0,24,45,...,8,17,10,12,3,0,20,6,1499.73,1502.13


Looks OK. How well do they generally predict games? Since all of the Elo predictions calculated above have a true outcome of 1, it's really simple to check what the log loss would be on these 150k games:

In [11]:
np.mean(-np.log(preds))

0.57868575172327097

(This is a pretty rough measure, because this is looking only at regular-season games, which is not really what we're ultimately interested in predicting.)

Final step: for each team, pull out the final Elo rating at the end of each regular season. This is a bit annoying because the team ID could be in either the winner or loser column for their last game of the season..

In [12]:
def final_elo_per_season(df, team_id):
    d = df.copy()
    d = d.loc[(d.WTeamID == team_id) | (d.LTeamID == team_id), :]
    d.sort_values(['Season', 'DayNum'], inplace=True)
    d.drop_duplicates(['Season'], keep='last', inplace=True)
    w_mask = d.WTeamID == team_id
    l_mask = d.LTeamID == team_id
    d['season_elo'] = None
    d.loc[w_mask, 'season_elo'] = d.loc[w_mask, 'w_elo']
    d.loc[l_mask, 'season_elo'] = d.loc[l_mask, 'l_elo']
    out = pd.DataFrame({
        'team_id': team_id,
        'season': d.Season,
        'season_elo': d.season_elo
    })
    return(out)

In [13]:
df_list = [final_elo_per_season(rs, i) for i in team_ids]
season_elos = pd.concat(df_list)

In [14]:
season_elos.sample(10)

Unnamed: 0,season,season_elo,team_id
4631,2018,1568.61,1393
4704,2018,1521.06,1449
4700,2018,1436.24,1357
4634,2018,1478.56,1309
4616,2018,1444.5,1206
4621,2018,1584.85,1243
4543,2018,1481.62,1219
4681,2018,1421.67,1365
4579,2018,1570.88,1460
4544,2018,1530.16,1396


In [15]:
season_elos.to_csv("2018_elos.csv", index=None)