# Measure performance with ELO & Brier Scores

The motivation behind this post is to discuss the concepts of ELO, apply that in an NBA simulation, and then measure it with Brier Scores, in a end-to-end work flow! Thank you to my teacher @MaxHumber, for inspiring this post.

Resources: 
* https://en.wikipedia.org/wiki/Elo_rating_system
* https://projects.fivethirtyeight.com/complete-history-of-the-nba/#raptors
* https://fivethirtyeight.com/features/how-we-calculate-nba-elo-ratings/
* https://en.wikipedia.org/wiki/Brier_score

## Import Libraries

In [65]:
# data wrangling
import pandas as pd
import numpy as np

# plotting
#import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('fivethirtyeight')

# preprocessing & feature engineering
# from sklearn.preprocessing import StandardScaler, LabelBinarizer, PolynomialFeatures
# from sklearn_pandas import DataFrameMapper, CategoricalImputer, FunctionTransformer
# from sklearn.impute import SimpleImputer
# from sklearn.pipeline import make_pipeline
# from imblearn.over_sampling import SMOTE

# modelling & evaluation
# from sklearn.model_selection import train_test_split
# from sklearn.linear_model import LogisticRegression
# from sklearn.neighbors import KNeighborsClassifier
# from sklearn.model_selection import GridSearchCV, cross_val_score
# from sklearn.metrics import roc_auc_score, confusion_matrix, accuracy_score
from sklearn.metrics import accuracy_score

# scientific notation off
np.set_printoptions(suppress=True)
pd.options.display.float_format = '{:.2f}'.format

# suppress warnings
#from sklearn.exceptions import DataConversionWarning
#import warnings
#warnings.filterwarnings(action='ignore', category=DataConversionWarning)
# warnings.filterwarnings(action='ignore', category=FutureWarning)

# Elo

## Introduction

While I was familiar with the concept of elo given its usage from FiveThirtyEight to measure a basketball team's strength, relative to its peers, I was surprised to find out it was actually derived from chess!

> The Elo rating system, is a method for calculating the relative skill levels of players in zero-sum games such as chess. It is named after its creator Arpad Elo, a Hungarian-American physics professor (Source: Wikipedia).

Over the years, you can see it's extension and adoption to a wide range of use cases, including:
* Video games (CounterStrike, League of Legends)
* Sports (NBA, football, baseball)

Needless to say, it's a very popular framework! Essentially, the idea is that if you a play stronger opponent and win, you should be rewarded more, than if you play lesser opponents and win, and vice versa, in a zero-sum-game paradigm. This argument intuitively makes sense.

## Define Elo Model

(1) The expected probability of winning of team A follows the logistic curve.

![image.png](images/logistic_curve.png)

* Note the curve's maximum value is 1
* This is a common sigmoid curve (prob winning approaches 1 and 0, asymptotically)

(2) The parameters in this model:
* Ra - rating of team A
* rb - rating of team B
* g, the logistic growth rate / steepness of curve

Interpretation of g: 
Given g = 400. For each 400 rating points of advantage over the opponent, the expected probability is 10x in comparison to opponent's expected score (assuming log 10 base) Said another way, a higher g, will require a greater delta between winner and loser elo to impact the probability. For the NBA, we will leave this at 400, the same as Chess.

* k-factor - the adjustment per game based on win or loss, to reward under performance and reward over performance. There needs to be a level of stickiness, as you shouldn't lose all of your elo points with one game, and the score retains some memory. 

    * According to FiveThirtyEight, and empirical evidence, the optimal K for the NBA to be 20. it’s in the same range as the K used for NFL and international soccer elo ratings even though the NBA plays far more games than those sports. It’s much higher than the optimal K for baseball. It suggests that you should to give relatively high weight to an NBA team’s recent performance. For reference, it's typically 16 for master chess players.

* Baseline elo score - we will use 1500 for the NBA. This is the long term average elo score.

(3) Equation to update a team's elo score after each game: 

    * ![image.png](images/update_score.png)
    * Previous score + k-factor * (P(a) - Ea) 
    
* Create a table to summarize our parameters

In [53]:
class EloTracker:
    # start baseline elo at 1500
    # EloTracker takes data in the form of a dictionary or a list and shapes it into a dictionary 
    # This is how we're tracking our elo scores
    def __init__(self, data, start=1500):
        if isinstance(data, dict):
            self.data = data
        if isinstance(data, list):
            self.data = {i: start for i in data}
    
    # Expected probability of winning
    # Pr(A) = 1 / (10^(-ELODIFF/400) + 1)
    def prob(self, winner, loser):
        winner_elo, loser_elo = self.data[winner], self.data[loser]
        # add home court advantage
        expected_winner = 1 / ( 1 + 10**((loser_elo - winner_elo)/400) )
        expected_loser = 1 - expected_winner
        return expected_winner, expected_loser

    # set k=20 for NBA
    # remember, winner has probability of 1 and loser has probability of 0
    def update(self, winner, loser, k=20):
        expected_winner, expected_loser = self.prob(winner, loser)
        self.data[winner] = round(self.data[winner] + k*(1 - expected_winner))
        self.data[loser] = round(self.data[loser] + k*(0 - expected_loser))
        return self.data

    def __repr__(self):
        return f'EloTracker({self.data})'

In [54]:
# instantiate all teams at 1500 elo
teams = ['Toronto', 'Boston', 'Cleveland']
elo = EloTracker(teams)
elo.data

{'Toronto': 1500, 'Boston': 1500, 'Cleveland': 1500}

In [55]:
# By definition, teams with the same elo are equally likely to win,
# with home court advantage Toronto is the *slight* favourite
elo.prob('Toronto', 'Boston')

(0.5, 0.5)

In [56]:
# Toronto beats Boston
# Update elo; Toronto gained 16 elo and Boston lost 16
elo.update('Toronto', 'Boston')

{'Toronto': 1516, 'Boston': 1484, 'Cleveland': 1500}

In [57]:
# end of season, let's assume these elo scores
teams = {'Toronto': 1750, 'Boston': 1500, 'Cleveland': 1250}
elo = EloTracker(teams)
elo.data
toronto_elo_before = elo.data["Toronto"]
elo.update("Toronto", "Cleveland")
toronto_elo_after = elo.data["Toronto"]

print(f' Toronto gained {toronto_elo_after - toronto_elo_before} elo points')
# Because Toronto won against a worst team, they only gained 2 elo points.

 Toronto gained 2 elo points


In [58]:
# what if Toronto lost?
teams = {'Toronto': 1750, 'Boston': 1500, 'Cleveland': 1250}
elo = EloTracker(teams)
elo.data
toronto_elo_before = elo.data["Toronto"]
elo.update("Cleveland","Toronto")
toronto_elo_after = elo.data["Toronto"]

print(f' Toronto gained {toronto_elo_after - toronto_elo_before} elo points')
# Because Toronto lost to a worse team, they lost 30 elo points! They were expected to win!

 Toronto gained -30 elo points


# Wrangle Game Data to Season Data

In [86]:
# for each game, loop through and accumulate the points for each team
# full=False, means only return points, without dates
def reproduce_points(df, full=False):
    teams = sorted(list(df.home.unique()))
    points = {team: 0 for team in teams}
    points_and_dates = {}
    for i, row in df.iterrows():
        if row.extra_time and row.goals_home > row.goals_visitor:
            points[row.home] += 2
            points[row.visitor] += 1
        elif row.extra_time and row.goals_home < row.goals_visitor:
            points[row.home] += 1
            points[row.visitor] += 2
        elif row.goals_home > row.goals_visitor:
            points[row.home] += 2
        else:
            points[row.visitor] += 2
        # keep a running total of the points at each given row.date
        points_and_dates[row.date] = points.copy()
    if full:
        return points_and_dates
    else:
        return points

def points_to_dataframe(points):
    # take dataframe, reset index and re name index as team from 0 to ...
    # try to understand this later
    # creates points for each day, there will be days where the points do not change
    df = (
        pd.DataFrame(points)
        .reset_index()
        .rename(columns={'index':'team'}))
    
    start = pd.Timestamp(df.columns[1]) - pd.Timedelta(days=1)
    df[start.strftime('%Y-%m-%d')] = 0
    
    df = pd.melt(
        pd.DataFrame(df),
        id_vars=['team'],
        var_name='date',
        value_name='points')
    
    df['date'] = df['date'].apply(pd.to_datetime)
    
    df = df.sort_values(['date', 'team'])
    
    df['points_before'] = df.groupby('team')['points'].shift(1)
    df['points_before'] = df['points_before'].fillna(0).astype(int)
    df['date'] = df['date'].dt.strftime('%Y-%m-%d')
    return df


def games_to_points(df):
    points = reproduce_points(games, full=True)
    points = points_to_dataframe(points)
    return points

# retreive points at a given point in time
def retrieve_points(team, date):
    return points.query(
        f'team == "{team}" & date == "{date}"'
    )['points_before'].values[0]

def predict_based_on_points(home, away, date):
    points_home = retrieve_points(home, date)
    points_away = retrieve_points(away, date)
    if points_home == points_away:
        return home
    elif points_home > points_away:
        return home
    else:
        return away

In [73]:
# games = download_data('2017')
games = pd.read_csv('./data/nhl_2018-2019.csv')
games.head()

Unnamed: 0,date,home,visitor,goals_home,goals_visitor,extra_time
0,2018-10-03,San Jose Sharks,Anaheim Ducks,2.0,5.0,0
1,2018-10-03,Toronto Maple Leafs,Montreal Canadiens,3.0,2.0,1
2,2018-10-03,Vancouver Canucks,Calgary Flames,5.0,2.0,0
3,2018-10-03,Washington Capitals,Boston Bruins,7.0,0.0,0
4,2018-10-04,Buffalo Sabres,Boston Bruins,0.0,4.0,0


In [126]:
points = games_to_points(games)
points.tail()

Unnamed: 0,team,date,points,points_before
5358,Toronto Maple Leafs,2019-04-02,99,99
5359,Vancouver Canucks,2019-04-02,80,78
5360,Vegas Golden Knights,2019-04-02,93,93
5361,Washington Capitals,2019-04-02,102,102
5362,Winnipeg Jets,2019-04-02,96,96


In [91]:
# establish baselines
df = games.copy()
df['winner'] = np.where(df.goals_home > df.goals_visitor, df.home, df.visitor)

In [96]:
# baseline model 1
y = df['winner'] # true
y_hat = df['home'] # predict winner based on home team winning
accuracy_score(y, y_hat)

0.537156704361874

In [100]:
# baseline model 2
df['points'] = df.apply(lambda row:
    predict_based_on_points(row.home, row.visitor, row.date), axis=1)
y = df['winner'] # true 
y_hat = df['points'] # predict winner based on higher points
accuracy_score(y, y_hat)

0.5379644588045234

In [127]:
# Are we able to create a model with elo that is more predictive?

# Elo Model

In [152]:
games.head()

Unnamed: 0,date,home,visitor,goals_home,goals_visitor,extra_time
0,2018-10-03,San Jose Sharks,Anaheim Ducks,2.0,5.0,0
1,2018-10-03,Toronto Maple Leafs,Montreal Canadiens,3.0,2.0,1
2,2018-10-03,Vancouver Canucks,Calgary Flames,5.0,2.0,0
3,2018-10-03,Washington Capitals,Boston Bruins,7.0,0.0,0
4,2018-10-04,Buffalo Sabres,Boston Bruins,0.0,4.0,0


In [157]:
games['winner'] = np.where(games.goals_home > games.goals_visitor, games.home, games.visitor)
df = games[['date', 'home', 'visitor', 'winner']]
df.head()

Unnamed: 0,date,home,visitor,winner
0,2018-10-03,San Jose Sharks,Anaheim Ducks,Anaheim Ducks
1,2018-10-03,Toronto Maple Leafs,Montreal Canadiens,Toronto Maple Leafs
2,2018-10-03,Vancouver Canucks,Calgary Flames,Vancouver Canucks
3,2018-10-03,Washington Capitals,Boston Bruins,Washington Capitals
4,2018-10-04,Buffalo Sabres,Boston Bruins,Boston Bruins


In [162]:
teams = df['winner'].unique().tolist()
elo = EloTracker(sorted(teams))

In [161]:
elo.data

{'Anaheim Ducks': 1500,
 'Arizona Coyotes': 1500,
 'Boston Bruins': 1500,
 'Buffalo Sabres': 1500,
 'Calgary Flames': 1500,
 'Carolina Hurricanes': 1500,
 'Chicago Blackhawks': 1500,
 'Colorado Avalanche': 1500,
 'Columbus Blue Jackets': 1500,
 'Dallas Stars': 1500,
 'Detroit Red Wings': 1500,
 'Edmonton Oilers': 1500,
 'Florida Panthers': 1500,
 'Los Angeles Kings': 1500,
 'Minnesota Wild': 1500,
 'Montreal Canadiens': 1500,
 'Nashville Predators': 1500,
 'New Jersey Devils': 1500,
 'New York Islanders': 1500,
 'New York Rangers': 1500,
 'Ottawa Senators': 1500,
 'Philadelphia Flyers': 1500,
 'Pittsburgh Penguins': 1500,
 'San Jose Sharks': 1500,
 'St. Louis Blues': 1500,
 'Tampa Bay Lightning': 1500,
 'Toronto Maple Leafs': 1500,
 'Vancouver Canucks': 1500,
 'Vegas Golden Knights': 1500,
 'Washington Capitals': 1500,
 'Winnipeg Jets': 1500}

In [163]:
elo_by_date = {}
for i, row in df.iterrows():
    if row.winner == row.home:
        elo.update(row.home, row.visitor, k=20)
    else:
        elo.update(row.visitor, row.home, k=20)
    elo_by_date[row.date] = elo.data.copy()

elo_by_date

{'2018-10-03': {'Anaheim Ducks': 1510,
  'Arizona Coyotes': 1500,
  'Boston Bruins': 1490,
  'Buffalo Sabres': 1500,
  'Calgary Flames': 1490,
  'Carolina Hurricanes': 1500,
  'Chicago Blackhawks': 1500,
  'Colorado Avalanche': 1500,
  'Columbus Blue Jackets': 1500,
  'Dallas Stars': 1500,
  'Detroit Red Wings': 1500,
  'Edmonton Oilers': 1500,
  'Florida Panthers': 1500,
  'Los Angeles Kings': 1500,
  'Minnesota Wild': 1500,
  'Montreal Canadiens': 1490,
  'Nashville Predators': 1500,
  'New Jersey Devils': 1500,
  'New York Islanders': 1500,
  'New York Rangers': 1500,
  'Ottawa Senators': 1500,
  'Philadelphia Flyers': 1500,
  'Pittsburgh Penguins': 1500,
  'San Jose Sharks': 1490,
  'St. Louis Blues': 1500,
  'Tampa Bay Lightning': 1500,
  'Toronto Maple Leafs': 1510,
  'Vancouver Canucks': 1510,
  'Vegas Golden Knights': 1500,
  'Washington Capitals': 1510,
  'Winnipeg Jets': 1500},
 '2018-10-04': {'Anaheim Ducks': 1510,
  'Arizona Coyotes': 1490,
  'Boston Bruins': 1500,
  'Buffa

In [179]:
# put into a data frame

season = (
    pd.DataFrame(elo_by_date)
    .reset_index()
    .rename(columns={'index':'team'})
)

season.head()

Unnamed: 0,team,2018-10-03,2018-10-04,2018-10-05,2018-10-06,2018-10-07,2018-10-08,2018-10-09,2018-10-10,2018-10-11,...,2019-03-24,2019-03-25,2019-03-26,2019-03-27,2019-03-28,2019-03-29,2019-03-30,2019-03-31,2019-04-01,2019-04-02
0,Anaheim Ducks,1510,1510,1510,1519,1519,1528,1528,1517,1517,...,1433,1433,1443,1443,1443,1436,1446,1446,1446,1446
1,Arizona Coyotes,1500,1490,1490,1481,1481,1481,1481,1492,1492,...,1487,1487,1497,1497,1497,1486,1486,1495,1495,1483
2,Boston Bruins,1490,1500,1500,1500,1500,1510,1510,1510,1519,...,1608,1600,1600,1605,1605,1605,1591,1577,1577,1586
3,Buffalo Sabres,1500,1490,1490,1500,1500,1510,1510,1510,1500,...,1404,1394,1384,1384,1375,1375,1370,1365,1365,1359
4,Calgary Flames,1490,1490,1490,1501,1501,1501,1512,1512,1501,...,1581,1567,1567,1555,1555,1562,1562,1571,1577,1577


In [180]:
season = pd.melt(
     pd.DataFrame(season),
     id_vars=['team'],
     var_name='date',
     value_name='elo'
)

season.head()
# Great, now have the elo for each time at each day

Unnamed: 0,team,date,elo
0,Anaheim Ducks,2018-10-03,1510
1,Arizona Coyotes,2018-10-03,1500
2,Boston Bruins,2018-10-03,1490
3,Buffalo Sabres,2018-10-03,1500
4,Calgary Flames,2018-10-03,1490


In [176]:
# season['date'] = season['date'].apply(pd.to_datetime)
season['elo_before'] = season.groupby('team')['elo'].shift(1).fillna(1000)
season

Unnamed: 0,team,date,elo,elo_before
0,Anaheim Ducks,2018-10-03,1510,1000.0
1,Arizona Coyotes,2018-10-03,1500,1000.0
2,Boston Bruins,2018-10-03,1490,1000.0
3,Buffalo Sabres,2018-10-03,1500,1000.0
4,Calgary Flames,2018-10-03,1490,1000.0
5,Carolina Hurricanes,2018-10-03,1500,1000.0
6,Chicago Blackhawks,2018-10-03,1500,1000.0
7,Colorado Avalanche,2018-10-03,1500,1000.0
8,Columbus Blue Jackets,2018-10-03,1500,1000.0
9,Dallas Stars,2018-10-03,1500,1000.0


In [186]:
df.head(1)

Unnamed: 0,team,date,elo
0,Anaheim Ducks,2018-10-03,1510


In [187]:
season.head(1)

Unnamed: 0,team,date,elo
0,Anaheim Ducks,2018-10-03,1510


# Add Later

In [None]:
# home court advantage
# on average an nba team is favoured by 3.5 points at home, if teams were evenly matched
# FiveThirtyEight: this is approximately 100 elo
# expected_winner = 1 / ( 1 + 10**((loser_elo - (winner_elo+100))/(28*400)))

In [149]:
# graph this for elo over time between both teams
import altair as alt
alt.renderers.enable('default')

points.query('team == "Toronto Maple Leafs"')
points['date'] = pd.to_datetime(points['date'])
# get the observations for Toronto and Boston
leafs = points[points.team.isin(['Toronto Maple Leafs', 'Boston Bruins'])]

(alt.Chart(leafs)
    .encode(x='date', y='points', color='team')
    .mark_line()
    .properties(background='white', width=700, height=400))

<VegaLite 2 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


Elo strikes a nice balance between ratings systems that account for margin of victory and those that don’t. While teams always gain Elo points after wins and lose Elo points after losses, they gain or lose more with larger margins of victory.

This works by assigning a multiplier to each game based on the final score and dividing it by a team’s projected margin of victory conditional upon having won the game. For instance, the Warriors’ 4-point margin over the Rockets in Game 1 of this year’s Western Conference finals was lower than Elo would expect for a Warriors win. So the Warriors gain Elo points, but not as many as if they’d won by a larger margin. The formula accounts for diminishing returns; going from a 5-point win to a 10-point win matters more than going from a 25-point win to a 30-point win. For the exact formula, see the footnotes.

It accounts for the fact that favourites tend to win by larger margins than they lose them by larger margins. This helps keep elo ratings stable around 1500, and prevent autocorrelation into the rating system (Five Thirty Eight).

In [65]:
# margin of victory multipler
def mult(winner,loser):
    # score would be in position 1, pull from game
    winner_elo,loser_elo = data[winner][0] , data[loser][0]
    mov = #winning score - #loser score
    mult = (mov + 3)**0.8/(7.5+0.006*(winner_elo-(loser_elo+100)))
    return mult

# add 100 to home elo when calculating prob, and margin of victory
# data[winner][0] + k*(1 - expected_winner)* mult("Toronto","Boston")

In [None]:
# year to year carry over
# Add elo graphs
# How to add data science? 
* https://en.wikipedia.org/wiki/Brier_score
* Simulate final four with elo
* Calculate the brier score 

In [None]:
# data["Toronto"][0]+20*(1-prob("Toronto","Boston")[0])