# Elo basics

The margin-dependent Elo model is based on the classical Elo model for win/loss games. The easiest way to understand the classical Elo algorithm is by looking at a minimal working example.

Example problem
===============

Consider two teams, `team_A` and `team_B`, each described by an instrinsic number, `rating_A` and `rating_B` respectively, which describes their true strength. Let's initially suppose the two teams are of equal strength and initialize their ratings as follows:

In [1]:
rating_A = 0
rating_B = 0

The Elo model says that the predicted (or prior) probability for `team_A` beating `team_B` is obtained by applying a certain cumulative distribution function (CDF) to the rating _difference_ of `team_A` and `team_B`.

In [2]:
from scipy.stats import norm

# Predicted probability that team_A beats team_B
pred_win_AvsB = norm.cdf(rating_A - rating_B)

print('Probability team A beats team B: {:.2f}'.format(pred_win_AvsB))

Probability team A beats team B: 0.50


Assume now, that team_A beats team_B in the game of interest. We should therefore increase team_A's rating relative to team_B's to account for this new information. The Elo model does this by transferring some of team_B's rating to team_A. The size of the rating change is given by the following formula:

In [3]:
# Observed "probability" that team_A beats team_B.
obs_win_AvsB = 1

# Formula for computing the magnitude of rating transfer.
kfactor = 0.2
rating_change = kfactor * (obs_win_AvsB - pred_win_AvsB)

print('rating change = {:.2f}'.format(rating_change))

rating change = 0.10


Notice that this rating change scales with the magnitude of the difference between the game's predicted and observed outcome probabilities. The `kfactor` constant is a free parameter which determines how rapidly the ratings should respond to each game outcome.

Once the rating change is calculated, the ratings are updated as follows:

In [4]:
rating_A += rating_change
rating_B -= rating_change

Now that we've updated the ratings for each team, let's revisit the predicted probability that team_A beats team_B again.

In [5]:
# Predicted probability that team_A beats team_B.
pred_win_AvsB = norm.cdf(rating_A - rating_B)

print('Probability team A beats team B: {:.2f}'.format(pred_win_AvsB))

Probability team A beats team B: 0.58


Now we see that team_A is predicted to beat team_B with _greater_ than 50% probability. This reflects the fact that we just observed team_A beating team_B in the previous matchup. What happens if team_A beats team_B again?

In [6]:
# Suppose team_A beats team_B again.
obs_win_AvsB = 1

# Formula for computing the magnitude of rating transfer.
rating_change = kfactor * (obs_win_AvsB - pred_win_AvsB)

# Update each team's ratings
rating_A += rating_change
rating_B -= rating_change

# Predicted probability that teamA beats teamB.
pred_win_AvsB = norm.cdf(rating_A - rating_B)

print('Probability team A beats team B: {:.2f}'.format(pred_win_AvsB))

Probability team A beats team B: 0.64


The predicted probability that team_A beats team_B is now even higher! Notice however, that the probability only increased by .06 for the second update whereas it increased by .08 after the first update. This diminishing return ensures that subsequent updates never increase the win probability beyond 100%.

Accounting for margin of victory
================================

Everything up to this point has been for a single win/loss game. What if we want to generate ratings and predictions for games with integer or real-valued outcomes?

Imagine now a game which produces two scores, score_A and score_B, which we can use to construct a point spread S = score_A - score_B. Consider four possible comparison lines = [-1.5, -0.5, 0.5, 1.5] for the point spread S (here I've omitted ties for simplicity, but they can be incorporated as well).

A team might be really good at finishing with S > 0.5, covering the first three lines [-1.5, -0.5, 0.5] and falling short of the fourth line 1.5, but they could be really _bad_ at finishing with S > 1.5 (maybe they always win but just barely). The general premise of the margin-dependent Elo model is that we can train an Elo model for each value of the point spread S. Hence if we choose four comparison lines for S, then we need four rating numbers for each team. Let's initialize every rating value to zero which treats all outcomes as equally likely.

In [7]:
import numpy as np

# vector of initial ratings
rating_A = np.array([0., 0., 0., 0.])
rating_B = np.array([0., 0., 0., 0.])

Let's start now by computing the predicted probability that team_A beats team_B by each possible margin of victory line. We'll follow the same procedure as before with one important difference.

For margin of victory Elo, if team_A becomes more likely to _win_ by p or more points, then team_B becomes more likely to _lose_ by p or more points. Hence the rating transfer should occur from team_B's rating at -p points to team_A's rating at +p points. The predicted win probability formula thus involves an additional reflection:

In [8]:
# Predicted probability that team_A beats team_B by each point spread outcome S = score_A - score_B:
pred_win_AvsB = norm.cdf(rating_A - rating_B[::-1])

print('Possible outcomes: [S > -1.5, S > -0.5, S > 0.5, S > 1.5]')
print('Probability of each possible outcome:', pred_win_AvsB)

Possible outcomes: [S > -1.5, S > -0.5, S > 0.5, S > 1.5]
Probability of each possible outcome: [0.5 0.5 0.5 0.5]


Assume now that team_A beats team_B by 1 point. The rating transfer is given by

In [9]:
# Vector of point spread outcomes
obs_win_AvsB = np.array([1, 1, 1, 0])

# Formula for computing the magnitude of rating transfer.
rating_change = kfactor * (obs_win_AvsB - pred_win_AvsB)

print('rating change = ', rating_change)

rating change =  [ 0.1  0.1  0.1 -0.1]


We can then apply the same formula to update the ratings, noting that the rating transfer should occur with a reflection as noted above.

In [10]:
# Update each team's ratings
rating_A += rating_change
rating_B -= rating_change[::-1]

# Predicted probability that teamA beats teamB.
pred_win_AvsB = norm.cdf(rating_A - rating_B[::-1])

print('Probability team A beats team B:\n', pred_win_AvsB)

Probability team A beats team B:
 [0.57925971 0.57925971 0.57925971 0.42074029]


Notice that after one update the predicted spread became slightly more likely for spreads S < 1 and slightly less likely for spreads S > 1. Let's repeat this same game outcome a few times and see how the ratings respond.

In [12]:
for n in range(100):
    # Vector of point spread outcomes: team_A lose by 1, 2
    # and win by 2 are zero, while outcome for win by 1 is one.
    obs_win_AvsB = np.array([1, 1, 1, 0])

    # Formula for computing the magnitude of rating transfer.
    rating_change = kfactor * (obs_win_AvsB - pred_win_AvsB)
    
    # Update each team's ratings
    rating_A += rating_change
    rating_B -= rating_change[::-1]

    # Predicted probability that teamA beats teamB.
    pred_win_AvsB = norm.cdf(rating_A - rating_B[::-1])

    print('Probability team A beats team B at each line:\n', pred_win_AvsB)

Probability team A beats team B at each line:
 [0.98938948 0.98938948 0.98938948 0.01061052]
Probability team A beats team B at each line:
 [0.98950801 0.98950801 0.98950801 0.01049199]
Probability team A beats team B at each line:
 [0.98962409 0.98962409 0.98962409 0.01037591]
Probability team A beats team B at each line:
 [0.98973778 0.98973778 0.98973778 0.01026222]
Probability team A beats team B at each line:
 [0.98984915 0.98984915 0.98984915 0.01015085]
Probability team A beats team B at each line:
 [0.98995827 0.98995827 0.98995827 0.01004173]
Probability team A beats team B at each line:
 [0.99006522 0.99006522 0.99006522 0.00993478]
Probability team A beats team B at each line:
 [0.99017004 0.99017004 0.99017004 0.00982996]
Probability team A beats team B at each line:
 [0.99027281 0.99027281 0.99027281 0.00972719]
Probability team A beats team B at each line:
 [0.99037357 0.99037357 0.99037357 0.00962643]
Probability team A beats team B at each line:
 [0.9904724 0.9904724 0.

After many iterations, we see that the predicted probability converges to the true probability distribution estimated from our observations. Our Elo estimated CDF becomes a step function. For S < 1 the predicted probability is ~1 and for S > 1 the predicted probability is ~0. The probability distribution function (PDF) can be estimated from the CDF by taking the first derivative. This produces a delta function located at S = 1. In other words, our model will predict a point spread of S = 1 if it repeatedly observed a point spread of S = 1.

In practice, one can estimate the point spread CDF by computing Elo ratings for each value of the point spread within some reasonable range. For example, if NFL point spreads fall within |S| < 60, then one might compute Elo ratings at each line = [-60.5, -59.5, ..., 59.5, 60.5].