In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
plt.style.use('ggplot')

One of my favourite models that I have come across is the [Elo ratings](https://en.wikipedia.org/wiki/Elo_rating_system). In short, it is a system that can be applied to pairwise matchups such as teams or players. It is quite popular across different domains such as chess, education, chess, online gaming, and many more. I like it because it was the first time I tried adjusting a formula with some knowledge about the topic that I wanted to focus on. Elo updates after every single game a player or team participates in. Because a team can only play the teams on their schedule in sport, Elo allows us to reduce some of the uncertainty to how good a particular team is. Another nice feature of Elo is that it rewards teams more for beating a good team, and punishes good teams for losing to bad teams[^1].

## Overview of Elo

A matchup of $Team_{A}$ and $Team_{B}$ start a match with rankings $R_{A}$ and $R_{B}$. The score of the game is then coded as $0$ for a loss, $0.5$ for a draw and $1$ for a win. The priors can be expressed using:

$$P_{A} = \frac{1}{1 + 10^{(R_{B} - R_{A})/400}} \quad P_{B} = 1 - P_{A}$$

::: {.column-margin}
Where $R$ is short for ranking.
:::

where $P_{i}$ is the prior probability that team $i$ wins the match. After each match, ratings are updated as follows:

$$R_{A}^{new} = R_{A} + K(S_{A} - P_{A}) \quad R_{B}^{new} = R_{B} + K(S_{B} - P_{B})$$

where $S_{i}$ is the score of team $i$ (0/0.5/1) and $K$ is an update weight (commonly called the `k-factor`).

:::{.callout-note}
Sometimes K Factor is simply called K or Update Factor.

  - A larger $K$ creates more variance as the values get updated, whereas a lower $K$ value takes longer to adjust to new information. 538 in their Elo model use a $K$ of $25$.
:::

As such, it is a zero sum game. For every elo point one team gains, the other team loses the same amount.

## Scaling Factor

A scaling factor of $400$ means that a difference in $400$ elo would give the favoured team a $90\%$ chance to win (see below). A smaller value would decrease the range of values. This does not matter too much, however, I like to keep this at $400$ so comparing different Elo methods is apples to apples. 


In [None]:
df = pd.DataFrame({
    'elo_diff': np.arange(-1000, 1100, 100),
    # [::-1] reverses the order of the prob column so -1000 is the underdog
    'prob': 1 / (1 + 10**((np.arange(-1000, 1100, 100)) / 400))[::-1]
})

fig = plt.figure(figsize = (8, 4))

plt.plot(df['elo_diff'], df['prob'], color = '#2ca25f', linewidth = 2)
plt.yticks(np.arange(0, 1.1, 0.1))
plt.xlabel('Elo Difference')
plt.ylabel('Win Probability')
plt.title('Probability that team $i$ beats team $j$')
plt.grid(True)
plt.show()

## Next

[^1]: Teams or players through a season probably do not get better or worse, but our estimation of their skill improves, or the uncertainty decreases.