# Creating an Elo Rating System for Soccer

## What is an Elo Rating and why is it useful for soccer?

I asked ChatGPT to answer this question:

An Elo Rating, or simply Elo, is a rating system used to measure the relative skill levels of players or teams in two-player or multiplayer competitive games and sports. It was originally developed for chess by Arpad Elo but has been widely adopted in various sports, including soccer. The Elo rating system provides a numerical representation of a player's or team's skill, making it easier to compare and rank them. 

In soccer, Elo ratings can be applied to both club teams and national teams, offering a standardized way to evaluate and rank their performances. While other rating systems exist, Elo remains a popular choice due to its simplicity and effectiveness in capturing the competitive nature of the sport. It has been used to rank and compare soccer teams, track performance over seasons, and aid in the scheduling of matches and tournaments to ensure fair play.

## How is it calculated? And can you explain the variables R, s and k in this context?

The Elo rating system calculates a player's or team's new rating based on the outcome of a game or match. The formula used for the calculation is as follows:

R' = R + K * (S - E)

Where:

R': The new rating after the game.
R: The current rating before the game.
K: The K-factor, which determines the impact of a single game on the rating.
S: The outcome of the game (1 for a win, 0.5 for a draw, 0 for a loss).
E: The expected outcome, which is calculated using the logistic function.
Now, let's break down the variables in the context of soccer:

R (Rating): This is the current rating of the player or team before the game. It represents their skill level, and it's a numerical value assigned based on their previous performance. For soccer teams, the R rating is typically used to assess their skill and competitiveness.

K-Factor (K): The K-factor is a constant that determines the weight or impact of a single game on the rating. In the context of soccer, the K-factor is used to reflect the degree of uncertainty in a team's skill level. Higher K-factors are applied to less experienced teams or teams with limited performance history, making their ratings more sensitive to individual games. More experienced and established teams have lower K-factors, meaning that their ratings change less with each game.

S (Outcome): In soccer, the outcome (S) is typically represented as a binary result:

1 for a win: A team gets a full point if they win the game.
0.5 for a draw: Both teams get a half point in the case of a draw.
0 for a loss: A team gets no points for a loss.
E (Expected Outcome): The expected outcome (E) is calculated using a logistic function and is based on the Elo ratings of the two competing teams. In the context of soccer, it represents the probability of winning, drawing, or losing the match based on the current Elo ratings.

The Elo system then adjusts the team's or player's rating based on the difference between the actual outcome (S) and the expected outcome (E). If a team with a lower Elo rating beats a team with a higher Elo rating, the winning team's rating will increase significantly, while the losing team's rating will decrease.

This adjustment ensures that over time, the ratings accurately reflect the true skill levels of teams or players, making the Elo system a valuable tool for ranking and comparing soccer teams.

#### Loading libraries

In [3]:
from statsbombpy import sb
import pandas as pd
import numpy as np
import plotly.graph_objects as go


#### Get the data

In [4]:
#To get the data I used statsbombpy
sb.competitions()




Unnamed: 0,competition_id,season_id,country_name,competition_name,competition_gender,competition_youth,competition_international,season_name,match_updated,match_updated_360,match_available_360,match_available
0,9,27,Germany,1. Bundesliga,male,False,False,2015/2016,2023-08-17T23:51:11.837478,,,2023-08-17T23:51:11.837478
1,16,4,Europe,Champions League,male,False,False,2018/2019,2023-03-07T12:20:48.118250,2021-06-13T16:17:31.694,,2023-03-07T12:20:48.118250
2,16,1,Europe,Champions League,male,False,False,2017/2018,2021-08-27T11:26:39.802832,2021-06-13T16:17:31.694,,2021-01-23T21:55:30.425330
3,16,2,Europe,Champions League,male,False,False,2016/2017,2021-08-27T11:26:39.802832,2021-06-13T16:17:31.694,,2020-07-29T05:00
4,16,27,Europe,Champions League,male,False,False,2015/2016,2021-08-27T11:26:39.802832,2021-06-13T16:17:31.694,,2020-07-29T05:00
...,...,...,...,...,...,...,...,...,...,...,...,...
62,55,43,Europe,UEFA Euro,male,False,True,2020,2023-02-24T21:26:47.128979,2023-04-27T22:38:34.970148,2023-04-27T22:38:34.970148,2023-02-24T21:26:47.128979
63,35,75,Europe,UEFA Europa League,male,False,False,1988/1989,2023-06-18T19:28:39.443883,2021-06-13T16:17:31.694,,2023-06-18T19:28:39.443883
64,53,106,Europe,UEFA Women's Euro,female,False,True,2022,2023-07-17T21:19:03.032991,2023-07-17T21:21:56.497106,2023-07-17T21:21:56.497106,2023-07-17T21:19:03.032991
65,72,107,International,Women's World Cup,female,False,True,2023,2023-09-01T12:34:19.705316,2023-09-01T12:35:45.762196,2023-09-01T12:35:45.762196,2023-09-01T12:34:19.705316


In [5]:
# Getting the first Bundesliga season from the StatsBomb data set as an example
df = sb.matches(competition_id=9, season_id= 27).sort_values('match_week')

df_use = df[['home_team', 'away_team', 'home_score', 'away_score', 'match_week']]
df_use



Unnamed: 0,home_team,away_team,home_score,away_score,match_week
305,Bayern Munich,Hamburger SV,5,0,1
297,Wolfsburg,Eintracht Frankfurt,2,1,1
298,VfB Stuttgart,FC Köln,1,3,1
300,Werder Bremen,Schalke 04,0,3,1
299,Augsburg,Hertha Berlin,0,1,1
...,...,...,...,...,...
18,Darmstadt 98,Borussia Mönchengladbach,0,2,34
17,Borussia Dortmund,FC Köln,2,2,34
16,Bayer Leverkusen,Ingolstadt,3,2,34
15,Augsburg,Hamburger SV,1,3,34


#### Writing functions for the Elo rating system

In [6]:
# Expected result function
def expected_result(home,away):
    dr=home-away
    we=(1/(10**(-dr/400)+1))
    return [np.round(we,3),1-np.round(we,3)]

# Actual result function --> s can be adjusted here
def actual_result(home,away, s=15):
    if home>away:
        wa=0
        wl=s
    elif home<away:
        wa=s
        wl=0
    elif home==away:
        wa=s/2
        wl=s/2
    return [wl,wa]

# Calculate the Elo rating --> k can be adjusted here
def calculate_elo(elo_home,elo_away,home_goals,away_goals, k = 15):
    
    wl,wv=actual_result(home_goals,away_goals)
    wel,wev=expected_result(elo_home,elo_away)

    elo_home_new=elo_home+k*(wl-wel)
    elo_away_new=elo_away+k*(wv-wev)

    return elo_home_new,elo_away_new

### Apply the rating system to 1. Bundesliga 2015/2016 season

In [7]:
# Calculating Elo ratings for the season via looping --> R can be adjusted here
current_elo={}
for idx,row in df_use.iterrows():
    
    home=row['home_team']
    away=row['away_team']
    home_goals=row['home_score']
    away_goals=row['away_score']    

    if home not in current_elo.keys():
        current_elo[home]=100
    
    if away not in current_elo.keys():
        current_elo[away]=100
    
    elo_home=current_elo[home]
    elo_away=current_elo[away]
    elo_home_new,elo_away_new=calculate_elo(elo_home,elo_away,home_goals,away_goals)

    current_elo[home]=elo_home_new
    current_elo[away]=elo_away_new
    
    df_use.loc[idx,'Elo_home_after']=elo_home_new
    df_use.loc[idx,'Elo_away_after']=elo_away_new
    df_use.loc[idx,'Elo_home_before']=elo_home
    df_use.loc[idx,'Elo_away_before']=elo_away


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_use.loc[idx,'Elo_home_after']=elo_home_new
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_use.loc[idx,'Elo_away_after']=elo_away_new
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_use.loc[idx,'Elo_home_before']=elo_home
A value is trying to be set on a copy of a slice from a DataFrame.
Try

### Checking the results of the Elo rating at the end of the season

In [8]:
# Filtering for the last week
df_elo = df_use.loc[df_use.loc[:,'match_week'] == 34]
df_elo

# Extracting the Elos
df_elo_teams = df_elo[['home_team', 'away_team']].melt(value_name= 'Team', var_name='location1')
df_elo_elos =  df_elo[['Elo_home_after', 'Elo_away_after']].melt(value_name= 'Elo', var_name='location2')

# Putting everything together and calculating a realtive Elo in relation to the highest value set to 100%
df_elo_final = pd.DataFrame
df_elo_final = pd.concat([df_elo_teams, df_elo_elos], axis=1).drop(columns=['location1', 'location2'])
df_elo_final['Relative_Elo'] = df_elo_final['Elo'] / max(df_elo_final['Elo'])*100
df_elo_final['Rank_Elo'] = df_elo_final['Elo'].rank(method='min', ascending=False)
df_elo_final.sort_values('Elo', ascending= False)







Unnamed: 0,Team,Elo,Relative_Elo,Rank_Elo
3,Bayern Munich,6370.03,100.0,1.0
5,Borussia Dortmund,5717.605,89.757898,2.0
6,Bayer Leverkusen,4474.675,70.245745,3.0
13,Borussia Mönchengladbach,4088.995,64.191142,4.0
17,Schalke 04,3921.565,61.56274,5.0
2,FSV Mainz 05,3847.36,60.397832,6.0
11,Hertha Berlin,3800.935,59.669028,7.0
14,FC Köln,3529.015,55.400289,8.0
0,Wolfsburg,3490.72,54.799114,9.0
16,Hamburger SV,3248.905,51.002978,10.0


#### Putting in the actual finished for 15/16 for comparison

In [9]:
data = df

df_season = pd.DataFrame(data)

# Define the point system
points_win = 3
points_draw = 1
points_loss = 0

# Create an empty table for the final standings
teams = set(df_season['home_team'].unique())
table = pd.DataFrame({'Team': list(teams), 'Points': 0, 'Goal Differential': 0})

# Update the table based on match results
for _, match in df_season.iterrows():
    home_team = match['home_team']
    away_team = match['away_team']
    home_goals = match['home_score']
    away_goals = match['away_score']
    
    # Update points and goal differential
    if home_goals > away_goals:
        table.loc[table['Team'] == home_team, 'Points'] += points_win
        table.loc[table['Team'] == home_team, 'Goal Differential'] += home_goals - away_goals
        table.loc[table['Team'] == away_team, 'Goal Differential'] += away_goals - home_goals
    elif home_goals < away_goals:
        table.loc[table['Team'] == away_team, 'Points'] += points_win
        table.loc[table['Team'] == away_team, 'Goal Differential'] += away_goals - home_goals
        table.loc[table['Team'] == home_team, 'Goal Differential'] += home_goals - away_goals
    else:
        table.loc[table['Team'] == home_team, 'Points'] += points_draw
        table.loc[table['Team'] == away_team, 'Points'] += points_draw

# Sort the table by points and goal differential
table = table.sort_values(by=['Points', 'Goal Differential'], ascending=False)

# Add the rank column
table['Rank'] = range(1, len(table) + 1)

table

Unnamed: 0,Team,Points,Goal Differential,Rank
16,Bayern Munich,88,63,1
4,Borussia Dortmund,78,48,2
13,Bayer Leverkusen,60,16,3
17,Borussia Mönchengladbach,55,17,4
15,Schalke 04,52,2,5
14,FSV Mainz 05,50,4,6
11,Hertha Berlin,50,0,7
0,Wolfsburg,45,-2,8
12,FC Köln,43,-4,9
10,Hamburger SV,41,-6,10


In [10]:
# Adding relative points with the highest amount set to 100% to be able to compare points and elo rating
df_bundesliga_1516 = pd.DataFrame(table, columns =['Team', 'Goal Differential', 'Points', 'Rank'])
df_bundesliga_1516['Relative_Points'] = df_bundesliga_1516['Points'] / max(df_bundesliga_1516['Points']) *100
df_bundesliga_1516


Unnamed: 0,Team,Goal Differential,Points,Rank,Relative_Points
16,Bayern Munich,63,88,1,100.0
4,Borussia Dortmund,48,78,2,88.636364
13,Bayer Leverkusen,16,60,3,68.181818
17,Borussia Mönchengladbach,17,55,4,62.5
15,Schalke 04,2,52,5,59.090909
14,FSV Mainz 05,4,50,6,56.818182
11,Hertha Berlin,0,50,7,56.818182
0,Wolfsburg,-2,45,8,51.136364
12,FC Köln,-4,43,9,48.863636
10,Hamburger SV,-6,41,10,46.590909


#### Merging and comparing Elo and actual results

In [11]:
# Mergung datasets and adding rank difference actual and relative
df_merged = pd.merge(df_bundesliga_1516, df_elo_final, on='Team', how='inner')
df_merged['Rank_Difference'] = abs(df_merged['Rank'] - df_merged['Rank_Elo'])
df_merged['Realtive_Difference'] = abs(df_merged['Relative_Points'] - df_merged['Relative_Elo'])
df_merged.aggregate({'Rank_Difference' : 'sum', 'Realtive_Difference' : 'sum'})

Rank_Difference         4.000000
Realtive_Difference    74.543782
dtype: float64

By simply adding up the differences we get the Mean Absolute Error here, which is a decent measure to evaluate the performance of the setup here. It is simple but viable since there are no extreme values to be expected, which could screw with this measure. These two factors would be the thing we optimize for.

The initial setup with R = 100, s = 15 and K = 15 gives us these values:

Rank_Difference: 6.000000    Realtive_Difference: 74.543782

I messed around with the values for R, s and K a little but. While there is some room to improve the relative difference by increasing the s to lets say 25 and reduce the K to 10 (Realtive_Difference: 71.232318) it looks like there is no improvement in the absolute rank difference possible.

#### Plotting Elo Rank vs Actual Table Ranks

In [25]:
# Plotting rank comparisons
fig = go.Figure()
fig.add_trace(go.Bar(x=df_merged['Team'], y= df_merged['Rank'],
                name='Actual Final Rank'
                ))
fig.add_trace(go.Bar(x=df_merged['Team'], y= df_merged['Rank_Elo'],
                name='Final Elo Rank'
                ))

fig.update_layout(
   title='<b>Comparison - Final Elo Rank vs Actual Final Table</b>',
   template = "plotly_dark",
   xaxis_tickfont_size=14,
   width = 1500,
   height = 750,
   yaxis=dict(
        title='Rank #',
        titlefont_size=16,
        tickfont_size=14))

fig.add_annotation(text='Data Source: <a href=”https://github.com/statsbomb/open-data”>StatsBomb</a> <br>Viz: <a href=”https://twitter.com/_prospecttheory”>@_prospecttheory</a> <br> <br> <br> <br> <br>', 
                    align='left',
                    showarrow=False,
                    xref='paper',
                    yref='paper',
                    x= -0.05,
                    y= -0.65)


fig.show()

Only Köln and Wolfsburg are swapped and the trio Bremen, Hoffenheim and Darmstadt are not in the right order. But since the rank difference is never bigger than two, I would rate the performance of the system as decent. Optimizing the variables R, s and k has some value, however I would rate the gains as negliable.