# ELO Ratings

Useful links a description of ELO rating from [World Football ELO Ratings](https://eloratings.net/about), working R example from [Robert Hickman](https://www.robert-hickman.eu/post/guardian_knowledge_june/) and the [dataset](https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017) he uses. Using ELO to [assess aerial performance](https://webcache.googleusercontent.com/search?q=cache:9-oen_lyqf4J:https://www.optasportspro.com/news-analysis/blog-a-new-way-to-assess-aerial-performance/+&cd=1&hl=en&ct=clnk&gl=ch). [This](https://twitter.com/petermckeever/status/1283420492289003520) series of tweets.

In [14]:
import numpy as np
import pandas as pd

In [6]:
df = pd.read_csv('data/international/results.csv')

In [7]:
df.head()

Unnamed: 0,date,home_team,away_team,home_score,away_score,tournament,city,country,neutral
0,1872-11-30,Scotland,England,0,0,Friendly,Glasgow,Scotland,False
1,1873-03-08,England,Scotland,4,2,Friendly,London,England,False
2,1874-03-07,Scotland,England,2,1,Friendly,Glasgow,Scotland,False
3,1875-03-06,England,Scotland,2,2,Friendly,London,England,False
4,1876-03-04,Scotland,England,3,0,Friendly,Glasgow,Scotland,False


In [43]:
pd.to_datetime(df['date'])

0       1872-11-30
1       1873-03-08
2       1874-03-07
3       1875-03-06
4       1876-03-04
           ...    
41581   2020-01-10
41582   2020-01-12
41583   2020-01-15
41584   2020-01-19
41585   2020-02-01
Name: date, Length: 41586, dtype: datetime64[ns]

Outside of just telling us the tournament, further context for each match is unknown so I will just take K=40 for all matches.

In [37]:
df.drop(['tournament','city','country'],axis=1,inplace=True)
df['K'] = 40

The value of G is just dependent on the scoreline.

In [93]:
df['G'] = np.abs(df['home_score']-df['away_score'])

In [99]:
def calc_G(row):
    if abs(row['home_score']-row['away_score']) < 2:
        val = 1
    elif abs(row['home_score']-row['away_score']) < 3:
        val = 1.5
    elif abs(row['home_score']-row['away_score']) >= 3:
        val = 1.75 + (abs(row['home_score']-row['away_score'])-3)/8
    return val

In [108]:
df['G'] = df.apply(calc_G,axis=1)

In [117]:
df.head()

Unnamed: 0,date,home_team,away_team,home_score,away_score,neutral,K,G
0,1872-11-30,Scotland,England,0,0,False,40,1.0
1,1873-03-08,England,Scotland,4,2,False,40,1.5
2,1874-03-07,Scotland,England,2,1,False,40,1.0
3,1875-03-06,England,Scotland,2,2,False,40,1.0
4,1876-03-04,Scotland,England,3,0,False,40,1.75


Finally, calculate the result.

In [110]:
def calc_result(row):
    if row['home_score'] > row['away_score']:
        result = 1.0
    elif row['home_score'] < row['away_score']:
        result = 0.0
    elif row['home_score'] == row['away_score']:
        result = 0.5
    return result

In [118]:
df['result'] = df.apply(calc_result,axis=1)

In [120]:
df

Unnamed: 0,date,home_team,away_team,home_score,away_score,neutral,K,G,result
0,1872-11-30,Scotland,England,0,0,False,40,1.00,0.5
1,1873-03-08,England,Scotland,4,2,False,40,1.50,1.0
2,1874-03-07,Scotland,England,2,1,False,40,1.00,1.0
3,1875-03-06,England,Scotland,2,2,False,40,1.00,0.5
4,1876-03-04,Scotland,England,3,0,False,40,1.75,1.0
...,...,...,...,...,...,...,...,...,...
41581,2020-01-10,Barbados,Canada,1,4,True,40,1.75,0.0
41582,2020-01-12,Kosovo,Sweden,0,1,True,40,1.00,0.0
41583,2020-01-15,Canada,Iceland,0,1,True,40,1.00,0.0
41584,2020-01-19,El Salvador,Iceland,0,1,True,40,1.00,0.0


H value describing the home advantage.

In [122]:
H = 100

Initialise each team with a rating of 1200.

In [32]:
teams = np.unique(np.concatenate((df['home_team'],df['away_team'])))
team_ratings = pd.Series(1200,index=teams)

In [208]:
team_ratings = pd.melt(df, id_vars=['date'], value_vars=['home_team', 'away_team'], value_name='team').drop_duplicates(subset=['team']).sort_values(by=['date']).drop('variable',axis=1)
team_ratings['rating'] = 1200

In [209]:
team_ratings = team_ratings.set_index('team').drop('date',axis=1)

Function to calculate ELO.

In [211]:
def calc_ELO(row):
    
    hr = team_ratings.loc[row['home_team']]
    ar = team_ratings.loc[row['away_team']]
    if row['neutral'] == False:
        dr = hr - ar + H
    else:
        dr = hr - ar
    
    e_result = 1/(10**(-dr/400) + 1)
    
    new_hr = hr + (row['K'] * row['G'] * (row['result'] - e_result))
    new_ar = ar + (row['K'] * row['G'] * ((1-row['result']) - (1-e_result)))
    
    team_ratings.loc[row['home_team']] = new_hr
    team_ratings.loc[row['away_team']] = new_ar
    
    return [new_hr[0],new_ar[0]]

In [203]:
1/(10**(-(100)/400)+1)

0.6400649998028851

In [195]:
1-(1/(10**(-100/400)+1))

0.3599350001971149

In [204]:
1200 + (40*1.00*((1-0.5) - (1-(1/(10**(-100/400) + 1)))))

1205.6025999921153

In [212]:
df['new_rating'] = df.apply(calc_ELO,axis=1)

In [213]:
df

Unnamed: 0,date,home_team,away_team,home_score,away_score,neutral,K,G,result,new_rating
0,1872-11-30,Scotland,England,0,0,False,40,1.00,0.5,"[1194.3974000078847, 1205.6025999921153]"
1,1873-03-08,England,Scotland,4,2,False,40,1.50,1.0,"[1226.3153770723034, 1173.6846229276966]"
2,1874-03-07,Scotland,England,2,1,False,40,1.00,1.0,"[1190.9746009192363, 1209.0253990807637]"
3,1875-03-06,England,Scotland,2,2,False,40,1.00,0.5,"[1202.479824069611, 1197.520175930389]"
4,1876-03-04,Scotland,England,3,0,False,40,1.75,1.0,"[1223.1778614597756, 1176.8221385402244]"
...,...,...,...,...,...,...,...,...,...,...
41581,2020-01-10,Barbados,Canada,1,4,True,40,1.75,0.0,"[970.8590689756546, 1433.3147011214617]"
41582,2020-01-12,Kosovo,Sweden,0,1,True,40,1.00,0.0,"[1380.3713756859802, 1610.4515195363688]"
41583,2020-01-15,Canada,Iceland,0,1,True,40,1.00,0.0,"[1413.2826290054743, 1452.7896230400183]"
41584,2020-01-19,El Salvador,Iceland,0,1,True,40,1.00,0.0,"[1361.5380327763637, 1468.508859688179]"


In [214]:
team_ratings.sort_values(by='rating',ascending=False)

Unnamed: 0_level_0,rating
team,Unnamed: 1_level_1
Brazil,1870.385066
Belgium,1833.370951
Spain,1823.958371
Netherlands,1795.890052
France,1781.256512
...,...
British Virgin Islands,587.880655
Anguilla,580.151117
Northern Mariana Islands,571.368106
East Timor,551.663116
