In this notebook we will optimize the parameters of the rating system

## Imports

In [2]:
from scraping import ufcstats
from glicko2 import Fighter, FighterManager
import pandas as pd
from datetime import datetime

## Data

In [2]:
fights = ufcstats.get_completed_fights(latest_n_events=None)

In [3]:
fights_df = pd.DataFrame(fights).sort_values('date')
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
8179,UFC 2: No Way Out,1994-03-11,Scott Morris,Sean Daugherty,Open Weight,win,SUB
8165,UFC 2: No Way Out,1994-03-11,Royce Gracie,Patrick Smith,Open Weight,win,KO/TKO
8166,UFC 2: No Way Out,1994-03-11,Royce Gracie,Remco Pardoel,Open Weight,win,SUB
8167,UFC 2: No Way Out,1994-03-11,Patrick Smith,Johnny Rhodes,Open Weight,win,SUB
8168,UFC 2: No Way Out,1994-03-11,Royce Gracie,Jason DeLucia,Open Weight,win,SUB


According to [Wikipedia](https://en.wikipedia.org/wiki/Ultimate_Fighting_Championship), UFC 28 (November 17, 2000) was the first UFC event under the "Unified Rules of MMA". This is the current (albeit with some changes) ruleset used by the UFC today (2025). For the purposes of this project, we will only consider fights taking place on or after November 17, 2000.

In [4]:
fights_df = fights_df[fights_df['date'] >= datetime(2000, 11, 17)]
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
7926,UFC 28: High Stakes,2000-11-17,Ben Earwood,Chris Lytle,Welterweight,win,U-DEC
7925,UFC 28: High Stakes,2000-11-17,Mark Hughes,Alex Stiebling,Middleweight,win,U-DEC
7924,UFC 28: High Stakes,2000-11-17,Jens Pulver,John Lewis,Lightweight,win,KO/TKO
7923,UFC 28: High Stakes,2000-11-17,Andrei Arlovski,Aaron Brink,Heavyweight,win,SUB
7922,UFC 28: High Stakes,2000-11-17,Josh Barnett,Gan McGee,Super Heavyweight,win,KO/TKO


Further we will treat fights graded "No Contest" as if the fight never occured (and thus exclude them).

In [5]:
fights_df = fights_df[fights_df['outcome'] != 'nc']

The following duplicates the entire dataframe, swapping 'fighter' and 'opponent' values and changing the 'outcome' so that there are two rows associated with each fight (i.e. one winner and one loser).

In [6]:
print('Shape before:', fights_df.shape)

copy_df = fights_df.copy()
copy_df[['fighter', 'opponent']] = copy_df[['opponent', 'fighter']]
copy_df['outcome'] = copy_df['outcome'].replace('win', 'loss')
fights_df = pd.concat([fights_df, copy_df], ignore_index=True)

print('Shape after:', fights_df.shape)

Shape before: (7842, 7)
Shape after: (15684, 7)


In [7]:
fights_df.to_csv('fights.csv', index=False)

## Optimizing Algorithm Parameters

In [3]:
fights_df = pd.read_csv('fights.csv', parse_dates=['date'])

In [4]:
weekly_grouped = fights_df.groupby(pd.Grouper(key='date', freq='W'))

In [None]:
manager = FighterManager(tau=0.01)

for week, group in weekly_grouped:
    manager.update_fighters(group)

In [6]:
fighters_df = pd.DataFrame({'name': name,
                            'current_rating': fighter.rating,
                            'peak_rating': fighter.peak_rating,
                            'current_streak': fighter.streak,
                            'best_streak': fighter.best_streak}
                            for name, fighter in manager.items())

fighters_df.sort_values('peak_rating', ascending=False)

Unnamed: 0,name,current_rating,peak_rating,current_streak,best_streak
411,Jon Jones,2688.989204,2688.989204,19,19
1250,Islam Makhachev,2575.400773,2575.400773,15,15
233,Anderson Silva,1813.429173,2567.806775,-3,16
894,Daniel Cormier,2381.572290,2552.493424,-2,7
1268,Kamaru Usman,2308.233734,2550.243143,-3,15
...,...,...,...,...,...
1340,Kelly Faszholz,1112.516347,1143.595158,-2,0
1391,Chris Avila,1080.699434,1135.447783,-2,0
2397,Daniel Frunza,1128.868266,1128.868266,-1,0
1744,Sung Bin Jo,1128.394578,1128.394578,-1,0
