In this notebook we will optimize the parameters of the rating system

## Imports

In [1]:
from scraping import ufcstats
from glicko2 import Fighter, FighterManager
import pandas as pd
from datetime import datetime

## Data

In [34]:
fights = ufcstats.get_completed_fights(latest_n_events=None)

In [35]:
fights_df = pd.DataFrame(fights).sort_values('date')
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
8057,UFC 2: No Way Out,1994-03-11,Scott Morris,Sean Daugherty,Open Weight,win,SUB
8043,UFC 2: No Way Out,1994-03-11,Royce Gracie,Patrick Smith,Open Weight,win,KO/TKO
8044,UFC 2: No Way Out,1994-03-11,Royce Gracie,Remco Pardoel,Open Weight,win,SUB
8045,UFC 2: No Way Out,1994-03-11,Patrick Smith,Johnny Rhodes,Open Weight,win,SUB
8046,UFC 2: No Way Out,1994-03-11,Royce Gracie,Jason DeLucia,Open Weight,win,SUB


According to [Wikipedia](https://en.wikipedia.org/wiki/Ultimate_Fighting_Championship), UFC 28 (November 17, 2000) was the first UFC event under the "Unified Rules of MMA". This is the current (albeit with some changes) ruleset used by the UFC today (2025). For the purposes of this project, we will only consider fights taking place on or after November 17, 2000.

In [36]:
fights_df = fights_df[fights_df['date'] >= datetime(2000, 11, 17)]
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
7798,UFC 28: High Stakes,2000-11-17,Randy Couture,Kevin Randleman,Heavyweight,win,KO/TKO
7800,UFC 28: High Stakes,2000-11-17,Josh Barnett,Gan McGee,Super Heavyweight,win,KO/TKO
7799,UFC 28: High Stakes,2000-11-17,Renato Sobral,Maurice Smith,Heavyweight,win,M-DEC
7802,UFC 28: High Stakes,2000-11-17,Jens Pulver,John Lewis,Lightweight,win,KO/TKO
7804,UFC 28: High Stakes,2000-11-17,Ben Earwood,Chris Lytle,Welterweight,win,U-DEC


Further we will treat fights graded "No Contest" as if the fight never occured (and thus exclude them).

In [37]:
fights_df = fights_df[fights_df['outcome'] != 'nc']

The following duplicates the entire dataframe, swapping 'fighter' and 'opponent' values and changing the 'outcome' so that there are two rows associated with each fight (i.e. one winner and one loser).

In [38]:
print('Shape before:', fights_df.shape)

copy_df = fights_df.copy()
copy_df[['fighter', 'opponent']] = copy_df[['opponent', 'fighter']]
copy_df['outcome'] = copy_df['outcome'].replace('win', 'loss')
fights_df = pd.concat([fights_df, copy_df], ignore_index=True)

print('Shape after:', fights_df.shape)

Shape before: (7720, 7)
Shape after: (15440, 7)


In [39]:
fights_df.to_csv('fights.csv', index=False)

## Optimizing Algorithm Parameters

In [44]:
fights_df = pd.read_csv('fights.csv', parse_dates=['date'])

In [46]:
weekly_grouped = fights_df.groupby(pd.Grouper(key='date', freq='W'))

In [47]:
manager = FighterManager()

for week, group in weekly_grouped:
    manager.update_fighters(group)

In [62]:
fighters_df = pd.DataFrame({'name': name,
                            'current_rating': fighter.rating,
                            'peak_rating': fighter.peak_rating,
                            'current_streak': fighter.streak,
                            'best_streak': fighter.best_streak}
                            for name, fighter in manager.items())

fighters_df.sort_values('peak_rating', ascending=False)

Unnamed: 0,name,current_rating,peak_rating,current_streak,best_streak
411,Jon Jones,2688.224146,2688.224146,19,19
1250,Islam Makhachev,2575.235694,2575.235694,15,15
233,Anderson Silva,1812.809997,2565.092927,-3,16
894,Daniel Cormier,2381.210010,2551.882260,-2,7
1268,Kamaru Usman,2308.189107,2550.191223,-3,15
...,...,...,...,...,...
1468,Rashad Coulter,1150.744736,1150.744736,1,1
1340,Kelly Faszholz,1112.516347,1143.595158,-2,0
1391,Chris Avila,1080.712039,1135.460146,-2,0
1743,Sung Bin Jo,1128.387826,1128.387826,-1,0
