In this notebook we will optimize the parameters of the rating system.

## Imports

In [1]:
from rating.glicko2_ufc import FighterManager
import pandas as pd
from datetime import datetime
import numpy as np
from scipy.optimize import minimize

## Data

First load the data

In [2]:
fights_df = pd.read_json('data/fights.json').sort_values('date')
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
8249,UFC 2: No Way Out,1994-03-11,Scott Morris,Sean Daugherty,Open Weight,win,SUB
8235,UFC 2: No Way Out,1994-03-11,Royce Gracie,Patrick Smith,Open Weight,win,KO/TKO
8236,UFC 2: No Way Out,1994-03-11,Royce Gracie,Remco Pardoel,Open Weight,win,SUB
8237,UFC 2: No Way Out,1994-03-11,Patrick Smith,Johnny Rhodes,Open Weight,win,SUB
8238,UFC 2: No Way Out,1994-03-11,Royce Gracie,Jason DeLucia,Open Weight,win,SUB


According to [Wikipedia](https://en.wikipedia.org/wiki/Ultimate_Fighting_Championship), UFC 28 (November 17, 2000) was the first UFC event under the "Unified Rules of MMA". This is the current (albeit with minor changes) ruleset used by the UFC today (2025). For the purposes of this project, we will only consider fights taking place on or after November 17, 2000.

In [3]:
fights_df = fights_df[fights_df['date'] >= datetime(2000, 11, 17)]
fights_df.head()

Unnamed: 0,event,date,fighter,opponent,weight_class,outcome,method
7990,UFC 28: High Stakes,2000-11-17,Randy Couture,Kevin Randleman,Heavyweight,win,KO/TKO
7991,UFC 28: High Stakes,2000-11-17,Renato Sobral,Maurice Smith,Heavyweight,win,M-DEC
7992,UFC 28: High Stakes,2000-11-17,Josh Barnett,Gan McGee,Super Heavyweight,win,KO/TKO
7993,UFC 28: High Stakes,2000-11-17,Andrei Arlovski,Aaron Brink,Heavyweight,win,SUB
7995,UFC 28: High Stakes,2000-11-17,Mark Hughes,Alex Stiebling,Middleweight,win,U-DEC


Further we will treat fights graded "No Contest" as if the fight never occured (and thus exclude them).

In [4]:
fights_df = fights_df[fights_df['outcome'] != 'nc']

## Optimizing Algorithm Parameters

First we group the data into ten-week rating periods. The choice of ten-week durations is motivated by:
1. The rating periods (for Glicko-2) should be chosen so that there is possibility of reasonable change in ability of fighters.
2. An average UFC fighter's fight camp lasts 8-10 weeks.

So we assume ten weeks is a sufficient amount of time for fighters to improve their ability, changes which will be captured by the rating algorithm.

In [5]:
fights_grouped = fights_df.groupby(pd.Grouper(key='date', freq='10W'))
grouped_list = list(fights_grouped)

We want to choose parameters tau and initial volatility to minimize predictive discrepancy of the rating algorithm. The parameters are:
- $\tau$: scales change in volatility over time (smaller $\tau$ corresponds to smaller change in volatility)
- $\sigma_0$: intial volatility value assigned to new fighers in the system

The predictive descrepancy objective $f(\tau, \sigma_0)$ will be computed as such:
- Define $S = \{x_1 \ldots x_m\}$ the set of all $m$ fighters
- Define $S_n = \{s_1=(x_i, x_j),s_2=(x_k, x_l) \ldots\}$ the set of pairs of fighters that fought during the $n$-th rating period
- Define $\hat{p}_n:S_{n+1} \rightarrow (0,1)$ the predicted outcome of the fight between two fighters after the $n$-th rating update. So $\hat{p}_5(x_1, x_2) = 0.7$ means that after the $5$-th rating update, the predicted probability of fighter 1 beating fighter 2 is 70%. Naturally $p$ is the true binary outcome.
- $f(\tau, \sigma_0) = \sum_{t=n}^{T-1} \sum_{s\in S_{n+1}} L(\hat{p}_n(s),p(s))$

Here $T$ is the total amount of rating periods and $L$ is a loss function. We will use cross-entropy. $n$ is effectively a burn-in for the rating algorithm. We will choose $n$ to be the first 75% of the rating periods. Draws will not be used in loss computation.

Additionally backtesting accuracy can be computed as $\sum_{t=n}^{T-1} \sum_{s\in S_{n+1}} 1\{\hat{p}_n(s) == p(s)\}$

In [None]:
# loss function
def cross_entropy_loss(phat, p):
    return -(p*np.log(phat) + (1-p)*np.log(1-phat)).sum()


# objective implementation
def f(tau, sigma0, burnin=0.75, loss=cross_entropy_loss):
    manager = FighterManager(tau=tau, volatility=sigma0)
    # burn-in rating updates
    T = len(grouped_list)
    n = int(T*burnin)
    for period, group in grouped_list[:n]:
        timestamp = period.strftime('%Y-%m-%d')
        manager.update_fighters(timestamp, group)
    # loss calculation
    total_loss = 0
    for period, group in grouped_list[n:]:
        if len(group) == 0:
            continue
        non_draw = group[group['outcome'] != 'draw']
        p = non_draw['outcome'] == 'win'
        fighters = non_draw['fighter']
        opponents = non_draw['opponent']
        competitors = pd.concat([fighters, opponents]).unique()
        manager.add_fighters(competitors)
        matchups_matrix = manager.get_matchups_matrix(competitors)
        phat = np.array(
            [
                matchups_matrix.loc[fighter, opponent]
                for fighter, opponent in zip(fighters, opponents)
            ]
        )
        total_loss += loss(phat, p)
        timestamp = period.strftime('%Y-%m-%d')
        manager.update_fighters(timestamp, group)
    return total_loss


# accuracy implementation
def accuracy(tau, sigma0, burnin=0.75):
    manager = FighterManager(tau=tau, volatility=sigma0)
    # burn-in rating updates
    T = len(grouped_list)
    n = int(T*burnin)
    for period, group in grouped_list[:n]:
        timestamp = period.strftime('%Y-%m-%d')
        manager.update_fighters(timestamp, group)
    # accuracy calculation
    total_correct = total_eval = 0
    for period, group in grouped_list[n:]:
        if len(group) == 0:
            continue
        non_draw = group[group['outcome'] != 'draw']
        p = non_draw['outcome'] == 'win'
        fighters = non_draw['fighter']
        opponents = non_draw['opponent']
        competitors = pd.concat([fighters, opponents]).unique()
        manager.add_fighters(competitors)
        matchups_matrix = manager.get_matchups_matrix(competitors)
        phat = np.array(
            [
                matchups_matrix.loc[fighter, opponent]
                for fighter, opponent in zip(fighters, opponents)
            ]
        )
        total_correct += sum((phat>0.5) == p)
        total_eval += len(group)
        timestamp = period.strftime('%Y-%m-%d')
        manager.update_fighters(timestamp, group)
    return total_correct/total_eval

Our objective is quite expensive to compute, so we will use Nelder-Mead to optimize.

In [7]:
minimize(
    fun=lambda x: f(tau=x[0], sigma0=x[1]),
    x0=[1, 0.25],
    method='Nelder-Mead',
    bounds=[(0.001, 2), (0.001, 0.5)]
)

       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 2137.691467989795
             x: [ 1.311e+00  2.956e-01]
           nit: 39
          nfev: 81
 final_simplex: (array([[ 1.311e+00,  2.956e-01],
                       [ 1.311e+00,  2.956e-01],
                       [ 1.311e+00,  2.956e-01]]), array([ 2.138e+03,  2.138e+03,  2.138e+03]))

In [10]:
accuracy(1.311, 0.2956)

0.5552362707535121

The optimal parameters are $\tau=1.311$ and $\sigma_0 = 0.2956$. Finally we can take a look at the ratings!

In [12]:
manager = FighterManager(volatility=0.2956, tau=1.311)
for period, group in fights_grouped:
    timestamp = period.strftime('%Y-%m-%d')
    manager.update_fighters(timestamp, group)

ratings_df = pd.DataFrame(
        {'name': name,
         'weight_class': fighter.weight_class,
         'current_rating': fighter.rating,
         'peak_rating': fighter.peak_rating,
         'current_streak': fighter.streak,
         'best_streak': fighter.best_streak}
        for name, fighter in manager.items()
)

ratings_df.sort_values('peak_rating', ascending=False)

Unnamed: 0,name,weight_class,current_rating,peak_rating,current_streak,best_streak
416,Jon Jones,Heavyweight,2845.416300,2845.416300,19,19
1245,Islam Makhachev,Lightweight,2753.456801,2753.456801,15,15
750,Khabib Nurmagomedov,Lightweight,2691.468920,2691.468920,13,13
1171,Leon Edwards,Welterweight,2320.905044,2690.433173,-2,12
697,Stipe Miocic,Heavyweight,2414.368432,2690.282192,-2,6
...,...,...,...,...,...,...
1949,Gloria de Paula,Women's Strawweight,1140.437373,1152.740438,-1,1
547,Ronys Torres,Lightweight,1150.258032,1150.258032,-2,0
1392,Chris Avila,Featherweight,1079.050385,1124.208036,-2,0
1744,Sung Bin Jo,Featherweight,1107.148102,1107.148102,-1,0
