# 🔹 UFC Fight Predictor Model Training

<div style="text-align: center;">
  🔹 <img src="../img/ufc_logo.png" width="50" /> 🔹
</div>

#  Import Libraries and Setup Environment

In [1]:
# Import necessary libraries
import os
import sys
import pandas as pd
from autogluon.tabular import TabularDataset, TabularPredictor
pd.set_option('display.max_colwidth', 200) 

# Get the current working directory
current_dir = os.getcwd()

# Navigate to the project root
project_root = os.path.abspath(os.path.join(current_dir, '..'))

# Import from /src
sys.path.append(os.path.join(project_root))
from src.io_model import *
from src.config import *

<div style="text-align: center;">
  🔹 <img src="../img/ufc_logo.png" width="50" /> 🔹
</div>

#  Load Data

In [2]:
# Load UFCData
try:
    ufc_data = load_data(name='ufc_data')
    logger.info("✅ UFCData objects loaded successfully.")
except Exception as e:
    logger.error(f"❌ Error loading training data: {e}")

In [3]:
ufc_data

📊 UFC Dataset Summary
----------------------------------------
🧪 Total samples      : 5548
🧪 Train/Test split  : 4438 / 1110
🧪 Total features     : 227

🔢 Numerical features : 224
🔠 Categorical features: 3
    - Binary          : 0
    - Multiclass      : 3

🏷 Label distribution (raw):
   - Class 0: 3363 (60.6%)
   - Class 1: 2185 (39.4%)

✅ No missing values detected

📈 Feature summary statistics (train set):
                               mean      std      min      max
title_fight                   0.058    0.234    0.000    1.000
r_height                    178.277    9.133  152.400  210.820
r_reach                     183.063   11.056  147.320  213.360
b_height                    178.279    8.915  152.400  210.820
b_reach                     182.886   10.722  147.320  213.360
r_wins                        5.289    4.336    0.000   27.000
r_losses                      2.789    2.654    0.000   18.000
r_total_fights                8.140    6.488    1.000   44.000
r_current_win_strea

# 🔧 Hyperparameters Tuning 🔧

<div style="text-align: center;">
  🔹 <img src="../img/ufc_logo.png" width="50" /> 🔹
</div>

# 🔹 UFC Machine Learning Training

### 🚀 AutoGluon Training 

In [4]:
train_data = ufc_data.get_df_train()
train_data

Unnamed: 0,division,title_fight,r_height,r_reach,r_stance,b_height,b_reach,b_stance,r_wins,r_losses,...,height_reach_ratio_dif,r_finish_rate,r_win_ratio,r_ko_per_fight,r_sub_per_fight,b_finish_rate,b_win_ratio,b_ko_per_fight,b_sub_per_fight,label
1318,heavyweight,0,190.50,200.66,Southpaw,193.04,203.20,Orthodox,14,8,...,0.001,0.786,0.636,0.136,0.364,1.000,0.333,0.333,0.000,1
4437,middleweight,0,190.50,180.34,Orthodox,185.42,193.04,Orthodox,1,1,...,-0.095,0.000,0.500,0.000,0.000,0.000,0.000,0.000,0.000,0
428,lightweight,0,172.72,177.80,Southpaw,175.26,177.80,Orthodox,0,2,...,0.015,0.000,0.000,0.000,0.000,0.500,0.667,0.333,0.000,0
1758,featherweight,0,172.72,175.26,Orthodox,172.72,172.72,Orthodox,2,2,...,0.014,0.000,0.500,0.000,0.000,0.000,1.000,0.000,0.000,0
5423,bantamweight,0,170.18,172.72,Switch,167.64,167.64,Orthodox,3,2,...,0.015,0.333,0.600,0.200,0.000,0.000,0.000,0.000,0.000,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3145,women,0,175.26,172.72,Orthodox,167.64,165.10,Orthodox,4,2,...,0.000,0.000,0.667,0.000,0.000,0.400,0.625,0.125,0.125,0
3633,light heavyweight,0,187.96,187.96,Orthodox,190.50,198.12,Switch,3,1,...,-0.038,1.000,0.750,0.250,0.500,1.000,1.000,1.000,0.000,0
5081,light heavyweight,0,193.04,195.58,Southpaw,190.50,193.04,Orthodox,6,4,...,0.000,0.667,0.600,0.300,0.100,0.571,0.538,0.308,0.000,0
4688,welterweight,0,190.50,203.20,Orthodox,190.50,203.20,Orthodox,20,9,...,0.000,0.350,0.690,0.172,0.069,1.000,0.750,0.750,0.000,0


In [5]:
predictor = TabularPredictor(label='label', path = '../models').fit(train_data)

Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.3.1
Python Version:     3.10.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jul 17 17:57:16 UTC 2025
CPU Count:          24
Memory Avail:       9.01 GB / 15.51 GB (58.1%)
Disk Space Avail:   191.58 GB / 236.89 GB (80.9%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with 

In [6]:
test_data = ufc_data.get_df_test()
test_data

Unnamed: 0,division,title_fight,r_height,r_reach,r_stance,b_height,b_reach,b_stance,r_wins,r_losses,...,height_reach_ratio_dif,r_finish_rate,r_win_ratio,r_ko_per_fight,r_sub_per_fight,b_finish_rate,b_win_ratio,b_ko_per_fight,b_sub_per_fight,label
119,heavyweight,0,190.50,198.12,Orthodox,190.50,200.66,Southpaw,3,0,...,-0.013,1.000,1.000,0.667,0.333,0.714,0.778,0.111,0.444,0
869,lightweight,0,177.80,182.88,Orthodox,177.80,180.34,Orthodox,1,1,...,0.014,0.000,0.500,0.000,0.000,1.000,0.727,0.182,0.545,0
2626,lightweight,0,182.88,190.50,Switch,177.80,185.42,Orthodox,4,3,...,-0.001,1.000,0.571,0.429,0.143,0.667,0.750,0.500,0.000,0
4406,middleweight,0,190.50,203.20,Orthodox,190.50,190.50,Orthodox,2,0,...,0.062,1.000,1.000,1.000,0.000,0.667,0.750,0.500,0.000,1
3080,bantamweight,0,167.64,165.10,Orthodox,167.64,165.10,Orthodox,3,1,...,0.000,0.667,0.750,0.250,0.250,0.500,0.667,0.000,0.333,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3057,women,0,162.56,157.48,Orthodox,170.18,165.10,Orthodox,0,1,...,-0.001,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0
508,welterweight,0,182.88,193.04,Southpaw,182.88,180.34,Orthodox,6,3,...,0.067,0.833,0.667,0.000,0.556,1.000,0.500,0.500,0.000,0
4032,light heavyweight,0,187.96,193.04,Orthodox,182.88,187.96,Southpaw,1,0,...,-0.001,0.000,1.000,0.000,0.000,0.667,0.600,0.400,0.000,1
4809,featherweight,0,175.26,182.88,Orthodox,175.26,175.26,Switch,7,3,...,0.042,0.429,0.700,0.200,0.100,0.600,0.455,0.182,0.091,1


In [10]:
predictor.evaluate(test_data)

{'accuracy': 0.6594594594594595,
 'balanced_accuracy': 0.6120414415455915,
 'mcc': 0.2517830609116573,
 'roc_auc': 0.664319400478067,
 'f1': 0.4735376044568245,
 'precision': 0.604982206405694,
 'recall': 0.3890160183066362}

In [9]:
pd.set_option('display.max_rows', None)
predictor.feature_importance(test_data)

Computing feature importance via permutation shuffling for 227 features using 1110 rows with 5 shuffle sets...
	27.62s	= Expected runtime (5.52s per shuffle set)
	8.31s	= Actual runtime (Completed 5 of 5 shuffle sets)


Unnamed: 0,importance,stddev,p_value,n,p99_high,p99_low
r_age,0.02432432,0.007007,0.000742,5,0.038753,0.009896
b_age,0.02396396,0.004707,0.00017,5,0.033656,0.014272
b_SApM_last_3,0.01189189,0.004742,0.002483,5,0.021655,0.002129
b_SApM_last_5,0.00990991,0.002919,0.000808,5,0.015921,0.003899
b_leg_acc_last_3,0.009189189,0.003211,0.00153,5,0.0158,0.002579
b_landed_head_per_last_5,0.008108108,0.004413,0.007379,5,0.017196,-0.000979
r_SApM_career,0.008108108,0.002548,0.001031,5,0.013355,0.002861
r_Str_Def_career,0.007927928,0.001733,0.000257,5,0.011496,0.00436
r_clinch_landed_last_5,0.007747748,0.003524,0.003975,5,0.015003,0.000492
r_TD_Avg_career,0.007747748,0.004877,0.011871,5,0.017789,-0.002293


<div style="text-align: center;">
     <img src="../img/ufc_logo.png" width="800" /> 
</div>