Skip to content

Latest commit

 

History

History
75 lines (55 loc) · 2.73 KB

README.md

File metadata and controls

75 lines (55 loc) · 2.73 KB

UFC-Prediction

Predicting Fight's Winner with Machine Learning and AI

This project consists of a number of jupyter notebooks which provide the following:

  1. Data Preprocessing and Exploratory Data Analysis (EDA):

    • Used libraries: pandas, NumPy, missingno
  2. Visualisations:

    • Plotly for Interactive plots and Seaborn/Matplotlib for regular charts
  3. Construction of DNN with Hyper-parameters tuning:

    • Keras/keras-tuner and Tensorflow
  4. Ensemble Method (Combining different ML models):

    • Tensorflow/Keras and Scikit-Learn
  5. App Development and Deployment


Introduction

The aim is Data-driven decision making (DDDM) approach towards discovering recurring patterns in the data to predict the outcome of a sporting event in the future.

🥊 The Ultimate Fighting Championship (UFC) is currently one of the fastest-growing sports in the world (Telegraph, 2017) and organises events weekly.

Dataset

The original dataset, data.csv, found on Kaggle, contains the list of all UFC fights from 1993 to 2019. Each row represents information on match details, two fighters (blue and red), and the winner. E.g: Demographics, body attributes, player current form, match details

  • Dimensions: 5144 rows x 145 columns
  • 9 categorical, 136 numerical features
  • Target (categorical: Blue/Red) specifies the winner
  • High dimensions
  • Baseline - 67% (Similar Features Considered)

Data Preprocessing

Performed tasks:

  • Feature Selection
  • Replacing empty string with NA
  • Removing 'Draw' matches (Binary classification)
  • Distinguishing numeric & symbolic fields
  • Removing constant columns (due to no variation in them)
  • Formating data to 3 Decimal Points
  • 1-hot-encoding categorical fields
  • Dimensionality Reduction with PCA (Principal Component Analysis)
  • Missing Values Treatment
    • Replacing missings with Median
    • Prediction Missings via Linear Regression
    • Dropping Remaining Missings

ML Models and Ensemble Method

Trained multiple models separately and then combined them into one ensembled model to increase performance:

  • Deep Neural Network (DNN)
  • Support Vector Machine (SVM)
  • Dicision Tree (DT)
  • AdaBoost
  • Random Forest (RF)
  • ExtraTrees
  • GradientBoosting
  • Multi-Layer Perceptron (MLP)
  • K-Nearest-Neighbours (KNN)
  • Logistic Regression
  • Linear Discriminant Analysis (LDA)
  • XGB

Backend Data API and Development

Generated the latest fighter details and used trained models to predict matches. App deployed on heroku and available on (https://ai-predicts-ufc.herokuapp.com)

© TheDeepestLearners