# UFC Modeling 

This is my scratchpad for my UFC modeling project.

MVP: Baseline Accuracy is 49% and my best Random Forest model's accuracy is 64%.

What I would add to make it more accurate: 

The win/loss record for each fighter at the time of the fight. 

### Imports

In [2]:
# imports
import numpy as np
import pandas as pd

# Classification Modeling
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier

# Visualizing
import matplotlib.pyplot as plt
import seaborn as sns

# My files
from Wrangle_UFC import *

### Acquire and Prepare

In [3]:
# Remove Limits On Viewing Dataframes
pd.set_option('display.max_columns', None)

In [4]:
final_df, fighter_stat_diff = ufc_stats_difference()

In [5]:
final_df.head(1)

Unnamed: 0.1,Unnamed: 0,event_name,fighter1,fighter2,outcome,win,loss,draw,no_contest,weight_f1,reach_f1,stance_f1,strikes_f1,strike_acc_f1,strikes_absorbed_f1,strike_defense_f1,takedowns_f1,takedown_acc_f1,takedown_def_f1,sub_attempt_f1,age_days_f1,age_f1,outcome_f1,height_in_f1,stance_Orthodox_f1,stance_Southpaw_f1,stance_Switch_f1,weight_f2,reach_f2,stance_f2,strikes_f2,strike_acc_f2,strikes_absorbed_f2,strike_defense_f2,takedowns_f2,takedown_acc_f2,takedown_def_f2,sub_attempt_f2,age_days_f2,age_f2,outcome_f2,height_in_f2,stance_Orthodox_f2,stance_Southpaw_f2,stance_Switch_f2,weight_diff,reach_diff,strike_diff,strike_acc_diff,strikes_absorbed_diff,strikes_defense_diff,takedown_attempts_diff,takedown_acc_diff,takedown_defense_diff,submission_attempt_diff,age_diff,height_diff
0,0,UFC 259: Blachowicz vs. Adesanya,Aalon Cruz,Uros Medic,fighter2,0,1,0,0,145.0,78.0,Switch,7.58,39,8.88,58,0.0,0,0,0.0,11490,31.0,fighter2,72,0,0,1,155.0,71.0,Southpaw,19.91,77,0.52,86,0.0,0,100,0.0,10177,27.0,fighter1,73,0,1,0,-10.0,7.0,-12.33,-38.0,8.36,-28.0,0.0,0.0,-100.0,0.0,1313.0,-1.0


In [6]:
final_df.outcome

0       fighter2
1       fighter2
2       fighter2
3       fighter2
4       fighter2
          ...   
9507    fighter1
9508    fighter2
9509        draw
9510    fighter1
9511    fighter2
Name: outcome, Length: 9512, dtype: object

### Split and Scale

In [7]:
train, validate, test = train_validate_test_split(fighter_stat_diff)

In [8]:
train.head(1)

Unnamed: 0,event_name,fighter1,fighter2,outcome,stance_f1,stance_f2,weight_diff,reach_diff,strike_diff,strike_acc_diff,strikes_absorbed_diff,strikes_defense_diff,takedown_attempts_diff,takedown_acc_diff,takedown_defense_diff,submission_attempt_diff,age_diff,height_diff
8625,UFC 112: Invincible,Terry Etim,Rafael Dos Anjos,fighter2,Orthodox,Southpaw,0.0,3.0,-1.33,-11.0,-0.69,6.0,-1.5,-7.0,-19.0,0.8,-442.0,5.0


In [9]:
train, validate, test, X_train, y_train, X_validate, y_validate, X_test, y_test = split_tvt_into_variables(train, validate, test, target='outcome')

In [10]:
X_train.head(1)

Unnamed: 0,weight_diff,reach_diff,strike_diff,strike_acc_diff,strikes_absorbed_diff,strikes_defense_diff,takedown_attempts_diff,takedown_acc_diff,takedown_defense_diff,submission_attempt_diff,age_diff,height_diff
8625,0.0,3.0,-1.33,-11.0,-0.69,6.0,-1.5,-7.0,-19.0,0.8,-442.0,5.0


In [11]:
scaler, X_train_scaled, X_validate_scaled, X_test_scaled = Min_Max_Scaler(X_train, X_validate, X_test)

In [12]:
X_train_scaled.head(1)

Unnamed: 0,weight_diff,reach_diff,strike_diff,strike_acc_diff,strikes_absorbed_diff,strikes_defense_diff,takedown_attempts_diff,takedown_acc_diff,takedown_defense_diff,submission_attempt_diff,age_diff,height_diff
8625,0.516129,0.615385,0.464444,0.363636,0.481659,0.566667,0.430939,0.465,0.405,0.554795,0.464216,0.692308


### MVP

##### Baseline

In [13]:
# baseline prediction = most common value
baseline = y_train.mode()
baseline

0    fighter1
1    fighter2
dtype: object

In [14]:
match_bsl_prediction = y_train == 'fighter1'

In [15]:
baseline_accuracy = match_bsl_prediction.mean()

In [16]:
baseline_accuracy

0.492114156965828

In [17]:
# basline accuracy = 49%

#### Decision Tree¶

In [18]:
# Create
tree1_clf = DecisionTreeClassifier(max_depth=3, random_state=123)

In [19]:
# fit
tree1_clf = tree1_clf.fit(X_train, y_train)

In [20]:
#visualize the decision tree
import graphviz
from graphviz import Graph

dot_data = export_graphviz(tree1_clf, feature_names= X_train.columns, rounded=True, filled=True, out_file=None)
graph = graphviz.Source(dot_data) 

graph.render('ufc_decision_tree', view=True)

'ufc_decision_tree.pdf'

In [21]:
y_pred = tree1_clf.predict(X_train)
y_pred[0:5]

array(['fighter2', 'fighter2', 'fighter1', 'fighter1', 'fighter2'],
      dtype=object)

In [22]:
y_pred_proba = tree1_clf.predict_proba(X_train)
y_pred_proba[0:5]

array([[0.00722394, 0.41795666, 0.56449948, 0.01031992],
       [0.00645995, 0.45090439, 0.53359173, 0.00904393],
       [0.00502513, 0.54648241, 0.44095477, 0.00753769],
       [0.00502513, 0.54648241, 0.44095477, 0.00753769],
       [0.00481928, 0.34939759, 0.63855422, 0.00722892]])

In [23]:
# accuracy:
print('Accuracy of Decision Tree 1 classifier on training set: {:.2f}'
      .format(tree1_clf.score(X_train, y_train)))

Accuracy of Decision Tree 1 classifier on training set: 0.61


In [24]:
y_train.value_counts()

fighter2      2621
fighter1      2621
no_contest      52
draw            32
Name: outcome, dtype: int64

In [25]:
# classification report: 
print(classification_report(y_train, y_pred))

              precision    recall  f1-score   support

        draw       0.00      0.00      0.00        32
    fighter1       0.61      0.62      0.61      2621
    fighter2       0.61      0.62      0.62      2621
  no_contest       0.00      0.00      0.00        52

    accuracy                           0.61      5326
   macro avg       0.31      0.31      0.31      5326
weighted avg       0.60      0.61      0.61      5326



In [26]:
# make classification report prettier in a df
class_report = classification_report(y_train, y_pred, output_dict=(True))
print("Tree1 depth")
pd.DataFrame(class_report)

Tree1 depth


Unnamed: 0,draw,fighter1,fighter2,no_contest,accuracy,macro avg,weighted avg
precision,0.0,0.611237,0.610696,0.0,0.610965,0.305483,0.60133
recall,0.0,0.618466,0.623045,0.0,0.610965,0.310378,0.610965
f1-score,0.0,0.61483,0.616808,0.0,0.610965,0.30791,0.606107
support,32.0,2621.0,2621.0,52.0,0.610965,5326.0,5326.0


In [27]:
for i in range(2, 11):
    # Make the model
    tree = DecisionTreeClassifier(max_depth=i, random_state=123)

    # Fit the model (on train and only train)
    tree = tree.fit(X_train, y_train)

    # Use the model
    # We'll evaluate the model's performance on train, first
    y_pred = tree.predict(X_train)

    # Produce the classification report on the actual y values and this model's predicted y values
    report = classification_report(y_train, y_pred, output_dict=True)
    print(f"Tree with max depth of {i}")
    print(pd.DataFrame(report))
    print()

Tree with max depth of 2
           draw     fighter1     fighter2  no_contest  accuracy    macro avg  \
precision   0.0     0.583650     0.582715         0.0  0.583177     0.291591   
recall      0.0     0.585654     0.599390         0.0  0.583177     0.296261   
f1-score    0.0     0.584651     0.590935         0.0  0.583177     0.293896   
support    32.0  2621.000000  2621.000000        52.0  0.583177  5326.000000   

           weighted avg  
precision      0.573985  
recall         0.583177  
f1-score       0.578522  
support     5326.000000  

Tree with max depth of 3
           draw     fighter1     fighter2  no_contest  accuracy    macro avg  \
precision   0.0     0.611237     0.610696         0.0  0.610965     0.305483   
recall      0.0     0.618466     0.623045         0.0  0.610965     0.310378   
f1-score    0.0     0.614830     0.616808         0.0  0.610965     0.307910   
support    32.0  2621.000000  2621.000000        52.0  0.610965  5326.000000   

           weight

In [28]:
metrics = []  

In [29]:
for i in range(1, 11):
    tree = DecisionTreeClassifier(max_depth=i, random_state=123)
    
    #run the model on train and only TRAIN data 
    tree = tree.fit(X_train, y_train)
    
    #use/test the model to evaluate models performance on train data first...
    in_sample_accuracy = tree.score(X_train, y_train)
    out_sample_accuracy = tree.score(X_validate, y_validate)
    
    output = {'max_depth': i, 'train_accuracy': in_sample_accuracy, 'validate_accuracy': out_sample_accuracy}
    
    metrics.append(output)
    
tree_df = pd.DataFrame(metrics)
tree_df["difference"] = tree_df.train_accuracy - tree_df.validate_accuracy

tree_df

Unnamed: 0,max_depth,train_accuracy,validate_accuracy,difference
0,1,0.583177,0.592641,-0.009464
1,2,0.583177,0.592641,-0.009464
2,3,0.610965,0.598336,0.01263
3,4,0.618663,0.607096,0.011567
4,5,0.640631,0.616294,0.024337
5,6,0.656027,0.614542,0.041485
6,7,0.686819,0.607534,0.079285
7,8,0.720804,0.605782,0.115022
8,9,0.75122,0.597021,0.154199
9,10,0.784641,0.593079,0.191562


In [30]:
# to avoid over-fitting, set a threshhold by looking at the difference

threshold = 0.10  #threshold set for amount of overfit that is tolerated

models = []
metrics = []

for i in range(1, 11):
    tree = DecisionTreeClassifier(max_depth=i, random_state=123)
    #^^^ creates the model
    
    tree = tree.fit(X_train, y_train)   #fit model to train data and only TRAIN data
    
    in_sample_accuracy = tree.score(X_train, y_train)
    out_sample_accuracy = tree.score(X_validate, y_validate)
    #^^^evaluates the models performance on train data first
    
    difference = in_sample_accuracy - out_sample_accuracy
    #^^calculates the difference in accuracy
    
    if difference > threshold:
        break
    #^^adds conditions to check the accuracy vs the threshold
    
    output = {
        'max_depth': i,
        'train_accuracy': in_sample_accuracy,
        'validate_accuracy': out_sample_accuracy,
        'difference': difference}
    #^^^formats the output for each models performance o train and validate
    
    metrics.append(output)
    
    models.append(output)
    
model_df = pd.DataFrame(metrics)
model_df["difference"] = tree_df.train_accuracy - tree_df.validate_accuracy


model_df.head()

Unnamed: 0,max_depth,train_accuracy,validate_accuracy,difference
0,1,0.583177,0.592641,-0.009464
1,2,0.583177,0.592641,-0.009464
2,3,0.610965,0.598336,0.01263
3,4,0.618663,0.607096,0.011567
4,5,0.640631,0.616294,0.024337


61% accurate

#### Random Forest Model

In [31]:
# create model
rf1_clf = RandomForestClassifier(max_depth=10, min_samples_leaf=1, random_state=123)  

In [32]:
# fit model 
rf1_clf.fit(X_train, y_train)

RandomForestClassifier(max_depth=10, random_state=123)

In [33]:
# show feature importance
print(rf1_clf.feature_importances_)

[0.03798569 0.04949213 0.14051246 0.07447359 0.1358705  0.08221113
 0.09118034 0.07426964 0.10032851 0.07082072 0.10060133 0.04225396]


In [34]:
# predict target
y_pred = rf1_clf.predict(X_train)
y_pred

array(['fighter2', 'fighter2', 'fighter2', ..., 'fighter1', 'fighter1',
       'fighter1'], dtype=object)

In [35]:
y_pred_proba = rf1_clf.predict_proba(X_train)

In [36]:
print('Accuracy of random forest classifier on training set: {:.2f}'
     .format(rf1_clf.score(X_train, y_train)))

Accuracy of random forest classifier on training set: 0.88


In [37]:
print(classification_report(y_train, y_pred))

              precision    recall  f1-score   support

        draw       0.00      0.00      0.00        32
    fighter1       0.87      0.90      0.89      2621
    fighter2       0.88      0.89      0.88      2621
  no_contest       1.00      0.10      0.18        52

    accuracy                           0.88      5326
   macro avg       0.69      0.47      0.49      5326
weighted avg       0.87      0.88      0.87      5326



In [38]:
class_report = classification_report(y_train, y_pred, output_dict=(True))
print("Tree1 depth")
pd.DataFrame(class_report)

Tree1 depth


Unnamed: 0,draw,fighter1,fighter2,no_contest,accuracy,macro avg,weighted avg
precision,0.0,0.874907,0.881594,1.0,0.878333,0.689125,0.874162
recall,0.0,0.896604,0.886303,0.096154,0.878333,0.469765,0.878333
f1-score,0.0,0.885623,0.883942,0.175439,0.878333,0.486251,0.872541
support,32.0,2621.0,2621.0,52.0,0.878333,5326.0,5326.0


In [39]:
max_depth = 16

for i in range(1, max_depth):
    # Create Model
    depth = max_depth - i
    n = i
    forest = RandomForestClassifier(max_depth=depth, min_samples_leaf=n, random_state=123)

    # Fit the model (on train and only train)
    forest = forest.fit(X_train, y_train)

    # Use the model
    # We'll evaluate the model's performance on train, first
    y_pred = forest.predict(X_train)

    # Produce the classification report on the actual y values and this model's predicted y values
    report = classification_report(y_train, y_pred, output_dict=True)
    print(f"Tree with max depth of {i}")
    print(pd.DataFrame(report))
    print()

Tree with max depth of 1
                draw     fighter1     fighter2  no_contest  accuracy  \
precision   1.000000     0.994683     0.997332    1.000000  0.996057   
recall      0.812500     0.999237     0.998474    0.826923  0.996057   
f1-score    0.896552     0.996955     0.997903    0.905263  0.996057   
support    32.000000  2621.000000  2621.000000   52.000000  0.996057   

             macro avg  weighted avg  
precision     0.998004      0.996071  
recall        0.909283      0.996057  
f1-score      0.949168      0.995923  
support    5326.000000   5326.000000  

Tree with max depth of 2
           draw     fighter1     fighter2  no_contest  accuracy    macro avg  \
precision   0.0     0.967898     0.973177         0.0  0.970522     0.485269   
recall      0.0     0.989317     0.982831         0.0  0.970522     0.493037   
f1-score    0.0     0.978491     0.977980         0.0  0.970522     0.489118   
support    32.0  2621.000000  2621.000000        52.0  0.970522  5326.000

In [40]:
metrics = []
max_depth = 16

for i in range(1, max_depth):
    # Create model
    depth = max_depth - i
    n = i
    forest = RandomForestClassifier(max_depth=depth, min_samples_leaf=n, random_state=123)

    # Fit the model (on train and only train)
    forest = forest.fit(X_train, y_train)

    # Use the model
    # We'll evaluate the model's performance on train, first
    in_sample_accuracy = forest.score(X_train, y_train)
    
    out_of_sample_accuracy = forest.score(X_validate, y_validate)

    output = {
        "min_samples_per_leaf": n,
        "max_depth": depth,
        "train_accuracy": in_sample_accuracy,
        "validate_accuracy": out_of_sample_accuracy
    }
    
    metrics.append(output)
    
df = pd.DataFrame(metrics)
df["difference"] = df.train_accuracy - df.validate_accuracy
df


Unnamed: 0,min_samples_per_leaf,max_depth,train_accuracy,validate_accuracy,difference
0,1,15,0.996057,0.626369,0.369688
1,2,14,0.970522,0.632501,0.338021
2,3,13,0.936725,0.645204,0.291522
3,4,12,0.895794,0.639947,0.255847
4,5,11,0.854675,0.635129,0.219546
5,6,10,0.81656,0.633377,0.183183
6,7,9,0.778633,0.637757,0.140876
7,8,8,0.742959,0.635567,0.107392
8,9,7,0.714608,0.633815,0.080792
9,10,6,0.687007,0.639509,0.047498


best random forest model: 
9	10	6	0.687007	0.639509	0.047498

64% accuracy

#### KNN

In [41]:
# create knn
knn = KNeighborsClassifier(n_neighbors=5, weights='uniform')

In [42]:
# fit
knn.fit(X_train, y_train)

KNeighborsClassifier()

In [43]:
# predict
y_pred = knn.predict(X_train)

In [44]:
#y_pred prob
y_pred_proba = knn.predict_proba(X_train)

In [45]:
# Accuracy
print('Accuracy of KNN classifier on training set: {:.2f}'
     .format(knn.score(X_train, y_train)))

Accuracy of KNN classifier on training set: 0.70


In [46]:
# confusion matrix
pd.DataFrame(confusion_matrix(y_train, y_pred))

Unnamed: 0,0,1,2,3
0,2,20,10,0
1,1,1914,706,0
2,2,792,1827,0
3,0,37,15,0


In [47]:
# classification report
print(classification_report(y_train, y_pred))

              precision    recall  f1-score   support

        draw       0.40      0.06      0.11        32
    fighter1       0.69      0.73      0.71      2621
    fighter2       0.71      0.70      0.71      2621
  no_contest       0.00      0.00      0.00        52

    accuracy                           0.70      5326
   macro avg       0.45      0.37      0.38      5326
weighted avg       0.69      0.70      0.70      5326



In [48]:
pd.DataFrame(classification_report(y_train, y_pred, output_dict=True))

Unnamed: 0,draw,fighter1,fighter2,no_contest,accuracy,macro avg,weighted avg
precision,0.4,0.692725,0.71423,0.0,0.702779,0.451739,0.694786
recall,0.0625,0.730256,0.697062,0.0,0.702779,0.372454,0.702779
f1-score,0.108108,0.710996,0.705542,0.0,0.702779,0.381161,0.697748
support,32.0,2621.0,2621.0,52.0,0.702779,5326.0,5326.0


In [49]:
pd.DataFrame(classification_report(y_train, y_pred, output_dict=True))

Unnamed: 0,draw,fighter1,fighter2,no_contest,accuracy,macro avg,weighted avg
precision,0.4,0.692725,0.71423,0.0,0.702779,0.451739,0.694786
recall,0.0625,0.730256,0.697062,0.0,0.702779,0.372454,0.702779
f1-score,0.108108,0.710996,0.705542,0.0,0.702779,0.381161,0.697748
support,32.0,2621.0,2621.0,52.0,0.702779,5326.0,5326.0


In [50]:
for k in range(1, 21):
            
    # define the thing
    knn = KNeighborsClassifier(n_neighbors=k)
    
    # fit the thing (remmeber only fit on training data)
    knn = knn.fit(X_train, y_train)
    
    # predict on train
    y_pred = knn.predict(X_train)

    # Produce the classification report on the actual y values and this model's predicted y values
    report = classification_report(y_train, y_pred, output_dict=True)
    print(f"KNN with k value of {k}")
    print(pd.DataFrame(report))
    print()

KNN with k value of 1
           draw  fighter1  fighter2  no_contest  accuracy  macro avg  \
precision   1.0       1.0       1.0         1.0       1.0        1.0   
recall      1.0       1.0       1.0         1.0       1.0        1.0   
f1-score    1.0       1.0       1.0         1.0       1.0        1.0   
support    32.0    2621.0    2621.0        52.0       1.0     5326.0   

           weighted avg  
precision           1.0  
recall              1.0  
f1-score            1.0  
support          5326.0  

KNN with k value of 2
                draw     fighter1     fighter2  no_contest  accuracy  \
precision   0.524590     0.679064     0.981690         0.0   0.75798   
recall      1.000000     0.996185     0.531858         0.0   0.75798   
f1-score    0.688172     0.807609     0.689928         0.0   0.75798   
support    32.000000  2621.000000  2621.000000        52.0   0.75798   

             macro avg  weighted avg  
precision     0.546336      0.820432  
recall        0.632011   

KNN with k value of 16
           draw     fighter1     fighter2  no_contest  accuracy    macro avg  \
precision   0.0     0.614295     0.651092         0.0  0.630116     0.316347   
recall      0.0     0.711560     0.568867         0.0  0.630116     0.320107   
f1-score    0.0     0.659360     0.607208         0.0  0.630116     0.316642   
support    32.0  2621.000000  2621.000000        52.0  0.630116  5326.000000   

           weighted avg  
precision      0.622715  
recall         0.630116  
f1-score       0.623296  
support     5326.000000  

KNN with k value of 17
           draw     fighter1     fighter2  no_contest  accuracy    macro avg  \
precision   0.0     0.620158     0.633176         0.0  0.626361     0.313333   
recall      0.0     0.659672     0.613125         0.0  0.626361     0.318199   
f1-score    0.0     0.639305     0.622989         0.0  0.626361     0.315573   
support    32.0  2621.000000  2621.000000        52.0  0.626361  5326.000000   

           weighted a

In [51]:
metrics = []

# loop through different values of k
for k in range(1, 21):
            
    # define the thing
    knn = KNeighborsClassifier(n_neighbors=k)
    
    # fit the thing (remmeber only fit on training data)
    knn.fit(X_train, y_train)
    
    # use the thing (calculate accuracy)
    train_accuracy = knn.score(X_train, y_train)
    validate_accuracy = knn.score(X_validate, y_validate)
    
    output = {
        "k": k,
        "train_accuracy": train_accuracy,
        "validate_accuracy": validate_accuracy
    }
    
    metrics.append(output)


df = pd.DataFrame(metrics)
df["difference"] = df.train_accuracy - df.validate_accuracy
df

Unnamed: 0,k,train_accuracy,validate_accuracy,difference
0,1,1.0,0.530442,0.469558
1,2,0.75798,0.52869,0.229289
2,3,0.757604,0.545335,0.212269
3,4,0.705407,0.553219,0.152188
4,5,0.702779,0.548401,0.154378
5,6,0.675742,0.551467,0.124274
6,7,0.673864,0.555848,0.118016
7,8,0.658468,0.549715,0.108753
8,9,0.65828,0.552781,0.105499
9,10,0.642884,0.545773,0.097111


55% accuracy

Baseline Accuracy is 49% and my best Random Forest model's accuracy is 64%.

Best model on test dataset:

In [56]:
# predict target
y_pred = rf1_clf.predict(X_test)
y_pred

array(['fighter1', 'fighter1', 'fighter2', ..., 'fighter1', 'fighter2',
       'fighter2'], dtype=object)

In [57]:
# fit model 
rf1_clf.fit(X_train, y_train)

RandomForestClassifier(max_depth=10, random_state=123)

In [58]:
y_pred = knn.predict(X_test)
y_pred

array(['fighter1', 'fighter1', 'fighter1', ..., 'fighter1', 'fighter2',
       'fighter2'], dtype=object)

In [59]:
y_pred_proba = rf1_clf.predict_proba(X_test)

In [60]:
print('Accuracy of random forest classifier on test set: {:.2f}'
     .format(rf1_clf.score(X_test, y_test)))

Accuracy of random forest classifier on test set: 0.64


In [61]:
pd.DataFrame(classification_report(y_test, y_pred, output_dict=True))

Unnamed: 0,draw,fighter1,fighter2,no_contest,accuracy,macro avg,weighted avg
precision,0.0,0.549074,0.567436,0.0,0.557015,0.279128,0.54945
recall,0.0,0.632871,0.498932,0.0,0.557015,0.282951,0.557015
f1-score,0.0,0.588002,0.530984,0.0,0.557015,0.279746,0.550688
support,11.0,937.0,936.0,19.0,0.557015,1903.0,1903.0


Best Random Forest model used on test data is 64% accurate. 

An increase from 49 to 64 represents a positive change of 30.61%

Conclusion: 



To do with more time:

- Predict on fight card and add predictions to CSV.
- Scrape for most recent data. 
- Add win/loss record as a feature.
- Figure out how to add stance as a feature. 
- Once accuracy is increaed by a stisfactory amout, create a front end app that takes in two fighter and returns the predicted outcome. 
