# UFC 264 | Poirier vs McGregor III | Data Analysis 

*Sam Park*

*8 July 2021*

*NOTE*

This notebook is being posted after UFC 264 took place on July 10th. However, the analysis is done from the perspective of before the fight took place. I decided to post the notebook anyways since the fight ended in a non-conclusive way (TKO due to Doctor Stoppage) and there is a possibility of a fourth fight in the future.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
data_264 = pd.read_csv('../input/ultimate-ufc-dataset/ufc-master.csv')
data_264.head(5)

# Extract Individual Fighter Data

In [None]:
R_mac = data_264.loc[data_264["R_fighter"] == "Conor McGregor"]
B_mac = data_264.loc[data_264["B_fighter"] == "Conor McGregor"]

R_mac["Win/Loss"] = np.where(R_mac["Winner"] == "Red", 1, 0)
B_mac["Win/Loss"] = np.where(B_mac["Winner"] == "Blue", 1, 0)

In [None]:
data_mac = pd.concat([R_mac, B_mac])
data_mac.head(5)

In [None]:
data_mac = data_mac[["R_fighter", "B_fighter", "date", "weight_class", "no_of_rounds",
                     "finish", "finish_details", "finish_round",
                     "total_fight_time_secs", "Win/Loss"]]

In [None]:
R_por = data_264.loc[data_264["R_fighter"] == "Dustin Poirier"]
B_por = data_264.loc[data_264["B_fighter"] == "Dustin Poirier"]

R_por["Win/Loss"] = np.where(R_por["Winner"] == "Red", 1, 0)
B_por["Win/Loss"] = np.where(B_por["Winner"] == "Blue", 1, 0)

data_por = pd.concat([R_por, B_por])

data_por = data_por[["R_fighter", "B_fighter", "date", "weight_class", "no_of_rounds",
                     "finish", "finish_details", "finish_round",
                     "total_fight_time_secs", "Win/Loss"]]

In [None]:
data_por = pd.concat([R_por, B_por])
data_por.head(5)

In [None]:
data_por = data_por[["R_fighter", "B_fighter", "date", "weight_class", "no_of_rounds",
                     "finish", "finish_details", "finish_round",
                     "total_fight_time_secs", "Win/Loss"]]

*NOTE*

The code for this notebook was created before UFC 264 took place. Therefore, if you would like to recreate this analysis you will need to remove the data from UFC 264 before continuing. There are also some null values and inconsistencies in the finish_details which you will need to fill in. However, the general framework of the data can be achieved with the code presented above. I will continue by connecting the data I saved from my original analysis with the necessary edits to keep things simple.

In [None]:
data_por = pd.read_csv('../input/por-vs-mac/data_por.csv')
data_mac = pd.read_csv('../input/por-vs-mac/data_mac.csv')

# Create Tables With Summary Statistics From UFCStats.com

In [None]:
tale_of_tape_data = {"Dustin Poirier":["27-6-0 (1 NC)", "19-5", "5'9\"", "72\"",
                                       "Southpaw", "32", "1"], 
                     "Conor McGregor":["22-5-0", "10-3", "5'9\"", "74\"", "Southpaw",
                                       "32", "1"]}

tale_of_the_tape = pd.DataFrame(tale_of_tape_data, index = ["Professional Record",
                                                            "UFC Record", "Height",
                                                            "Reach", "Stance", "Age",
                                                            "Head to Head Wins"])

str_ave_data = {"Dustin Poirier":["5.59", "4.17", "50%", "54%"],
                "Conor McGregor":["5.32", "4.54", "49%", "54%"]}

str_ave = pd.DataFrame(str_ave_data, index = ["Strikes Landed per Min", "Strikes Absorbed per Min",
                                              "Striking Accuracy", "Striking Defense"])

grap_ave_data = {"Dustin Poirier":["1.47", "1.3", "36%", "61%"],
                 "Conor McGregor":["0.70", "0.0", "55%", "67%"]}

grap_ave = pd.DataFrame(grap_ave_data, index = ["Average Takedowns per 15 Min",
                                                "Average Submissions per 15 Min",
                                                "Takedown Accuracy", "Takedown Defense"])

### Create Stylers to Highlight Key Stats in Tables

In [None]:
def tale(val):
    if '19-5' in val:
        color = 'green'
    elif '10-3' in val:
        color = 'red'
    elif '72"' in val:
        color = 'red'
    elif '74"' in val:
        color = 'green'
    else:
        color = 'black'
    return 'color: %s' % color

t = tale_of_the_tape.style.applymap(tale)

In [None]:
def str_(val):
    if '5.59' in val:
        color = 'green'
    elif '4.17' in val:
        color = 'green'
    elif '5.32' in val:
        color = 'red'
    elif '4.54' in val:
        color = 'red'
    else:
        color = 'black'
    return 'color: %s' % color

s = str_ave.style.applymap(str_)

In [None]:
def grap(val):
    if '1.47' in val:
        color = 'green'
    elif '1.3' in val:
        color = 'green'
    elif '0.70' in val:
        color = 'red'
    elif '0.0' in val:
        color = 'red'
    elif '55%' in val:
        color = 'green'
    elif '67%' in val:
        color = 'green'
    elif '36%' in val:
        color = 'red'
    elif '61%' in val:
        color = 'red'
    else:
        color = 'black'
    return 'color: %s' % color

g = grap_ave.style.applymap(grap)

In [None]:
t

Both fighters have very similar static data. Overall records, height, stance, age, and head to head wins are basically even. The first advantage I see from the static data is that Poirier has much more UFC experience. Even though McGregor has achieved more championship success in the UFC, he only has 13 total UFC fights compared to 24 for Poirier. This is partially due to the fact that Poirier started fighting in the UFC before McGregor, and McGregor's lack of activity over the past few years. The second significant advantage in the static data is a two inch reach advantage for McGregor. This could be integral in the boxing exchanges, especially since McGregor has a deadly straight left hand out of the southpaw stance.

In [None]:
g

In the grappling department we can see advantages for both fighters. Average takedowns per 15 minutes and average submissions per 15 minutes easily side with Poirier. This means than Poirier is a much more active wrestler and grappler. However, even though McGregor is much less active as a wrestler he has a higher takedown accuracy and takedown defense. Many believe that Dustin Poirier's key to victory is to take the fight to the ground since he was able to take Conor McGregor down in the second fight and McGregor possesses a dangerous left hand on the feet. This data leads me to believe that it might be harder for Poirier to take McGregor down than many experts are making it seem.

In [None]:
s

Striking accuracy and strking defense are essentially even between these two fighters. Both are regarded as proficient strikers so this could lead to an interesting battle on the feet. It should also be noted that both fighters have head to head victories via KO/TKO due to Punches from distance. However, Poirier has the edge when it comes to strikes landed and strikes absorbed per minute. This would lead me to believe that Poirier could pull away in strikes landed if the fight progresses into the later rounds.

# Visualize Fighter Comparisons

In [None]:
rds_por = pd.pivot_table(data_por, index = ['no_of_rounds'], columns = ['finish_round'], 
                            values = ['Win/Loss'], aggfunc = np.sum)

rds_mac = pd.pivot_table(data_mac, index = ['no_of_rounds'], columns = ['finish_round'],
                            values = ['Win/Loss'], aggfunc = np.sum, fill_value = 0)

In [None]:
three_rds_por = rds_por.loc[3]
three_rds_por = pd.DataFrame(three_rds_por)
three_rds_por = three_rds_por.dropna()

five_rds_por = rds_por.loc[5]
five_rds_por = pd.DataFrame(five_rds_por)

In [None]:
three_rds_mac = rds_mac.loc[3]
three_rds_mac = pd.DataFrame(three_rds_mac)
three_rds_mac = three_rds_mac.dropna()

five_rds_mac = rds_mac.loc[5]
five_rds_mac = pd.DataFrame(five_rds_mac)

In [None]:
three_rds_por['label'] = np.array(['Round 1 Finish', 'Round 2 Finish', 'Round 3 Finish'])
three_rds_por['y'] = np.array([6, 2, 5])

five_rds_por['label'] = np.array(['Round1 Finish', 'Round 2 Finish', 'Round 3 Finish',
                                  'Round 4 Finish', 'Round 5 Finish'])
five_rds_por['y'] = np.array([0, 1, 1, 1, 2])

In [None]:
three_rds_mac['index'] = np.array([0, 1, 2, 3, 4])
three_rds_mac = three_rds_mac[three_rds_mac['index'] < 3]
three_rds_mac = three_rds_mac.drop(['index'], axis = 1)
three_rds_mac['label'] = np.array(['Round 1 Finish', 'Round 2 Finish', 'Round 3 Finish'])
three_rds_mac['y'] = np.array([2, 0, 1])

five_rds_mac['label'] = np.array(['Round 1 Finish', 'Round 2 Finish', 'Round 3',
                                  'Round 3/4 Finish', 'Round 5 Finish'])
five_rds_mac['y'] = np.array([3, 3, 0, 0, 1])

In [None]:
def autopct_format(values):
    def my_format(pct):
        total = sum(values)
        val = int(round(pct*total/100.0))
        return '{v:d}'.format(v=val)
    return my_format

In [None]:
explode = np.array([0.1, 0.1, 0.1])
colm = np.array(['#FFA500', '#FFFFF0', '#15b01A'])
colp = np.array(['#E50000', '#FFFFF0', '#0000FF'])

fig, (ax1, ax2) = plt.subplots(1,2, figsize = (10,5))
fig.suptitle('3 Round Fight Comparison')

ax1.pie(three_rds_por['y'], labels = three_rds_por['label'], explode = explode,
        autopct = autopct_format(three_rds_por['y']), colors = colp, shadow=True)
ax1.set_title('Dustin Poirier 3 Round Fights')

ax2.pie(three_rds_mac['y'], labels = three_rds_mac['label'], explode = explode,
        autopct = autopct_format(three_rds_mac['y']), colors = colm, shadow=True)
ax2.set_title('Conor McGregor 3 Round Fights')

We can see that Poirier has as many 3 round bouts in the UFC as McGregor has total bouts in the UFC. Poirier has gone the full 3 rounds 5 times in 3 round bouts while McGregor has only gone the distance once in a 3 round bout. Both fighters also have first round finishes in 3 round bouts. 

In [None]:
colm = np.array(['#FFA500', '#FFFFF0', '#15b01A', '#FFD700', '#AAFF32'])
colp = np.array(['#FFFF14', '#FFFFF0', '#0000FF', '#E50000', '#00FFFF'])
explode = np.array([0.1, 0.1, 0.1, 0.1, 0.1])

fig2, (ax1, ax2) = plt.subplots(1,2, figsize = (10,5))
fig2.suptitle('5 Round Fight Comparison')

ax1.pie(five_rds_por['y'], labels = five_rds_por['label'], explode = explode,
        autopct = autopct_format(five_rds_por['y']), colors = colp, shadow=True)
ax1.set_title('Dustin Poirier 5 Rounds Fights')

ax2.pie(five_rds_mac['y'], labels = five_rds_mac['label'], explode = explode,
        autopct = autopct_format(five_rds_mac['y']), colors = colm, shadow=True)
ax2.set_title('Conor McGregor 5 Round Fights')

Since the upcoming bout is a 5 rounder, the 5 round fight comparison is more important to analyze in this case. Intriguingly, McGregor has fought in more 5 round bouts than Poirier even though Poirier has more overall UFC experience. Another popular narrative among UFC experts is that Poirier will have an edge in rounds 4 and 5 due to McGregor's lack of cardio. However, according to the data McGregor actually has more experience training for full 5 round bouts in the UFC. I believe McGregor hasn't been able to properly showcase his cardio since he tends to finish his fights in earlier rounds, and it seems reasonable to assume that McGregor should have decent cardio since he has prepared for so many 5 round bouts in the past. It will be interesting to see if Poirier will have the advantage if the fight reaches the championship rounds.

In [None]:
data_por = pd.read_csv('../input/por-vs-mac/data_por.csv')
data_mac = pd.read_csv('../input/por-vs-mac/data_mac.csv')

In [None]:
finish_por = pd.pivot_table(data_por, index = ['finish'], columns = ['finish_details'],
                            values = ['Win/Loss'], aggfunc = np.sum, fill_value = 0)

finish_mac = pd.pivot_table(data_mac, index = ['finish'], columns = ['finish_details'],
                            values = ['Win/Loss'], aggfunc = np.sum, fill_value = 0)

In [None]:
finish_mac['y'] = np.array([8, 1, 0, 1])
finish_mac['label'] = np.array(['KO/TKO Punches', 'M-DEC', 'SUB', 'U-DEC'])

In [None]:
finish_por = finish_por.stack()
finish_por['index'] = np.array([0,0,0,3,3,0,0,0,3,0,0,0,3,3,0,0,0,0,0,0,3,0,0,0])
finish_por = finish_por[finish_por['index']>2]
finish_por = finish_por.drop(['index'], axis = 1)
finish_por['y'] = np.array([1,8,1,1,2,6,])
finish_por['label'] = np.array(['KO/TKO Injury', 'KO/TKO Punches', 'M-DEC', 'SUB Armbar',
                              "SUB D'Arce Choke", 'U-DEC'])

In [None]:
explodep = np.array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1])
explodem = np.array([0.1, 0.1, 0.1, 0.1])
colp = np.array(['#FFFF14', '#FFFFF0', '#0000FF', '#E50000', '#00FFFF', '#FF81C0'])
colm = np.array(['#FFA500', '#FFFFF0', '#FFD700', '#15B01A'])

fig3, (ax1, ax2) = plt.subplots(1,2, figsize = (10,5))
fig3.suptitle('Finish Comparison')

ax1.pie(finish_por['y'], labels = finish_por['label'], explode = explodep,
        autopct = autopct_format(finish_por['y']), colors = colp, shadow=True)
ax1.set_title('Dustin Poirier Finishes')

ax2.pie(finish_mac['y'], labels = finish_mac['label'], explode = explodem,
        autopct = autopct_format(finish_mac['y']), colors = colm, shadow=True)
ax2.set_title('Conor McGregor Finishes')

Here we see the different types of finishes for each fighter's UFC career wins. The difference is clear between these two fighters. McGregor has finished almost all of his fights via KO/TKO due to punches, while Poirier has a mixture of finishes including multiple knockout, decision, and submission victories. In their second head to head bout, many criticized McGregor for applying a boxing heavy approach and praised Poirier for showcasing a plethora of mixed martial arts disciplines. The data appears to support this narrative, and it will be interesting to see if McGregor approaches the trilogy fights with a more well rounded approach.

# Conclusion

Dustin Poirier vs Conor McGregor III is likely to be one of the biggest UFC fights of all time and many believe both fighters are very equally matched. This analysis has revealed potential advantages for both fighters but I believe the data suggests this bout will be very closely contested. Poirier clearly has many avenues to victory, while McGregor has had a very high rate of success with strikes from distance. I believe this fight will come down to whether or not McGregor can maintain his output in the later rounds and if Poirier can have success getting McGregor to the ground. I believe both fighters have a relatively high likelihood of winning by knockout or decision, but I believe Poirier could also get it done via submission.

*P.S.*

Do you believe there will be a fourth fight between Dustin Poirier and Conor McGregor in the future since the fight ended prematurely due to an injury? If so, comment if you would like to me to do another analysis before their next fight or any other upcoming UFC fights.