# UFC Fight Analysis: Exploring Striking Trends Across Weight Classes

## Context and Purpose
The Ultimate Fighting Championship (UFC) has seen substantial evolution since its inception, emerging as a key platform for mixed martial arts (MMA). In this project, I delve into the striking patterns observed in UFC fights, aiming to uncover insights into how fighting styles and strategies vary across weight classes and over time. The analysis is centered around several key questions to unravel the dynamics of striking in MMA:

1. Which weight class strikes the most on average, and what is the variability per weight class?
2. Which fighters get struck the most on average?
3. How many strikes does it take to knock fighters out, on average per weight class?
4. Do fighters strike more now than they did in the past?

## Methodology
Utilizing a dataset of UFC fights, I conducted a comprehensive analysis of striking data. The dataset includes detailed fight records, such as the number of significant strikes landed, weight class, fight outcome, and fighter statistics. My approach involved data wrangling, exploratory data analysis, and visualization using Python libraries like Pandas, Numpy and Plotly. Each question was addressed through specific statistical and graphical techniques, illustrating the striking trends and patterns in UFC.

## Key Findings
A summary of some intriguing insights from my analysis:

- **Striking Frequency by Weight Class**: Lighter weight classes tend to have a higher average striking rate compared to heavier weight classes, indicating a negative correlation between striking frequency and weight class. This suggests a more agile and fast-paced fighting style in the lighter divisions.
- **Strikes and Fighter Records**: Fighters with lower win-loss records tend to be struck more often, implying that more skilled or elite fighters typically absorb fewer strikes. This trend highlights the significance of defense and evasiveness at higher competition levels.
- **Knockouts and Weight Classes**: It generally takes fewer strikes to knock out heavier fighters, likely due to the increased power of strikes in these weight classes. This aligns with the understanding that heavier fighters possess greater knockout power.
- **Striking Trends Over Time**: There has been a noticeable increase in the number of strikes thrown in UFC fights over the years. This could be attributed to various factors, including the introduction of female fighters, the addition of smaller weight classes, and the evolution of rules and safety measures in the sport.

This analysis offers a comprehensive view of striking trends in the UFC, providing valuable insights for fans, analysts, and practitioners of MMA.

In [56]:
#Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import seaborn as sns
import warnings
import colorsys
import plotly.express as px

#Read in data and check shape
df = pd.read_csv('ufc.csv')
print(df.shape)
display(df.head())

#pd.set_option('display.max_columns', None)
#pd.set_option('display.max_rows', None)

(7417, 18)


Unnamed: 0,Location,Fighter 1,Fighter 2,Fighter_1_KD,Fighter_2_KD,Fighter_1_STR,Fighter_2_STR,Fighter_1_TD,Fighter_2_TD,Fighter_1_SUB,Fighter_2_SUB,Weight_Class,Method,Round,Time,Event Name,Date,Winner
0,"Austin, Texas, USA",Arman Tsarukyan,Beneil Dariush,1.0,0.0,8.0,2.0,0.0,0.0,0.0,0.0,Lightweight,KO/TKO Punch,1.0,1:04,UFC Fight Night,2-Dec-23,Arman Tsarukyan
1,"Austin, Texas, USA",Jalin Turner,Bobby Green,1.0,0.0,33.0,15.0,0.0,0.0,0.0,0.0,Lightweight,KO/TKO Punches,1.0,2:49,UFC Fight Night,2-Dec-23,Jalin Turner
2,"Austin, Texas, USA",Deiveson Figueiredo,Rob Font,0.0,0.0,45.0,46.0,4.0,0.0,0.0,0.0,Bantamweight,U-DEC,3.0,5:00,UFC Fight Night,2-Dec-23,Deiveson Figueiredo
3,"Austin, Texas, USA",Sean Brady,Kelvin Gastelum,0.0,0.0,14.0,18.0,5.0,0.0,3.0,0.0,Welterweight,SUB Kimura,3.0,1:43,UFC Fight Night,2-Dec-23,Sean Brady
4,"Austin, Texas, USA",Joaquim Silva,Clay Guida,0.0,0.0,46.0,43.0,2.0,2.0,2.0,0.0,Lightweight,U-DEC,3.0,5:00,UFC Fight Night,2-Dec-23,Joaquim Silva


In [57]:
df.dropna(subset='Fighter 2', inplace=True)
df.dropna(subset='Date', inplace=True)

#Check for null values, data types, and duplicates
print(df.isnull().sum())
print()
print(df.dtypes)
print()
print(df.describe())
print()
print(df.duplicated().sum())

# Find and display duplicates
duplicates = df[df.duplicated()]
display(duplicates)

Location         0
Fighter 1        0
Fighter 2        0
Fighter_1_KD     0
Fighter_2_KD     0
Fighter_1_STR    0
Fighter_2_STR    0
Fighter_1_TD     0
Fighter_2_TD     0
Fighter_1_SUB    0
Fighter_2_SUB    0
Weight_Class     0
Method           0
Round            0
Time             0
Event Name       0
Date             0
Winner           0
dtype: int64

Location          object
Fighter 1         object
Fighter 2         object
Fighter_1_KD     float64
Fighter_2_KD     float64
Fighter_1_STR    float64
Fighter_2_STR    float64
Fighter_1_TD     float64
Fighter_2_TD     float64
Fighter_1_SUB    float64
Fighter_2_SUB    float64
Weight_Class      object
Method            object
Round            float64
Time              object
Event Name        object
Date              object
Winner            object
dtype: object

       Fighter_1_KD  Fighter_2_KD  Fighter_1_STR  Fighter_2_STR  Fighter_1_TD  \
count   7412.000000   7412.000000    7412.000000    7412.000000   7412.000000   
mean       0.3669

Unnamed: 0,Location,Fighter 1,Fighter 2,Fighter_1_KD,Fighter_2_KD,Fighter_1_STR,Fighter_2_STR,Fighter_1_TD,Fighter_2_TD,Fighter_1_SUB,Fighter_2_SUB,Weight_Class,Method,Round,Time,Event Name,Date,Winner


## 1. Which weight class strikes the most on average, and what is the variability per weight class?

In [88]:
def get_gradient_colors(dataframe, value_column, color_start=(1, 0, 0), color_end=None):
    """
    Generate gradient colors for a dataframe column.
    
    :param dataframe: Pandas DataFrame containing the data.
    :param value_column: The name of the column to base the gradient on.
    :param color_start: The starting color for the gradient as an HSV tuple.
    :param color_end: The ending color for the gradient as an HSV tuple. If None, a single-color gradient is used.
    :return: List of gradient colors.
    """
    min_value = dataframe[value_column].min()
    max_value = dataframe[value_column].max()

    def interpolate_color(value, color1, color2=None):
        """ Interpolate between two colors in HSV space """
        if color2 is None:
            color2 = color1
        normalized = (value - min_value) / (max_value - min_value)
        interpolated_color = [color1[i] + (color2[i] - color1[i]) * normalized for i in range(3)]
        return colorsys.hsv_to_rgb(*interpolated_color)
    
    gradient_colors = [
        interpolate_color(value, color_start, color_end) for value in dataframe[value_column]
    ]
    return ['rgb({:.0f}, {:.0f}, {:.0f})'.format(r * 255, g * 255, b * 255) for r, g, b in gradient_colors]

# Combining the strikes data for each match
df['Total_STR'] = df['Fighter_1_STR'] + df['Fighter_2_STR']

# Grouping by weight class and calculating the average significant strikes
average_strikes_per_class = df.groupby('Weight_Class')['Total_STR'].mean().sort_values(ascending=True).reset_index()

# Adding a column with the gradient colors using the get_gradient_colors function
average_strikes_per_class['Color'] = get_gradient_colors(average_strikes_per_class, 'Total_STR', color_start=(0, 0.5, 1), color_end=(0, 1, 1))

# Creating a bar plot with Plotly, setting the colors for each bar
fig = px.bar(average_strikes_per_class, 
             x=average_strikes_per_class['Total_STR'], 
             y=average_strikes_per_class['Weight_Class'],
             orientation='h', 
             color='Color',
             color_discrete_map="identity",
             labels={'x': 'Average Significant Strikes', 'index': 'Weight Class'},
             title='Average Significant Strikes Landed per Weight Class')


# Customize the layout
fig.update_layout(showlegend=False, template='plotly_dark', title_x=0.5, title_font_size=20, xaxis=dict(title='Average Significant Strikes'),
    yaxis=dict(title='Weight Class'))

# Show the plot
fig.show()

# Display the dataframe
average_strikes_per_class[['Weight_Class','Total_STR']].sort_values(by='Total_STR', ascending=False)





Unnamed: 0,Weight_Class,Total_STR
14,Women's Flyweight,103.051887
13,Women's Strawweight,102.189474
12,Women's Bantamweight,91.608247
11,Women's Featherweight,89.758621
10,Catch Weight,88.169231
9,Featherweight,84.83662
8,Bantamweight,80.477093
7,Flyweight,74.75
6,Welterweight,70.123496
5,Lightweight,69.129658


!['Average Significant Strikes'](average_strikes.png)


In [86]:
# Calculating the standard deviation of significant strikes within each weight class
std_strikes_per_class = df.copy().groupby('Weight_Class')['Total_STR'].std().sort_values(ascending=True).reset_index().dropna()

#Creating a color column using the get_gradient_colors function
std_strikes_per_class['Color'] = get_gradient_colors(std_strikes_per_class, 'Total_STR', color_start=(0.14, 0.5, 1), color_end=(0.14, 1, 1))

# Creating a bar plot with Plotly, setting the colors for each bar
fig = px.bar(std_strikes_per_class, 
             x=std_strikes_per_class['Total_STR'], 
             y=std_strikes_per_class['Weight_Class'],
             orientation='h', 
             color='Color',
             color_discrete_map="identity",
             labels={'x': 'Standard Deviation of Significant Strikes', 'index': 'Weight Class'},
             title='Standard Deviation of Significant Strikes Landed per Weight Class')


# Customize the layout
fig.update_layout(showlegend=False, template='plotly_dark', title_x=0.5, title_font_size=20, xaxis=dict(title='Standard Deviation of Significant Strikes'),
    yaxis=dict(title='Weight Class'))

# Show the plot
fig.show()

# Display the dataframe
std_strikes_per_class[['Weight_Class','Total_STR']].sort_values(by='Total_STR', ascending=False)





Unnamed: 0,Weight_Class,Total_STR
13,Women's Strawweight,66.353879
12,Women's Flyweight,66.246646
11,Catch Weight,65.994453
10,Featherweight,64.803348
9,Women's Featherweight,63.560801
8,Women's Bantamweight,61.041404
7,Bantamweight,58.979909
6,Welterweight,57.310431
5,Lightweight,53.440447
4,Flyweight,52.302561


#### Super Heavyweight: NaN (not enough data for a reliable measure)
!['Average Significant Strikes'](std_plot.png)


## 2. Which fighters get struck the most on average?

In [133]:
# For each fighter, calculate the strikes they absorbed from their opponent
fighter_1_absorbed = df[['Fighter 1', 'Fighter_2_STR']].rename(columns={'Fighter 1': 'Fighter', 'Fighter_2_STR': 'Absorbed_STR'})
fighter_2_absorbed = df[['Fighter 2', 'Fighter_1_STR']].rename(columns={'Fighter 2': 'Fighter', 'Fighter_1_STR': 'Absorbed_STR'})
combined_absorbed_data = pd.concat([fighter_1_absorbed, fighter_2_absorbed], ignore_index=True)

# Calculating the average significant strikes absorbed for each fighter
average_absorbed = combined_absorbed_data.groupby('Fighter')['Absorbed_STR'].mean()

# Creating columns for the fight record (Wins-Losses-Draws)
df['Fighter 1 Wins'] = df['Winner'] == df['Fighter 1']
df['Fighter 2 Wins'] = df['Winner'] == df['Fighter 2']
df['Draw'] = df['Method'] == 'Draw'

# Aggregating wins, losses, and draws
fighter_1_record = df.groupby('Fighter 1').agg({'Fighter 1 Wins': 'sum', 'Fighter 2 Wins': 'count', 'Draw': 'sum'}).rename(columns={'Fighter 1 Wins': 'Wins', 'Fighter 2 Wins': 'Losses', 'Draw': 'Draws'})
fighter_2_record = df.groupby('Fighter 2').agg({'Fighter 2 Wins': 'sum', 'Fighter 1 Wins': 'count', 'Draw': 'sum'}).rename(columns={'Fighter 2 Wins': 'Wins', 'Fighter 1 Wins': 'Losses', 'Draw': 'Draws'})
fighter_record = fighter_1_record.add(fighter_2_record, fill_value=0).astype(int)

# Calculating the Losses count (subtracting the wins and draws)
fighter_record['Losses'] = fighter_record['Losses'] - fighter_record['Wins'] - fighter_record['Draws']

# Merging the records and the most absorbed event details
fighter_details = average_absorbed.reset_index().merge(fighter_record, left_on='Fighter', right_index=True)

# Displaying the updated dataset with the additional columns
fighter_details = fighter_details.sort_values(by='Absorbed_STR', ascending=False)

# Gathering the top 20 fighters with the most significant strikes absorbed (for the viz)
top_20_fighters_details = fighter_details.head(20).sort_values(by='Absorbed_STR', ascending=True)

top_20_fighters_details['Color'] = get_gradient_colors(top_20_fighters_details, 'Absorbed_STR', color_start=(0, 1, 0), color_end=(0, 1, 1))

# Creating a bar plot with Plotly, setting the colors for each bar
fig = px.bar(top_20_fighters_details, 
             x=top_20_fighters_details['Absorbed_STR'], 
             y=top_20_fighters_details['Fighter'],
             orientation='h',
             color='Color',
             color_discrete_map="identity",
             labels={'x': 'Average Significant Strikes Absorbed', 'index': 'Fighter'},
             title='Top 20 UFC Fighters with the Most Significant Strikes Absorbed on Average')


# Customize the layout
fig.update_layout(showlegend=False, template='plotly_white', title_x=0.5, title_font_size=20, xaxis=dict(title='Average Significant Strikes Absorbed'),
    yaxis=dict(title='Fighter'), height=700)

# Show the plot
fig.show()

fighter_details





Unnamed: 0,Fighter,Absorbed_STR,Wins,Losses,Draws
1396,Landon Quinones,171.0,0,1,0
1340,Kevin Borjas,156.0,0,1,0
1371,Kris Moutinho,129.0,0,2,0
2001,Rosi Sexton,125.0,0,2,0
1832,Peggy Morgan,123.0,0,1,0
...,...,...,...,...,...
153,Anthony Macias,0.0,0,3,0
1733,Neil Grove,0.0,0,1,0
734,Ernie Verdicia,0.0,0,1,0
1520,Marcus Bossett,0.0,0,1,0


![Top 20 UFC Fighters with the Most Significant Strikes Absorbed](top20.png)

## 3. How many strikes does it take to knock fighters out, on average per weight class?

In [132]:
# Ensuring that the weight classes are in the correct order
weight_class_order = [
    "Women's Strawweight",
    "Women's Flyweight", 
    "Flyweight",
    "Women's Bantamweight",  
    "Bantamweight",
    "Women's Featherweight", 
    "Featherweight", 
    "Lightweight", 
    "Welterweight", 
    "Middleweight", 
    "Light Heavyweight", 
    "Heavyweight", 
    "Super Heavyweight",
    "Catch Weight",
    "Open Weight"
]

# Filtering dataset for fights that ended in a knockout
ko_fights = df.copy()[df['Method'].str.contains('KO/TKO')]

# For knockouts, consider the total strikes landed by the winner until the knockout
ko_fights['Total_Strikes_Landed'] = ko_fights.apply(lambda row: row['Fighter_1_STR'] if row['Winner'] == row['Fighter 1'] else row['Fighter_2_STR'], axis=1)

# Create a numerical column for weight classes
weight_class_mapping = {wc: i for i, wc in enumerate(weight_class_order)}
ko_fights['Weight_Class_Num'] = ko_fights['Weight_Class'].map(weight_class_mapping)

# Generate gradient colors based on the weight classes
ko_fights['Color'] = get_gradient_colors(ko_fights, 'Weight_Class_Num', color_start=(0, 0, 1), color_end=(0, 1, 1))  # Red to Gold gradient

# Creating a box plot with Plotly
fig = px.box(ko_fights, 
             x='Weight_Class', 
             y='Total_Strikes_Landed', 
             color='Color',
             color_discrete_map="identity",  # Use the colors as they are in the gradient
             category_orders={'Weight_Class': weight_class_order})  # Ordering the weight classes

# Customizing the layout
fig.update_layout(title='Distribution of Strikes Leading to Knockouts in Each Weight Class',
                  xaxis_title='Weight Class',
                  yaxis_title='Total Strikes Landed (for KO/TKO)', template='plotly_dark', title_x=0.5, height=700,
                  #boxmode='group',  # Group boxes of the same location
                  boxgap=0.00001  # Smaller gap between boxes for thicker appearance)
)

# Rotating the x-axis labels for better readability
fig.update_xaxes(tickangle=45)
fig.update_traces(width=0.5)

# Show the plot
fig.show()





![Distribution of Strikes Leading to Knockouts in Each Weight Class](weight_class_knockouts.png)


## 4. Do fighters strike more now than they did in the past?

In [27]:
# Convert 'Date' to datetime and extract the year
df['Date'] = pd.to_datetime(df['Date'])
df['Year'] = df['Date'].dt.year
df['Total_Strikes_Landed'] = df['Fighter_1_STR'] + df['Fighter_2_STR']

# Calculate average strikes per year for each weight class
average_strikes = df.groupby(['Year']).agg({'Total_Strikes_Landed': 'mean'}).reset_index()

# Create the line plot
fig = px.line(average_strikes, x='Year', y='Total_Strikes_Landed',
              title='Average Strikes Per Year by Weight Class',
              labels={'Total_Strikes_Landed': 'Average Strikes'})

# Customize with UFC colors
fig.update_layout(height=1000, width=1200, template='plotly_dark')
fig.update_traces(line=dict(color='#FF0000', width=2))

# Show the plot
fig.show()


![Strikes Over Time](strikes_over_time.png)


Thanks for reading!