In [12]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os

In [13]:
master_dataframe = pd.read_csv('cleaned_master_data.csv')
print(master_dataframe.columns.tolist())


['date', 'result', 'fighter', 'opponent', 'division', 'stance', 'dob', 'method', 'total_comp_time', 'round', 'time', 'referee', 'time_format', 'reach', 'height', 'age', 'knockdowns', 'sub_attempts', 'reversals', 'control', 'takedowns_landed', 'takedowns_attempts', 'sig_strikes_landed', 'sig_strikes_attempts', 'total_strikes_landed', 'total_strikes_attempts', 'head_strikes_landed', 'head_strikes_attempts', 'body_strikes_landed', 'body_strikes_attempts', 'leg_strikes_landed', 'leg_strikes_attempts', 'distance_strikes_landed', 'distance_strikes_attempts', 'clinch_strikes_landed', 'clinch_strikes_attempts', 'ground_strikes_landed', 'ground_strikes_attempts']


## 1. Analyze High-Risk Outcomes (KO/TKO and Knockdowns)

In [14]:
# Analyze methods resulting in KO/TKO
ko_tko = master_dataframe[master_dataframe['method'].str.contains('KO|TKO', na=False)]

# Count of KO/TKO outcomes
ko_tko_count = ko_tko['method'].count()
total_fights = master_dataframe.shape[0]

print(f"KO/TKO Outcomes: {ko_tko_count}")
print(f"Percentage of Fights Ending in KO/TKO: {ko_tko_count / total_fights * 100:.2f}%")

# Analyze average knockdowns per fight
avg_knockdowns = master_dataframe['knockdowns'].mean()
print(f"Average Knockdowns Per Fight: {avg_knockdowns:.2f}")

KO/TKO Outcomes: 4377
Percentage of Fights Ending in KO/TKO: 32.92%
Average Knockdowns Per Fight: 0.22


KO/TKO outcomes account for 32.92% of all fights, with an average of 0.22 knockdowns per fight. This indicates that nearly one-third of MMA fights result in high-impact outcomes, posing significant risks of head trauma or injury, especially in fights with multiple knockdowns.

## 2. Correlate Head Strikes with KO/TKO Outcomes

In [15]:
# Filter fights with high head strikes
master_dataframe['head_strike_ratio'] = master_dataframe['head_strikes_landed'] / master_dataframe['total_strikes_landed']

# Correlation between head strikes and KO/TKO outcomes
head_strike_ko = master_dataframe[master_dataframe['method'].str.contains('KO|TKO', na=False)]
avg_head_strike_ratio_ko = head_strike_ko['head_strike_ratio'].mean()

print(f"Average Head Strike Ratio in KO/TKO Fights: {avg_head_strike_ratio_ko:.2f}")


Average Head Strike Ratio in KO/TKO Fights: 0.51


The average head strike ratio in KO/TKO fights is 0.51, meaning over half of all strikes landed in such fights target the head. This highlights the potential risks associated with head trauma, as fights involving high ratios of head strikes are more likely to end in a KO/TKO.

## 3. Analyze Fight Duration and Outcomes

In [16]:
# Group fights by duration and outcome
avg_fight_duration = master_dataframe.groupby('method')['total_comp_time'].mean().sort_values(ascending=False)

print("Average Fight Duration by Method:")
print(avg_fight_duration)

Average Fight Duration by Method:
method
U-DEC     942.083244
S-DEC     935.950479
M-DEC     935.083333
DRAW      669.982759
SUB       375.900645
KO/TKO    350.874572
DQ        350.450000
Name: total_comp_time, dtype: float64


Fights ending in decisions (U-DEC, S-DEC, M-DEC) have the longest average durations (around 935-942 seconds), while fights ending in KO/TKO or submissions are significantly shorter (~350 seconds). Longer fights may indicate a more measured approach, potentially reducing high-risk outcomes like knockouts or submissions due to exhaustion.

## 4. Compare Safety by Division

In [17]:
# Group by division and analyze KO/TKO rates
division_safety = master_dataframe[master_dataframe['method'].str.contains('KO|TKO', na=False)]
division_ko_rate = division_safety.groupby('division')['method'].count() / master_dataframe.groupby('division').size() * 100

print("KO/TKO Rate by Division:")
print(division_ko_rate.sort_values(ascending=False))

KO/TKO Rate by Division:
division
Super Heavyweight        100.000000
Heavyweight               53.913043
Open Weight               46.073298
Light Heavyweight         43.570844
Middleweight              37.039106
Welterweight              32.458028
Women's Featherweight     30.434783
Catch Weight              28.888889
Bantamweight              28.415301
Lightweight               27.770701
Featherweight             26.357827
Women's Bantamweight      24.852071
Flyweight                 22.519084
Women's Flyweight         18.787879
Women's Strawweight       11.914894
dtype: float64


KO/TKO rates vary significantly across divisions, with higher rates in heavier divisions like Heavyweight (53.91%) and Super Heavyweight (100%), compared to lighter divisions like Women’s Strawweight (11.91%). This suggests that fighters in heavier divisions experience greater risks due to the higher impact of strikes.

## 5. Investigate Referee Impact

In [18]:
# Average fight time by referee
referee_duration = master_dataframe.groupby('referee')['total_comp_time'].mean().sort_values(ascending=False)

# KO/TKO rates by referee
referee_ko_rate = division_safety.groupby('referee')['method'].count() / master_dataframe.groupby('referee').size() * 100

print("Average Fight Duration by Referee:")
print(referee_duration)

print("KO/TKO Rate by Referee:")
print(referee_ko_rate.sort_values(ascending=False))

Average Fight Duration by Referee:
referee
Nick Berens          900.0
Shawn Gregory        900.0
Ricky Parker         900.0
Bo Nesslein          900.0
Rick McCoy           900.0
                     ...  
Graham Bettes         97.0
Rick Fike             68.0
Tomaz Bendy           59.0
Taimak Guarriello     31.0
Lonnie Foster         21.0
Name: total_comp_time, Length: 209, dtype: float64
KO/TKO Rate by Referee:
referee
Gabe Barahona           100.0
Steven Davis            100.0
Sean Brockmole          100.0
Rick Fike               100.0
Brandon Pfannenstiel    100.0
                        ...  
Tomasz Bronder            NaN
Tomaz Bendy               NaN
Vyacheslav  Kiselev       NaN
Wayne Spinola             NaN
Will Fisher               NaN
Length: 209, dtype: float64


Referees such as Gabe Barahona and Steven Davis have 100% KO/TKO rates in their officiated fights, while others like Nick Berens oversee longer-duration fights. This implies that referee decision-making and stoppage timing could significantly influence safety outcomes.

## 6. Fighter Age and Safety

In [19]:
# Compare KO/TKO outcomes for older and younger fighters
older_fighters = master_dataframe[master_dataframe['age'] > 35]
younger_fighters = master_dataframe[master_dataframe['age'] <= 35]

older_ko_rate = older_fighters[older_fighters['method'].str.contains('KO|TKO', na=False)]['method'].count() / older_fighters.shape[0] * 100
younger_ko_rate = younger_fighters[younger_fighters['method'].str.contains('KO|TKO', na=False)]['method'].count() / younger_fighters.shape[0] * 100

print(f"KO/TKO Rate for Older Fighters (>35): {older_ko_rate:.2f}%")
print(f"KO/TKO Rate for Younger Fighters (<=35): {younger_ko_rate:.2f}%")

KO/TKO Rate for Older Fighters (>35): 40.27%
KO/TKO Rate for Younger Fighters (<=35): 32.26%


Older fighters (>35) have a higher KO/TKO rate (40.27%) compared to younger fighters (32.26%). This suggests that older fighters may be at greater risk of high-impact outcomes, possibly due to diminished reflexes, endurance, or accumulated damage over their careers.

## 7. Submission Trends

In [20]:
# Count submission outcomes
submission_fights = master_dataframe[master_dataframe['method'].str.contains('SUB', na=False)]
submission_rate = submission_fights.shape[0] / total_fights * 100

print(f"Submission Rate: {submission_rate:.2f}%")

Submission Rate: 19.83%


Submissions account for 19.83% of all outcomes, a significantly lower rate compared to KO/TKO outcomes. As submissions are generally less injury-prone than knockouts, this lower rate may reflect a preference for striking-based strategies in modern MMA.

## Full Analysis and Key Safety Trends


This data provides valuable insights into safety trends in MMA. High-risk outcomes like KO/TKOs are prevalent, accounting for nearly a third of all fights, with a strong correlation to head strikes. Heavier divisions, such as Heavyweight and Super Heavyweight, show significantly higher KO/TKO rates, indicating that the increased striking power in these divisions poses greater risks to fighter safety. Conversely, lighter divisions like Women’s Strawweight have lower KO/TKO rates, suggesting potentially safer competition. Fight duration also plays a role, with longer fights often ending in decisions, which might reduce the frequency of high-impact outcomes. Referees significantly impact safety, as seen in varying fight durations and KO/TKO rates, suggesting that their judgment in stoppages can influence injury risk.

Older fighters are more likely to experience KO/TKO outcomes, highlighting potential vulnerabilities such as reduced reflexes or physical resilience. Additionally, while submissions are less common, they present a safer alternative to striking-heavy strategies, particularly in grappling-focused divisions. Overall, the data emphasizes the importance of weight classes, referees, and fighting styles in determining safety trends, offering opportunities for improved regulations and strategies to minimize high-risk outcomes in MMA.