# Purpose

The purpose of this notebook is to gain insights on the statistics previousl calculated.

The main table we will be using is a fighter round performance table which holds a record for every 5-minute round in the UFC. Each round has two rows, one for each fighter. The statistics are as follows:
Fighter Round Performance:
 - SSA - Significant Strike Attempts
 - SSS - Significant Strike Successes
 - SS_AC - Significant Strike Accuracy
 - SS_DI - Significant Strike Differential
 - SS_DE - Significant Strike Defense
 - SSA_P1M - Significant Strike Attempts Per 1 Minute
 - SSS_P1M - Significant Strike Successes Per 1 Minute

In [1]:
%load_ext autoreload
%autoreload 2

import os
import sys
module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

import pandas as pd
from sqlalchemy import create_engine
from src import local
from src import functions
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import ttest_ind

In [None]:
data = pd.read_csv('../../data/ufcstats_data/fighter_round_performance.csv')
data.head()

In [None]:
data['date'] = pd.to_datetime(data.date)

## In order to get the names, we'll pull the fighter table from the sql database

In [None]:
# Credentials
USER = local.user 
PASS = local.password
HOST = local.host
PORT = local.port

#create engine
engine = create_engine(f'postgresql://{USER}:{PASS}@{HOST}:{PORT}/match_finder')

In [None]:
query = """
SELECT name, link
FROM fighters
"""

fighters = pd.read_sql(query, engine)

In [None]:
fighters

In [None]:
data = data.join(fighters.set_index('link'), on='fighter_link')

### Fighter Career Statistics
I only want to look at people who have at least 9 rounds and have fought in the last year and a half.

## Top 10 active fighters with the highest Sig Stike Attempts Per Minute

In [16]:
data.groupby('').mean()

Unnamed: 0_level_0,round,minutes,ssa,sss,ss_ac,ss_de,sss_di,ssa_di,ssa_p1m,sss_p1m,ssa_1,sss_1,ss_ac_1
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Aalon Cruz,1.000000,1.416667,12.000000,2.000000,0.166667,0.310345,-18.000000,-17.000000,8.470588,1.411765,29.000000,20.000000,0.689655
Aaron Brink,1.000000,0.916667,5.000000,0.000000,0.000000,0.500000,-2.000000,1.000000,5.454545,0.000000,4.000000,2.000000,0.500000
Aaron Phillips,1.875000,4.685417,13.625000,7.750000,0.553545,0.379311,-8.375000,-13.000000,2.902349,1.626007,26.625000,16.125000,0.620689
Aaron Riley,1.800000,4.768333,45.200000,15.700000,0.349931,0.615215,-2.950000,-2.500000,9.593708,3.380641,47.700000,18.650000,0.384785
Aaron Rosa,1.857143,4.080952,41.428571,18.857143,0.412043,0.476898,0.428571,4.285714,10.097259,4.198007,37.142857,18.428571,0.523102
...,...,...,...,...,...,...,...,...,...,...,...,...,...
Zarah Fairn,1.000000,3.783333,16.500000,7.500000,0.432143,0.409722,-17.500000,-24.500000,4.503879,2.050400,41.000000,25.000000,0.590278
Zarrukh Adashev,1.000000,0.533333,6.000000,2.000000,0.333333,0.333333,0.000000,3.000000,11.250000,3.750000,3.000000,2.000000,0.666667
Zelim Imadaev,1.800000,4.980000,44.200000,18.400000,0.421417,0.430252,1.600000,13.600000,8.863673,3.690612,30.600000,16.800000,0.569748
Zhalgas Zhumagulov,2.000000,5.000000,39.666667,22.000000,0.555095,0.570672,4.666667,-0.666667,7.933333,4.400000,40.333333,17.333333,0.429328


In [9]:
fighter_groups = data.groupby('name')

In [20]:
recent_fighters = fighter_groups.date.max()>pd.to_datetime('1/1/2019')
experienced = fighter_groups.date.count()>=9

In [21]:
fighter_groups.ssa_p1m.mean()[recent_fighters][experienced].sort_values(ascending=False)[:10]

name
Weili Zhang         16.833255
Paulo Costa         16.801537
Marco Polo Reyes    16.691553
Xiaonan Yan         15.946667
Irene Aldana        15.873973
Chan Sung Jung      15.485653
Ion Cutelaba        15.471439
Cory Sandhagen      14.784528
Max Holloway        14.608947
Shane Burgos        14.454724
Name: ssa_p1m, dtype: float64

## Top 10 active fighters with the highest Sig Stike Attempts Per Minute

In [16]:
data.groupby('').mean()

Unnamed: 0_level_0,round,minutes,ssa,sss,ss_ac,ss_de,sss_di,ssa_di,ssa_p1m,sss_p1m,ssa_1,sss_1,ss_ac_1
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Aalon Cruz,1.000000,1.416667,12.000000,2.000000,0.166667,0.310345,-18.000000,-17.000000,8.470588,1.411765,29.000000,20.000000,0.689655
Aaron Brink,1.000000,0.916667,5.000000,0.000000,0.000000,0.500000,-2.000000,1.000000,5.454545,0.000000,4.000000,2.000000,0.500000
Aaron Phillips,1.875000,4.685417,13.625000,7.750000,0.553545,0.379311,-8.375000,-13.000000,2.902349,1.626007,26.625000,16.125000,0.620689
Aaron Riley,1.800000,4.768333,45.200000,15.700000,0.349931,0.615215,-2.950000,-2.500000,9.593708,3.380641,47.700000,18.650000,0.384785
Aaron Rosa,1.857143,4.080952,41.428571,18.857143,0.412043,0.476898,0.428571,4.285714,10.097259,4.198007,37.142857,18.428571,0.523102
...,...,...,...,...,...,...,...,...,...,...,...,...,...
Zarah Fairn,1.000000,3.783333,16.500000,7.500000,0.432143,0.409722,-17.500000,-24.500000,4.503879,2.050400,41.000000,25.000000,0.590278
Zarrukh Adashev,1.000000,0.533333,6.000000,2.000000,0.333333,0.333333,0.000000,3.000000,11.250000,3.750000,3.000000,2.000000,0.666667
Zelim Imadaev,1.800000,4.980000,44.200000,18.400000,0.421417,0.430252,1.600000,13.600000,8.863673,3.690612,30.600000,16.800000,0.569748
Zhalgas Zhumagulov,2.000000,5.000000,39.666667,22.000000,0.555095,0.570672,4.666667,-0.666667,7.933333,4.400000,40.333333,17.333333,0.429328


In [9]:
fighter_groups = data.groupby('name')

In [20]:
recent_fighters = fighter_groups.date.max()>pd.to_datetime('1/1/2019')
experienced = fighter_groups.date.count()>=9

In [23]:
fighter_groups.sss_di.mean()[recent_fighters][experienced].sort_values(ascending=False)[:10]

name
Cristiane Justino     19.500000
Sabina Mazo           15.888889
Joanna Jedrzejczyk    15.872727
Tatiana Suarez        15.636364
Cain Velasquez        15.620690
Petr Yan              14.200000
Xiaonan Yan           14.200000
Hakeem Dawodu         13.692308
Amanda Ribas          13.333333
Aljamain Sterling     13.333333
Name: sss_di, dtype: float64