## 1. Clasificación de estilo de lucha de peleadores de UFC

El objetivo de este proyecto es, mediante la utilización de datos históricos de atletas de MMA que han competido en UFC, determinar cual es su estilo de pelea, donde tendremos 3 tipos:

- **Striker**: Su estilo predominante es la pelea de pie, y posee una gran cantidad de victorias por KO o nocaut técnico (TKO).
- **Grappler**: Predominan los agarres, derribos y técnicas de sumisión.
- **Táctico**: Busca principalmente la victoria por puntuación, ya sea desición unánime, mayoría, o dividida.

Importacion de librerías:

In [None]:


import pandas as pd #Manipulación de datos
import numpy as np #Operaciones numéricas
import matplotlib.pyplot as plt #Gráficos
import seaborn as sns #Visualización

from sklearn.model_selection import train_test_split #Separar train/test
from sklearn.tree import DecisionTreeClassifier #Árbol de decisión
from sklearn.metrics import classification_report, confusion_matrix #Métricas

Carga del dataset y visualización:

In [16]:
data = pd.read_csv("..//data//ufc-fighters-statistics.csv")
data

Unnamed: 0,name,nickname,wins,losses,draws,height_cm,weight_in_kg,reach_in_cm,stance,date_of_birth,significant_strikes_landed_per_minute,significant_striking_accuracy,significant_strikes_absorbed_per_minute,significant_strike_defence,average_takedowns_landed_per_15_minutes,takedown_accuracy,takedown_defense,average_submissions_attempted_per_15_minutes
0,Robert Drysdale,,7,0,0,190.50,92.99,,Orthodox,1981-10-05,0.00,0.0,0.00,0.0,7.32,100.0,0.0,21.9
1,Daniel McWilliams,The Animal,15,37,0,185.42,83.91,,,,3.36,77.0,0.00,0.0,0.00,0.0,100.0,21.6
2,Dan Molina,,13,9,0,177.80,97.98,,,,0.00,0.0,5.58,60.0,0.00,0.0,0.0,20.9
3,Paul Ruiz,,7,4,0,167.64,61.23,,,,1.40,33.0,1.40,75.0,0.00,0.0,100.0,20.9
4,Collin Huckbody,All In,8,2,0,190.50,83.91,193.04,Orthodox,1994-09-29,2.05,60.0,2.73,42.0,10.23,100.0,0.0,20.4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4106,John Campetella,,0,1,0,175.26,106.59,,Orthodox,,0.00,0.0,0.00,0.0,0.00,0.0,0.0,0.0
4107,Andre Pederneiras,,1,1,2,172.72,70.31,,Orthodox,1967-03-22,0.00,0.0,0.00,0.0,0.00,0.0,0.0,0.0
4108,Bryson Kamaka,,12,20,1,180.34,77.11,,Orthodox,,9.47,60.0,12.63,0.0,0.00,0.0,100.0,0.0
4109,Matej Penaz,Money,6,1,0,190.50,83.91,210.82,Southpaw,1996-10-14,1.28,33.0,2.55,33.0,0.00,0.0,0.0,0.0


Ahora vemos las columnas de nuestro dataset :

In [17]:
data.columns

Index(['name', 'nickname', 'wins', 'losses', 'draws', 'height_cm',
       'weight_in_kg', 'reach_in_cm', 'stance', 'date_of_birth',
       'significant_strikes_landed_per_minute',
       'significant_striking_accuracy',
       'significant_strikes_absorbed_per_minute', 'significant_strike_defence',
       'average_takedowns_landed_per_15_minutes', 'takedown_accuracy',
       'takedown_defense', 'average_submissions_attempted_per_15_minutes'],
      dtype='object')

In [18]:
data = data.drop(columns=['name', 'nickname', 'stance', 'date_of_birth'])
data

Unnamed: 0,wins,losses,draws,height_cm,weight_in_kg,reach_in_cm,significant_strikes_landed_per_minute,significant_striking_accuracy,significant_strikes_absorbed_per_minute,significant_strike_defence,average_takedowns_landed_per_15_minutes,takedown_accuracy,takedown_defense,average_submissions_attempted_per_15_minutes
0,7,0,0,190.50,92.99,,0.00,0.0,0.00,0.0,7.32,100.0,0.0,21.9
1,15,37,0,185.42,83.91,,3.36,77.0,0.00,0.0,0.00,0.0,100.0,21.6
2,13,9,0,177.80,97.98,,0.00,0.0,5.58,60.0,0.00,0.0,0.0,20.9
3,7,4,0,167.64,61.23,,1.40,33.0,1.40,75.0,0.00,0.0,100.0,20.9
4,8,2,0,190.50,83.91,193.04,2.05,60.0,2.73,42.0,10.23,100.0,0.0,20.4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4106,0,1,0,175.26,106.59,,0.00,0.0,0.00,0.0,0.00,0.0,0.0,0.0
4107,1,1,2,172.72,70.31,,0.00,0.0,0.00,0.0,0.00,0.0,0.0,0.0
4108,12,20,1,180.34,77.11,,9.47,60.0,12.63,0.0,0.00,0.0,100.0,0.0
4109,6,1,0,190.50,83.91,210.82,1.28,33.0,2.55,33.0,0.00,0.0,0.0,0.0


In [19]:
data.isna().sum()

wins                                               0
losses                                             0
draws                                              0
height_cm                                        298
weight_in_kg                                      87
reach_in_cm                                     1927
significant_strikes_landed_per_minute              0
significant_striking_accuracy                      0
significant_strikes_absorbed_per_minute            0
significant_strike_defence                         0
average_takedowns_landed_per_15_minutes            0
takedown_accuracy                                  0
takedown_defense                                   0
average_submissions_attempted_per_15_minutes       0
dtype: int64

In [20]:
data = data.dropna()
data

Unnamed: 0,wins,losses,draws,height_cm,weight_in_kg,reach_in_cm,significant_strikes_landed_per_minute,significant_striking_accuracy,significant_strikes_absorbed_per_minute,significant_strike_defence,average_takedowns_landed_per_15_minutes,takedown_accuracy,takedown_defense,average_submissions_attempted_per_15_minutes
4,8,2,0,190.50,83.91,193.04,2.05,60.0,2.73,42.0,10.23,100.0,0.0,20.4
8,9,3,0,177.80,70.31,175.26,1.91,42.0,6.22,33.0,0.00,0.0,0.0,14.3
11,5,0,0,185.42,83.91,193.04,2.76,62.0,0.55,75.0,11.04,50.0,0.0,13.8
17,16,8,0,185.42,65.77,185.42,1.38,50.0,7.59,26.0,0.00,0.0,0.0,10.3
22,6,2,0,167.64,56.70,167.64,0.87,50.0,0.65,66.0,3.27,100.0,0.0,9.8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4096,6,1,0,162.56,61.23,162.56,0.00,0.0,0.00,100.0,15.52,100.0,0.0,0.0
4099,7,5,0,182.88,92.99,190.50,1.78,80.0,3.11,65.0,0.00,0.0,0.0,0.0
4102,4,5,0,167.64,56.70,167.64,4.44,34.0,4.31,64.0,0.00,0.0,0.0,0.0
4109,6,1,0,190.50,83.91,210.82,1.28,33.0,2.55,33.0,0.00,0.0,0.0,0.0


## Creación de variable objetivo: Estilo de lucha

Vamos a construir una variable categórica llamada `estilo_lucha` que clasifica a los peleadores en tres grupos:

- **Striker**: destaca por su golpeo ofensivo.
- **Grappler**: se enfoca en sumisiones y derribos.
- **Táctico**: tiene un enfoque más equilibrado o defensivo.

La asignación se basa en reglas simples sobre las estadísticas ofensivas de cada peleador.

In [22]:
# Creamos la variable 'estilo_lucha' según estadísticas ofensivas
def definir_estilo(row):
    golpes = row['significant_strikes_landed_per_minute']
    derribos = row['average_takedowns_landed_per_15_minutes']
    sumisiones = row['average_submissions_attempted_per_15_minutes']
    
    # Condición para striker
    if golpes >= 4.0 and derribos < 1.5 and sumisiones < 0.5:
        return 'Striker'
    # Condición para grappler
    elif sumisiones >= 0.5 or derribos >= 2.0:
        return 'Grappler'
    # Si no cumple con las anteriores, es táctico
    else:
        return 'Tactico'

# Aplicar la función fila por fila
data['estilo_lucha'] = data.apply(definir_estilo, axis=1)

# Ver la distribución de clases
data['estilo_lucha'].value_counts()


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data['estilo_lucha'] = data.apply(definir_estilo, axis=1)


estilo_lucha
Grappler    1206
Tactico      619
Striker      358
Name: count, dtype: int64