Generar los datasets de cada equipo que se va a analizar

In [1]:
import pandas as pd
import numpy as np

df_raw = pd.read_csv("combined_odds_cleaned.csv")
df_raw['Date'] = pd.to_datetime(df_raw['Date'], dayfirst=True)


  df_raw['Date'] = pd.to_datetime(df_raw['Date'], dayfirst=True)


# Resumen
Análisis de las características de los datasets
### Explicación de las columnas en los datasets finales

Información general del partido
- **`season`** → Indica la temporada en la que se jugó el partido.
- **`date`** → Fecha en la que se jugó el partido.
- **`team`** → Nombre del equipo que se analiza
- **`rival_team`** → Nombre del equipo rival al equipo que se analiza
- **`home_adv`** → Si el equipo que se analiza tiene ventaja de jugar en casa
- **`last_season_team`** → Posicion final en la clasificacion de la temporada anterior del equipo que se analiza
- **`last_season_rival`** → Posicion final en la clasificacion de la temporada anterior del equipo que se analiza

Métricas de rendimiento del equipo analizado (últimos 10 partidos)
- **`pct_wins`** → Porcentaje de victorias en los últimos 10 partidos de esa temporada.
- **`avg_goals_scored`** → Promedio de goles anotados en los últimos 10 partidos de esa temporada.
- **`avg_goals_received`** → Promedio de goles recibidos en los últimos 10 partidos de esa temporada.
- **`goal_difference`** → Diferencia de goles contra el rival en los 10 partidos anteriores de esa temporada
  - Si el total de los goles ha sido team (8) y ha recibido (6), la diferencia es 2
  - Si el total de los goles ha sido team (4) y ha recibido (8), la diferencia es -4

Información del equipo contrario
- **`pct_wins_rival`** → Porcentaje de victorias del equipo contrario en sus últimos 10 partidos de esa temporada.
- **`avg_goals_scored_rival`** → Promedio de goles anotados por el equipo contrario en sus últimos 10 partidos de esa temporada.
- **`avg_goals_received_rival`** → Promedio de goles recibidos por el equipo contrario en sus últimos 10 partidos de esa temporada.
- **`goal_difference_rival`** → Diferencia de goles del rival en los 10 partidos anteriores de esa temporada
  - Si el total de los goles ha sido rival (8) y ha recibido (6), la diferencia es 2
  - Si el total de los goles ha sido rival (4) y ha recibido (8), la diferencia es -4


Información Vs. Rival
- **`pct_wins_vs_rival`** → Porcentaje de victorias Vs. contrario en los ultimos 5 encuentros
- **`avg_goals_scored_vs_rival`** → Total de goles anotados al rival en los últimos 5 encuentros
- **`avg_goals_received_vs_rival`** → Total de goles recibidos por el rival en los últimos 5 encuentros
- **`goal_difference_vs_rival`** → Diferencia de goles contra el rival en los ultimos 5 encuentros. 
  - Si el total de los goles ha sido team (8) y rival_team (6), la diferencia es 2
  - Si el total de los goles ha sido team (4) y rival_team (8), la diferencia es -4

Cuotas de apuestas
- **`AvgH`** → Promedio de cuotas para la victoria del equipo que analizamos
- - **`AvgA`** → Promedio de cuotas para la victoria del equipo local
- **`AvgD`** → Promedio de cuotas para un empate en el partido
- **`AvgAHH`** →  Promedio de cuotas para una victoria local con hándicap asiático (local gana tras un ajuste de goles)
- **`AvgAHA`** → Promedio de cuotas para una victoria visitante con hándicap asiático (visitante gana tras un ajuste de goles)

Información a predecir
- **`goals_team`** → Número de goles anotados por el equipo local.
- **`goals_rival`** → Número de goles anotados por el equipo visitante.
- **`result`** → Resultado del partido desde la perspectiva del equipo analizado:
  - `1` → Victoria.
  - `0` → Empate.
  - `-1` → Derrota.



In [2]:
def get_result_for_team(row, team_name):
    """
    Dado un registro (row) y el nombre de un equipo,
    devuelve el resultado desde la perspectiva de ese equipo:
      1  -> Victoria
      0  -> Empate
     -1  -> Derrota
    """
    # Goles locales y visitantes
    home_goals = row['FTHG']
    away_goals = row['FTAG']
    
    # Identificar si el equipo es local o visitante en este partido
    if row['HomeTeam'] == team_name:
        # Comparar home_goals vs away_goals
        if home_goals > away_goals:
            return 1
        elif home_goals < away_goals:
            return -1
        else:
            return 0
    else:
        # El equipo es visitante
        if away_goals > home_goals:
            return 1
        elif away_goals < home_goals:
            return -1
        else:
            return 0


In [3]:
def get_goals_for_team(row, team_name):
    """
    Devuelve la cantidad de goles anotados y recibidos
    desde la perspectiva de 'team_name' en este partido.
    """
    if row['HomeTeam'] == team_name:
        return row['FTHG'], row['FTAG']
    else:
        return row['FTAG'], row['FTHG']

In [4]:

def calculate_past_n_matches_stats(df_team, team_name, current_date, current_season, n=10):
    """
    Calcula las métricas en base a los últimos 'n' partidos de la misma temporada
    anteriores a 'current_date', para el equipo 'team_name'.
    
    Retorna:
        pct_wins, avg_goals_scored, avg_goals_received, goal_difference
    """
    # Filtrar por misma temporada y fecha anterior
    df_past = df_team[
        (df_team['Season'] == current_season) &
        (df_team['Date'] < current_date)
    ].sort_values('Date', ascending=False)
    
    # Tomar los últimos n partidos (o menos si no hay suficientes)
    df_past = df_past.head(n)
    
    if len(df_past) == 0:
        # Si no hay partidos previos, devolvemos ceros
        return 0.0, 0.0, 0.0, 0.0
    
    # Cálculo de victorias (1), empates (0), derrotas (-1)
    results = df_past['result']  # ya calculado previamente
    
    wins = np.sum(results == 1)
    total = len(df_past)
    pct_wins = wins / total  # 0 a 1
    
    # Goles marcados/encajados
    goals_scored = df_past['goals_team'].sum()
    goals_received = df_past['goals_rival'].sum()
    
    avg_goals_scored = goals_scored / total
    avg_goals_received = goals_received / total
    
    goal_difference = goals_scored - goals_received
    
    return pct_wins, avg_goals_scored, avg_goals_received, goal_difference

In [5]:
def calculate_rival_stats(df_full, rival_name, current_date, current_season, n=10):
    """
    Calcula las métricas para el rival en sus últimos 'n' partidos
    de la MISMA temporada, anteriores a 'current_date'.
    """
    # Filtrar todos los partidos donde el rival es local o visitante
    df_rival = df_full[
        ( (df_full['HomeTeam'] == rival_name) | (df_full['AwayTeam'] == rival_name) ) &
        (df_full['Season'] == current_season) &
        (df_full['Date'] < current_date)
    ].sort_values('Date', ascending=False)
    
    df_rival = df_rival.head(n)
    
    if len(df_rival) == 0:
        return 0.0, 0.0, 0.0, 0.0
    
    # Calcular victorias (desde perspectiva del rival)
    # Para cada row: result_rival
    def get_rival_result(row):
        home_goals = row['FTHG']
        away_goals = row['FTAG']
        # ¿El rival está en casa?
        if row['HomeTeam'] == rival_name:
            if home_goals > away_goals:
                return 1
            elif home_goals < away_goals:
                return -1
            else:
                return 0
        else:
            if away_goals > home_goals:
                return 1
            elif away_goals < home_goals:
                return -1
            else:
                return 0
    
    df_rival['rival_result'] = df_rival.apply(get_rival_result, axis=1)
    wins_rival = np.sum(df_rival['rival_result'] == 1)
    total_rival = len(df_rival)
    
    pct_wins_rival = wins_rival / total_rival
    
    # Goles del rival en cada partido
    def get_goals_for_rival(row):
        if row['HomeTeam'] == rival_name:
            return row['FTHG'], row['FTAG']
        else:
            return row['FTAG'], row['FTHG']
    
    df_rival['g_rival_scored'], df_rival['g_rival_received'] = zip(*df_rival.apply(get_goals_for_rival, axis=1))
    
    total_goals_scored_rival = df_rival['g_rival_scored'].sum()
    total_goals_received_rival = df_rival['g_rival_received'].sum()
    
    avg_goals_scored_rival = total_goals_scored_rival / total_rival
    avg_goals_received_rival = total_goals_received_rival / total_rival
    
    goal_difference_rival = total_goals_scored_rival - total_goals_received_rival
    
    # Eliminar columnas auxiliares
    df_rival.drop(['rival_result','g_rival_scored','g_rival_received'], axis=1, inplace=True)
    
    return pct_wins_rival, avg_goals_scored_rival, avg_goals_received_rival, goal_difference_rival


In [6]:
def calculate_rival_stats(df_full, rival_name, current_date, current_season, n=10):
    """
    Calcula las métricas para el rival en sus últimos 'n' partidos
    de la MISMA temporada, anteriores a 'current_date'.
    """
    # Filtrar todos los partidos donde el rival es local o visitante
    df_rival = df_full[
        ( (df_full['HomeTeam'] == rival_name) | (df_full['AwayTeam'] == rival_name) ) &
        (df_full['Season'] == current_season) &
        (df_full['Date'] < current_date)
    ].sort_values('Date', ascending=False)
    
    df_rival = df_rival.head(n)
    
    if len(df_rival) == 0:
        return 0.0, 0.0, 0.0, 0.0
    
    # Calcular victorias (desde perspectiva del rival)
    # Para cada row: result_rival
    def get_rival_result(row):
        home_goals = row['FTHG']
        away_goals = row['FTAG']
        # ¿El rival está en casa?
        if row['HomeTeam'] == rival_name:
            if home_goals > away_goals:
                return 1
            elif home_goals < away_goals:
                return -1
            else:
                return 0
        else:
            if away_goals > home_goals:
                return 1
            elif away_goals < home_goals:
                return -1
            else:
                return 0
    
    df_rival['rival_result'] = df_rival.apply(get_rival_result, axis=1)
    wins_rival = np.sum(df_rival['rival_result'] == 1)
    total_rival = len(df_rival)
    
    pct_wins_rival = wins_rival / total_rival
    
    # Goles del rival en cada partido
    def get_goals_for_rival(row):
        if row['HomeTeam'] == rival_name:
            return row['FTHG'], row['FTAG']
        else:
            return row['FTAG'], row['FTHG']
    
    df_rival['g_rival_scored'], df_rival['g_rival_received'] = zip(*df_rival.apply(get_goals_for_rival, axis=1))
    
    total_goals_scored_rival = df_rival['g_rival_scored'].sum()
    total_goals_received_rival = df_rival['g_rival_received'].sum()
    
    avg_goals_scored_rival = total_goals_scored_rival / total_rival
    avg_goals_received_rival = total_goals_received_rival / total_rival
    
    goal_difference_rival = total_goals_scored_rival - total_goals_received_rival
    
    # Eliminar columnas auxiliares
    df_rival.drop(['rival_result','g_rival_scored','g_rival_received'], axis=1, inplace=True)
    
    return pct_wins_rival, avg_goals_scored_rival, avg_goals_received_rival, goal_difference_rival

In [7]:

def calculate_vs_rival_stats(df_team, team_name, rival_name, current_date, n=5):
    """
    Calcula las métricas de enfrentamientos directos previos (últimos n)
    entre 'team_name' y 'rival_name', sin importar la temporada,
    anteriores a 'current_date'.
    
    Retorna:
        pct_wins_vs_rival, avg_goals_scored_vs_rival, avg_goals_received_vs_rival, goal_difference_vs_rival
    """
    # Filtrar todos los partidos pasados entre team_name y rival_name
    df_direct = df_team[
        (
            (df_team['HomeTeam'] == team_name) & (df_team['AwayTeam'] == rival_name)
        ) |
        (
            (df_team['HomeTeam'] == rival_name) & (df_team['AwayTeam'] == team_name)
        )
    ]
    df_direct = df_direct[df_direct['Date'] < current_date].sort_values('Date', ascending=False)
    
    df_direct = df_direct.head(n)  # últimos n
    
    if len(df_direct) == 0:
        return 0.0, 0.0, 0.0, 0.0
    
    # Para cada row, calcular el resultado desde la perspectiva de 'team_name'
    df_direct['direct_result'] = df_direct.apply(lambda row: get_result_for_team(row, team_name), axis=1)
    
    wins_vs_rival = np.sum(df_direct['direct_result'] == 1)
    total_vs_rival = len(df_direct)
    
    pct_wins_vs_rival = wins_vs_rival / total_vs_rival
    
    # Goles en esos partidos (para 'team_name')
    df_direct['g_scored'], df_direct['g_received'] = zip(*df_direct.apply(lambda row: get_goals_for_team(row, team_name), axis=1))
    
    sum_goals_scored = df_direct['g_scored'].sum()
    sum_goals_received = df_direct['g_received'].sum()
    
    goal_difference_vs_rival = sum_goals_scored - sum_goals_received
    
    # Aunque la variable se llame 'avg_goals_scored_vs_rival', 
    # en el enunciado se pide la suma de los goles (no el promedio).
    # Lo dejamos como pide el enunciado.
    avg_goals_scored_vs_rival = sum_goals_scored
    avg_goals_received_vs_rival = sum_goals_received
    
    # Limpieza de columnas auxiliares
    df_direct.drop(['direct_result','g_scored','g_received'], axis=1, inplace=True)
    
    return pct_wins_vs_rival, avg_goals_scored_vs_rival, avg_goals_received_vs_rival, goal_difference_vs_rival


In [40]:
# Cargar el archivo de clasificaciones
clasificaciones_df = pd.read_csv('clasificacion_final/clasificaciones.csv')

# Función para obtener la posición del equipo en la temporada anterior
def get_last_season_position(team_name, season):
    # Filtrar las clasificaciones para la temporada anterior
    last_season = str(int(season.split('-')[0]) - 1) + '-' + str(int(season.split('-')[1]) - 1)
    
    # Buscar la posición del equipo en la temporada anterior
    last_season_data = clasificaciones_df[clasificaciones_df['season'] == last_season]
    if team_name in last_season_data['team'].values:
        position = last_season_data[last_season_data['team'] == team_name]['position'].values[0]
        return position
    return 21  # Si no se encuentra el equipo en la temporada anterior

In [52]:
import numpy as np

def build_team_dataset(df_full, team_name):
    """
    Construye un dataset con las columnas solicitadas para 'team_name',
    asegurando que las cuotas siempre correspondan a la perspectiva del equipo analizado.

    Retorna un DataFrame que contiene:
      - season, date, team, rival_team, home_adv, last_season_team, last_season_rival
      - pct_wins, avg_goals_scored, avg_goals_received, goal_difference
      - pct_wins_rival, avg_goals_scored_rival, avg_goals_received_rival, goal_difference_rival
      - pct_wins_vs_rival, avg_goals_scored_vs_rival, avg_goals_received_vs_rival, goal_difference_vs_rival
      - goals_team, goals_rival, result
      - AvgWin, AvgLoss, AvgDraw, AvgAHWin, AvgAHLoss
    """
    # 1. Filtrar los partidos donde aparece el equipo
    df_team = df_full[
        (df_full['HomeTeam'] == team_name) | (df_full['AwayTeam'] == team_name)
    ].copy()
    
    # 2. Agregar columnas base
    df_team['season'] = df_team['Season']
    df_team['date'] = df_team['Date']
    df_team['team'] = team_name
    
    # Determinar rival y si juega en casa o fuera
    df_team['rival_team'] = np.where(df_team['HomeTeam'] == team_name, df_team['AwayTeam'], df_team['HomeTeam'])
    df_team['home_adv'] = np.where(df_team['HomeTeam'] == team_name, 1, 0)

    # Posición de la temporada anterior
    df_team['last_season_team'] = df_team.apply(lambda row: get_last_season_position(row['team'], row['season']), axis=1)
    df_team['last_season_rival'] = df_team.apply(lambda row: get_last_season_position(row['rival_team'], row['season']), axis=1)

    # Goles y resultado desde la perspectiva del equipo
    df_team['goals_team'], df_team['goals_rival'] = zip(*df_team.apply(lambda row: get_goals_for_team(row, team_name), axis=1))
    df_team['result'] = df_team.apply(lambda row: get_result_for_team(row, team_name), axis=1)
    
    # Ordenar por fecha
    df_team.sort_values('date', inplace=True)

    # 3. Inicializar columnas de estadísticas
    stats_cols = [
        'pct_wins', 'avg_goals_scored', 'avg_goals_received', 'goal_difference',
        'pct_wins_rival', 'avg_goals_scored_rival', 'avg_goals_received_rival', 'goal_difference_rival',
        'pct_wins_vs_rival', 'avg_goals_scored_vs_rival', 'avg_goals_received_vs_rival', 'goal_difference_vs_rival'
    ]
    for col in stats_cols:
        df_team[col] = np.nan

    # 4. Ajustar cuotas a la perspectiva del equipo analizado
    df_team['AvgWin'] = np.where(df_team['home_adv'] == 1, df_team['AvgH'], df_team['AvgA'])
    df_team['AvgLoss'] = np.where(df_team['home_adv'] == 1, df_team['AvgA'], df_team['AvgH'])
    df_team['AvgDraw'] = df_team['AvgD']
    df_team['AvgAHWin'] = np.where(df_team['home_adv'] == 1, df_team['AvgAHH'], df_team['AvgAHA'])
    df_team['AvgAHLoss'] = np.where(df_team['home_adv'] == 1, df_team['AvgAHA'], df_team['AvgAHH'])

    # 5. Calcular estadísticas para cada partido
    df_team = df_team.reset_index(drop=True)

    for i in range(len(df_team)):
        match_date = df_team.loc[i, 'date']
        match_season = df_team.loc[i, 'season']
        rival = df_team.loc[i, 'rival_team']

        # Estadísticas del equipo en sus últimos 10 partidos
        pct_w, avg_gs, avg_gr, gd = calculate_past_n_matches_stats(
            df_team, team_name, match_date, match_season, n=10
        )
        df_team.loc[i, ['pct_wins', 'avg_goals_scored', 'avg_goals_received', 'goal_difference']] = pct_w, avg_gs, avg_gr, gd

        # Estadísticas del rival en sus últimos 10 partidos
        pct_wr, avg_gs_r, avg_gr_r, gd_r = calculate_rival_stats(
            df_full, rival, match_date, match_season, n=10
        )
        df_team.loc[i, ['pct_wins_rival', 'avg_goals_scored_rival', 'avg_goals_received_rival', 'goal_difference_rival']] = pct_wr, avg_gs_r, avg_gr_r, gd_r

        # Estadísticas de enfrentamientos directos (últimos 5 partidos)
        pct_w_vs, sum_gs_vs, sum_gr_vs, gd_vs = calculate_vs_rival_stats(
            df_full, team_name, rival, match_date, n=5
        )
        df_team.loc[i, ['pct_wins_vs_rival', 'avg_goals_scored_vs_rival', 'avg_goals_received_vs_rival', 'goal_difference_vs_rival']] = pct_w_vs, sum_gs_vs, sum_gr_vs, gd_vs

    # 6. Seleccionar las columnas finales
    final_cols = [
        'season', 'date', 'team', 'rival_team', 'home_adv', 'last_season_team', 'last_season_rival',
        'pct_wins', 'avg_goals_scored', 'avg_goals_received', 'goal_difference',
        'pct_wins_rival', 'avg_goals_scored_rival', 'avg_goals_received_rival', 'goal_difference_rival',
        'pct_wins_vs_rival', 'avg_goals_scored_vs_rival', 'avg_goals_received_vs_rival', 'goal_difference_vs_rival',
        'goals_team', 'goals_rival', 'result',
        'AvgWin', 'AvgLoss', 'AvgDraw', 'AvgAHWin', 'AvgAHLoss'
    ]
    df_team_final = df_team[final_cols].copy()

    # Redondear valores numéricos a 2 decimales
    df_team_final = df_team_final.round(2)

    return df_team_final


Guardamos los datasets

In [53]:
equipos = ["Real Madrid", "Barcelona", "Valencia", "Ath Bilbao"]

datasets = {}

# Definir las columnas que contienen las cuotas
cols_to_impute = ['AvgAHWin', 'AvgAHLoss']

for eq in equipos:
    # Generar el dataset del equipo
    df_eq = build_team_dataset(df_raw, eq)
    
    # Imputar NaN en las columnas de cuotas con un valor neutral (1.0)
    df_eq[cols_to_impute] = df_eq[cols_to_impute].fillna(1.0)

    # Guardar CSV con floats a 2 decimales
    filename = f"datasets_equipos/{eq.lower().replace(' ', '_')}.csv"
    df_eq.to_csv(filename, index=False, encoding="utf-8", float_format='%.2f')
    print(f"Dataset para {eq} guardado en: {filename}")
    
    # También guardamos el dataset en el diccionario 'datasets'
    datasets[eq] = df_eq


Dataset para Real Madrid guardado en: datasets_equipos/real_madrid.csv
Dataset para Barcelona guardado en: datasets_equipos/barcelona.csv
Dataset para Valencia guardado en: datasets_equipos/valencia.csv
Dataset para Ath Bilbao guardado en: datasets_equipos/ath_bilbao.csv


In [54]:
datasets["Real Madrid"]

Unnamed: 0,season,date,team,rival_team,home_adv,last_season_team,last_season_rival,pct_wins,avg_goals_scored,avg_goals_received,...,avg_goals_received_vs_rival,goal_difference_vs_rival,goals_team,goals_rival,result,AvgWin,AvgLoss,AvgDraw,AvgAHWin,AvgAHLoss
0,2003-04,2003-08-30,Real Madrid,Betis,1,21,21,0.00,0.00,0.00,...,0.0,0.0,2,1,1,1.38,7.18,4.00,1.94,1.91
1,2003-04,2003-09-02,Real Madrid,Villarreal,0,21,21,1.00,2.00,1.00,...,0.0,0.0,1,1,0,1.80,3.99,3.28,1.96,1.88
2,2003-04,2003-09-13,Real Madrid,Valladolid,1,21,21,0.50,1.50,1.00,...,0.0,0.0,7,2,1,1.29,9.10,4.39,1.82,2.02
3,2003-04,2003-09-21,Real Madrid,Malaga,0,21,21,0.67,3.33,1.33,...,0.0,0.0,3,1,1,1.62,4.90,3.40,1.92,1.93
4,2003-04,2003-09-27,Real Madrid,Valencia,0,21,21,0.75,3.25,1.25,...,0.0,0.0,0,2,-1,2.27,2.79,3.13,1.95,1.90
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
793,2023-24,2024-05-04,Real Madrid,Cadiz,1,2,16,0.80,2.30,0.70,...,2.0,6.0,3,0,1,7.09,6.00,2.13,2.02,1.88
794,2023-24,2024-05-11,Real Madrid,Granada,0,2,21,0.80,2.20,0.70,...,2.0,11.0,4,0,1,3.96,2.52,5.59,1.86,1.99
795,2023-24,2024-05-14,Real Madrid,Alaves,1,2,21,0.90,2.50,0.60,...,4.0,9.0,5,0,1,11.32,8.03,2.46,2.08,1.84
796,2023-24,2024-05-19,Real Madrid,Villarreal,0,2,5,0.90,2.90,0.60,...,6.0,1.0,4,4,0,3.61,2.34,3.35,1.84,2.02


In [55]:
# 6. Seleccionar las columnas finales
final_cols = [
    'season', 'date', 'team', 'rival_team', 'home_adv', 'last_season_team', 'last_season_rival',
    'pct_wins', 'avg_goals_scored', 'avg_goals_received', 'goal_difference',
    'pct_wins_rival', 'avg_goals_scored_rival', 'avg_goals_received_rival', 'goal_difference_rival',
    'pct_wins_vs_rival', 'avg_goals_scored_vs_rival', 'avg_goals_received_vs_rival', 'goal_difference_vs_rival',
    'goals_team', 'goals_rival', 'result',
    'AvgWin', 'AvgLoss', 'AvgDraw', 'AvgAHWin', 'AvgAHLoss'
]

In [56]:
df_eq_1 = datasets.get('Real Madrid')  
nan_eq_1 = df_eq_1[final_cols].isna().sum()  # Contar NaN en las columnas finales
print("NaN en el equipo 'Real Madrid':")
print(nan_eq_1)
print("-------------")


NaN en el equipo 'Real Madrid':
season                         0
date                           0
team                           0
rival_team                     0
home_adv                       0
last_season_team               0
last_season_rival              0
pct_wins                       0
avg_goals_scored               0
avg_goals_received             0
goal_difference                0
pct_wins_rival                 0
avg_goals_scored_rival         0
avg_goals_received_rival       0
goal_difference_rival          0
pct_wins_vs_rival              0
avg_goals_scored_vs_rival      0
avg_goals_received_vs_rival    0
goal_difference_vs_rival       0
goals_team                     0
goals_rival                    0
result                         0
AvgWin                         0
AvgLoss                        0
AvgDraw                        0
AvgAHWin                       0
AvgAHLoss                      0
dtype: int64
-------------


In [57]:
df_eq_1 = datasets.get('Barcelona') 
nan_eq_1 = df_eq_1[final_cols].isna().sum()  # Contar NaN en las columnas finales
print("NaN en el equipo 'Barcelona':")
print(nan_eq_1)
print("-------------")


NaN en el equipo 'Barcelona':
season                         0
date                           0
team                           0
rival_team                     0
home_adv                       0
last_season_team               0
last_season_rival              0
pct_wins                       0
avg_goals_scored               0
avg_goals_received             0
goal_difference                0
pct_wins_rival                 0
avg_goals_scored_rival         0
avg_goals_received_rival       0
goal_difference_rival          0
pct_wins_vs_rival              0
avg_goals_scored_vs_rival      0
avg_goals_received_vs_rival    0
goal_difference_vs_rival       0
goals_team                     0
goals_rival                    0
result                         0
AvgWin                         0
AvgLoss                        0
AvgDraw                        0
AvgAHWin                       0
AvgAHLoss                      0
dtype: int64
-------------


In [58]:
df_eq_1 = datasets.get('Valencia')
nan_eq_1 = df_eq_1[final_cols].isna().sum()  # Contar NaN en las columnas finales
print("NaN en el equipo 'Valencia':")
print(nan_eq_1)
print("-------------")


NaN en el equipo 'Valencia':
season                         0
date                           0
team                           0
rival_team                     0
home_adv                       0
last_season_team               0
last_season_rival              0
pct_wins                       0
avg_goals_scored               0
avg_goals_received             0
goal_difference                0
pct_wins_rival                 0
avg_goals_scored_rival         0
avg_goals_received_rival       0
goal_difference_rival          0
pct_wins_vs_rival              0
avg_goals_scored_vs_rival      0
avg_goals_received_vs_rival    0
goal_difference_vs_rival       0
goals_team                     0
goals_rival                    0
result                         0
AvgWin                         0
AvgLoss                        0
AvgDraw                        0
AvgAHWin                       0
AvgAHLoss                      0
dtype: int64
-------------


In [59]:
df_eq_1 = datasets.get('Ath Bilbao')  
nan_eq_1 = df_eq_1[final_cols].isna().sum()  # Contar NaN en las columnas finales
print("NaN en el equipo 'Ath Bilbao':")
print(nan_eq_1)
print("-------------")


NaN en el equipo 'Ath Bilbao':
season                         0
date                           0
team                           0
rival_team                     0
home_adv                       0
last_season_team               0
last_season_rival              0
pct_wins                       0
avg_goals_scored               0
avg_goals_received             0
goal_difference                0
pct_wins_rival                 0
avg_goals_scored_rival         0
avg_goals_received_rival       0
goal_difference_rival          0
pct_wins_vs_rival              0
avg_goals_scored_vs_rival      0
avg_goals_received_vs_rival    0
goal_difference_vs_rival       0
goals_team                     0
goals_rival                    0
result                         0
AvgWin                         0
AvgLoss                        0
AvgDraw                        0
AvgAHWin                       0
AvgAHLoss                      0
dtype: int64
-------------
