In [1]:
# Add mais informações:
    # Quantas vezes o time da casa ganhou do time visitante
    # Quantas vezes o time da casa perdeu do time visitante

📊 Features de desempenho recente
Essas ajudam o modelo a entender como o time está jogando ultimamente:
- `home_team_last_5_win_rate`: percentual de vitórias nos últimos 5 jogos do time da casa.
- `away_team_last_5_win_rate`: idem para o visitante.
- `home_team_goal_avg_last_5`: média de gols marcados pelo time da casa nos últimos 5 jogos.
- `away_team_goal_avg_last_5`: idem para o visitante.
- `home_team_goals_conceded_last_5`: média de gols sofridos nos últimos 5 jogos.
- `home_team_form`: pontos conquistados nos últimos 5 jogos (3 por vitória, 1 por empate, 0 por derrota).


🏠📦 Features de desempenho em casa/fora
Separar o desempenho como mandante e visitante melhora bastante:
- `home_team_home_win_rate`: taxa de vitórias como mandante.
- `away_team_away_win_rate`: taxa de vitórias como visitante.
- `home_team_home_goal_avg`: média de gols marcados em casa.
- `away_team_away_goal_avg`: média de gols marcados fora de casa.


📈 Features históricas de confronto
Conhecido como head-to-head:
- `head_to_head_win_home_team`: quantas vezes o time da casa venceu o visitante historicamente.
- `head_to_head_draws`: quantos empates nos últimos X jogos entre os dois times.
- `head_to_head_goal_diff`: saldo de gols nos últimos confrontos entre eles.


🧠 Features de contexto
Baseado em data, tempo e local:
- `is_weekend`: jogo foi em final de semana?
- `month`: mês do jogo (para ver se tem sazonalidade).
- `days_since_last_match_home`: dias desde o último jogo do time da casa.
- `match_importance`: pode ser calculado se você tiver a rodada ou posição na tabela (por ex. “jogo decisivo”).


⚽️ Features com diferenças entre os times
Você pode criar colunas de diferença entre as estatísticas dos dois times:
- `goal_avg_diff`: diferença entre a média de gols marcados dos dois.
- `win_rate_diff`: diferença entre taxas de vitória.
- `ranking_diff`: se você tiver um ranking ou pontuação.

🔥 Features de momentum e forma recente
Essas capturam o "embalo" dos times:
- `home_team_streak`: número de vitórias consecutivas do time da casa.
- `away_team_streak`: idem para o visitante.
- `home_team_last_match_result`: resultado do último jogo (pode codificar como +1 vitória, 0 empate, -1 derrota).
- `home_team_points_last_3`: soma de pontos conquistados nos últimos 3 jogos (vitória = 3, empate = 1).
- `home_team_score_diff_last_5`: saldo de gols nos últimos 5 jogos.

🧮 Features estatísticas acumuladas
Acumulados por temporada:
- `home_team_total_goals_scored_season`
- `home_team_total_goals_conceded_season`
- `away_team_total_goals_scored_season`
- `away_team_total_goals_conceded_season`
- `home_team_matches_played_season`: total de jogos no campeonato até o momento.
- `home_team_goal_diff_season`: saldo de gols na temporada.

🗓️ Features temporais
- `match_day_of_week`: dia da semana do jogo (Monday, Tuesday, etc).
- `season_phase`: divida a temporada em blocos, tipo começo/meio/final (pode influenciar em jogos decisivos).
- `round_number`: se você tiver como contar qual rodada do campeonato é.

💹 Features baseadas em odds (se tiver)
Se você tiver odds das casas de apostas:
- `home_win_odds`, `draw_odds`, `away_win_odds`
- `implied_prob_home_win`: transforme a odd em probabilidade implícita: `1 / odds`.
- `odds_margin`: margem de lucro da casa de aposta — pode indicar incerteza do mercado.

🧠 Encoding avançado (para nomes de time)
- `home_team_embedding`: usar embeddings para representar times (ex: contar vitórias, gols, jogos... e transformar em vetores).
- `team_strength_score`: crie um "score de força" baseado nos resultados e gols marcados/concedidos.

💡 Algumas ideias bônus
- **Resultado esperado**: diferença esperada de gols com base nas médias recentes.
- **Jogos simultâneos**: quantos jogos acontecem no mesmo dia (às vezes influencia no desempenho).
- **Jogos recentes entre os dois**: tempo desde o último confronto direto.
- **Primeiro jogo da temporada?**: pode ser um booleano (True / False) para indicar isso.
- **Treinador novo?**: se você tiver esses dados, uma feature de “novo técnico” ajuda bastante!



In [1]:
import os
os.chdir('..')  # volta um diretório
os.chdir('..')  # volta um diretório


In [2]:
import pandas as pd
from pathlib import Path
pd.set_option('display.max_columns', 500)

In [3]:
FT_DIR = Path("database", "features")
df = pd.read_csv(os.path.join(FT_DIR, 'ft_df.csv'))

In [None]:
drop_columns = [id	country	league	season	home_team	away_team,
                result	psch	pscd	psca	maxch	maxcd	maxca	avgch	avgcd	avgca	bfech	bfecd	datetime]
df

Unnamed: 0,id,country,league,season,home_team,away_team,home_score,away_score,result,psch,pscd,psca,maxch,maxcd,maxca,avgch,avgcd,avgca,bfech,bfecd,datetime,hash,last_updated,home_team_last_5_win_rate,away_team_last_5_win_rate,home_team_goal_avg_last_5,away_team_goal_avg_last_5,home_team_goals_conceded_last_5,away_team_goals_conceded_last_5,home_team_form,away_team_form,home_team_home_win_rate,away_team_away_win_rate,home_team_home_goal_avg,away_team_away_goal_avg,head_to_head_win_home_team,head_to_head_draws,head_to_head_goal_diff,is_weekend,month,days_since_last_match_home,match_importance,home_team_ranking,away_team_ranking,goal_avg_diff,win_rate_diff,ranking_diff,away_team_streak,away_team_last_match_result,away_team_points_last_3,away_team_score_diff_last_5,home_team_total_goals_scored_season,home_team_total_goals_conceded_season,home_team_matches_played_season,home_team_goal_diff_season,away_team_total_goals_scored_season,away_team_total_goals_conceded_season,match_day_of_week,season_phase,home_team_encoder,away_team_encoder,winner
0,5415,Brazil,Serie A,2012,Palmeiras,Portuguesa,1,1,D,1.75,3.86,5.25,1.76,3.87,5.31,1.69,3.50,4.90,,,2012-05-19 22:30:00,10444097902145517897,2024-12-18 22:32:46,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,27,30,1
1,5416,Brazil,Serie A,2012,Sport Recife,Flamengo RJ,1,1,D,2.83,3.39,2.68,2.83,3.42,2.70,2.59,3.23,2.58,,,2012-05-19 22:30:00,7876314183501917566,2024-12-18 22:32:46,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,34,17,1
2,5417,Brazil,Serie A,2012,Figueirense,Nautico,2,1,H,1.60,4.04,6.72,1.67,4.05,7.22,1.59,3.67,5.64,,,2012-05-20 01:00:00,9296066046964045682,2024-12-18 22:32:46,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,16,26,2
3,5418,Brazil,Serie A,2012,Botafogo RJ,Sao Paulo,4,2,H,2.49,3.35,3.15,2.49,3.39,3.15,2.35,3.26,2.84,,,2012-05-20 20:00:00,3618841616446699339,2024-12-18 22:32:46,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,6,33,2
4,5419,Brazil,Serie A,2012,Corinthians,Fluminense,0,1,A,1.96,3.53,4.41,1.96,3.53,4.41,1.89,3.33,3.89,,,2012-05-20 20:00:00,11994628649421207242,2024-12-18 22:32:46,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,0,3,0.0,0.0,-3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,11,18,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4954,59823,Brazil,Serie A,2025,Sport Recife,Palmeiras,1,2,A,3.20,3.00,2.59,3.40,3.10,2.60,3.15,2.98,2.49,3.40,3.10,2025-04-06 22:30:00,6697492086880154176,2025-04-13 01:24:10,0.2,0.4,0.6,0.8,0.8,1.0,6,7,0.4,0.8,1.4,2.0,0,1,-4,True,4,1213.0,0,1,4,-0.2,-0.2,-3,3,1,9,3,0,0,1,0,0,0,Sunday,mid,34,27,0
4955,59824,Brazil,Serie A,2025,Vitoria,Flamengo RJ,1,2,A,5.52,3.36,1.79,5.54,3.45,1.84,5.15,3.32,1.79,5.90,3.45,2025-04-06 22:30:00,3456916064123279528,2025-04-13 01:24:10,0.2,0.4,1.2,1.8,1.2,1.0,6,9,0.6,0.6,1.4,1.4,0,2,-3,True,4,122.0,0,0,4,-0.6,-0.2,-4,1,1,7,3,0,2,1,-2,1,1,Sunday,mid,36,17,0
4956,59821,Brazil,Serie A,2025,Internacional,Cruzeiro,3,0,H,1.62,3.77,6.53,1.67,3.77,6.58,1.62,3.66,6.14,1.67,3.90,2025-04-06 22:30:00,12860026645276073376,2025-04-13 01:24:10,0.2,0.4,1.4,1.2,1.8,1.0,4,8,0.6,0.2,1.8,0.6,2,3,2,True,4,122.0,0,4,3,0.2,-0.2,1,1,1,4,-2,1,1,1,0,2,1,Sunday,mid,22,14,2
4957,59822,Brazil,Serie A,2025,Mirassol,Fortaleza,1,1,D,2.36,3.04,3.58,2.45,3.10,3.58,2.34,2.99,3.43,2.46,3.10,2025-04-06 22:30:00,7380770139153214873,2025-04-13 01:24:10,0.0,0.4,1.0,1.2,2.0,1.0,0,7,0.0,0.2,0.0,1.6,0,0,0,True,4,,0,1,4,-0.2,-0.4,-3,0,-1,1,-1,1,2,1,-1,2,0,Sunday,mid,25,19,1


In [4]:
import os
import numpy as np
import pandas as pd
from pathlib import Path
from sklearn.preprocessing import LabelEncoder
import pandas as pd
from modelagem.feature_eng.match_analysis import get_storage_ranks, create_main_cols
from modelagem.utils.logs import logger
# Função para encoding dos times
import pandas as pd
from sklearn.preprocessing import LabelEncoder
import json

from modelagem.utils.preprocessing.engine import encode_teams, calculate_match_points, get_feature_columns



# Definindo diretórios base
MODEL_DIR = os.path.join('database', 'models')
LOG_DIR = os.path.join('logs')
FT_DIR = Path("features")
LOG_DIR = Path("logs")


from modelagem.utils.get_data import get_soccer_data
df = get_soccer_data()

df = df.copy() 


# # Criando colunas iniciais
# df['match_name'] = df['home_team'] + ' - ' + df['away_team']
# df['datetime'] = pd.to_datetime(df['datetime'])

# # Selecionando colunas importantes
# df = df[["season", "datetime", "home_team", "away_team", 
#             "home_score", "away_score", "result", "match_name"]]

# # Convertendo colunas para inteiro
# to_int = ['season', 'home_score', 'away_score']
# df[to_int] = df[to_int].apply(pd.to_numeric, errors='coerce').fillna(0).astype(int)

# # Encoding dos times
# df = encode_teams(df,
#                     path_team_mapping=os.path.join(MODEL_DIR, "team_mapping.json"))

# # Calculando pontos e resultado das partidas
# logger.debug("Iniciando o cálculo dos pontos e resultados das partidas.")
# df = calculate_match_points(df)

# # Reordenando colunas
# cols_order = ['season','datetime', 'result', 'match_name', 'home_team', 'away_team',
#                 'home_team_encoder', 'away_team_encoder', 'winner', 'home_score', 
#                 'away_score', 'h_match_points', 'a_match_points']
# df = df[cols_order]

# # Feature Engineering
# logger.debug("Iniciando o feature engineering.")
# df_storage_ranks = get_storage_ranks(df)







# logger.debug("Iniciando o feature engineering do time da casa.")
# ht_cols = [f'ht{col}' for col in get_feature_columns()]
# at_cols = [f'at{col}' for col in get_feature_columns()]

# logger.debug("Criando create_main_cols")
# df[ht_cols] = df.apply(lambda x: create_main_cols(x, x.home_team, df, df_storage_ranks), axis=1, result_type='expand')
# df[at_cols] = df.apply(lambda x: create_main_cols(x, x.away_team, df, df_storage_ranks), axis=1, result_type='expand')

# # Removendo colunas desnecessárias
# df.drop(columns=['match_name', 'datetime', 'home_team', 'away_team', 
#                     'home_score', 'away_score', 'h_match_points', 'a_match_points'], inplace=True)

# df.fillna(-33, inplace=True)  # Preenchendo valores ausentes

2025-04-13 21:44:09,341 | main | INFO | get_data | get_soccer_data | Dados carregados com sucesso para o país: Brazil
2025-04-13 21:44:09,342 | main | DEBUG | get_data | get_soccer_data | Conexão com o banco de dados fechada.


Conexão SQLite estabelecida.


In [5]:
from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()
y = encoder.fit_transform(df['result'])  # 'H' → 0, 'D' → 1, 'A' → 2

df["map_result"] = y



In [6]:
from modelagem.utils.feature.implementing_features import main

path_team_mapping=os.path.join(MODEL_DIR, "team_mapping.json")
df_with_features = main(df, path_team_mapping)

ℹ️ Colunas de ranking não encontradas (pulando ranking_diff)


In [7]:
df_with_features

Unnamed: 0,id,country,league,season,home_team,away_team,home_score,away_score,result,psch,pscd,psca,maxch,maxcd,maxca,avgch,avgcd,avgca,bfech,bfecd,datetime,hash,last_updated,map_result,home_team_last_5_win_rate,away_team_last_5_win_rate,home_team_goal_avg_last_5,away_team_goal_avg_last_5,home_team_goals_conceded_last_5,away_team_goals_conceded_last_5,home_team_form,away_team_form,home_team_home_win_rate,away_team_away_win_rate,home_team_home_goal_avg,away_team_away_goal_avg,head_to_head_win_home_team,head_to_head_draws,head_to_head_goal_diff,is_weekend,month,days_since_last_match_home,match_importance,home_team_ranking,away_team_ranking,goal_avg_diff,win_rate_diff,ranking_diff,away_team_streak,away_team_last_match_result,away_team_points_last_3,away_team_score_diff_last_5,home_team_total_goals_scored_season,home_team_total_goals_conceded_season,home_team_matches_played_season,home_team_goal_diff_season,away_team_total_goals_scored_season,away_team_total_goals_conceded_season,match_day_of_week,season_phase,home_team_encoder,away_team_encoder
0,5415,Brazil,Serie A,2012,Palmeiras,Portuguesa,1,1,1,1.75,3.86,5.25,1.76,3.87,5.31,1.69,3.50,4.90,,,2012-05-19 22:30:00,10444097902145517897,2024-12-18 22:32:46,1,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,27,30
1,5416,Brazil,Serie A,2012,Sport Recife,Flamengo RJ,1,1,1,2.83,3.39,2.68,2.83,3.42,2.70,2.59,3.23,2.58,,,2012-05-19 22:30:00,7876314183501917566,2024-12-18 22:32:46,1,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,34,17
2,5417,Brazil,Serie A,2012,Figueirense,Nautico,2,1,2,1.60,4.04,6.72,1.67,4.05,7.22,1.59,3.67,5.64,,,2012-05-20 01:00:00,9296066046964045682,2024-12-18 22:32:46,2,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,16,26
3,5418,Brazil,Serie A,2012,Botafogo RJ,Sao Paulo,4,2,2,2.49,3.35,3.15,2.49,3.39,3.15,2.35,3.26,2.84,,,2012-05-20 20:00:00,3618841616446699339,2024-12-18 22:32:46,2,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,6,33
4,5419,Brazil,Serie A,2012,Corinthians,Fluminense,0,1,0,1.96,3.53,4.41,1.96,3.53,4.41,1.89,3.33,3.89,,,2012-05-20 20:00:00,11994628649421207242,2024-12-18 22:32:46,0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,0,3,0.0,0.0,-3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,11,18
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4956,59823,Brazil,Serie A,2025,Sport Recife,Palmeiras,1,2,0,3.20,3.00,2.59,3.40,3.10,2.60,3.15,2.98,2.49,3.40,3.10,2025-04-06 22:30:00,6697492086880154176,2025-04-13 01:24:10,0,0.2,0.4,0.6,0.8,0.8,1.0,6,7,0.4,0.8,1.4,2.0,0,1,-4,True,4,1213.0,0,1,4,-0.2,-0.2,-3,3,1,9,3,0,0,1,0,0,0,Sunday,mid,34,27
4957,59824,Brazil,Serie A,2025,Vitoria,Flamengo RJ,1,2,0,5.52,3.36,1.79,5.54,3.45,1.84,5.15,3.32,1.79,5.90,3.45,2025-04-06 22:30:00,3456916064123279528,2025-04-13 01:24:10,0,0.2,0.4,1.2,1.8,1.2,1.0,6,9,0.6,0.6,1.4,1.4,0,2,-3,True,4,122.0,0,0,4,-0.6,-0.2,-4,1,1,7,3,0,2,1,-2,1,1,Sunday,mid,36,17
4954,59821,Brazil,Serie A,2025,Internacional,Cruzeiro,3,0,2,1.62,3.77,6.53,1.67,3.77,6.58,1.62,3.66,6.14,1.67,3.90,2025-04-06 22:30:00,12860026645276073376,2025-04-13 01:24:10,2,0.2,0.4,1.4,1.2,1.8,1.0,4,8,0.6,0.2,1.8,0.6,2,3,2,True,4,122.0,0,4,3,0.2,-0.2,1,1,1,4,-2,1,1,1,0,2,1,Sunday,mid,22,14
4955,59822,Brazil,Serie A,2025,Mirassol,Fortaleza,1,1,1,2.36,3.04,3.58,2.45,3.10,3.58,2.34,2.99,3.43,2.46,3.10,2025-04-06 22:30:00,7380770139153214873,2025-04-13 01:24:10,1,0.0,0.4,1.0,1.2,2.0,1.0,0,7,0.0,0.2,0.0,1.6,0,0,0,True,4,,0,1,4,-0.2,-0.4,-3,0,-1,1,-1,1,2,1,-1,2,0,Sunday,mid,25,19


In [8]:
df_with_features

# match_day_of_week	season_phase, result, home_team	away_team

Unnamed: 0,id,country,league,season,home_team,away_team,home_score,away_score,result,psch,pscd,psca,maxch,maxcd,maxca,avgch,avgcd,avgca,bfech,bfecd,datetime,hash,last_updated,map_result,home_team_last_5_win_rate,away_team_last_5_win_rate,home_team_goal_avg_last_5,away_team_goal_avg_last_5,home_team_goals_conceded_last_5,away_team_goals_conceded_last_5,home_team_form,away_team_form,home_team_home_win_rate,away_team_away_win_rate,home_team_home_goal_avg,away_team_away_goal_avg,head_to_head_win_home_team,head_to_head_draws,head_to_head_goal_diff,is_weekend,month,days_since_last_match_home,match_importance,home_team_ranking,away_team_ranking,goal_avg_diff,win_rate_diff,ranking_diff,away_team_streak,away_team_last_match_result,away_team_points_last_3,away_team_score_diff_last_5,home_team_total_goals_scored_season,home_team_total_goals_conceded_season,home_team_matches_played_season,home_team_goal_diff_season,away_team_total_goals_scored_season,away_team_total_goals_conceded_season,match_day_of_week,season_phase,home_team_encoder,away_team_encoder
0,5415,Brazil,Serie A,2012,Palmeiras,Portuguesa,1,1,1,1.75,3.86,5.25,1.76,3.87,5.31,1.69,3.50,4.90,,,2012-05-19 22:30:00,10444097902145517897,2024-12-18 22:32:46,1,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,27,30
1,5416,Brazil,Serie A,2012,Sport Recife,Flamengo RJ,1,1,1,2.83,3.39,2.68,2.83,3.42,2.70,2.59,3.23,2.58,,,2012-05-19 22:30:00,7876314183501917566,2024-12-18 22:32:46,1,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,1,1,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,Saturday,mid,34,17
2,5417,Brazil,Serie A,2012,Figueirense,Nautico,2,1,2,1.60,4.04,6.72,1.67,4.05,7.22,1.59,3.67,5.64,,,2012-05-20 01:00:00,9296066046964045682,2024-12-18 22:32:46,2,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,16,26
3,5418,Brazil,Serie A,2012,Botafogo RJ,Sao Paulo,4,2,2,2.49,3.35,3.15,2.49,3.39,3.15,2.35,3.26,2.84,,,2012-05-20 20:00:00,3618841616446699339,2024-12-18 22:32:46,2,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,3,0,0.0,0.0,3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,6,33
4,5419,Brazil,Serie A,2012,Corinthians,Fluminense,0,1,0,1.96,3.53,4.41,1.96,3.53,4.41,1.89,3.33,3.89,,,2012-05-20 20:00:00,11994628649421207242,2024-12-18 22:32:46,0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0.0,0.0,0,0,0,True,5,,0,0,3,0.0,0.0,-3,0,0,0,0,0,0,0,0,0,0,Sunday,mid,11,18
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4956,59823,Brazil,Serie A,2025,Sport Recife,Palmeiras,1,2,0,3.20,3.00,2.59,3.40,3.10,2.60,3.15,2.98,2.49,3.40,3.10,2025-04-06 22:30:00,6697492086880154176,2025-04-13 01:24:10,0,0.2,0.4,0.6,0.8,0.8,1.0,6,7,0.4,0.8,1.4,2.0,0,1,-4,True,4,1213.0,0,1,4,-0.2,-0.2,-3,3,1,9,3,0,0,1,0,0,0,Sunday,mid,34,27
4957,59824,Brazil,Serie A,2025,Vitoria,Flamengo RJ,1,2,0,5.52,3.36,1.79,5.54,3.45,1.84,5.15,3.32,1.79,5.90,3.45,2025-04-06 22:30:00,3456916064123279528,2025-04-13 01:24:10,0,0.2,0.4,1.2,1.8,1.2,1.0,6,9,0.6,0.6,1.4,1.4,0,2,-3,True,4,122.0,0,0,4,-0.6,-0.2,-4,1,1,7,3,0,2,1,-2,1,1,Sunday,mid,36,17
4954,59821,Brazil,Serie A,2025,Internacional,Cruzeiro,3,0,2,1.62,3.77,6.53,1.67,3.77,6.58,1.62,3.66,6.14,1.67,3.90,2025-04-06 22:30:00,12860026645276073376,2025-04-13 01:24:10,2,0.2,0.4,1.4,1.2,1.8,1.0,4,8,0.6,0.2,1.8,0.6,2,3,2,True,4,122.0,0,4,3,0.2,-0.2,1,1,1,4,-2,1,1,1,0,2,1,Sunday,mid,22,14
4955,59822,Brazil,Serie A,2025,Mirassol,Fortaleza,1,1,1,2.36,3.04,3.58,2.45,3.10,3.58,2.34,2.99,3.43,2.46,3.10,2025-04-06 22:30:00,7380770139153214873,2025-04-13 01:24:10,1,0.0,0.4,1.0,1.2,2.0,1.0,0,7,0.0,0.2,0.0,1.6,0,0,0,True,4,,0,1,4,-0.2,-0.4,-3,0,-1,1,-1,1,2,1,-1,2,0,Sunday,mid,25,19


In [9]:
drop_columns = df_with_features.select_dtypes(include=['object']).columns

result = df_with_features.drop(columns=drop_columns).corr()

result.style.background_gradient(cmap='coolwarm', axis=None)

Unnamed: 0,id,home_score,away_score,result,psch,pscd,psca,maxch,maxcd,maxca,avgch,avgcd,avgca,bfech,bfecd,datetime,map_result,home_team_last_5_win_rate,away_team_last_5_win_rate,home_team_goal_avg_last_5,away_team_goal_avg_last_5,home_team_goals_conceded_last_5,away_team_goals_conceded_last_5,home_team_form,away_team_form,home_team_home_win_rate,away_team_away_win_rate,home_team_home_goal_avg,away_team_away_goal_avg,head_to_head_win_home_team,head_to_head_draws,head_to_head_goal_diff,is_weekend,month,days_since_last_match_home,match_importance,home_team_ranking,away_team_ranking,goal_avg_diff,win_rate_diff,ranking_diff,away_team_streak,away_team_last_match_result,away_team_points_last_3,away_team_score_diff_last_5,home_team_total_goals_scored_season,home_team_total_goals_conceded_season,home_team_matches_played_season,home_team_goal_diff_season,away_team_total_goals_scored_season,away_team_total_goals_conceded_season,home_team_encoder,away_team_encoder
id,1.0,-0.002614,7e-06,-0.002653,0.031396,-0.055341,-0.006118,0.028584,-0.04462,-0.01139,0.040867,-0.00882,0.010167,-0.020107,-0.067608,0.516611,-0.002653,-0.004704,0.004035,0.021857,0.007639,0.016567,0.00432,0.005877,0.012988,-0.018355,0.053836,-0.011577,0.046459,0.135711,0.126788,0.004617,-0.001592,-0.117231,0.153103,-0.009569,-0.056455,-0.055337,0.009952,-0.006231,-0.001743,0.061982,0.039292,0.051999,0.042452,-0.056394,-0.057219,-0.061667,0.000199,-0.056136,-0.056535,-0.037878,-0.037716
home_score,-0.002614,1.0,0.025397,0.609414,-0.232961,0.235511,0.256054,-0.220776,0.2353,0.257057,-0.231233,0.235109,0.261189,-0.266536,0.286409,-0.023398,0.609414,0.092198,-0.073993,0.087944,-0.047097,-0.068903,0.081288,0.094924,-0.080264,0.125543,-0.064735,0.12182,-0.035392,0.065518,0.028933,0.110073,-0.016357,0.002417,-0.001609,0.007255,0.134555,-0.082067,0.095755,0.118479,0.311054,-0.011823,-0.013706,-0.050587,-0.066548,0.076772,-0.039525,0.013482,0.166188,-0.022523,0.062084,0.025799,-0.000579
away_score,7e-06,0.025397,1.0,-0.609773,0.198674,-0.070369,-0.166601,0.188646,-0.075544,-0.161868,0.196461,-0.073454,-0.167302,0.24613,-0.110323,0.028566,-0.609773,-0.063289,0.071093,-0.061507,0.100544,0.052098,-0.028183,-0.068688,0.076748,-0.070593,0.068574,-0.050455,0.101818,-0.042686,0.005746,-0.068821,-0.025271,-0.009822,0.003856,0.01152,-0.080259,0.08918,-0.115362,-0.095854,-0.243241,0.040659,0.051419,0.071695,0.090283,-0.022313,0.031141,-0.006682,-0.076162,0.054927,-0.022961,0.00054,0.004099
result,-0.002653,0.609414,-0.609773,1.0,-0.286189,0.17703,0.263707,-0.271175,0.181207,0.261635,-0.283339,0.179798,0.268827,-0.382518,0.190515,-0.031357,1.0,0.090883,-0.089956,0.095944,-0.092903,-0.071242,0.060791,0.098467,-0.095834,0.11729,-0.082534,0.102154,-0.071727,0.067418,0.014171,0.117286,0.003463,-0.007166,-0.009389,-0.01037,0.147744,-0.128224,0.134187,0.128965,0.396211,-0.032456,-0.032154,-0.074204,-0.087848,0.056747,-0.056336,0.002446,0.161286,-0.055355,0.035968,0.022161,-0.012386
psch,0.031396,-0.232961,0.198674,-0.286189,1.0,-0.335221,-0.665574,0.986611,-0.332403,-0.648985,0.995318,-0.329141,-0.675885,0.994974,-0.36795,0.087178,-0.286189,-0.254334,0.264718,-0.251303,0.262234,0.192163,-0.190393,-0.270467,0.289201,-0.287385,0.254875,-0.290571,0.221802,-0.19352,-0.01623,-0.347376,-0.015914,-0.007748,0.039719,0.012698,-0.203773,0.207502,-0.364998,-0.370192,-0.590434,0.141292,0.146992,0.246167,0.295232,-0.15761,0.140877,0.003867,-0.425855,0.164765,-0.132949,-0.046923,0.027272
pscd,-0.055341,0.235511,-0.070369,0.17703,-0.335221,1.0,0.852203,-0.298895,0.986309,0.85998,-0.335368,0.986738,0.851781,-0.304291,0.991884,-0.086363,0.17703,0.22836,-0.181767,0.248776,-0.129165,-0.098752,0.216403,0.231334,-0.208534,0.225173,-0.155419,0.261884,-0.09605,0.15063,-0.072555,0.247501,-0.00268,0.070448,-0.035008,0.085146,0.257283,-0.061909,0.267963,0.292377,0.458483,-0.084136,-0.108931,-0.156739,-0.18414,0.253991,0.014045,0.102098,0.344526,-0.000369,0.236037,0.040088,-0.001757
psca,-0.006118,0.256054,-0.166601,0.263707,-0.665574,0.852203,1.0,-0.631038,0.85844,0.988082,-0.662871,0.86048,0.991868,-0.69022,0.835067,-0.020757,0.263707,0.277564,-0.275438,0.250272,-0.276889,-0.20809,0.216324,0.293696,-0.302217,0.294157,-0.253205,0.28242,-0.231188,0.213944,-0.021388,0.34643,0.002436,0.032944,-0.028501,0.028822,0.246801,-0.187496,0.37476,0.394374,0.623562,-0.134578,-0.155067,-0.244053,-0.293176,0.198729,-0.114137,0.035152,0.446948,-0.145938,0.182449,0.040857,-0.026402
maxch,0.028584,-0.220776,0.188646,-0.271175,0.986611,-0.298895,-0.631038,1.0,-0.293009,-0.617085,0.993762,-0.292325,-0.642737,0.996706,-0.342665,0.082654,-0.271175,-0.245484,0.259927,-0.244218,0.257609,0.18663,-0.183824,-0.261161,0.283522,-0.279963,0.248648,-0.283645,0.217003,-0.188234,-0.017215,-0.334378,-0.014144,-0.004038,0.037458,0.014149,-0.196165,0.205777,-0.356689,-0.360473,-0.577027,0.137424,0.143046,0.24009,0.28799,-0.152393,0.139856,0.006874,-0.416923,0.16435,-0.12718,-0.042572,0.028548
maxcd,-0.04462,0.2353,-0.075544,0.181207,-0.332403,0.986309,0.85844,-0.293009,1.0,0.869016,-0.33192,0.988177,0.859872,-0.321188,0.984854,-0.062014,0.181207,0.236398,-0.188143,0.254953,-0.135473,-0.107248,0.218458,0.241068,-0.21395,0.233334,-0.162532,0.269473,-0.102645,0.158895,-0.068611,0.25309,-0.00965,0.068154,-0.029892,0.082497,0.257391,-0.072542,0.276836,0.302653,0.473891,-0.088486,-0.111878,-0.161948,-0.18991,0.251172,0.001115,0.094803,0.358811,-0.011347,0.229511,0.035532,-0.007822
maxca,-0.01139,0.257057,-0.161868,0.261635,-0.648985,0.85998,0.988082,-0.617085,0.869016,1.0,-0.648396,0.870625,0.991228,-0.687113,0.830415,-0.025444,0.261635,0.283174,-0.274273,0.254913,-0.272412,-0.207219,0.215011,0.298901,-0.300019,0.299367,-0.252199,0.290464,-0.231656,0.210926,-0.019985,0.342133,0.000946,0.035952,-0.028539,0.029738,0.250773,-0.185145,0.37483,0.397528,0.625898,-0.135039,-0.155996,-0.243556,-0.290665,0.202593,-0.113578,0.036253,0.4517,-0.143623,0.182373,0.040958,-0.0282


In [10]:
result["map_result"].reset_index()

Unnamed: 0,index,map_result
0,id,-0.002653
1,home_score,0.609414
2,away_score,-0.609773
3,result,1.0
4,psch,-0.286189
5,pscd,0.17703
6,psca,0.263707
7,maxch,-0.271175
8,maxcd,0.181207
9,maxca,0.261635
