# PSG Complete Data Analysis - FootballDecoded

### Objetivo del An√°lisis

Este notebook realiza una **extracci√≥n exhaustiva y organizada** de todos los datos disponibles del **Paris Saint-Germain** para las temporadas **2023/24 y 2024/25**, utilizando el conjunto completo de wrappers desarrollados en FootballDecoded.

**Datos a extraer:**
- Estad√≠sticas de equipo (temporada completa por competici√≥n)
- Estad√≠sticas individuales (todos los jugadores de la plantilla)
- Eventos de partidos (goles, pases, disparos, etc.)
- Datos espaciales (coordenadas, mapas de calor, redes de pase)
- M√©tricas avanzadas (xG, xA, PPDA, build-up chains)
- An√°lisis espec√≠fico del partido PSG vs Auxerre (01/09/2024)

**Fuentes de datos:**
- **FBref**: Estad√≠sticas completas y eventos
- **Understat**: M√©tricas avanzadas (xGChain, PPDA, etc.)
- **WhoScored**: Datos espaciales y coordenadas

In [1]:
# Dependencias base
import pandas as pd
import numpy as np
from typing import Dict, List, Optional, Union
from datetime import datetime
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

import sys
import os
    
# Ir al directorio correcto donde est√° instalado el paquete
data_dir = "/home/oriol/FD/Data"
sys.path.insert(0, data_dir)
    
from scrappers import FBref, Understat, WhoScored
from wrappers import *

In [2]:
# PSG 2023-24
psg_23_24 = fbref_extract_league_players(
    league="FRA-Ligue 1",
    season="2023-24", 
    team_filter="Paris S-G",
    verbose=True
)

# PSG 2024-25  
psg_24_25 = fbref_extract_league_players(
    league="FRA-Ligue 1",
    season="2024-25",
    team_filter="Paris S-G", 
    verbose=True
)

üîç Extracting player list from FRA-Ligue 1 2023-24
   Team filter: Paris S-G


System Info - Platform: linux, Machine: amd64 (Ubuntu), File Ext : so.
Library Download URL: https://github.com/bogdanfinn/tls-client/releases/download/v1.9.1/tls-client-linux-ubuntu-amd64-1.9.1.so


/home/oriol/.cache/pypoetry/virtualenvs/footballdecoded-guhsVucw-py3.10/lib/python3.10/site-packages/tls_requests/bin/tls-client-linux-ubuntu-amd64-1.9.1.so: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 11.9M/11.9M [00:00<00:00, 39.7MiB/s]


   üîç Team filter: 29/540 players
   ‚úÖ Found 29 players
üîç Extracting player list from FRA-Ligue 1 2024-25
   Team filter: Paris S-G


   üîç Team filter: 28/553 players
   ‚úÖ Found 28 players


In [3]:
# Extract data for 2023-24 players
psg_data_23_24 = fbref_extract_multiple_players(
    psg_23_24['player'].tolist(),
    "FRA-Ligue 1", 
    "2023-24",
    verbose=True
)

print("\n" + "="*50 + "\n")

# Extract data for 2024-25 players  
psg_data_24_25 = fbref_extract_multiple_players(
    psg_24_25['player'].tolist(),
    "FRA-Ligue 1", 
    "2024-25",
    verbose=True
)

üîç Extracting season data for 29 players
[1/29] Achraf Hakimi


[2/29] Arnau Tenas


[3/29] Bradley Barcola


[4/29] Carlos Soler


[5/29] Cher Ndour


[6/29] Danilo Pereira


[7/29] Ethan Mbapp√©


[8/29] Fabi√°n Ruiz Pe√±a


[9/29] Gianluigi Donnarumma


[10/29] Gon√ßalo Ramos


[11/29] Hugo Ekitike


[12/29] Keylor Navas


[13/29] Kylian Mbapp√©


[14/29] Layvin Kurzawa


[15/29] Lee Kang-in


[16/29] Lucas Beraldo


[17/29] Lucas Hern√°ndez


[18/29] Manuel Ugarte Ribeiro


[19/29] Marco Asensio


[20/29] Marquinhos


[21/29] Milan ≈†kriniar


[22/29] Nordi Mukiele


[23/29] Nuno Mendes


[24/29] Ousmane Demb√©l√©


[25/29] Randal Kolo Muani


[26/29] Senny Mayulu


[27/29] Vitinha


[28/29] Warren Za√Øre-Emery


[29/29] Yoram Zague


‚úÖ Extracted 29/29 players successfully


üîç Extracting season data for 28 players
[1/28] Achraf Hakimi


[2/28] Arnau Tenas


[3/28] Axel Tap√©


[4/28] Bradley Barcola


[5/28] D√©sir√© Dou√©


[6/28] Fabi√°n Ruiz Pe√±a


[7/28] Gianluigi Donnarumma


[8/28] Gon√ßalo Ramos


[9/28] Ibrahim Mbaye


[10/28] Jo√£o Neves


[11/28] Khvicha Kvaratskhelia


[12/28] Lee Kang-in


[13/28] Lucas Beraldo


[14/28] Lucas Hern√°ndez


[15/28] Marco Asensio


[16/28] Marquinhos


[17/28] Matvei Safonov


[18/28] Milan ≈†kriniar


[19/28] Noham Kamara


[20/28] Nuno Mendes


[21/28] Ousmane Demb√©l√©


[22/28] Presnel Kimpembe


[23/28] Randal Kolo Muani


[24/28] Senny Mayulu


[25/28] Vitinha


[26/28] Warren Za√Øre-Emery


[27/28] Willian Pacho


[28/28] Yoram Zague


‚úÖ Extracted 28/28 players successfully


In [4]:
# Use existing player lists from FBref
players_23_24 = psg_23_24['player'].tolist()
players_24_25 = psg_24_25['player'].tolist()

# Extract Understat data for 2023-24 players
psg_understat_23_24 = understat_extract_multiple_players(
    players_23_24,
    "FRA-Ligue 1", 
    "2023-24",
    verbose=True
)

print("\n" + "="*50 + "\n")

# Extract Understat data for 2024-25 players  
psg_understat_24_25 = understat_extract_multiple_players(
    players_24_25,
    "FRA-Ligue 1", 
    "2024-25",
    verbose=True
)

üéØ Extracting Understat data for 29 players
[1/29] Achraf Hakimi


[2/29] Arnau Tenas


[3/29] Bradley Barcola


[4/29] Carlos Soler


[5/29] Cher Ndour


[6/29] Danilo Pereira


[7/29] Ethan Mbapp√©


[8/29] Fabi√°n Ruiz Pe√±a


[9/29] Gianluigi Donnarumma


[10/29] Gon√ßalo Ramos


[11/29] Hugo Ekitike


[12/29] Keylor Navas


[13/29] Kylian Mbapp√©


[14/29] Layvin Kurzawa


[15/29] Lee Kang-in


[16/29] Lucas Beraldo


[17/29] Lucas Hern√°ndez


[18/29] Manuel Ugarte Ribeiro


[19/29] Marco Asensio


[20/29] Marquinhos


[21/29] Milan ≈†kriniar


[22/29] Nordi Mukiele


[23/29] Nuno Mendes


[24/29] Ousmane Demb√©l√©


[25/29] Randal Kolo Muani


[26/29] Senny Mayulu


[27/29] Vitinha


[28/29] Warren Za√Øre-Emery


[29/29] Yoram Zague


‚úÖ Successfully extracted 29/29 players


üéØ Extracting Understat data for 28 players
[1/28] Achraf Hakimi


[2/28] Arnau Tenas


[3/28] Axel Tap√©


[4/28] Bradley Barcola


[5/28] D√©sir√© Dou√©


[6/28] Fabi√°n Ruiz Pe√±a


[7/28] Gianluigi Donnarumma


[8/28] Gon√ßalo Ramos


[9/28] Ibrahim Mbaye


[10/28] Jo√£o Neves


[11/28] Khvicha Kvaratskhelia


[12/28] Lee Kang-in


[13/28] Lucas Beraldo


[14/28] Lucas Hern√°ndez


[15/28] Marco Asensio


[16/28] Marquinhos


[17/28] Matvei Safonov


[18/28] Milan ≈†kriniar


[19/28] Noham Kamara


[20/28] Nuno Mendes


[21/28] Ousmane Demb√©l√©


[22/28] Presnel Kimpembe


[23/28] Randal Kolo Muani


[24/28] Senny Mayulu


[25/28] Vitinha


[26/28] Warren Za√Øre-Emery


[27/28] Willian Pacho


[28/28] Yoram Zague


‚úÖ Successfully extracted 28/28 players


In [9]:
# Merge 2023-24 data
psg_23_24 = pd.merge(
    psg_data_23_24, 
    psg_understat_23_24,
    on=['player_name', 'league'],
    how='left',
    suffixes=('', '_dup')
)

# Remove duplicate columns
dup_cols = [col for col in psg_23_24.columns if col.endswith('_dup')]
psg_23_24 = psg_23_24.drop(columns=dup_cols)

# Merge 2024-25 data
psg_24_25 = pd.merge(
    psg_data_24_25, 
    psg_understat_24_25,
    on=['player_name', 'league'],
    how='left',
    suffixes=('', '_dup')
)

# Remove duplicate columns
dup_cols = [col for col in psg_24_25.columns if col.endswith('_dup')]
psg_24_25 = psg_24_25.drop(columns=dup_cols)

In [17]:
# El match_id de Understat es 28351 (extra√≠do de la URL)
match_id = 28351
league = "FRA-Ligue 1"
season = "2024-25"

# Extraer todos los eventos de disparo con an√°lisis completo
shot_events = understat_extract_shot_events(
    match_id=match_id,
    league=league,
    season=season,
    verbose=True
)

# Reset index para trabajar con el DataFrame normalmente
df = shot_events.reset_index()

# Exportar CSV limpio
filename = f"understat_match_{match_id}_complete"
df.to_csv(f"{filename}.csv", index=False)
print(f"\nüíæ Exportado: {filename}.csv")
print(f"   Filas: {len(df)} | Columnas: {len(df.columns)}")

üéØ Extracting complete shot events from match 28351


   üìä Raw data: 28 shots extracted
   ‚úÖ SUCCESS: 28 shot events with complete analytics
   üìä Goals: 4 | Average xG: 0.165

üíæ Exportado: understat_match_28351_complete.csv
   Filas: 28 | Columnas: 34


In [16]:
whoscored_match_id = 1824012
league = "FRA-Ligue 1"
season = "2024-25"

# Extraer todos los eventos espaciales del partido
match_events = whoscored_extract_match_events(
    match_id=whoscored_match_id,
    league=league,
    season=season,
    verbose=True
)

# Reset index si es necesario
df_events = match_events.reset_index() if hasattr(match_events, 'index') else match_events
    
# Exportar CSV
filename = f"whoscored_match_{whoscored_match_id}_complete"
df_events.to_csv(f"{filename}.csv", index=False)
print(f"\nüíæ Exportado: {filename}.csv")
print(f"   Filas: {len(df_events)} | Columnas: {len(df_events.columns)}")

üéØ Extracting complete event data from match 1824012


   üìä Reading event stream...


   üìä Raw data: 1526 events extracted
   ‚úÖ SUCCESS: 1526 events with spatial data
   üìä Players: 32 | Event types: 31

üíæ Exportado: whoscored_match_1824012_complete.csv
   Filas: 1526 | Columnas: 37
