# Opta Team Performance metrics from FotMob: scraping data for UEFA's Top 8 Leagues
##### Notebook to scrape team performance data by StatsPerform from [FotMob](https://www.fotmob.com/), using [soccerdata](https://github.com/probberechts/soccerdata) by [Pieter Robberechts](https://x.com/p_robberechts)

### By [Vítor Neves](https://github.com/Vitor-Neves48)
Notebook first written: 16th April 2025
Notebook last updated: 16th April 2025

![StatsPerform](https://github.com/Vitor-Neves48/portfolio-football-analytics/blob/main/Analysis%20Tools/media/StatsPerform_Logo_Primary_01.jpg)

![Opta](https://github.com/Vitor-Neves48/portfolio-football-analytics/blob/main/Analysis%20Tools/media/opta_logo.jpeg)

![FotMob](https://github.com/Vitor-Neves48/portfolio-football-analytics/blob/main/Analysis%20Tools/media/fotmob_logo.PNG)

___

<a id='sectionintro'></a>

## <a id='import_libraries'>Introduction</a>
This notebook scrapes player Event data from [FotMob](https://www.fotmob.com/) using the [soccerdata](https://github.com/probberechts/soccerdata) library by [Pieter Robberechts](https://x.com/p_robberechts), [pandas](http://pandas.pydata.org/) for data manipulation through DataFrames, and [Selenium](https://www.selenium.dev/) and [Beautifulsoup](https://pypi.org/project/beautifulsoup4/) for webscraping.

For more information about this notebook and the author, I'm available through all the following channels:
* (ADD LINKS)


In [None]:
# imports
import locale
import os
import sys
import time

import numpy as np
import pandas as pd
import soccerdata as sd
from scipy.stats import poisson
from tqdm import tqdm

sys.path.append(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Analysis Tools"
) # Change as needed
import function_town as ft

## Data Scraping

In [2]:
Eredivisie_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_NLD-Eredivisie_2020-2021.parquet"
) # Change as needed
Eredivisie_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_NLD-Eredivisie_2021-2022.parquet"
) # Change as needed
Eredivisie_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_NLD-Eredivisie_2022-2023.parquet"
) # Change as needed
Eredivisie_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_NLD-Eredivisie_2023-2024.parquet"
) # Change as needed
Eredivisie_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_NLD-Eredivisie_2024-2025.parquet"
) # Change as needed

In [4]:
Premier_League_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ENG-Premier League_2020-2021.parquet"
) # Change as needed
Premier_League_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ENG-Premier League_2021-2022.parquet"
) # Change as needed
Premier_League_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ENG-Premier League_2022-2023.parquet"
) # Change as needed
Premier_League_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ENG-Premier League_2023-2024.parquet"
) # Change as needed
Premier_League_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ENG-Premier League_2024-2025.parquet"
) # Change as needed

In [5]:
La_Liga_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ESP-La Liga_2020-2021.parquet"
) # Change as needed
La_Liga_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ESP-La Liga_2021-2022.parquet"
) # Change as needed
La_Liga_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ESP-La Liga_2022-2023.parquet"
) # Change as needed
La_Liga_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ESP-La Liga_2023-2024.parquet"
) # Change as needed
La_Liga_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ESP-La Liga_2024-2025.parquet"
) # Change as needed

In [6]:
Ligue_1_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_FRA-Ligue 1_2020-2021.parquet"
) # Change as needed
Ligue_1_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_FRA-Ligue 1_2021-2022.parquet"
) # Change as needed
Ligue_1_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_FRA-Ligue 1_2022-2023.parquet"
) # Change as needed
Ligue_1_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_FRA-Ligue 1_2023-2024.parquet"
) # Change as needed
Ligue_1_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_FRA-Ligue 1_2024-2025.parquet"
) # Change as needed

In [7]:
Bundesliga_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_GER-Bundesliga_2020-2021.parquet"
) # Change as needed
Bundesliga_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_GER-Bundesliga_2021-2022.parquet"
) # Change as needed
Bundesliga_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_GER-Bundesliga_2022-2023.parquet"
) # Change as needed
Bundesliga_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_GER-Bundesliga_2023-2024.parquet"
) # Change as needed
Bundesliga_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_GER-Bundesliga_2024-2025.parquet"
) # Change as needed

In [8]:
Serie_A_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ITA-Serie A_2020-2021.parquet"
) # Change as needed
Serie_A_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ITA-Serie A_2021-2022.parquet"
) # Change as needed
Serie_A_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ITA-Serie A_2022-2023.parquet"
) # Change as needed
Serie_A_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ITA-Serie A_2023-2024.parquet"
) # Change as needed
Serie_A_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_ITA-Serie A_2024-2025.parquet"
) # Change as needed

In [9]:
Liga_Portugal_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_PRT-Liga Portugal_2020-2021.parquet"
) # Change as needed
Liga_Portugal_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_PRT-Liga Portugal_2021-2022.parquet"
) # Change as needed
Liga_Portugal_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_PRT-Liga Portugal_2022-2023.parquet"
) # Change as needed
Liga_Portugal_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_PRT-Liga Portugal_2023-2024.parquet"
) # Change as needed
Liga_Portugal_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_PRT-Liga Portugal_2024-2025.parquet"
) # Change as needed

In [10]:
Jupiler_Pro_League_2020_2021=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_BEL-Jupiler Pro League_2020-2021.parquet"
) # Change as needed
Jupiler_Pro_League_2021_2022=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_BEL-Jupiler Pro League_2021-2022.parquet"
) # Change as needed
Jupiler_Pro_League_2022_2023=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_BEL-Jupiler Pro League_2022-2023.parquet"
) # Change as needed
Jupiler_Pro_League_2023_2024=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_BEL-Jupiler Pro League_2023-2024.parquet"
) # Change as needed
Jupiler_Pro_League_2024_2025=pd.read_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_BEL-Jupiler Pro League_2024-2025.parquet"
) # Change as needed

In [13]:
# %% Join everything into a UEFA Top 8 Leagues dataset
UEFA_Top8_Last5_Seasons = [
    Eredivisie_2020_2021,
    Eredivisie_2021_2022,
    Eredivisie_2022_2023,
    Eredivisie_2023_2024,
    Eredivisie_2024_2025,
    Jupiler_Pro_League_2020_2021,
    Jupiler_Pro_League_2021_2022,
    Jupiler_Pro_League_2022_2023,
    Jupiler_Pro_League_2023_2024,
    Jupiler_Pro_League_2024_2025,
    Liga_Portugal_2020_2021,
    Liga_Portugal_2021_2022,
    Liga_Portugal_2022_2023,
    Liga_Portugal_2023_2024,
    Liga_Portugal_2024_2025,
    Serie_A_2020_2021,
    Serie_A_2021_2022,
    Serie_A_2022_2023,
    Serie_A_2023_2024,
    Serie_A_2024_2025,
    Ligue_1_2020_2021,
    Ligue_1_2021_2022,
    Ligue_1_2022_2023,
    Ligue_1_2023_2024,
    Ligue_1_2024_2025,
    Premier_League_2020_2021,
    Premier_League_2021_2022,
    Premier_League_2022_2023,
    Premier_League_2023_2024,
    Premier_League_2024_2025,
    La_Liga_2020_2021,
    La_Liga_2021_2022,
    La_Liga_2022_2023,
    La_Liga_2023_2024,
    La_Liga_2024_2025,
    Bundesliga_2020_2021,
    Bundesliga_2021_2022,
    Bundesliga_2022_2023,
    Bundesliga_2023_2024,
    Bundesliga_2024_2025,
]

UEFA_Top8_Current_Season = [
    Eredivisie_2024_2025,
    Jupiler_Pro_League_2024_2025,
    Liga_Portugal_2024_2025,
    Serie_A_2024_2025,
    Ligue_1_2024_2025,
    Premier_League_2024_2025,
    La_Liga_2024_2025,
    Bundesliga_2024_2025,
]

UEFA_Top8_Last4_Seasons_Without_Current = [
    Eredivisie_2020_2021,
    Eredivisie_2021_2022,
    Eredivisie_2022_2023,
    Eredivisie_2023_2024,
    Jupiler_Pro_League_2020_2021,
    Jupiler_Pro_League_2021_2022,
    Jupiler_Pro_League_2022_2023,
    Jupiler_Pro_League_2023_2024,
    Liga_Portugal_2020_2021,
    Liga_Portugal_2021_2022,
    Liga_Portugal_2022_2023,
    Liga_Portugal_2023_2024,
    Serie_A_2020_2021,
    Serie_A_2021_2022,
    Serie_A_2022_2023,
    Serie_A_2023_2024,
    Ligue_1_2020_2021,
    Ligue_1_2021_2022,
    Ligue_1_2022_2023,
    Ligue_1_2023_2024,
    Premier_League_2020_2021,
    Premier_League_2021_2022,
    Premier_League_2022_2023,
    Premier_League_2023_2024,
    La_Liga_2020_2021,
    La_Liga_2021_2022,
    La_Liga_2022_2023,
    La_Liga_2023_2024,
    Bundesliga_2020_2021,
    Bundesliga_2021_2022,
    Bundesliga_2022_2023,
    Bundesliga_2023_2024
]

UEFA_Top8_Last5_Seasons_df = pd.concat(UEFA_Top8_Last5_Seasons, ignore_index=True)

UEFA_Top8_Last5_Seasons_df.to_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_UEFA_Top8_Last5_Seasons.parquet",
    index=False,
) # Change as needed
#---
UEFA_Top8_Current_Season_df = pd.concat(UEFA_Top8_Current_Season, ignore_index=True)

UEFA_Top8_Current_Season_df.to_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_UEFA_Top8_Current_Season.parquet",
    index=False,
) # Change as needed
#---
UEFA_Top8_Last4_Seasons_Without_Current_df = pd.concat(UEFA_Top8_Last4_Seasons_Without_Current, ignore_index=True)

UEFA_Top8_Last4_Seasons_Without_Current_df.to_parquet(
    r"C:\Users\Vitor\Desktop\Football Data Analytics\My_Projects\Main level\data\raw_all_team_stats\FotMob_all_team_stats_UEFA_Top8_Last4_Seasons_Without_Current.parquet",
    index=False,
) # Change as needed



In [12]:
UEFA_Top8_Last5_Seasons_df

Unnamed: 0,league,season,match date,matchup,team,Opponent,Blocked shots,Hit woodwork,Shots inside box,Shots off target,...,Aerial duels won (%),Ground duels won (%),Successful dribbles (%),Red cards,Yellow cards,Big chances missed,Corners,Ball possession,Fouls committed,Big chances
0,NLD-Eredivisie,2021,2020-09-13,Sparta Rotterdam-Ajax,Ajax,Sparta Rotterdam,1.0,0.0,4.0,4.0,...,0.49,0.50,0.75,1.0,3.0,1,1,48,13,1
1,NLD-Eredivisie,2021,2020-09-20,Ajax-RKC Waalwijk,Ajax,RKC Waalwijk,5.0,0.0,14.0,8.0,...,0.43,0.57,0.53,0.0,1.0,3,4,63,8,4
2,NLD-Eredivisie,2021,2020-09-26,Ajax-Vitesse,Ajax,Vitesse,4.0,1.0,11.0,10.0,...,0.62,0.53,0.75,1.0,3.0,3,7,48,13,4
3,NLD-Eredivisie,2021,2020-10-04,FC Groningen-Ajax,Ajax,FC Groningen,5.0,2.0,11.0,11.0,...,0.53,0.55,0.79,0.0,1.0,1,11,69,9,1
4,NLD-Eredivisie,2021,2020-10-18,Ajax-SC Heerenveen,Ajax,SC Heerenveen,4.0,0.0,14.0,7.0,...,0.35,0.46,0.77,0.0,1.0,1,5,69,10,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
42117,GER-Bundesliga,2425,2025-03-29,Holstein Kiel-Werder Bremen,Werder Bremen,Holstein Kiel,3.0,0.0,9.0,4.0,...,0.82,0.71,0.56,0.0,1.0,1,4,45,8,2
42118,GER-Bundesliga,2425,2025-04-05,Mainz 05-Holstein Kiel,Holstein Kiel,Mainz 05,2.0,1.0,8.0,3.0,...,0.37,0.57,0.44,0.0,4.0,3,4,43,8,3
42119,GER-Bundesliga,2425,2025-04-05,Mainz 05-Holstein Kiel,Mainz 05,Holstein Kiel,1.0,2.0,10.0,9.0,...,0.63,0.43,0.30,0.0,3.0,2,4,57,12,3
42120,GER-Bundesliga,2425,2025-04-12,Holstein Kiel-St. Pauli,Holstein Kiel,St. Pauli,1.0,1.0,3.0,3.0,...,0.55,0.46,0.25,0.0,3.0,1,3,52,15,1
