# Data Analysis

## LineUp Data 2022/23

Data Loading
- lineups_2022_2023_1.csv
- lineups_2022_2023_2.csv

In [4]:
import pandas as pd

# Load the datasets
lineups_1 = pd.read_csv('./data/lineups_2022_2023_1.csv')
lineups_2 = pd.read_csv('./data/lineups_2022_2023_2.csv')

lineups_1.head(), lineups_2.head()


(      Position               Player  Age Market Value            Club  \
 0   Goalkeeper   David von Ballmoos   27       €2.50m  BSC Young Boys   
 1  Centre-Back       Cédric Zesiger   24       €3.20m  BSC Young Boys   
 2  Centre-Back  Fabian Lustenberger   34        €400k  BSC Young Boys   
 3    Left-Back       Ulisses Garcia   26       €2.00m  BSC Young Boys   
 4   Right-Back           Lewin Blum   20        €750k  BSC Young Boys   
 
                                    Gameday   H/A    Status  Match ID  
 0  1. Matchday | Sat, 7/16/22   |  6:00 PM  Home  Starting   3840895  
 1  1. Matchday | Sat, 7/16/22   |  6:00 PM  Home  Starting   3840895  
 2  1. Matchday | Sat, 7/16/22   |  6:00 PM  Home  Starting   3840895  
 3  1. Matchday | Sat, 7/16/22   |  6:00 PM  Home  Starting   3840895  
 4  1. Matchday | Sat, 7/16/22   |  6:00 PM  Home  Starting   3840895  ,
              Position         Player  Age Market Value           Club  \
 0          Goalkeeper    Marwin Hitz   35     

In [6]:
# Concatenate the datasets
lineups = pd.concat([lineups_1, lineups_2], axis=0)

lineups.head()

Unnamed: 0,Position,Player,Age,Market Value,Club,Gameday,H/A,Status,Match ID
0,Goalkeeper,David von Ballmoos,27,€2.50m,BSC Young Boys,"1. Matchday | Sat, 7/16/22 | 6:00 PM",Home,Starting,3840895
1,Centre-Back,Cédric Zesiger,24,€3.20m,BSC Young Boys,"1. Matchday | Sat, 7/16/22 | 6:00 PM",Home,Starting,3840895
2,Centre-Back,Fabian Lustenberger,34,€400k,BSC Young Boys,"1. Matchday | Sat, 7/16/22 | 6:00 PM",Home,Starting,3840895
3,Left-Back,Ulisses Garcia,26,€2.00m,BSC Young Boys,"1. Matchday | Sat, 7/16/22 | 6:00 PM",Home,Starting,3840895
4,Right-Back,Lewin Blum,20,€750k,BSC Young Boys,"1. Matchday | Sat, 7/16/22 | 6:00 PM",Home,Starting,3840895


Check N/A's

In [7]:
# check for missing values
missing_values = lineups.isnull().sum()

missing_values

Position           0
Player             0
Age                0
Market Value    2779
Club               0
Gameday            0
H/A                0
Status             0
Match ID           0
dtype: int64

For the LineUp of each match in season 2022/2023 wo only have missing values in the column `Market Value`, which is not further suprising

In [8]:
# look at different values in the 'Club' column
lineups['Club'].value_counts()

Club
FC St. Gallen 1879              718
FC Basel 1893                   717
Servette FC                     717
BSC Young Boys                  716
FC Zürich                       716
FC Sion                         716
FC Winterthur                   713
Grasshopper Club Zurich         712
FC Luzern                       711
FC Lugano                       697
CF Barcelona                    176
Nîmes Olympique                 118
FC Girondins Bordeaux           117
Angers SCO                      117
Red Star FC                     117
AC Ajaccio                      116
Stade Rennais FC                116
Olympique Marseille             116
FC Sochaux-Montbéliard          116
Sporting Étoile Club Bastia     116
Paris FC                        116
AS Saint-Étienne                115
US Valenciennes-Anzin           115
CS Sedan-Ardennes               115
FC Nantes                       114
RC Strasbourg Alsace            114
Stade Reims                     113
FC Metz                

In order to only have a look at Swiss Clubs for a first step, we need to filter them.

In [10]:
# dictionary for values in 'Clubs' we want to keep
clubs_to_keep = {
    'FC St. Gallen 1879': 'FC St. Gallen 1879',
    'FC Basel 1893': 'FC Basel 1893',
    'Servette FC': 'Servette FC',
    'BSC Young Boys': 'BSC Young Boys',
    'FC Zürich': 'FC Zürich',
    'FC Sion': 'FC Sion',
    'FC Winterthur': 'FC Winterthur',
    'Grasshopper Club Zurich': 'Grasshopper Club Zurich',
    'FC Luzern': 'FC Luzern',
    'FC Lugano': 'FC Lugano'
}

# filter the dataset
lineups_SL = lineups[lineups['Club'].isin(clubs_to_keep.keys())]

lineups_SL['Club'].value_counts()

Club
FC St. Gallen 1879         718
FC Basel 1893              717
Servette FC                717
BSC Young Boys             716
FC Zürich                  716
FC Sion                    716
FC Winterthur              713
Grasshopper Club Zurich    712
FC Luzern                  711
FC Lugano                  697
Name: count, dtype: int64

Wieso hed Lugano so wenig Spiel?

## Match Event Data 2022/23

Data Loading
- match_events_2022_2023_1.csv
- match_events_2022_2023_2.csv

In [11]:
# Load the datasets
events_1 = pd.read_csv('./data/match_events_2022_2023_1.csv')
events_2 = pd.read_csv('./data/match_events_2022_2023_2.csv')

events_1.head(), events_2.head()

(             Club   H/A Timestamp         Event         Player Event  \
 0  BSC Young Boys  Home       62'          Goal  Christian Fassnacht   
 1  BSC Young Boys  Home       77'          Goal         Cedric Itten   
 2  BSC Young Boys  Home       81'          Goal        Fabian Rieder   
 3  BSC Young Boys  Home       85'          Goal       Wilfried Kanga   
 4  BSC Young Boys  Home       63'  Substitution         Cedric Itten   
 
         Remark Event        Player Assist     Player Out  Match ID  
 0             Header       Ulisses Garcia            NaN   3840895  
 1  Right-footed shot        Cheikh Niasse            NaN   3840895  
 2             Header       Wilfried Kanga            NaN   3840895  
 3  Right-footed shot  Christian Fassnacht            NaN   3840895  
 4           Tactical                  NaN  Meschack Elia   3840895  ,
             Club   H/A Timestamp         Event         Player Event  \
 0      FC Zürich  Away       90'          Goal           Roko Simi

In [13]:
# Concatenate the datasets
events = pd.concat([events_1, events_2], axis=0)

events.head()

Unnamed: 0,Club,H/A,Timestamp,Event,Player Event,Remark Event,Player Assist,Player Out,Match ID
0,BSC Young Boys,Home,62',Goal,Christian Fassnacht,Header,Ulisses Garcia,,3840895
1,BSC Young Boys,Home,77',Goal,Cedric Itten,Right-footed shot,Cheikh Niasse,,3840895
2,BSC Young Boys,Home,81',Goal,Fabian Rieder,Header,Wilfried Kanga,,3840895
3,BSC Young Boys,Home,85',Goal,Wilfried Kanga,Right-footed shot,Christian Fassnacht,,3840895
4,BSC Young Boys,Home,63',Substitution,Cedric Itten,Tactical,,Meschack Elia,3840895


Check N/A's

In [14]:
# check for missing values
missing_values_e = events.isnull().sum()

missing_values_e

Club                0
H/A                 0
Timestamp           0
Event               0
Player Event        1
Remark Event        0
Player Assist    3110
Player Out       1836
Match ID            0
dtype: int64

dddd

In [15]:
# look at different values in the 'Club' column
events['Club'].value_counts()


Club
BSC Young Boys                  327
FC St. Gallen 1879              327
FC Lugano                       321
Grasshopper Club Zurich         320
FC Zürich                       306
FC Sion                         302
FC Winterthur                   300
Servette FC                     297
FC Basel 1893                   296
FC Luzern                       296
CF Barcelona                     29
Olympique Marseille              28
FC Nantes                        27
AS Saint-Étienne                 25
Sporting Étoile Club Bastia      25
Nîmes Olympique                  22
OGC Nice                         22
FC Girondins Bordeaux            22
Paris FC                         20
US Valenciennes-Anzin            20
AS Nancy-Lorraine                19
CS Sedan-Ardennes                18
RC Strasbourg Alsace             18
Stade Rennais FC                 17
Stade Reims                      17
Angers SCO                       17
Olympique Lyon                   16
Red Star FC            

In [16]:
# filter the dataset
events_SL = events[events['Club'].isin(clubs_to_keep.keys())]

events_SL['Club'].value_counts()

Club
BSC Young Boys             327
FC St. Gallen 1879         327
FC Lugano                  321
Grasshopper Club Zurich    320
FC Zürich                  306
FC Sion                    302
FC Winterthur              300
Servette FC                297
FC Basel 1893              296
FC Luzern                  296
Name: count, dtype: int64