<a id='top'></a>

# Data Parsing of StatsBomb Data
##### Notebook to parse JSON data available from the [StatsBomb Open Data GitHub repository](https://github.com/statsbomb/open-data)

### By [Edd Webster](https://www.twitter.com/eddwebster)
Last updated: 10/11/2020

![title](../../img/stats-bomb-logo.png)

Click [here](#section5) to jump straight to the Exploratory Data Analysis section and skip the [Task Brief](#section2), [Data Sources](#section3), and [Data Engineering](#section4) sections. Or click [here](#section6) to jump straight to the Conclusion.

___

<a id='sectionintro'></a>

## <a id='import_libraries'>Introduction</a>
This notebook parses JSON data from [StatsBomb](https://statsbomb.com/) using [pandas](http://pandas.pydata.org/) for data manipulation through DataFrames.

For more information about this notebook and the author, I'm available through all the following channels:
*    [eddwebster.com](https://www.eddwebster.com/),
*    edd.j.webster@gmail.com,
*    [@eddwebster](https://www.twitter.com/eddwebster),
*    [LinkedIn.com/in/eddwebster](https://www.linkedin.com/in/eddwebster/),
*    [GitHub/eddwebster](https://github.com/eddwebster/),
*    [Kaggle.com/eddwebster](https://www.kaggle.com/eddwebster), and
*    [HackerRank.com/eddwebster](https://www.hackerrank.com/eddwebster).

The accompanying GitHub repository for this notebook can be found [here](https://github.com/eddwebster/fifa-league) and a static version of this notebook can be found [here](https://nbviewer.jupyter.org/github/eddwebster/fifa-league/blob/master/FIFA%2020%20Fantasy%20Football%20League%20using%20TransferMarkt%20Player%20Valuations.ipynb).

___

<a id='sectioncontents'></a>

## <a id='notebook_contents'>Notebook Contents</a>
1.    [Notebook Dependencies](#section1)<br>
2.    [Project Brief](#section2)<br>
3.    [Data Sources](#section3)<br>
      1.    [Introduction](#section3.1)<br>
      2.    [Data Dictionary](#section3.2)<br>
      3.    [Creating the DataFrame](#section3.3)<br>
      4.    [Initial Data Handling](#section3.4)<br>
      5.    [Export the Raw DataFrame](#section3.5)<br>         
4.    [Data Engineering](#section4)<br>
      1.    [Introduction](#section4.1)<br>
      2.    [Columns of Interest](#section4.2)<br>
      3.    [String Cleaning](#section4.3)<br>
      4.    [Converting Data Types](#section4.4)<br>
      5.    [Export the Engineered DataFrame](#section4.5)<br>
5.    [Exploratory Data Analysis (EDA)](#section5)<br>
      1.    [...](#section5.1)<br>
      2.    [...](#section5.2)<br>
      3.    [...](#section5.3)<br>
6.    [Summary](#section6)<br>
7.    [Next Steps](#section7)<br>
8.    [Bibliography](#section8)<br>

___

<a id='section1'></a>

## <a id='#section1'>1. Notebook Dependencies</a>

This notebook was written using [Python 3](https://docs.python.org/3.7/) and requires the following libraries:
*    [`Jupyter notebooks`](https://jupyter.org/) for this notebook environment with which this project is presented;
*    [`NumPy`](http://www.numpy.org/) for multidimensional array computing;
*    [`pandas`](http://pandas.pydata.org/) for data analysis and manipulation;
*    `tqdm` for a clean progress bar;
*    [`matplotlib`](https://matplotlib.org/contents.html?v=20200411155018) for data visualisations;

All packages used for this notebook except for BeautifulSoup can be obtained by downloading and installing the [Conda](https://anaconda.org/anaconda/conda) distribution, available on all platforms (Windows, Linux and Mac OSX). Step-by-step guides on how to install Anaconda can be found for Windows [here](https://medium.com/@GalarnykMichael/install-python-on-windows-anaconda-c63c7c3d1444) and Mac [here](https://medium.com/@GalarnykMichael/install-python-on-mac-anaconda-ccd9f2014072), as well as in the Anaconda documentation itself [here](https://docs.anaconda.com/anaconda/install/).

### Import Libraries and Modules

In [1]:
# Python ≥3.5 (ideally)
import platform
import sys, getopt
assert sys.version_info >= (3, 5)
import csv

# Import Dependencies
%matplotlib inline

# Math Operations
import numpy as np
from math import pi

# Datetime
import datetime
from datetime import date
import time

# Data Preprocessing
import pandas as pd    # version 1.0.3
import os    #  used to read the csv filenames
import re
import random
from io import BytesIO
from pathlib import Path

# Reading directories
import glob
import os

# Working with JSON
import json
import codecs
from pandas.io.json import json_normalize

# Football Libraries
from FCPython import createPitch

# Data Visualisation
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn-whitegrid')
import missingno as msno    # visually display missing data

# Progress Bar
from tqdm import tqdm    # a clean progress bar library

# Display in Jupyter
from IPython.display import Image, Video, YouTubeVideo
from IPython.core.display import HTML

# Ignore Warnings
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

print('Setup Complete')

Setup Complete


In [2]:
# Python / module versions used here for reference
print('Python: {}'.format(platform.python_version()))
print('NumPy: {}'.format(np.__version__))
print('pandas: {}'.format(pd.__version__))
print('matplotlib: {}'.format(mpl.__version__))
print('Seaborn: {}'.format(sns.__version__))

Python: 3.7.6
NumPy: 1.18.1
pandas: 1.0.1
matplotlib: 3.1.3
Seaborn: 0.10.0


### Defined Variables

In [3]:
# Define today's date
today = datetime.datetime.now().strftime('%d/%m/%Y').replace('/', '')

### Defined Filepaths

In [4]:
# Set up initial paths to subfolders
base_dir = os.path.join('..', '..', )
data_dir = os.path.join(base_dir, 'data')
data_dir_fbref = os.path.join(base_dir, 'data', 'fbref')
data_dir_tm = os.path.join(base_dir, 'data', 'tm')
data_dir_sb = os.path.join(base_dir, 'data', 'sb')
data_dir_understat = os.path.join(base_dir, 'data', 'understat')
img_dir = os.path.join(base_dir, 'img')
fig_dir = os.path.join(base_dir, 'img', 'fig')
video_dir = os.path.join(base_dir, 'video')

### Notebook Settings

In [5]:
pd.set_option('display.max_columns', None)

---

<a id='section2'></a>

## <a id='#section2'>2. Project Brief</a>
This Jupyter notebook explores how to parse publicly available Event data from [StatsBomb](https://statsbomb.com/) using [pandas](http://pandas.pydata.org/) for data maniuplation through DataFrames.

In this analysis, we're looking specifically at [FA WSL](https://womenscompetitions.thefa.com/) for the 18/19 and 19/20 seasons.

The combined event data roduced in this notebook is exported to CSV. This data can be further analysed in Python, joined to other datasets, or explored using Tableau, PowerBI, Microsoft Excel.

---

<a id='section3'></a>

## <a id='#section3'>3. Data Sources</a>

### <a id='#section3.1'>3.1. Introduction</a>
[StatsBomb](https://statsbomb.com/) are a football analytics and data company.

Before conducting our EDA, the data needs to be imported as a DataFrame in the Data Sources section [Section 3](#section3) and Cleaned in the Data Engineering section [Section 4](#section4).

We'll be using the [pandas](http://pandas.pydata.org/) library to import our data to this workbook as a DataFrame.

### <a id='#section3.2'>3.2. Read in JSON files</a>

#### <a id='#section3.3.1.'>3.3.1. Competitions</a>

##### Data Dictionary

##### Import data

In [7]:
# Load the StatsBomb Competition JSON file
with open(data_dir_sb + '/competitions/raw/json/competitions.json') as f:
    json_sb_competitions_data = json.load(f)

In [8]:
# Display the StatsBomb Competition JSON file
json_sb_competitions_data

[{'competition_id': 16,
  'season_id': 4,
  'country_name': 'Europe',
  'competition_name': 'Champions League',
  'competition_gender': 'male',
  'season_name': '2018/2019',
  'match_updated': '2020-07-29T05:00',
  'match_available': '2020-07-29T05:00'},
 {'competition_id': 16,
  'season_id': 1,
  'country_name': 'Europe',
  'competition_name': 'Champions League',
  'competition_gender': 'male',
  'season_name': '2017/2018',
  'match_updated': '2020-07-29T05:00',
  'match_available': '2020-07-29T05:00'},
 {'competition_id': 16,
  'season_id': 2,
  'country_name': 'Europe',
  'competition_name': 'Champions League',
  'competition_gender': 'male',
  'season_name': '2016/2017',
  'match_updated': '2020-08-26T12:33:15.869622',
  'match_available': '2020-07-29T05:00'},
 {'competition_id': 16,
  'season_id': 27,
  'country_name': 'Europe',
  'competition_name': 'Champions League',
  'competition_gender': 'male',
  'season_name': '2015/2016',
  'match_updated': '2020-08-26T12:33:15.869622',
 

##### Flatten Competitions data

In [9]:
# Flatten the Wyscout JSON Competition data and export the DataFrame as a CSV file
df_sb_competitions_data_flat = json_normalize(json_sb_competitions_data)
df_sb_competitions_data_flat.to_csv(data_dir_sb + '/competitions/raw/csv/competitions.json', index=None, header=True)

  


In [10]:
df_sb_competitions_data_flat

Unnamed: 0,competition_id,season_id,country_name,competition_name,competition_gender,season_name,match_updated,match_available
0,16,4,Europe,Champions League,male,2018/2019,2020-07-29T05:00,2020-07-29T05:00
1,16,1,Europe,Champions League,male,2017/2018,2020-07-29T05:00,2020-07-29T05:00
2,16,2,Europe,Champions League,male,2016/2017,2020-08-26T12:33:15.869622,2020-07-29T05:00
3,16,27,Europe,Champions League,male,2015/2016,2020-08-26T12:33:15.869622,2020-07-29T05:00
4,16,26,Europe,Champions League,male,2014/2015,2020-08-26T12:33:15.869622,2020-07-29T05:00
5,16,25,Europe,Champions League,male,2013/2014,2020-08-26T12:33:15.869622,2020-07-29T05:00
6,16,24,Europe,Champions League,male,2012/2013,2020-08-26T12:33:15.869622,2020-07-29T05:00
7,16,23,Europe,Champions League,male,2011/2012,2020-08-26T12:33:15.869622,2020-07-29T05:00
8,16,22,Europe,Champions League,male,2010/2011,2020-07-29T05:00,2020-07-29T05:00
9,16,21,Europe,Champions League,male,2009/2010,2020-07-29T05:00,2020-07-29T05:00


##### Streamline the DataFrame

In [11]:
df_sb_competitions_data_flat.columns

Index(['competition_id', 'season_id', 'country_name', 'competition_name',
       'competition_gender', 'season_name', 'match_updated',
       'match_available'],
      dtype='object')

In [12]:
# Select columns of interest
cols_competitions = ['competition_id', 'season_id', 'country_name', 'competition_name', 'competition_gender', 'season_name']
                     
# Create more concise DataFrame using only columns of interest
df_sb_competitions_data_flat_select = df_sb_competitions_data_flat[cols_competitions]

# Export DataFrame
df_sb_competitions_data_flat_select.to_csv(data_dir_sb + '/competitions/raw/csv/competitions_select.csv', index=None, header=True)

In [13]:
df_sb_competitions_data_flat_select

Unnamed: 0,competition_id,season_id,country_name,competition_name,competition_gender,season_name
0,16,4,Europe,Champions League,male,2018/2019
1,16,1,Europe,Champions League,male,2017/2018
2,16,2,Europe,Champions League,male,2016/2017
3,16,27,Europe,Champions League,male,2015/2016
4,16,26,Europe,Champions League,male,2014/2015
5,16,25,Europe,Champions League,male,2013/2014
6,16,24,Europe,Champions League,male,2012/2013
7,16,23,Europe,Champions League,male,2011/2012
8,16,22,Europe,Champions League,male,2010/2011
9,16,21,Europe,Champions League,male,2009/2010


##### Identify FA Women's Super League Competitions

In [14]:
df_sb_competitions_data_flat_wsl = df_sb_competitions_data_flat_select.loc[df_sb_competitions_data_flat['competition_name'] == 'FA Women\'s Super League']

In [15]:
df_sb_competitions_data_flat_wsl

Unnamed: 0,competition_id,season_id,country_name,competition_name,competition_gender,season_name
15,37,42,England,FA Women's Super League,female,2019/2020
16,37,4,England,FA Women's Super League,female,2018/2019


##### Identify Competitions of Interest by ID

In [16]:
# FA Women's Super League has competition ID 37
competition_id = 37

For our analysis, we want just the Women's Super League which are the following competition IDs:
*    2018/2019 - `season_id`: 4
*    2019/2020 - `season_id`: 42

#### <a id='#section3.3.2.'>3.3.2. Matches</a>

##### Data Dictionary

##### Import Data

In [17]:
# Show files in directory
print(glob.glob(data_dir_sb + '/matches/raw/json/' + str(competition_id) + '/*json'))

['../../data/sb/matches/raw/json/37/90.json', '../../data/sb/matches/raw/json/37/4.json', '../../data/sb/matches/raw/json/37/42.json']


In [18]:
# REWRITE THE STEP BELOW TO APPEND ALL JSON FILES TO ONE PANDAS DATAFRAME

In [19]:
# Import all StatsBomb JSON Match data for the WSL

## WSL - 18/19
with open(data_dir_sb + '/matches/raw/json/' + str(competition_id) + '/4.json') as f:
    json_sb_match_data_wsl_1819 = json.load(f)
          
## WSL - 19/20
with open(data_dir_sb + '/matches/raw/json/' + str(competition_id) + '/42.json') as f:
    json_sb_match_data_wsl_1920 = json.load(f)
    
## WSL - 20/21
with open(data_dir_sb + '/matches/raw/json/' + str(competition_id) + '/90.json') as f:
    json_sb_match_data_wsl_2021 = json.load(f)

##### Inspect 19/20 season JSON

In [20]:
json_sb_match_data_wsl_1819

[{'match_id': 19743,
  'match_date': '2018-10-21',
  'kick_off': '13:30:00.000',
  'competition': {'competition_id': 37,
   'country_name': 'England',
   'competition_name': "FA Women's Super League"},
  'season': {'season_id': 4, 'season_name': '2018/2019'},
  'home_team': {'home_team_id': 969,
   'home_team_name': 'Birmingham City WFC',
   'home_team_gender': 'female',
   'home_team_group': None,
   'country': {'id': 68, 'name': 'England'},
   'managers': [{'id': 128,
     'name': 'Marc Skinner',
     'nickname': None,
     'dob': None,
     'country': {'id': 68, 'name': 'England'}}]},
  'away_team': {'away_team_id': 971,
   'away_team_name': 'Chelsea FCW',
   'away_team_gender': 'female',
   'away_team_group': None,
   'country': {'id': 68, 'name': 'England'},
   'managers': [{'id': 152,
     'name': 'Emma Hayes',
     'nickname': None,
     'dob': None,
     'country': {'id': 68, 'name': 'England'}}]},
  'home_score': 0,
  'away_score': 0,
  'match_status': 'available',
  'last_upd

In [21]:
json_sb_match_data_wsl_1920 

[{'match_id': 2275054,
  'match_date': '2020-01-05',
  'kick_off': '15:00:00.000',
  'competition': {'competition_id': 37,
   'country_name': 'England',
   'competition_name': "FA Women's Super League"},
  'season': {'season_id': 42, 'season_name': '2019/2020'},
  'home_team': {'home_team_id': 965,
   'home_team_name': 'Brighton & Hove Albion WFC',
   'home_team_gender': 'female',
   'home_team_group': None,
   'country': {'id': 68, 'name': 'England'}},
  'away_team': {'away_team_id': 966,
   'away_team_name': 'Liverpool WFC',
   'away_team_gender': 'female',
   'away_team_group': None,
   'country': {'id': 68, 'name': 'England'}},
  'home_score': 1,
  'away_score': 0,
  'match_status': 'available',
  'last_updated': '2020-07-29T05:00',
  'metadata': {'data_version': '1.1.0',
   'shot_fidelity_version': '2',
   'xy_fidelity_version': '2'},
  'match_week': 11,
  'competition_stage': {'id': 1, 'name': 'Regular Season'}},
 {'match_id': 2275072,
  'match_date': '2020-01-05',
  'kick_off': 

In [22]:
json_sb_match_data_wsl_2021

[{'match_id': 3764234,
  'match_date': '2020-09-05',
  'kick_off': '15:30:00.000',
  'competition': {'competition_id': 37,
   'country_name': 'England',
   'competition_name': "FA Women's Super League"},
  'season': {'season_id': 90, 'season_name': '2020/2021'},
  'home_team': {'home_team_id': 2647,
   'home_team_name': 'Aston Villa',
   'home_team_gender': 'female',
   'home_team_group': None,
   'country': {'id': 68, 'name': 'England'}},
  'away_team': {'away_team_id': 746,
   'away_team_name': 'Manchester City WFC',
   'away_team_gender': 'female',
   'away_team_group': None,
   'country': {'id': 68, 'name': 'England'}},
  'home_score': 0,
  'away_score': 2,
  'match_status': 'available',
  'last_updated': '2020-09-06T19:03:37.632953',
  'metadata': {'data_version': '1.1.0',
   'shot_fidelity_version': '2',
   'xy_fidelity_version': '2'},
  'match_week': 1,
  'competition_stage': {'id': 1, 'name': 'Regular Season'},
  'stadium': {'id': 211,
   'name': 'Villa Park',
   'country': {'i

Inspect matches

In [23]:
# See the first match in the dataset
json_sb_match_data_wsl_1920[0]

{'match_id': 2275054,
 'match_date': '2020-01-05',
 'kick_off': '15:00:00.000',
 'competition': {'competition_id': 37,
  'country_name': 'England',
  'competition_name': "FA Women's Super League"},
 'season': {'season_id': 42, 'season_name': '2019/2020'},
 'home_team': {'home_team_id': 965,
  'home_team_name': 'Brighton & Hove Albion WFC',
  'home_team_gender': 'female',
  'home_team_group': None,
  'country': {'id': 68, 'name': 'England'}},
 'away_team': {'away_team_id': 966,
  'away_team_name': 'Liverpool WFC',
  'away_team_gender': 'female',
  'away_team_group': None,
  'country': {'id': 68, 'name': 'England'}},
 'home_score': 1,
 'away_score': 0,
 'match_status': 'available',
 'last_updated': '2020-07-29T05:00',
 'metadata': {'data_version': '1.1.0',
  'shot_fidelity_version': '2',
  'xy_fidelity_version': '2'},
 'match_week': 11,
 'competition_stage': {'id': 1, 'name': 'Regular Season'}}

In [24]:
# See the away team for the first match in the dataset
json_sb_match_data_wsl_1920[0]['away_team']

{'away_team_id': 966,
 'away_team_name': 'Liverpool WFC',
 'away_team_gender': 'female',
 'away_team_group': None,
 'country': {'id': 68, 'name': 'England'}}

In [25]:
# See the away team name for the first match in the dataset
json_sb_match_data_wsl_1920[0]['away_team']['away_team_name']

'Liverpool WFC'

Print out the result list for the FA Women's Super League

In [26]:
# Print all the match results
for match in json_sb_match_data_wsl_1920:
    home_team_name = match['home_team']['home_team_name']
    away_team_name = match['away_team']['away_team_name']
    home_score = match['home_score']
    away_score = match['away_score']
    describe_text = f"The match between {home_team_name} and {away_team_name}"
    result_text = f" finished {home_score} : {away_score}"
    print(describe_text + result_text)

The match between Brighton & Hove Albion WFC and Liverpool WFC finished 1 : 0
The match between Chelsea FCW and Reading WFC finished 3 : 1
The match between Tottenham Hotspur Women and Manchester City WFC finished 1 : 4
The match between West Ham United LFC and Brighton & Hove Albion WFC finished 2 : 1
The match between Manchester United and Bristol City WFC finished 0 : 1
The match between Liverpool WFC and Reading WFC finished 0 : 1
The match between Chelsea FCW and West Ham United LFC finished 8 : 0
The match between Reading WFC and Tottenham Hotspur Women finished 3 : 1
The match between Birmingham City WFC and Manchester City WFC finished 0 : 2
The match between Tottenham Hotspur Women and West Ham United LFC finished 2 : 1
The match between Everton LFC and Reading WFC finished 3 : 1
The match between Everton LFC and Arsenal WFC finished 1 : 3
The match between Brighton & Hove Albion WFC and West Ham United LFC finished 1 : 3
The match between Liverpool WFC and Chelsea FCW finishe

Code to show just Manchester City WFC's results in the World Cup

In [27]:
# Print match results involving Manchester City WFC
for match in json_sb_match_data_wsl_1920:
    home_team_name = match['home_team']['home_team_name']
    away_team_name = match['away_team']['away_team_name']
    if home_team_name == 'Manchester City WFC' or away_team_name == 'Manchester City WFC':
        home_score = match['home_score']
        away_score = match['away_score']
        describe_text = 'The match between ' + home_team_name + ' and ' + away_team_name
        result_text = ' finished ' + str(home_score) +  ' : ' + str(away_score)
        print(describe_text + result_text)

The match between Tottenham Hotspur Women and Manchester City WFC finished 1 : 4
The match between Birmingham City WFC and Manchester City WFC finished 0 : 2
The match between Manchester City WFC and Brighton & Hove Albion WFC finished 5 : 0
The match between Manchester City WFC and Liverpool WFC finished 1 : 0
The match between Manchester City WFC and Arsenal WFC finished 2 : 1
The match between Manchester City WFC and Everton LFC finished 3 : 1
The match between Manchester City WFC and Birmingham City WFC finished 3 : 0
The match between Manchester City WFC and West Ham United LFC finished 5 : 0
The match between Manchester City WFC and Manchester United finished 1 : 0
The match between Reading WFC and Manchester City WFC finished 0 : 2
The match between Chelsea FCW and Manchester City WFC finished 2 : 1
The match between Bristol City WFC and Manchester City WFC finished 0 : 5
The match between Arsenal WFC and Manchester City WFC finished 1 : 0
The match between Manchester City WFC a

Find the match ID for the game we are interested in - Manchester City WFC vs. Manchester United

In [28]:
# Now lets find the match we are interested in - Manchester City WFC vs. Manchester United
home_team_required = 'Manchester City WFC'
away_team_required = 'Manchester United'

In [29]:
# Find ID for the match we are interested in - Manchester City WFC vs. Manchester United
for match in json_sb_match_data_wsl_1920:
    home_team_name = match['home_team']['home_team_name']
    away_team_name = match['away_team']['away_team_name']
    if (home_team_name == home_team_required) and (away_team_name == away_team_required):
        match_id_required = match['match_id']
print(home_team_required + ' vs ' + away_team_required + ' has id: ' + str(match_id_required))

Manchester City WFC vs Manchester United has id: 2275136


##### Flatten Matches data

In [30]:
# Flatten the JSON Events data for each of the Big 5 European leagues and export each DataFrame as a CSV file

## WSL - 18/19
df_sb_match_data_wsl_1819_flat = json_normalize(json_sb_match_data_wsl_1819)
df_sb_match_data_wsl_1819_flat.to_csv(data_dir_sb + '/matches/raw/csv/wsl/' + '/df_sb_match_data_wsl_1819_flat.csv', index=None, header=True)
    
## WSL - 19/20
df_sb_match_data_wsl_1920_flat = json_normalize(json_sb_match_data_wsl_1920)
df_sb_match_data_wsl_1920_flat.to_csv(data_dir_sb + '/matches/raw/csv/wsl/' + '/df_sb_match_data_wsl_1920_flat.csv', index=None, header=True)

## WSL - 18/19
df_sb_match_data_wsl_2021_flat = json_normalize(json_sb_match_data_wsl_2021)
df_sb_match_data_wsl_2021_flat.to_csv(data_dir_sb + '/matches/raw/csv/wsl/' + '/df_sb_match_data_wsl_2021_flat.csv', index=None, header=True)

  after removing the cwd from sys.path.
  
  if sys.path[0] == '':


In [31]:
df_sb_match_data_wsl_1819_flat

Unnamed: 0,match_id,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version
0,19743,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,
1,19740,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,
2,19716,2018-09-09,15:00:00.000,4,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,4,2018/2019,974,Reading WFC,female,,68,England,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",970,Yeovil Town LFC,female,,68,England,"[{'id': 147, 'name': 'Lee Burch', 'nickname': ...",1.0.3,1,Regular Season,577.0,Adams Park,68.0,England,567.0,H. Conley,68.0,England,
3,19800,2019-03-14,20:30:00.000,4,0,available,2020-08-24T14:34:34.401523,18,37,England,FA Women's Super League,4,2018/2019,968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",973,Bristol City WFC,female,,68,England,"[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",1.1.0,1,Regular Season,456.0,Meadow Park,68.0,England,915.0,R. Whitton,,,
4,19739,2018-10-21,15:00:00.000,0,6,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,965,Brighton & Hove Albion WFC,female,,68,England,"[{'id': 149, 'name': 'Hope Patricia Powell', '...",746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1.0.3,1,Regular Season,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
102,19756,2019-04-17,20:00:00.000,1,3,available,2020-07-29T05:00,16,37,England,FA Women's Super League,4,2018/2019,967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,111.0,Haig Avenue,68.0,England,978.0,P. Clarke,,,2
103,19754,2019-03-24,13:30:00.000,1,5,available,2020-07-29T05:00,16,37,England,FA Women's Super League,4,2018/2019,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",1.1.0,1,Regular Season,579.0,Prenton Park,68.0,England,567.0,H. Conley,68.0,England,
104,19724,2018-09-23,15:00:00.000,4,3,available,2020-07-29T05:00,3,37,England,FA Women's Super League,4,2018/2019,968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",1.0.3,1,Regular Season,456.0,Meadow Park,68.0,England,897.0,P. Howard,,,
105,19814,2019-04-28,16:00:00.000,1,2,available,2020-07-29T05:00,21,37,England,FA Women's Super League,4,2018/2019,973,Bristol City WFC,female,,68,England,"[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",1.1.0,1,Regular Season,4055.0,Stoke Gifford Stadium,68.0,England,842.0,R. Hulme,,,2


In [32]:
df_sb_match_data_wsl_1920_flat

Unnamed: 0,match_id,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,metadata.data_version,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_stage.id,competition_stage.name,home_team.managers,away_team.managers,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name
0,2275054,2020-01-05,15:00:00.000,1,0,available,2020-07-29T05:00,11,37,England,FA Women's Super League,42,2019/2020,965,Brighton & Hove Albion WFC,female,,68,England,966,Liverpool WFC,female,,68,England,1.1.0,2,2,1,Regular Season,,,,,,,,,,
1,2275072,2020-01-05,13:30:00.000,3,1,available,2020-07-29T05:00,11,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,974,Reading WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...","[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",4279.0,The Cherry Red Records Stadium,68.0,England,893.0,S. Pearson,,
2,2275085,2020-01-05,15:00:00.000,1,4,available,2020-07-29T05:00,11,37,England,FA Women's Super League,42,2019/2020,749,Tottenham Hotspur Women,female,,68,England,746,Manchester City WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...","[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",122.0,The Hive Stadium,68.0,England,567.0,H. Conley,68.0,England
3,2275113,2020-01-19,16:00:00.000,2,1,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,972,West Ham United LFC,female,,68,England,965,Brighton & Hove Albion WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...","[{'id': 149, 'name': 'Hope Patricia Powell', '...",4062.0,The Rush Green Stadium,68.0,England,912.0,Ryan Atkin,,
4,2275142,2020-01-05,13:00:00.000,0,1,available,2020-07-29T05:00,11,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,973,Bristol City WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...","[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",4979.0,Leigh Sports Village Stadium,255.0,International,894.0,L. Oliver,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
82,2275083,2020-02-23,15:00:00.000,0,1,available,2020-07-29T05:00,17,37,England,FA Women's Super League,42,2019/2020,969,Birmingham City WFC,female,,68,England,973,Bristol City WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...","[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",5332.0,SportNation.bet Stadium,255.0,International,893.0,S. Pearson,,
83,2275086,2019-12-08,15:30:00.000,0,3,available,2020-07-29T05:00,9,37,England,FA Women's Super League,42,2019/2020,974,Reading WFC,female,,68,England,968,Arsenal WFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...","[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",577.0,Adams Park,68.0,England,842.0,R. Hulme,,
84,2275137,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,749,Tottenham Hotspur Women,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...","[{'id': 791, 'name': 'Karen Hills', 'nickname'...",4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,
85,2275056,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,967,Everton LFC,female,,68,England,1.1.0,2,2,1,Regular Season,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...","[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",6.0,Anfield,68.0,England,898.0,A. Fearn,,


In [33]:
df_sb_match_data_wsl_2021_flat

Unnamed: 0,match_id,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,metadata.data_version,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name
0,3764234,2020-09-05,15:30:00.000,0,2,available,2020-09-06T19:03:37.632953,1,37,England,FA Women's Super League,90,2020/2021,2647,Aston Villa,female,,68,England,746,Manchester City WFC,female,,68,England,1.1.0,2,2,1,Regular Season,211,Villa Park,68,England,937,A. Bryne


##### Concatenate the flattened Matches data for the WSL

In [34]:
# Concatenate the flattened events data for the Big 5 European leagues

## List of the Big 5 DataFrames
lst_events_dataframes_wsl = [df_sb_match_data_wsl_1819_flat, df_sb_match_data_wsl_1920_flat, df_sb_match_data_wsl_2021_flat]

## Concatenate the individual Big 5 DataFrames to one unified DataFrame
df_sb_match_data_wsl_flat = pd.concat(lst_events_dataframes_wsl)

## Export unified DataFrame
df_sb_match_data_wsl_flat.to_csv(data_dir_sb + '/matches/raw/csv/wsl/' + '/df_sb_match_data_wsl.csv', index=None, header=True)

In [35]:
df_sb_match_data_wsl_flat

Unnamed: 0,match_id,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version
0,19743,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,
1,19740,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,,
2,19716,2018-09-09,15:00:00.000,4,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,4,2018/2019,974,Reading WFC,female,,68,England,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",970,Yeovil Town LFC,female,,68,England,"[{'id': 147, 'name': 'Lee Burch', 'nickname': ...",1.0.3,1,Regular Season,577.0,Adams Park,68.0,England,567.0,H. Conley,68.0,England,,
3,19800,2019-03-14,20:30:00.000,4,0,available,2020-08-24T14:34:34.401523,18,37,England,FA Women's Super League,4,2018/2019,968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",973,Bristol City WFC,female,,68,England,"[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",1.1.0,1,Regular Season,456.0,Meadow Park,68.0,England,915.0,R. Whitton,,,,
4,19739,2018-10-21,15:00:00.000,0,6,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,965,Brighton & Hove Albion WFC,female,,68,England,"[{'id': 149, 'name': 'Hope Patricia Powell', '...",746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1.0.3,1,Regular Season,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83,2275086,2019-12-08,15:30:00.000,0,3,available,2020-07-29T05:00,9,37,England,FA Women's Super League,42,2019/2020,974,Reading WFC,female,,68,England,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",1.1.0,1,Regular Season,577.0,Adams Park,68.0,England,842.0,R. Hulme,,,2,2
84,2275137,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",749,Tottenham Hotspur Women,female,,68,England,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...",1.1.0,1,Regular Season,4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,,2,2
85,2275056,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",1.1.0,1,Regular Season,6.0,Anfield,68.0,England,898.0,A. Fearn,,,2,2
86,2275074,2020-02-12,20:00:00.000,2,0,available,2020-07-29T05:00,16,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,4279.0,The Cherry Red Records Stadium,68.0,England,915.0,R. Whitton,,,2,2


In [36]:
df_sb_match_data_wsl_flat.shape

(195, 40)

##### Convert `match_id` column to list

In [37]:
lst_sb_wsl_match_id = df_sb_match_data_wsl_flat['match_id'].tolist()

In [38]:
lst_sb_wsl_match_id

[19743,
 19740,
 19716,
 19800,
 19739,
 19734,
 19748,
 19822,
 19766,
 19785,
 19749,
 19751,
 19764,
 19773,
 19783,
 19736,
 19747,
 19787,
 19771,
 19769,
 19770,
 19765,
 19757,
 19742,
 19777,
 19758,
 19802,
 19803,
 19746,
 19733,
 19811,
 19805,
 19745,
 19752,
 19772,
 19775,
 19760,
 19792,
 19732,
 19744,
 19799,
 19730,
 19753,
 19735,
 19717,
 19718,
 19720,
 19719,
 19715,
 19723,
 19722,
 19727,
 19731,
 19714,
 19728,
 19726,
 19725,
 19738,
 19759,
 19750,
 19761,
 19763,
 19762,
 19768,
 19767,
 19776,
 19794,
 19820,
 19737,
 19790,
 19789,
 19793,
 19791,
 19796,
 19795,
 19797,
 19780,
 19782,
 19781,
 19779,
 19798,
 19784,
 19786,
 19774,
 19755,
 19806,
 19807,
 19788,
 19808,
 19804,
 19813,
 19816,
 19809,
 19810,
 19815,
 19817,
 19818,
 19819,
 19821,
 19801,
 19741,
 19778,
 19756,
 19754,
 19724,
 19814,
 19729,
 2275054,
 2275072,
 2275085,
 2275113,
 2275142,
 2275099,
 2275057,
 2275075,
 2275092,
 2275077,
 2275036,
 2275117,
 2275040,
 2275045,
 227

In [39]:
len(lst_sb_wsl_match_id)

195

#### <a id='#section3.3.3.'>3.3.3. Events</a>

##### Data Dictionary

The [StatsBomb](https://statsbomb.com/) dataset has one hundred and fourteen features (columns) with the following definitions and data types:

| Feature     | Data type    |
|------|-----|
| `id`    | `object`
| `index`    | `object`
| `period`    | `object`
| `timestamp`    | `object`
| `minute`    | `object`
| `second`    | `object`
| `possession`    | `object`
| `duration`    | `object`
| `type.id`    | `object`
| `type.name`    | `object`
| `possession_team.id`    | `object`
| `possession_team.name`    | `object`
| `play_pattern.id`    | `object`
| `play_pattern.name`    | `object`
| `team.id`    | `object`
| `team.name`    | `object`
| `tactics.formation`    | `object`
| `tactics.lineup`    | `object`
| `related_events`    | `object`
| `location`    | `object`
| `player.id`    | `object`
| `player.name`    | `object`
| `position.id`    | `object`
| `position.name`    | `object`
| `pass.recipient.id`    | `object`
| `pass.recipient.name`    | `object`
| `pass.length`    | `object`
| `pass.angle`    | `object`
| `pass.height.id`    | `object`
| `pass.height.name`    | `object`
| `pass.end_location`    | `object`
| `pass.type.id`    | `object`
| `pass.type.name`    | `object`
| `pass.body_part.id`    | `object`
| `pass.body_part.name`    | `object`
| `carry.end_location`    | `object`
| `under_pressure`    | `object`
| `duel.type.id`    | `object`
| `duel.type.name`    | `object`
| `out`    | `object`
| `miscontrol.aerial_won`    | `object`
| `pass.outcome.id`    | `object`
| `pass.outcome.name`    | `object`
| `ball_receipt.outcome.id`    | `object`
| `ball_receipt.outcome.name`    | `object`
| `pass.aerial_won`    | `object`
| `counterpress`    | `object`
| `off_camera`    | `object`
| `dribble.outcome.id`    | `object`
| `dribble.outcome.name`    | `object`
| `dribble.overrun`    | `object`
| `ball_recovery.offensive`    | `object`
| `shot.statsbomb_xg`    | `object`
| `shot.end_location`    | `object`
| `shot.outcome.id`    | `object`
| `shot.outcome.name`    | `object`
| `shot.type.id`    | `object`
| `shot.type.name`    | `object`
| `shot.body_part.id`    | `object`
| `shot.body_part.name`    | `object`
| `shot.technique.id`    | `object`
| `shot.technique.name`    | `object`
| `shot.freeze_frame`    | `object`
| `goalkeeper.end_location`    | `object`
| `goalkeeper.type.id`    | `object`
| `goalkeeper.type.name`    | `object`
| `goalkeeper.position.id`    | `object`
| `goalkeeper.position.name`    | `object`
| `pass.straight`    | `object`
| `pass.technique.id`    | `object`
| `pass.technique.name`    | `object`
| `clearance.head`    | `object`
| `clearance.body_part.id`    | `object`
| `clearance.body_part.name`    | `object`
| `pass.switch`    | `object`
| `duel.outcome.id`    | `object`
| `duel.outcome.name`    | `object`
| `foul_committed.advantage`    | `object`
| `foul_won.advantage`    | `object`
| `pass.cross`    | `object`
| `pass.assisted_shot_id`    | `object`
| `pass.shot_assist`    | `object`
| `shot.one_on_one`    | `object`
| `shot.key_pass_id`    | `object`
| `goalkeeper.body_part.id`    | `object`
| `goalkeeper.body_part.name`    | `object`
| `goalkeeper.technique.id`    | `object`
| `goalkeeper.technique.name`    | `object`
| `goalkeeper.outcome.id`    | `object`
| `goalkeeper.outcome.name`    | `object`
| `clearance.aerial_won`    | `object`
| `foul_committed.card.id`    | `object`
| `foul_committed.card.name`    | `object`
| `foul_won.defensive`    | `object`
| `clearance.right_foot`    | `object`
| `shot.first_time`    | `object`
| `pass.through_ball`    | `object`
| `interception.outcome.id`    | `object`
| `interception.outcome.name`    | `object`
| `clearance.left_foot`    | `object`
| `ball_recovery.recovery_failure`    | `object`
| `shot.aerial_won`    | `object`
| `pass.goal_assist`    | `object`
| `pass.cut_back`    | `object`
| `pass.deflected`    | `object`
| `clearance.other`    | `object`
| `pass.outswinging`    | `object`
| `substitution.outcome.id`    | `object`
| `substitution.outcome.name`    | `object`
| `substitution.replacement.id`    | `object`
| `substitution.replacement.name`    | `object`
| `block.deflection`    | `object`
| `block.offensive`    | `object`
| `injury_stoppage.in_chain`    | `object`

For a full list of definitions, see the official documentation [[link](https://statsbomb.com/stat-definitions/)].

##### Import Data

In [40]:
# Show files in directory
print(glob.glob(data_dir_sb + '/events/raw/json/' + '/*json'))

['../../data/sb/events/raw/json/2275050.json', '../../data/sb/events/raw/json/19795.json', '../../data/sb/events/raw/json/7298.json', '../../data/sb/events/raw/json/265958.json', '../../data/sb/events/raw/json/69182.json', '../../data/sb/events/raw/json/18242.json', '../../data/sb/events/raw/json/69301.json', '../../data/sb/events/raw/json/303696.json', '../../data/sb/events/raw/json/69244.json', '../../data/sb/events/raw/json/2275142.json', '../../data/sb/events/raw/json/266620.json', '../../data/sb/events/raw/json/7559.json', '../../data/sb/events/raw/json/69213.json', '../../data/sb/events/raw/json/2275154.json', '../../data/sb/events/raw/json/69340.json', '../../data/sb/events/raw/json/69205.json', '../../data/sb/events/raw/json/19804.json', '../../data/sb/events/raw/json/8655.json', '../../data/sb/events/raw/json/266724.json', '../../data/sb/events/raw/json/19783.json', '../../data/sb/events/raw/json/22980.json', '../../data/sb/events/raw/json/2275103.json', '../../data/sb/events/

In [41]:
# Import all StatsBomb JSON Event data files for the WSL and combined to one pandas DataFrame

## Create a blank DataFrame to append loaded JSON files
df_events = pd.DataFrame()

## Loop through WSL event data using match ids, load JSONs, normalise JSONs, append 
for match_id in lst_sb_wsl_match_id:
    with open(data_dir_sb + '/events/raw/json/' + str(match_id) + '.json') as f:
        event = json.load(f)
       #match_id = str(match_id)
        event_flat = json_normalize(event)
        event_flat['match_id'] = match_id
        df_events = df_events.append(event_flat)

  # This is added back by InteractiveShellApp.init_path()


In [42]:
df_events.columns

Index(['id', 'index', 'period', 'timestamp', 'minute', 'second', 'possession',
       'duration', 'type.id', 'type.name',
       ...
       'player_off.permanent', 'shot.saved_to_post',
       'goalkeeper.shot_saved_to_post', 'goalkeeper.lost_in_play',
       'goalkeeper.success_out', 'shot.follows_dribble',
       'half_start.late_video_start', 'goalkeeper.success_in_play',
       'half_end.early_video_end', 'goalkeeper.saved_to_post'],
      dtype='object', length=148)

In [43]:
# Export unified DataFrame
df_events.to_csv(data_dir_sb + '/events/raw/csv/wsl/' + '/df_sb_event_data_wsl.csv', index=None, header=True)

In [44]:
lst_formation = df_events['tactics.formation'].unique().tolist()

In [45]:
lst_formation

[4231.0,
 42211.0,
 nan,
 442.0,
 4321.0,
 4222.0,
 41221.0,
 433.0,
 4411.0,
 4141.0,
 5221.0,
 42121.0,
 3232.0,
 451.0,
 31222.0,
 4132.0,
 343.0,
 541.0,
 32221.0,
 3421.0,
 41212.0,
 3142.0,
 3511.0,
 3412.0,
 41131.0,
 4312.0,
 352.0]

#### <a id='#section3.3.3.'>3.3.4. Lineups</a>

##### Data Dictionary

##### Import Data

In [46]:
# Show files in directory
print(glob.glob(data_dir_sb + '/lineups/raw/json/' + '/*json'))

['../../data/sb/lineups/raw/json/2275050.json', '../../data/sb/lineups/raw/json/19795.json', '../../data/sb/lineups/raw/json/7298.json', '../../data/sb/lineups/raw/json/265958.json', '../../data/sb/lineups/raw/json/69182.json', '../../data/sb/lineups/raw/json/18242.json', '../../data/sb/lineups/raw/json/69301.json', '../../data/sb/lineups/raw/json/303696.json', '../../data/sb/lineups/raw/json/69244.json', '../../data/sb/lineups/raw/json/2275142.json', '../../data/sb/lineups/raw/json/266620.json', '../../data/sb/lineups/raw/json/7559.json', '../../data/sb/lineups/raw/json/69213.json', '../../data/sb/lineups/raw/json/2275154.json', '../../data/sb/lineups/raw/json/69340.json', '../../data/sb/lineups/raw/json/69205.json', '../../data/sb/lineups/raw/json/19804.json', '../../data/sb/lineups/raw/json/8655.json', '../../data/sb/lineups/raw/json/266724.json', '../../data/sb/lineups/raw/json/19783.json', '../../data/sb/lineups/raw/json/22980.json', '../../data/sb/lineups/raw/json/2275103.json', 

In [47]:
# Import all StatsBomb JSON lineup data files for the WSL and combined to one pandas DataFrame

## Create a blank DataFrame to append loaded JSON files
df_lineups = pd.DataFrame()

## Loop through WSL lineup data using match ids, load JSONs, normalise JSONs, append 
for lineup_id in lst_sb_wsl_match_id:
    with open(data_dir_sb + '/lineups/raw/json/' + str(lineup_id) + '.json') as f:
        lineup = json.load(f)
        lineup_flat = json_normalize(lineup)
        df_lineups = df_lineups.append(lineup_flat)

  # Remove the CWD from sys.path while we load stuff.


In [48]:
df_lineups

Unnamed: 0,team_id,team_name,lineup
0,971,Chelsea FCW,"[{'player_id': 4633, 'player_name': 'Magdalena..."
1,969,Birmingham City WFC,"[{'player_id': 10193, 'player_name': 'Chloe Ar..."
0,972,West Ham United LFC,"[{'player_id': 4653, 'player_name': 'Jane Ross..."
1,966,Liverpool WFC,"[{'player_id': 15218, 'player_name': 'Jessica ..."
0,974,Reading WFC,"[{'player_id': 10198, 'player_name': 'Josanne ..."
...,...,...,...
1,966,Liverpool WFC,"[{'player_id': 15547, 'player_name': 'Melissa ..."
0,971,Chelsea FCW,"[{'player_id': 4633, 'player_name': 'Magdalena..."
1,969,Birmingham City WFC,"[{'player_id': 10193, 'player_name': 'Chloe Ar..."
0,746,Manchester City WFC,"[{'player_id': 4637, 'player_name': 'Ellie Roe..."


In [49]:
# Export lineups DataFrame
df_lineups.to_csv(data_dir_sb + '/lineups/raw/csv/wsl/' + '/df_sb_lineup_data_wsl.csv', index=None, header=True)

---

## <a id='#section4'>4. Data Engineering</a>
Before conducting an [Exploratory Data Analysis (EDA)](#section5) of the data, we'll first need to clean and wrangle the datasets to a form that meet our needs.

### <a id='#section4.1'>4.1. Join Datasets</a>
Next, we're required to join the `Matches` DataFrame and the `Players` DataFrame to the `Events` DatFrame. The `Events` data is the base DataFrame in which we join the other tables via `wyId`, `matchId`, `competitionId`, `playerId`, and `teamId`.

##### Join Competitions Data to Match Data

In [50]:
# Join the Events DataFrame to the Matches DataFrame
df_sb_match_competitions = pd.merge(df_sb_match_data_wsl_flat, df_sb_competitions_data_flat_select, left_on=['competition.competition_id', 'season.season_id'], right_on=['competition_id', 'season_id'])

In [51]:
df_sb_match_competitions.head()

Unnamed: 0,match_id,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name
0,19743,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019
1,19740,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019
2,19716,2018-09-09,15:00:00.000,4,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,4,2018/2019,974,Reading WFC,female,,68,England,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",970,Yeovil Town LFC,female,,68,England,"[{'id': 147, 'name': 'Lee Burch', 'nickname': ...",1.0.3,1,Regular Season,577.0,Adams Park,68.0,England,567.0,H. Conley,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019
3,19800,2019-03-14,20:30:00.000,4,0,available,2020-08-24T14:34:34.401523,18,37,England,FA Women's Super League,4,2018/2019,968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",973,Bristol City WFC,female,,68,England,"[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",1.1.0,1,Regular Season,456.0,Meadow Park,68.0,England,915.0,R. Whitton,,,,,37,4,England,FA Women's Super League,female,2018/2019
4,19739,2018-10-21,15:00:00.000,0,6,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,965,Brighton & Hove Albion WFC,female,,68,England,"[{'id': 149, 'name': 'Hope Patricia Powell', '...",746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1.0.3,1,Regular Season,,,,,,,,,,,37,4,England,FA Women's Super League,female,2018/2019


##### Join Events Data to Match Data

In [52]:
# Join the Events DataFrame to the Matches-Competitions DataFrame
df_sb_events_match_competitions = pd.merge(df_events, df_sb_match_competitions, left_on=['match_id'], right_on=['match_id'])

In [53]:
df_sb_events_match_competitions.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019
1,489dd844-2b2e-4b0c-90ef-7bbab5ab4bd7,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019
2,c036ad64-e323-4c8d-b770-3b6e26e7d882,3,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,,,[48ca911e-66d7-4ebc-8514-6728f94df8d2],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019
3,48ca911e-66d7-4ebc-8514-6728f94df8d2,4,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,,,[c036ad64-e323-4c8d-b770-3b6e26e7d882],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"[61.0, 41.0]",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"[52.0, 45.0]",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019


In [54]:
df_sb_events_match_competitions.shape

(648877, 193)

### <a id='#section4.2'>4.2. Create Engineered Attributes</a>

#### <a id='#section4.2.1'>4.2.1. Create `Team` and `Opponent` Attributes</a>

In [55]:
df_sb_events_match_competitions['Team'] = np.where(df_sb_events_match_competitions['team.name'] == df_sb_events_match_competitions['home_team.home_team_name'], df_sb_events_match_competitions['home_team.home_team_name'], df_sb_events_match_competitions['away_team.away_team_name'])
df_sb_events_match_competitions['Opponent'] = np.where(df_sb_events_match_competitions['team.name'] == df_sb_events_match_competitions['away_team.away_team_name'], df_sb_events_match_competitions['home_team.home_team_name'], df_sb_events_match_competitions['away_team.away_team_name'])

#### <a id='#section4.2.2'>4.2.2. Create `Full_Fixture_Date` Attribute</a>

In [56]:
df_sb_events_match_competitions['Full_Fixture_Date'] = df_sb_events_match_competitions['match_date'].astype(str) + ' ' + df_sb_events_match_competitions['home_team.home_team_name'].astype(str)  + ' ' + df_sb_events_match_competitions['home_score'].astype(str) + ' ' + ' vs. ' + ' ' + df_sb_events_match_competitions['away_score'].astype(str) + ' ' + df_sb_events_match_competitions['away_team.away_team_name'].astype(str)

In [57]:
df_sb_events_match_competitions.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
1,489dd844-2b2e-4b0c-90ef-7bbab5ab4bd7,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
2,c036ad64-e323-4c8d-b770-3b6e26e7d882,3,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,,,[48ca911e-66d7-4ebc-8514-6728f94df8d2],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
3,48ca911e-66d7-4ebc-8514-6728f94df8d2,4,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,,,[c036ad64-e323-4c8d-b770-3b6e26e7d882],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"[61.0, 41.0]",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"[52.0, 45.0]",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...


### <a id='#section4.3'>4.3. Export Raw DataFrame</a>

In [58]:
# Export 
df_sb_events_match_competitions.to_csv(data_dir_sb + '/combined/raw/csv/wsl/' + '/df_sb_combined_data_wsl.csv', index=None, header=True)

## <a id='#section5'>5. Data Engineering</a>

### <a id='#section5.1'>5.1. Assign Raw DataFrame to Engineered DataFrame</a>

In [59]:
# Assign Raw DataFrame to Engineered DataFrame
df_sb = df_sb_events_match_competitions

### <a id='#section5.2'>5.2. Extract Lineups from DataFrame</a>

In [60]:
# List unique values in the df_sb['type.name'] column
df_sb['type.name'].unique()

array(['Starting XI', 'Half Start', 'Pass', 'Ball Receipt*', 'Carry',
       'Pressure', 'Ball Recovery', 'Block', 'Duel', 'Interception',
       'Dribbled Past', 'Dribble', 'Shot', 'Goal Keeper',
       'Foul Committed', 'Foul Won', 'Dispossessed', 'Clearance',
       'Miscontrol', '50/50', 'Injury Stoppage', 'Player Off',
       'Player On', 'Substitution', 'Shield', 'Tactical Shift',
       'Half End', 'Error', 'Referee Ball-Drop', 'Offside',
       'Own Goal Against', 'Own Goal For', 'Bad Behaviour'], dtype=object)

The starting XI players and formation can be found in the rows where `type.name` is 'Starting XI'.

In [61]:
df_lineup = df_sb[df_sb['type.name'] == 'Starting XI']

In [62]:
df_lineup

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
1,489dd844-2b2e-4b0c-90ef-7bbab5ab4bd7,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
3527,58bb5658-80a5-4b17-86a0-20111ab91353,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,972,West Ham United LFC,1,Regular Play,972,West Ham United LFC,442.0,"[{'player': {'id': 18158, 'name': 'Rebecca Lei...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19740,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019,West Ham United LFC,Liverpool WFC,2018-10-21 West Ham United LFC 0 vs. 1 Liver...
3528,bbad0c35-aef8-40bf-b213-226e86c5f2f7,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,972,West Ham United LFC,1,Regular Play,966,Liverpool WFC,4321.0,"[{'player': {'id': 19778, 'name': 'Frances Kit...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19740,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019,Liverpool WFC,West Ham United LFC,2018-10-21 West Ham United LFC 0 vs. 1 Liver...
6565,04c3b8ac-d7a9-490e-a167-334fb353c82a,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,974,Reading WFC,1,Regular Play,974,Reading WFC,4222.0,"[{'player': {'id': 15719, 'name': 'Grace Molon...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19716,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-09-09,15:00:00.000,4,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,4,2018/2019,974,Reading WFC,female,,68,England,"[{'id': 144, 'name': 'Kelly Chambers', 'nickna...",970,Yeovil Town LFC,female,,68,England,"[{'id': 147, 'name': 'Lee Burch', 'nickname': ...",1.0.3,1,Regular Season,577.0,Adams Park,68.0,England,567.0,H. Conley,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019,Reading WFC,Yeovil Town LFC,2018-09-09 Reading WFC 4 vs. 0 Yeovil Town LFC
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
638307,36de8f79-c32a-4163-8ad7-bf2da3db4f9e,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,1475,Manchester United,1,Regular Play,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275137,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",749,Tottenham Hotspur Women,female,,68,England,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...",1.1.0,1,Regular Season,4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Tottenham Hotspur Women,Manchester United,2020-01-19 Manchester United 3 vs. 0 Tottenh...
641636,32898ea7-108b-4f9b-86ec-847f35aa2974,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,966,Liverpool WFC,1,Regular Play,966,Liverpool WFC,4231.0,"[{'player': {'id': 15626, 'name': 'Anke Preuß'...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275056,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",1.1.0,1,Regular Season,6.0,Anfield,68.0,England,898.0,A. Fearn,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Liverpool WFC,Everton LFC,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC
641637,33bbf774-d931-4e8e-bbaa-6ba618878f25,2,1,00:00:00.000,0,0,1,0.0,35,Starting XI,966,Liverpool WFC,1,Regular Play,967,Everton LFC,433.0,"[{'player': {'id': 13857, 'name': 'Tinja-Riikk...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275056,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",1.1.0,1,Regular Season,6.0,Anfield,68.0,England,898.0,A. Fearn,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Everton LFC,Liverpool WFC,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC
645124,d7b811ef-4156-4e3c-a357-aed72b7c53b7,1,1,00:00:00.000,0,0,1,0.0,35,Starting XI,971,Chelsea FCW,1,Regular Play,971,Chelsea FCW,442.0,"[{'player': {'id': 19421, 'name': 'Carly Mitch...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275074,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-02-12,20:00:00.000,2,0,available,2020-07-29T05:00,16,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,4279.0,The Cherry Red Records Stadium,68.0,England,915.0,R. Whitton,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Chelsea FCW,Birmingham City WFC,2020-02-12 Chelsea FCW 2 vs. 0 Birmingham Ci...


In [63]:
# Streamline DataFrame to include just the columns of interest

## Define columns
cols = ['id', 'type.name', 'match_date', 'kick_off', 'Full_Fixture_Date', 'team.id', 'team.name', 'tactics.formation', 'tactics.lineup', 'competition_name', 'season_name', 'home_team.home_team_name', 'away_team.away_team_name', 'Team', 'Opponent', 'home_score', 'away_score']

## Select only columns of interest
df_lineup_select = df_lineup[cols]

In [64]:
df_lineup_select

Unnamed: 0,id,type.name,match_date,kick_off,Full_Fixture_Date,team.id,team.name,tactics.formation,tactics.lineup,competition_name,season_name,home_team.home_team_name,away_team.away_team_name,Team,Opponent,home_score,away_score
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ...",FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0
1,489dd844-2b2e-4b0c-90ef-7bbab5ab4bd7,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Chelsea FCW,Birmingham City WFC,0,0
3527,58bb5658-80a5-4b17-86a0-20111ab91353,Starting XI,2018-10-21,16:00:00.000,2018-10-21 West Ham United LFC 0 vs. 1 Liver...,972,West Ham United LFC,442.0,"[{'player': {'id': 18158, 'name': 'Rebecca Lei...",FA Women's Super League,2018/2019,West Ham United LFC,Liverpool WFC,West Ham United LFC,Liverpool WFC,0,1
3528,bbad0c35-aef8-40bf-b213-226e86c5f2f7,Starting XI,2018-10-21,16:00:00.000,2018-10-21 West Ham United LFC 0 vs. 1 Liver...,966,Liverpool WFC,4321.0,"[{'player': {'id': 19778, 'name': 'Frances Kit...",FA Women's Super League,2018/2019,West Ham United LFC,Liverpool WFC,Liverpool WFC,West Ham United LFC,0,1
6565,04c3b8ac-d7a9-490e-a167-334fb353c82a,Starting XI,2018-09-09,15:00:00.000,2018-09-09 Reading WFC 4 vs. 0 Yeovil Town LFC,974,Reading WFC,4222.0,"[{'player': {'id': 15719, 'name': 'Grace Molon...",FA Women's Super League,2018/2019,Reading WFC,Yeovil Town LFC,Reading WFC,Yeovil Town LFC,4,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
638307,36de8f79-c32a-4163-8ad7-bf2da3db4f9e,Starting XI,2020-01-19,13:00:00.000,2020-01-19 Manchester United 3 vs. 0 Tottenh...,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga...",FA Women's Super League,2019/2020,Manchester United,Tottenham Hotspur Women,Tottenham Hotspur Women,Manchester United,3,0
641636,32898ea7-108b-4f9b-86ec-847f35aa2974,Starting XI,2019-11-17,16:00:00.000,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC,966,Liverpool WFC,4231.0,"[{'player': {'id': 15626, 'name': 'Anke Preuß'...",FA Women's Super League,2019/2020,Liverpool WFC,Everton LFC,Liverpool WFC,Everton LFC,0,1
641637,33bbf774-d931-4e8e-bbaa-6ba618878f25,Starting XI,2019-11-17,16:00:00.000,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC,967,Everton LFC,433.0,"[{'player': {'id': 13857, 'name': 'Tinja-Riikk...",FA Women's Super League,2019/2020,Liverpool WFC,Everton LFC,Everton LFC,Liverpool WFC,0,1
645124,d7b811ef-4156-4e3c-a357-aed72b7c53b7,Starting XI,2020-02-12,20:00:00.000,2020-02-12 Chelsea FCW 2 vs. 0 Birmingham Ci...,971,Chelsea FCW,442.0,"[{'player': {'id': 19421, 'name': 'Carly Mitch...",FA Women's Super League,2019/2020,Chelsea FCW,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,2,0


We can see from the extracted lineup data so far. To get the stating XI players, we need to breakdown the `tactics.lineup` attribute.

In [65]:
# Normalize tactics.lineup - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame

## explode all columns with lists of dicts
df_lineup_select_normalize = df_lineup_select.apply(lambda x: x.explode()).reset_index(drop=True)

## list of columns with dicts
cols_to_normalize = ['tactics.lineup']

## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix
normalized = list()

for col in cols_to_normalize:
    d = pd.json_normalize(df_lineup_select_normalize[col], sep='_')
    d.columns = [f'{col}_{v}' for v in d.columns]
    normalized.append(d.copy())

## combine df with the normalized columns
df_lineup_select_normalize = pd.concat([df_lineup_select_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)

## display(df_lineup_select_normalize)
df_lineup_select_normalize.head(30)

Unnamed: 0,id,type.name,match_date,kick_off,Full_Fixture_Date,team.id,team.name,tactics.formation,competition_name,season_name,home_team.home_team_name,away_team.away_team_name,Team,Opponent,home_score,away_score,tactics.lineup_jersey_number,tactics.lineup_player_id,tactics.lineup_player_name,tactics.lineup_position_id,tactics.lineup_position_name
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,30.0,15560,Ann-Katrin Berger,1,Goalkeeper
1,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,7.0,10193,Chloe Arthur,2,Right Back
2,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,3.0,19502,Meaghan Sargeant,3,Right Center Back
3,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,25.0,19503,Aoife Mannion,5,Left Center Back
4,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,6.0,15569,Kerys Harrop,6,Left Back
5,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,13.0,15565,Marisa Ewers,9,Right Defensive Midfield
6,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4.0,19501,Hayley Ladd,11,Left Defensive Midfield
7,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,15.0,15563,Charlie Wellings,17,Right Wing
8,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,37.0,15562,Lucy Staniforth,19,Center Attacking Midfield
9,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,Starting XI,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,969,Birmingham City WFC,4231.0,FA Women's Super League,2018/2019,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,8.0,19500,Sarah Emma Mayling,21,Left Wing


In [66]:
df_lineup_engineered = df_lineup_select_normalize

In [67]:
# Streamline DataFrame to include just the columns of interest

## Define columns
cols = ['id', 'match_date', 'kick_off', 'Full_Fixture_Date', 'type.name', 'season_name', 'competition_name', 'home_team.home_team_name', 'away_team.away_team_name', 'Team', 'Opponent', 'home_score', 'away_score', 'tactics.formation', 'tactics.lineup_jersey_number', 'tactics.lineup_position_id', 'tactics.lineup_player_name', 'tactics.lineup_position_name']

## Select only columns of interest
df_lineup_engineered_select = df_lineup_engineered[cols]

In [68]:
df_lineup_engineered_select['tactics.formation'] = df_lineup_engineered_select['tactics.formation'].astype('Int64')
df_lineup_engineered_select['tactics.lineup_jersey_number'] = df_lineup_engineered_select['tactics.lineup_jersey_number'].astype('Int64')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [69]:
df_lineup_engineered_select.head(5)

Unnamed: 0,id,match_date,kick_off,Full_Fixture_Date,type.name,season_name,competition_name,home_team.home_team_name,away_team.away_team_name,Team,Opponent,home_score,away_score,tactics.formation,tactics.lineup_jersey_number,tactics.lineup_position_id,tactics.lineup_player_name,tactics.lineup_position_name
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,30,1,Ann-Katrin Berger,Goalkeeper
1,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,7,2,Chloe Arthur,Right Back
2,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,3,3,Meaghan Sargeant,Right Center Back
3,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,25,5,Aoife Mannion,Left Center Back
4,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,6,6,Kerys Harrop,Left Back


In [70]:
df_lineup_engineered_select.columns

Index(['id', 'match_date', 'kick_off', 'Full_Fixture_Date', 'type.name',
       'season_name', 'competition_name', 'home_team.home_team_name',
       'away_team.away_team_name', 'Team', 'Opponent', 'home_score',
       'away_score', 'tactics.formation', 'tactics.lineup_jersey_number',
       'tactics.lineup_position_id', 'tactics.lineup_player_name',
       'tactics.lineup_position_name'],
      dtype='object')

In [71]:
## Rename columns
df_lineup_engineered_select = df_lineup_engineered_select.rename(columns={'id': 'Match_Id',
                                                                          'match_date': 'Match_Date',
                                                                          'kick_off': 'Kick_Off',
                                                                          'type.name': 'Type_Name',
                                                                          'season_name': 'Season',
                                                                          'competition_name': 'Competition',
                                                                          'home_team.home_team_name': 'Home_Team',
                                                                          'away_team.away_team_name': 'Away_Team',
                                                                          'home_score': 'Home_Score',
                                                                          'away_score': 'Away_Score',
                                                                          'tactics.formation': 'Formation',
                                                                          'tactics.lineup_jersey_number': 'Shirt_Number',
                                                                          'tactics.lineup_position_id': 'Position_Number',
                                                                          'tactics.lineup_player_name': 'Player_Name',
                                                                          'tactics.lineup_position_name': 'Position_Name'
                                                                         }
                                                                         
                                                                )

## Display DataFrame
df_lineup_engineered_select.head()

Unnamed: 0,Match_Id,Match_Date,Kick_Off,Full_Fixture_Date,Type_Name,Season,Competition,Home_Team,Away_Team,Team,Opponent,Home_Score,Away_Score,Formation,Shirt_Number,Position_Number,Player_Name,Position_Name
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,30,1,Ann-Katrin Berger,Goalkeeper
1,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,7,2,Chloe Arthur,Right Back
2,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,3,3,Meaghan Sargeant,Right Center Back
3,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,25,5,Aoife Mannion,Left Center Back
4,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4231,6,6,Kerys Harrop,Left Back


In [72]:
# Convert Match_Date from string to datetime64[ns]
df_lineup_engineered_select['Match_Date']= pd.to_datetime(df_lineup_engineered_select['Match_Date'])

In [73]:
"""
# THIS IS NOT WORKING ATM

# Convert Kick_Off from string to datetime64[ns]
df_lineup_engineered_select['Kick_Off']= pd.to_datetime(df_lineup_engineered_select['Kick_Off'], format='%H:%M', errors='ignore')
df_lineup_engineered_select['Kick_Off'] = df_lineup_engineered_select['Kick_Off'].dt.time
"""

"\n# THIS IS NOT WORKING ATM\n\n# Convert Kick_Off from string to datetime64[ns]\ndf_lineup_engineered_select['Kick_Off']= pd.to_datetime(df_lineup_engineered_select['Kick_Off'], format='%H:%M', errors='ignore')\ndf_lineup_engineered_select['Kick_Off'] = df_lineup_engineered_select['Kick_Off'].dt.time\n"

In [74]:
df_lineup_engineered_select.dtypes

Match_Id                     object
Match_Date           datetime64[ns]
Kick_Off                     object
Full_Fixture_Date            object
Type_Name                    object
Season                       object
Competition                  object
Home_Team                    object
Away_Team                    object
Team                         object
Opponent                     object
Home_Score                    int64
Away_Score                    int64
Formation                     Int64
Shirt_Number                  Int64
Position_Number               int64
Player_Name                  object
Position_Name                object
dtype: object

In [75]:
# Put hyphens between numbers in Formation attribute

## Convert Formation attribute from Integer to String
df_lineup_engineered_select['Formation'] = df_lineup_engineered_select['Formation'].astype(str)

## Define custom function to add hyphen between letters: StackOverflow: https://stackoverflow.com/questions/29382285/python-making-a-function-that-would-add-between-letters
def f(s):
        m = s[0]
        for i in s[1:]:
             m += '-' + i
        return m
    
## Apply custom function
df_lineup_engineered_select['Formation'] = df_lineup_engineered_select.apply(lambda row: f(row['Formation']),axis=1)

In [76]:
lst_formation = df_lineup_engineered_select['Formation'].unique().tolist()

In [77]:
lst_formation

['4-2-3-1',
 '4-2-2-1-1',
 '4-4-2',
 '4-3-2-1',
 '4-2-2-2',
 '4-1-2-2-1',
 '4-3-3',
 '4-4-1-1',
 '4-1-4-1',
 '5-2-2-1',
 '4-2-1-2-1',
 '3-2-3-2',
 '3-4-2-1',
 '3-2-2-2-1',
 '4-1-2-1-2',
 '3-1-4-2',
 '3-5-1-1',
 '3-4-1-2',
 '4-5-1',
 '4-3-1-2',
 '3-4-3',
 '3-5-2']

##### Add Position Coordinates

In [78]:
df_formations_coords = pd.read_csv(data_dir_sb + '/sb_formation_coordinates.csv')

In [79]:
#df_formations_coords['Id'] = df_formations_coords['Id'].astype('Int8')
#df_formations_coords['Player_Number'] = df_formations_coords['Player_Number'].astype('Int8')

In [80]:
df_lineup_engineered_select = pd.merge(df_lineup_engineered_select, df_formations_coords, how='left', left_on=['Formation', 'Position_Number'], right_on=['Formation', 'Player_Number'])

In [81]:
#df_lineup_engineered_select = df_lineup_engineered_select.drop(['Player_Number'], axis=1)
df_lineup_engineered_select = df_lineup_engineered_select.drop(['Id'], axis=1)
df_lineup_engineered_select = df_lineup_engineered_select.drop(['Player_Position'], axis=1)

In [82]:
df_lineup_engineered_select.head()

Unnamed: 0,Match_Id,Match_Date,Kick_Off,Full_Fixture_Date,Type_Name,Season,Competition,Home_Team,Away_Team,Team,Opponent,Home_Score,Away_Score,Formation,Shirt_Number,Position_Number,Player_Name,Position_Name,Player_Number,Player_Position_Abv,X,Y
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,30,1,Ann-Katrin Berger,Goalkeeper,1.0,GK,25.0,5.0
1,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,7,2,Chloe Arthur,Right Back,2.0,RB,42.0,16.0
2,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,3,3,Meaghan Sargeant,Right Center Back,3.0,RCB,30.0,12.0
3,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,25,5,Aoife Mannion,Left Center Back,5.0,LCB,20.0,12.0
4,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,6,6,Kerys Harrop,Left Back,6.0,LB,8.0,16.0


##### Add Opponent Data to Each Row

In [83]:
# Select columns of interest

## Define columns
cols = ['Match_Date',
        'Competition',
        'Full_Fixture_Date',
        'Team',
        'Formation'
       ]

##
df_lineup_opponent = df_lineup_engineered_select[cols]

##
df_lineup_opponent = df_lineup_opponent.drop_duplicates()

##
df_lineup_opponent.head()

Unnamed: 0,Match_Date,Competition,Full_Fixture_Date,Team,Formation
0,2018-10-21,FA Women's Super League,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Birmingham City WFC,4-2-3-1
11,2018-10-21,FA Women's Super League,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Chelsea FCW,4-2-2-1-1
22,2018-10-21,FA Women's Super League,2018-10-21 West Ham United LFC 0 vs. 1 Liver...,West Ham United LFC,4-4-2
33,2018-10-21,FA Women's Super League,2018-10-21 West Ham United LFC 0 vs. 1 Liver...,Liverpool WFC,4-3-2-1
44,2018-09-09,FA Women's Super League,2018-09-09 Reading WFC 4 vs. 0 Yeovil Town LFC,Reading WFC,4-2-2-2


In [84]:
# Join DataFrame to itself on 'Date', 'Fixture', 'Team'/'Opponent', and 'Event', to join Team and Opponent together
df_lineup_engineered_opponent_select = pd.merge(df_lineup_engineered_select, df_lineup_opponent,  how='left', left_on=['Match_Date', 'Competition', 'Full_Fixture_Date', 'Opponent'], right_on = ['Match_Date', 'Competition', 'Full_Fixture_Date', 'Team'])

In [85]:
# Clean Data

## Drop columns
df_lineup_engineered_opponent_select = df_lineup_engineered_opponent_select.drop(columns=['Team_y'])


## Rename columns
df_lineup_engineered_opponent_select = df_lineup_engineered_opponent_select.rename(columns={'Team_x': 'Team',
                                                                                            'Formation_x': 'Formation',
                                                                                            'Formation_y': 'Opponent_Formation'
                                                                                           }
                                                                                      )

## Display DataFrame
df_lineup_engineered_opponent_select.head()

Unnamed: 0,Match_Id,Match_Date,Kick_Off,Full_Fixture_Date,Type_Name,Season,Competition,Home_Team,Away_Team,Team,Opponent,Home_Score,Away_Score,Formation,Shirt_Number,Position_Number,Player_Name,Position_Name,Player_Number,Player_Position_Abv,X,Y,Opponent_Formation
0,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,30,1,Ann-Katrin Berger,Goalkeeper,1.0,GK,25.0,5.0,4-2-2-1-1
1,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,7,2,Chloe Arthur,Right Back,2.0,RB,42.0,16.0,4-2-2-1-1
2,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,3,3,Meaghan Sargeant,Right Center Back,3.0,RCB,30.0,12.0,4-2-2-1-1
3,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,25,5,Aoife Mannion,Left Center Back,5.0,LCB,20.0,12.0,4-2-2-1-1
4,1c6261d9-f0ae-4430-a087-30b4d1ef6e12,2018-10-21,13:30:00.000,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,Starting XI,2018/2019,FA Women's Super League,Birmingham City WFC,Chelsea FCW,Birmingham City WFC,Chelsea FCW,0,0,4-2-3-1,6,6,Kerys Harrop,Left Back,6.0,LB,8.0,16.0,4-2-2-1-1


##### Export DataFrame

In [86]:
# Export 
df_lineup_engineered_opponent_select.to_csv(data_dir_sb + '/lineups/engineered/' + '/sb_lineups_1819_2021_wsl.csv', index=None, header=True)

In [87]:
# Export 
df_lineup_engineered_opponent_select.to_csv(data_dir + '/export/' + '/sb_wsl_lineups.csv', index=None, header=True)

### <a id='#section5.3'>5.3. Tactical Shifts</a>

In [88]:
df_tactics = df_sb[df_sb['type.name'] == 'Tactical Shift']

In [89]:
df_tactics

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date
952,b671a141-5cf4-42f6-ae53-2a13bea26ed3,953,1,00:25:13.108,25,13,46,0.0,36,Tactical Shift,971,Chelsea FCW,1,Regular Play,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
2648,7afb9d97-1f6f-4114-a6a5-ea9953192c44,2649,2,00:19:41.594,64,41,141,0.0,36,Tactical Shift,969,Birmingham City WFC,4,From Throw In,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
3329,2cbab546-69ac-4022-bc03-d03c0a6f4dcb,3330,2,00:44:44.734,89,44,185,0.0,36,Tactical Shift,971,Chelsea FCW,4,From Throw In,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
11607,b93193ee-2953-4715-a63a-ee928145f2e1,1853,2,00:02:24.550,47,24,81,0.0,36,Tactical Shift,973,Bristol City WFC,1,Regular Play,973,Bristol City WFC,4411.0,"[{'player': {'id': 16376, 'name': 'Sophie Bagg...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19800,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-03-14,20:30:00.000,4,0,available,2020-08-24T14:34:34.401523,18,37,England,FA Women's Super League,4,2018/2019,968,Arsenal WFC,female,,68,England,"[{'id': 31, 'name': 'Joseph Montemurro', 'nick...",973,Bristol City WFC,female,,68,England,"[{'id': 143, 'name': 'Tanya Oxtoby', 'nickname...",1.1.0,1,Regular Season,456.0,Meadow Park,68.0,England,915.0,R. Whitton,,,,,37,4,England,FA Women's Super League,female,2018/2019,Bristol City WFC,Arsenal WFC,2019-03-14 Arsenal WFC 4 vs. 0 Bristol City WFC
13768,e478f069-d318-4b53-8bc9-54c9b93538eb,532,1,00:13:46.960,13,46,27,0.0,36,Tactical Shift,965,Brighton & Hove Albion WFC,4,From Throw In,965,Brighton & Hove Albion WFC,4141.0,"[{'player': {'id': 19419, 'name': 'Marie Houri...",[f64c4a50-57d1-49b1-8a04-f9b24845aff3],,,,,,,,,,,,,,,,,,True,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19739,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,15:00:00.000,0,6,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,965,Brighton & Hove Albion WFC,female,,68,England,"[{'id': 149, 'name': 'Hope Patricia Powell', '...",746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1.0.3,1,Regular Season,,,,,,,,,,,37,4,England,FA Women's Super League,female,2018/2019,Brighton & Hove Albion WFC,Manchester City WFC,2018-10-21 Brighton & Hove Albion WFC 0 vs. ...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
640677,3e69faac-7e19-490a-b1c8-846365e91be5,2372,2,00:18:03.430,63,3,142,0.0,36,Tactical Shift,749,Tottenham Hotspur Women,1,Regular Play,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275137,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",749,Tottenham Hotspur Women,female,,68,England,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...",1.1.0,1,Regular Season,4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Tottenham Hotspur Women,Manchester United,2020-01-19 Manchester United 3 vs. 0 Tottenh...
641466,13badfbc-4c5a-4398-a8f1-54b68ba3188c,3161,2,00:42:43.468,87,43,186,0.0,36,Tactical Shift,1475,Manchester United,1,Regular Play,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275137,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",749,Tottenham Hotspur Women,female,,68,England,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...",1.1.0,1,Regular Season,4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Tottenham Hotspur Women,Manchester United,2020-01-19 Manchester United 3 vs. 0 Tottenh...
641480,0ffd3c35-d7cb-4109-a8e4-798213ba0b38,3175,2,00:43:28.282,88,28,187,0.0,36,Tactical Shift,749,Tottenham Hotspur Women,7,From Goal Kick,1475,Manchester United,4231.0,"[{'player': {'id': 31538, 'name': 'Mary Alexan...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275137,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-01-19,13:00:00.000,3,0,available,2020-07-29T05:00,13,37,England,FA Women's Super League,42,2019/2020,1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",749,Tottenham Hotspur Women,female,,68,England,"[{'id': 791, 'name': 'Karen Hills', 'nickname'...",1.1.0,1,Regular Season,4979.0,Leigh Sports Village Stadium,255.0,International,1721.0,E. Duckworth,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester United,Tottenham Hotspur Women,2020-01-19 Manchester United 3 vs. 0 Tottenh...
642500,7c55b875-11a1-4db4-a0cf-b0156e178b5e,865,1,00:20:00.853,20,0,52,0.0,36,Tactical Shift,966,Liverpool WFC,4,From Throw In,967,Everton LFC,433.0,"[{'player': {'id': 13857, 'name': 'Tinja-Riikk...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275056,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",1.1.0,1,Regular Season,6.0,Anfield,68.0,England,898.0,A. Fearn,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Everton LFC,Liverpool WFC,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC


In [90]:
# Select columns of interest

##
cols = ['id', 'type.name', 'team.id', 'team.name', 'tactics.formation', 'tactics.lineup']

##
df_tactics_select = df_tactics[cols]

In [91]:
df_tactics_select

Unnamed: 0,id,type.name,team.id,team.name,tactics.formation,tactics.lineup
952,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,"[{'player': {'id': 15560, 'name': 'Ann-Katrin ..."
2648,7afb9d97-1f6f-4114-a6a5-ea9953192c44,Tactical Shift,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L..."
3329,2cbab546-69ac-4022-bc03-d03c0a6f4dcb,Tactical Shift,971,Chelsea FCW,42211.0,"[{'player': {'id': 4640, 'name': 'Rut Hedvig L..."
11607,b93193ee-2953-4715-a63a-ee928145f2e1,Tactical Shift,973,Bristol City WFC,4411.0,"[{'player': {'id': 16376, 'name': 'Sophie Bagg..."
13768,e478f069-d318-4b53-8bc9-54c9b93538eb,Tactical Shift,965,Brighton & Hove Albion WFC,4141.0,"[{'player': {'id': 19419, 'name': 'Marie Houri..."
...,...,...,...,...,...,...
640677,3e69faac-7e19-490a-b1c8-846365e91be5,Tactical Shift,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga..."
641466,13badfbc-4c5a-4398-a8f1-54b68ba3188c,Tactical Shift,749,Tottenham Hotspur Women,442.0,"[{'player': {'id': 33349, 'name': 'Chloe Morga..."
641480,0ffd3c35-d7cb-4109-a8e4-798213ba0b38,Tactical Shift,1475,Manchester United,4231.0,"[{'player': {'id': 31538, 'name': 'Mary Alexan..."
642500,7c55b875-11a1-4db4-a0cf-b0156e178b5e,Tactical Shift,967,Everton LFC,433.0,"[{'player': {'id': 13857, 'name': 'Tinja-Riikk..."


In [92]:
# Normalize tactics.lineup - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame

## explode all columns with lists of dicts
df_tactics_select_normalize = df_tactics_select.apply(lambda x: x.explode()).reset_index(drop=True)

## list of columns with dicts
cols_to_normalize = ['tactics.lineup']

## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix
normalized = list()
for col in cols_to_normalize:
    
    d = pd.json_normalize(df_tactics_select_normalize[col], sep='_')
    d.columns = [f'{col}_{v}' for v in d.columns]
    normalized.append(d.copy())

## combine df with the normalized columns
df_tactics_select_normalize = pd.concat([df_tactics_select_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)

## display(df_lineup_select_normalize)
df_tactics_select_normalize.head(10)

Unnamed: 0,id,type.name,team.id,team.name,tactics.formation,tactics.lineup_jersey_number,tactics.lineup_player_id,tactics.lineup_player_name,tactics.lineup_position_id,tactics.lineup_position_name
0,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,30,15560,Ann-Katrin Berger,1,Goalkeeper
1,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,23,19592,Harriet Scott,2,Right Back
2,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,3,19502,Meaghan Sargeant,3,Right Center Back
3,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,25,19503,Aoife Mannion,5,Left Center Back
4,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,6,15569,Kerys Harrop,6,Left Back
5,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,7,10193,Chloe Arthur,9,Right Defensive Midfield
6,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,4,19501,Hayley Ladd,11,Left Defensive Midfield
7,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,15,15563,Charlie Wellings,17,Right Wing
8,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,37,15562,Lucy Staniforth,19,Center Attacking Midfield
9,b671a141-5cf4-42f6-ae53-2a13bea26ed3,Tactical Shift,969,Birmingham City WFC,4231.0,8,19500,Sarah Emma Mayling,21,Left Wing


### <a id='#section5.4'>5.4. Halves</a>

In [93]:
df_half = df_sb[df_sb['type.name'] == 'Half Start']

In [94]:
df_half

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date
2,c036ad64-e323-4c8d-b770-3b6e26e7d882,3,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,971,Chelsea FCW,,,[48ca911e-66d7-4ebc-8514-6728f94df8d2],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
3,48ca911e-66d7-4ebc-8514-6728f94df8d2,4,1,00:00:00.000,0,0,1,0.0,18,Half Start,969,Birmingham City WFC,1,Regular Play,969,Birmingham City WFC,,,[c036ad64-e323-4c8d-b770-3b6e26e7d882],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
1929,c30d35de-f34d-4e02-a58b-e86f981e1f1f,1930,2,00:00:00.000,45,0,91,0.0,18,Half Start,971,Chelsea FCW,4,From Throw In,971,Chelsea FCW,,,[7005c1ff-bd6a-4d47-9897-eb79b926d479],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
1930,7005c1ff-bd6a-4d47-9897-eb79b926d479,1931,2,00:00:00.000,45,0,91,0.0,18,Half Start,971,Chelsea FCW,4,From Throw In,969,Birmingham City WFC,,,[c30d35de-f34d-4e02-a58b-e86f981e1f1f],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Birmingham City WFC,Chelsea FCW,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...
3529,74511f98-3f74-4691-bcc9-6bc7100bc0aa,3,1,00:00:00.000,0,0,1,0.0,18,Half Start,972,West Ham United LFC,1,Regular Play,966,Liverpool WFC,,,[604d7a26-34c1-4542-91b6-1ba1daf9b619],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19740,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,972,West Ham United LFC,female,,68,England,"[{'id': 139, 'name': 'Matt Beard', 'nickname':...",966,Liverpool WFC,female,,68,England,"[{'id': 153, 'name': 'Chris Kirkland', 'nickna...",1.0.3,1,Regular Season,4062.0,The Rush Green Stadium,68.0,England,568.0,J. Packman,68.0,England,,,37,4,England,FA Women's Super League,female,2018/2019,Liverpool WFC,West Ham United LFC,2018-10-21 West Ham United LFC 0 vs. 1 Liver...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
643559,e7c7ac72-7ad8-4910-8c40-44ba1377ddec,1924,2,00:00:00.000,45,0,107,0.0,18,Half Start,966,Liverpool WFC,9,From Kick Off,967,Everton LFC,,,[72890b42-968b-4d79-a568-1037fbaf6697],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275056,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-11-17,16:00:00.000,0,1,available,2020-07-29T05:00,6,37,England,FA Women's Super League,42,2019/2020,966,Liverpool WFC,female,,68,England,"[{'id': 623, 'name': 'Victoria Jepson', 'nickn...",967,Everton LFC,female,,68,England,"[{'id': 639, 'name': 'Willie Kirk', 'nickname'...",1.1.0,1,Regular Season,6.0,Anfield,68.0,England,898.0,A. Fearn,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Everton LFC,Liverpool WFC,2019-11-17 Liverpool WFC 0 vs. 1 Everton LFC
645126,2d821933-278c-454d-8ad4-3c18ed37f9a6,3,1,00:00:00.000,0,0,1,0.0,18,Half Start,971,Chelsea FCW,1,Regular Play,969,Birmingham City WFC,,,[8ee9248c-24df-4ba2-879e-550998582f44],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275074,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-02-12,20:00:00.000,2,0,available,2020-07-29T05:00,16,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,4279.0,The Cherry Red Records Stadium,68.0,England,915.0,R. Whitton,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Birmingham City WFC,Chelsea FCW,2020-02-12 Chelsea FCW 2 vs. 0 Birmingham Ci...
645127,8ee9248c-24df-4ba2-879e-550998582f44,4,1,00:00:00.000,0,0,1,0.0,18,Half Start,971,Chelsea FCW,1,Regular Play,971,Chelsea FCW,,,[2d821933-278c-454d-8ad4-3c18ed37f9a6],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275074,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-02-12,20:00:00.000,2,0,available,2020-07-29T05:00,16,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,4279.0,The Cherry Red Records Stadium,68.0,England,915.0,R. Whitton,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Chelsea FCW,Birmingham City WFC,2020-02-12 Chelsea FCW 2 vs. 0 Birmingham Ci...
646975,540daf54-e771-4379-bcee-2cc9667f2546,1852,2,00:00:00.000,45,0,95,0.0,18,Half Start,971,Chelsea FCW,1,Regular Play,969,Birmingham City WFC,,,[9c2195ac-53fc-4ae8-8bfc-7bcd2838db49],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275074,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020-02-12,20:00:00.000,2,0,available,2020-07-29T05:00,16,37,England,FA Women's Super League,42,2019/2020,971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",969,Birmingham City WFC,female,,68,England,"[{'id': 1817, 'name': 'Marta Tejedor', 'nickna...",1.1.0,1,Regular Season,4279.0,The Cherry Red Records Stadium,68.0,England,915.0,R. Whitton,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Birmingham City WFC,Chelsea FCW,2020-02-12 Chelsea FCW 2 vs. 0 Birmingham Ci...


### <a id='#section5.5'>5.5. Isolate In-Play Events</a>
DataFrame of only player's actions i.e. removing line ups, halves, etc.

#### <a id='#section5.5.1'>5.5.1. Remove Non-Event rows</a>

In [95]:
# List unique values in the df_sb['type.name'] column
df_sb['type.name'].unique()

array(['Starting XI', 'Half Start', 'Pass', 'Ball Receipt*', 'Carry',
       'Pressure', 'Ball Recovery', 'Block', 'Duel', 'Interception',
       'Dribbled Past', 'Dribble', 'Shot', 'Goal Keeper',
       'Foul Committed', 'Foul Won', 'Dispossessed', 'Clearance',
       'Miscontrol', '50/50', 'Injury Stoppage', 'Player Off',
       'Player On', 'Substitution', 'Shield', 'Tactical Shift',
       'Half End', 'Error', 'Referee Ball-Drop', 'Offside',
       'Own Goal Against', 'Own Goal For', 'Bad Behaviour'], dtype=object)

In [96]:
lst_events = ['Pass', 'Ball Receipt*', 'Carry', 'Duel', 'Miscontrol', 'Pressure', 'Ball Recovery', 'Dribbled Past', 'Dribble', 'Shot', 'Block', 'Goal Keeper', 'Clearance', 'Dispossessed', 'Foul Committed', 'Foul Won', 'Interception', 'Shield', 'Half End', 'Substitution', 'Tactical Shift', 'Injury Stoppage', 'Player Off', 'Player On', 'Offside', 'Referee Ball-Drop', 'Error']

In [97]:
df_sb_events = df_sb[df_sb['type.name'].isin(lst_events)]

In [98]:
df_sb_events.shape

(647281, 196)

#### <a id='#section5.5.2'>5.5.2. Break down all `location` attributes into seperate attribute for X, Y (and sometimes Z) coordinates</a>

In [99]:
# Display all location columns
for col in df_sb_events.columns:
    if 'location' in col:
        print(col)

location
pass.end_location
carry.end_location
shot.end_location
goalkeeper.end_location


There are the following five 'location' attributes:
- `location`
- `pass.end_location`
- `carry.end_location`
- `shot.end_location`
- `goalkeeper.end_location`

From reviewing the official documentation [[link](https://statsbomb.com/stat-definitions/)], the five attributes have the following dimensionality:
- `location` [x, y]
- `pass.end_location` [x, y]
- `carry.end_location` [x, y]
- `shot.end_location` [x, y, z]
- `goalkeeper.end_location` [x, y]

In [100]:
"""
# CURRENTLY NOT WORKING, NEED TO FIX

# Normalize 'shot.freeze_frame' avvtribute - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame

## explode all columns with lists of dicts
df_sb_events_normalize = df_sb_events.apply(lambda x: x.explode()).reset_index(drop=True)

## list of columns with dicts
cols_to_normalize = ['shot.freeze_frame']

## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix
normalized = list()

for col in cols_to_normalize:
    d = pd.json_normalize(df_sb_events_normalize[col], sep='_')
    d.columns = [f'{col}_{v}' for v in d.columns]
    normalized.append(d.copy())

## combine df with the normalized columns
df_sb_events_normalize = pd.concat([df_sb_events_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)

## display(df_lineup_select_normalize)
df_sb_events_normalize.head(30)
"""

"\n# CURRENTLY NOT WORKING, NEED TO FIX\n\n# Normalize 'shot.freeze_frame' avvtribute - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame\n\n## explode all columns with lists of dicts\ndf_sb_events_normalize = df_sb_events.apply(lambda x: x.explode()).reset_index(drop=True)\n\n## list of columns with dicts\ncols_to_normalize = ['shot.freeze_frame']\n\n## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix\nnormalized = list()\n\nfor col in cols_to_normalize:\n    d = pd.json_normalize(df_sb_events_normalize[col], sep='_')\n    d.columns = [f'{col}_{v}' for v in d.columns]\n    normalized.append(d.copy())\n\n## combine df with the normalized columns\ndf_sb_events_normalize = pd.concat([df_sb_events_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)\n\n## display(df_lineup_select_normalize)\ndf_sb_events_normalize.head(30)\n"

In [101]:
#

##
df_sb_events['location'] = df_sb_events['location'].astype(str)
df_sb_events['pass.end_location'] = df_sb_events['pass.end_location'].astype(str)
df_sb_events['carry.end_location'] = df_sb_events['carry.end_location'].astype(str)
df_sb_events['shot.end_location'] = df_sb_events['shot.end_location'].astype(str)
df_sb_events['goalkeeper.end_location'] = df_sb_events['goalkeeper.end_location'].astype(str)
df_sb_events['shot.end_location'] = df_sb_events['shot.end_location'].astype(str)
#df_sb_events['shot.freeze_frame'] = df_sb_events['shot.freeze_frame'].astype(str)


##

###
df_sb_events['location'] = df_sb_events['location'].str.replace('[','')
df_sb_events['pass.end_location'] = df_sb_events['pass.end_location'].str.replace('[','')
df_sb_events['carry.end_location'] = df_sb_events['carry.end_location'].str.replace('[','')
df_sb_events['shot.end_location'] = df_sb_events['shot.end_location'].str.replace('[','')
df_sb_events['goalkeeper.end_location'] = df_sb_events['goalkeeper.end_location'].str.replace('[','')
#df_sb_events['shot.freeze_frame'] = df_sb_events['shot.freeze_frame'].str.replace('[','')

###
df_sb_events['location'] = df_sb_events['location'].str.replace(']','')
df_sb_events['pass.end_location'] = df_sb_events['pass.end_location'].str.replace(']','')
df_sb_events['carry.end_location'] = df_sb_events['carry.end_location'].str.replace(']','')
df_sb_events['shot.end_location'] = df_sb_events['shot.end_location'].str.replace(']','')
df_sb_events['goalkeeper.end_location'] = df_sb_events['goalkeeper.end_location'].str.replace(']','')
#df_sb_events['shot.freeze_frame'] = df_sb_events['shot.freeze_frame'].str.replace(']','')


## Break down each location attributes
df_sb_events['location_x'], df_sb_events['location_y'] = df_sb_events['location'].str.split(',', 1).str
df_sb_events['pass.end_location_x'], df_sb_events['pass.end_location_y'] = df_sb_events['pass.end_location'].str.split(',', 1).str
df_sb_events['carry.end_location_x'], df_sb_events['carry.end_location_y'] = df_sb_events['carry.end_location'].str.split(',', 1).str
df_sb_events['shot.end_location_x'], df_sb_events['shot.end_location_y'], df_sb_events['shot.end_location_z'] = df_sb_events['shot.end_location'].str.split(',', 3).str[0:3].str
df_sb_events['goalkeeper.end_location_x'], df_sb_events['goalkeeper.end_location_y'] = df_sb_events['goalkeeper.end_location'].str.split(',', 1).str
#df_sb_events['shot.freeze_frame_x'], df_sb_events['shot.freeze_frame_y'] = df_sb_events['shot.freeze_frame'].str.split(',', 1).str


## Display DataFrame
df_sb_events.head(10)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: 

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,
5,89cf3d24-ba04-4269-9071-1dfabf468cd1,6,1,00:00:02.553,0,2,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[ac80414e-cec3-4c56-8e57-ac04149efbe2],"52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,,,,,,,
6,59972a9e-3362-43e1-96fd-883b2e3fbba4,7,1,00:00:02.553,0,2,2,0.835305,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[88d1ce50-3a00-4c88-90ba-a25e658c0deb, 89cf3d2...","52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,"53.0, 45.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,53.0,45.0,,,,,
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,
8,9f0af77c-93cc-49d5-8d16-b7a640a0b59f,9,1,00:00:05.082,0,5,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[88d1ce50-3a00-4c88-90ba-a25e658c0deb],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,,,,,,,
9,0dcfde6b-5909-4af4-bf4e-2e0f240c5d8c,10,1,00:00:05.082,0,5,2,0.040017,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[68374d6d-a178-4ffc-b4c7-236736c61ab0, 9f0af77...","38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,"38.0, 30.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,38.0,30.0,,,,,
10,68374d6d-a178-4ffc-b4c7-236736c61ab0,11,1,00:00:05.122,0,5,2,1.257417,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[c54c86a3-4bdd-47e1-87b7-67101a95ecc3],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,4642.0,Millie Bright,22.561028,1.794273,1.0,Ground Pass,"33.0, 52.0",,,38.0,Left Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,33.0,52.0,,,,,,,
11,c54c86a3-4bdd-47e1-87b7-67101a95ecc3,12,1,00:00:06.379,0,6,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[68374d6d-a178-4ffc-b4c7-236736c61ab0],"33.0, 52.0",4642.0,Millie Bright,3.0,Right Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,33.0,52.0,,,,,,,,,
12,cc54dc47-7b90-4f99-b619-bb3986a0be0d,13,1,00:00:06.379,0,6,2,2.829083,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[62187430-8a2c-4a84-a0e0-5b84a19b6d6d, c54c86a...","33.0, 52.0",4642.0,Millie Bright,3.0,Right Center Back,,,,,,,,,,,,"36.0, 57.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,33.0,52.0,,,36.0,57.0,,,,,
13,62187430-8a2c-4a84-a0e0-5b84a19b6d6d,14,1,00:00:09.208,0,9,2,1.58506,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[bed1875f-5f95-426e-a9d2-d97073ae7184],"36.0, 57.0",4642.0,Millie Bright,3.0,Right Center Back,19422.0,Jessica Carter,21.540659,1.19029,1.0,Ground Pass,"44.0, 77.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,36.0,57.0,44.0,77.0,,,,,,,


In [102]:
df_sb_events.shape

(647281, 207)

##### Export Dataset

In [103]:
# Export 
df_sb_events.to_csv(data_dir_sb + '/events/engineered/' + '/sb_events_1819_2021_wsl.csv', index=None, header=True)

# Export 
df_sb_events.to_csv(data_dir + '/export/' + '/sb_wsl_events.csv', index=None, header=True)

#### <a id='#section5.5.3'>5.5.3. Create Passing Matrix Data</a>

In [104]:
df1 = df_sb_events.copy()

In [105]:
df1['df_name'] = 'df1'

In [106]:
df1.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,,df1
5,89cf3d24-ba04-4269-9071-1dfabf468cd1,6,1,00:00:02.553,0,2,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[ac80414e-cec3-4c56-8e57-ac04149efbe2],"52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,,,,,,,,df1
6,59972a9e-3362-43e1-96fd-883b2e3fbba4,7,1,00:00:02.553,0,2,2,0.835305,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[88d1ce50-3a00-4c88-90ba-a25e658c0deb, 89cf3d2...","52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,"53.0, 45.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,53.0,45.0,,,,,,df1
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,,df1
8,9f0af77c-93cc-49d5-8d16-b7a640a0b59f,9,1,00:00:05.082,0,5,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[88d1ce50-3a00-4c88-90ba-a25e658c0deb],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,,,,,,,,df1


In [107]:
df2 = df_sb_events.copy()

In [108]:
df2['df_name'] = 'df2'

In [109]:
df2.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,,df2
5,89cf3d24-ba04-4269-9071-1dfabf468cd1,6,1,00:00:02.553,0,2,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[ac80414e-cec3-4c56-8e57-ac04149efbe2],"52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,,,,,,,,df2
6,59972a9e-3362-43e1-96fd-883b2e3fbba4,7,1,00:00:02.553,0,2,2,0.835305,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[88d1ce50-3a00-4c88-90ba-a25e658c0deb, 89cf3d2...","52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,"53.0, 45.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,53.0,45.0,,,,,,df2
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,,df2
8,9f0af77c-93cc-49d5-8d16-b7a640a0b59f,9,1,00:00:05.082,0,5,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[88d1ce50-3a00-4c88-90ba-a25e658c0deb],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,,,,,,,,df2


In [110]:
df1.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,,df1
5,89cf3d24-ba04-4269-9071-1dfabf468cd1,6,1,00:00:02.553,0,2,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[ac80414e-cec3-4c56-8e57-ac04149efbe2],"52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,,,,,,,,df1
6,59972a9e-3362-43e1-96fd-883b2e3fbba4,7,1,00:00:02.553,0,2,2,0.835305,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[88d1ce50-3a00-4c88-90ba-a25e658c0deb, 89cf3d2...","52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,"53.0, 45.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,53.0,45.0,,,,,,df1
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,,df1
8,9f0af77c-93cc-49d5-8d16-b7a640a0b59f,9,1,00:00:05.082,0,5,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[88d1ce50-3a00-4c88-90ba-a25e658c0deb],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,,,,,,,,df1


##### Concatanate DataFrames

In [111]:
df_sb_events_passing = pd.concat([df1, df2])

In [112]:
df_sb_events_passing.shape

(1294562, 208)

##### ...

In [113]:
df_sb_events_passing['Pass_X'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_x'], df_sb_events_passing['pass.end_location_x'])
df_sb_events_passing['Pass_Y'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_y'], df_sb_events_passing['pass.end_location_y'])

In [114]:
df_sb_events_passing.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name,Pass_X,Pass_Y
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,,df1,61.0,41.0
5,89cf3d24-ba04-4269-9071-1dfabf468cd1,6,1,00:00:02.553,0,2,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[ac80414e-cec3-4c56-8e57-ac04149efbe2],"52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,,,,,,,,df1,52.0,45.0
6,59972a9e-3362-43e1-96fd-883b2e3fbba4,7,1,00:00:02.553,0,2,2,0.835305,43,Carry,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[88d1ce50-3a00-4c88-90ba-a25e658c0deb, 89cf3d2...","52.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,,,,,,,,,,,,"53.0, 45.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,52.0,45.0,,,53.0,45.0,,,,,,df1,52.0,45.0
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,,df1,53.0,45.0
8,9f0af77c-93cc-49d5-8d16-b7a640a0b59f,9,1,00:00:05.082,0,5,2,,42,Ball Receipt*,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[88d1ce50-3a00-4c88-90ba-a25e658c0deb],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,,,,,,,,,,df1,38.0,30.0


In [115]:
sorted(df_sb_events_passing.columns)

['50_50.outcome.id',
 '50_50.outcome.name',
 'Full_Fixture_Date',
 'Opponent',
 'Pass_X',
 'Pass_Y',
 'Team',
 'away_score',
 'away_team.away_team_gender',
 'away_team.away_team_group',
 'away_team.away_team_id',
 'away_team.away_team_name',
 'away_team.country.id',
 'away_team.country.name',
 'away_team.managers',
 'bad_behaviour.card.id',
 'bad_behaviour.card.name',
 'ball_receipt.outcome.id',
 'ball_receipt.outcome.name',
 'ball_recovery.offensive',
 'ball_recovery.recovery_failure',
 'block.deflection',
 'block.offensive',
 'block.save_block',
 'carry.end_location',
 'carry.end_location_x',
 'carry.end_location_y',
 'clearance.aerial_won',
 'clearance.body_part.id',
 'clearance.body_part.name',
 'clearance.head',
 'clearance.left_foot',
 'clearance.other',
 'clearance.right_foot',
 'competition.competition_id',
 'competition.competition_name',
 'competition.country_name',
 'competition_gender',
 'competition_id',
 'competition_name',
 'competition_stage.id',
 'competition_stage.nam

##### Export Dataset

In [116]:
# Export 
#df_sb_events_passing.to_csv(data_dir_sb + '/events/engineered/' + '/sb_events_passing_matrix_1819_2021_wsl.csv', index=None, header=True)

# Export 
df_sb_events_passing.to_csv(data_dir + '/export/' + '/sb_wsl_events_passing_matrix.csv', index=None, header=True)

In [184]:
df_sb_events_passing_city_utd = df_sb_events_passing[df_sb_events_passing['Full_Fixture_Date'] == '2019-09-07 Manchester City WFC 1  vs.  0 Manchester United']

In [185]:
df_sb_events_passing_city_utd.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name,Pass_X,Pass_Y
515710,ab619e0b-f5f2-4e1a-af5b-8762f47caa5d,5,1,00:00:01.041,0,1,2,0.766714,30,Pass,1475,Manchester United,9,From Kick Off,1475,Manchester United,,,[aed540cb-c9a9-4887-ace7-cc1b7bf00671],"60.0, 40.0",4653.0,Jane Ross,23.0,Center Forward,31540.0,Katie Zelem,8.7,3.141593,1.0,Ground Pass,"51.3, 40.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275136,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-09-07,16:00:00.000,1,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,42,2019/2020,746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",1.1.0,1,Regular Season,4715.0,Etihad Stadium,68.0,England,977.0,R. Welch,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester United,Manchester City WFC,2019-09-07 Manchester City WFC 1 vs. 0 Manch...,60.0,40.0,51.3,40.0,,,,,,,,df1,60.0,40.0
515711,aed540cb-c9a9-4887-ace7-cc1b7bf00671,6,1,00:00:01.808,0,1,2,,42,Ball Receipt*,1475,Manchester United,9,From Kick Off,1475,Manchester United,,,[ab619e0b-f5f2-4e1a-af5b-8762f47caa5d],"51.3, 40.0",31540.0,Katie Zelem,10.0,Center Defensive Midfield,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275136,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-09-07,16:00:00.000,1,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,42,2019/2020,746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",1.1.0,1,Regular Season,4715.0,Etihad Stadium,68.0,England,977.0,R. Welch,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester United,Manchester City WFC,2019-09-07 Manchester City WFC 1 vs. 0 Manch...,51.3,40.0,,,,,,,,,,df1,51.3,40.0
515712,a3cb5d58-4271-41ef-8127-744db21b2a15,7,1,00:00:01.808,0,1,2,0.771229,43,Carry,1475,Manchester United,9,From Kick Off,1475,Manchester United,,,"[3b086c48-160e-4585-aab0-9a7be6405fdb, 5245e6e...","51.3, 40.0",31540.0,Katie Zelem,10.0,Center Defensive Midfield,,,,,,,,,,,,"53.9, 37.8",True,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275136,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-09-07,16:00:00.000,1,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,42,2019/2020,746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",1.1.0,1,Regular Season,4715.0,Etihad Stadium,68.0,England,977.0,R. Welch,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester United,Manchester City WFC,2019-09-07 Manchester City WFC 1 vs. 0 Manch...,51.3,40.0,,,53.9,37.8,,,,,,df1,51.3,40.0
515713,3b086c48-160e-4585-aab0-9a7be6405fdb,8,1,00:00:02.326,0,2,2,0.552478,17,Pressure,1475,Manchester United,9,From Kick Off,746,Manchester City WFC,,,"[5245e6e0-cde6-4041-b1a1-328c73277ebf, a3cb5d5...","64.3, 44.0",4992.0,Janine Beckie,23.0,Center Forward,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275136,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-09-07,16:00:00.000,1,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,42,2019/2020,746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",1.1.0,1,Regular Season,4715.0,Etihad Stadium,68.0,England,977.0,R. Welch,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester City WFC,Manchester United,2019-09-07 Manchester City WFC 1 vs. 0 Manch...,64.3,44.0,,,,,,,,,,df1,64.3,44.0
515714,5245e6e0-cde6-4041-b1a1-328c73277ebf,9,1,00:00:02.579,0,2,2,1.774316,30,Pass,1475,Manchester United,9,From Kick Off,1475,Manchester United,,,"[3b086c48-160e-4585-aab0-9a7be6405fdb, 49a2740...","53.9, 37.8",31540.0,Katie Zelem,10.0,Center Defensive Midfield,31539.0,Leah Galton,25.881653,-0.77447,3.0,High Pass,"72.4, 19.7",,,40.0,Right Foot,,True,9.0,Incomplete,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2275136,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-09-07,16:00:00.000,1,0,available,2020-07-29T05:00,1,37,England,FA Women's Super League,42,2019/2020,746,Manchester City WFC,female,,68,England,"[{'id': 30, 'name': 'Nick Cushing', 'nickname'...",1475,Manchester United,female,,68,England,"[{'id': 2926, 'name': 'Casey Stoney', 'nicknam...",1.1.0,1,Regular Season,4715.0,Etihad Stadium,68.0,England,977.0,R. Welch,,,2,2,37,42,England,FA Women's Super League,female,2019/2020,Manchester United,Manchester City WFC,2019-09-07 Manchester City WFC 1 vs. 0 Manch...,53.9,37.8,72.4,19.7,,,,,,,,df1,53.9,37.8


In [186]:
df_sb_events_passing_city_utd.shape

(7192, 210)

In [333]:
df_sb_events_passing_city_utd.to_csv(data_dir + '/passing_network_test.csv', index=None, header=True)

#### <a id='#section5.5.4'>5.5.4. Create Passing Network Data</a>

See: https://community.tableau.com/s/question/0D54T00000C6YbE/football-passing-network

In [300]:
df_sb_pass_network = df_sb_events_passing.copy()

In [301]:
df_sb_pass_network = df_sb_pass_network[df_sb_pass_network['type.name'] == 'Pass']

In [302]:
df_sb_pass_network['player_recipient'] = np.where(df_sb_pass_network['df_name'] == 'df1', df_sb_pass_network['player.name'], df_sb_pass_network['pass.recipient.name'])

In [306]:
df_sb_pass_network.head()

Unnamed: 0,id,index,period,timestamp,minute,second,possession,duration,type.id,type.name,possession_team.id,possession_team.name,play_pattern.id,play_pattern.name,team.id,team.name,tactics.formation,tactics.lineup,related_events,location,player.id,player.name,position.id,position.name,pass.recipient.id,pass.recipient.name,pass.length,pass.angle,pass.height.id,pass.height.name,pass.end_location,pass.type.id,pass.type.name,pass.body_part.id,pass.body_part.name,carry.end_location,under_pressure,pass.outcome.id,pass.outcome.name,ball_receipt.outcome.id,ball_receipt.outcome.name,counterpress,duel.type.id,duel.type.name,pass.aerial_won,interception.outcome.id,interception.outcome.name,dribble.outcome.id,dribble.outcome.name,pass.assisted_shot_id,pass.shot_assist,shot.statsbomb_xg,shot.end_location,shot.key_pass_id,shot.body_part.id,shot.body_part.name,shot.type.id,shot.type.name,shot.outcome.id,shot.outcome.name,shot.technique.id,shot.technique.name,shot.freeze_frame,goalkeeper.end_location,goalkeeper.position.id,goalkeeper.position.name,goalkeeper.type.id,goalkeeper.type.name,off_camera,duel.outcome.id,duel.outcome.name,pass.switch,ball_recovery.recovery_failure,50_50.outcome.id,50_50.outcome.name,foul_committed.card.id,foul_committed.card.name,shot.one_on_one,shot.aerial_won,pass.through_ball,pass.technique.id,pass.technique.name,goalkeeper.outcome.id,goalkeeper.outcome.name,goalkeeper.technique.id,goalkeeper.technique.name,goalkeeper.body_part.id,goalkeeper.body_part.name,substitution.outcome.id,substitution.outcome.name,substitution.replacement.id,substitution.replacement.name,foul_won.defensive,clearance.aerial_won,pass.backheel,pass.cross,foul_committed.offensive,foul_committed.advantage,foul_won.advantage,dribble.overrun,foul_committed.penalty,foul_won.penalty,injury_stoppage.in_chain,miscontrol.aerial_won,block.offensive,match_id,shot.open_goal,shot.first_time,dribble.nutmeg,pass.cut_back,pass.deflected,pass.goal_assist,foul_committed.type.id,foul_committed.type.name,pass.miscommunication,ball_recovery.offensive,block.save_block,block.deflection,clearance.head,clearance.body_part.id,clearance.body_part.name,out,clearance.left_foot,clearance.right_foot,pass.inswinging,pass.straight,clearance.other,pass.outswinging,shot.redirect,shot.deflected,bad_behaviour.card.id,bad_behaviour.card.name,pass.no_touch,dribble.no_touch,shot.saved_off_target,goalkeeper.shot_saved_off_target,goalkeeper.lost_out,goalkeeper.punched_out,player_off.permanent,shot.saved_to_post,goalkeeper.shot_saved_to_post,goalkeeper.lost_in_play,goalkeeper.success_out,shot.follows_dribble,half_start.late_video_start,goalkeeper.success_in_play,half_end.early_video_end,goalkeeper.saved_to_post,match_date,kick_off,home_score,away_score,match_status,last_updated,match_week,competition.competition_id,competition.country_name,competition.competition_name,season.season_id,season.season_name,home_team.home_team_id,home_team.home_team_name,home_team.home_team_gender,home_team.home_team_group,home_team.country.id,home_team.country.name,home_team.managers,away_team.away_team_id,away_team.away_team_name,away_team.away_team_gender,away_team.away_team_group,away_team.country.id,away_team.country.name,away_team.managers,metadata.data_version,competition_stage.id,competition_stage.name,stadium.id,stadium.name,stadium.country.id,stadium.country.name,referee.id,referee.name,referee.country.id,referee.country.name,metadata.shot_fidelity_version,metadata.xy_fidelity_version,competition_id,season_id,country_name,competition_name,competition_gender,season_name,Team,Opponent,Full_Fixture_Date,location_x,location_y,pass.end_location_x,pass.end_location_y,carry.end_location_x,carry.end_location_y,shot.end_location_x,shot.end_location_y,shot.end_location_z,goalkeeper.end_location_x,goalkeeper.end_location_y,df_name,Pass_X,Pass_Y,player_recipient
4,ac80414e-cec3-4c56-8e57-ac04149efbe2,5,1,00:00:01.324,0,1,2,1.228695,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[89cf3d24-ba04-4269-9071-1dfabf468cd1],"61.0, 41.0",4641.0,Francesca Kirby,23.0,Center Forward,15549.0,Sophie Ingle,9.848858,2.723368,1.0,Ground Pass,"52.0, 45.0",65.0,Kick Off,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,61.0,41.0,52.0,45.0,,,,,,,,df1,61.0,41.0,Francesca Kirby
7,88d1ce50-3a00-4c88-90ba-a25e658c0deb,8,1,00:00:03.388,0,3,2,1.693583,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[9f0af77c-93cc-49d5-8d16-b7a640a0b59f],"53.0, 45.0",15549.0,Sophie Ingle,9.0,Right Defensive Midfield,4633.0,Magdalena Ericsson,21.213203,-2.356194,1.0,Ground Pass,"38.0, 30.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,53.0,45.0,38.0,30.0,,,,,,,,df1,53.0,45.0,Sophie Ingle
10,68374d6d-a178-4ffc-b4c7-236736c61ab0,11,1,00:00:05.122,0,5,2,1.257417,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[c54c86a3-4bdd-47e1-87b7-67101a95ecc3],"38.0, 30.0",4633.0,Magdalena Ericsson,5.0,Left Center Back,4642.0,Millie Bright,22.561028,1.794273,1.0,Ground Pass,"33.0, 52.0",,,38.0,Left Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,30.0,33.0,52.0,,,,,,,,df1,38.0,30.0,Magdalena Ericsson
13,62187430-8a2c-4a84-a0e0-5b84a19b6d6d,14,1,00:00:09.208,0,9,2,1.58506,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,[bed1875f-5f95-426e-a9d2-d97073ae7184],"36.0, 57.0",4642.0,Millie Bright,3.0,Right Center Back,19422.0,Jessica Carter,21.540659,1.19029,1.0,Ground Pass,"44.0, 77.0",,,40.0,Right Foot,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,36.0,57.0,44.0,77.0,,,,,,,,df1,36.0,57.0,Millie Bright
17,c8ffae01-8549-4ca5-8e42-6037f2e1f944,18,1,00:00:12.945,0,12,2,2.457301,30,Pass,971,Chelsea FCW,9,From Kick Off,971,Chelsea FCW,,,"[14dae490-08c9-4239-be0a-4715f99f9e2d, c9d2527...","38.0, 74.0",19422.0,Jessica Carter,2.0,Right Back,4640.0,Rut Hedvig Lindahl,36.05551,-2.55359,1.0,Ground Pass,"8.0, 54.0",,,40.0,Right Foot,,True,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,19743,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2018-10-21,13:30:00.000,0,0,available,2020-07-29T05:00,6,37,England,FA Women's Super League,4,2018/2019,969,Birmingham City WFC,female,,68,England,"[{'id': 128, 'name': 'Marc Skinner', 'nickname...",971,Chelsea FCW,female,,68,England,"[{'id': 152, 'name': 'Emma Hayes', 'nickname':...",1.0.3,1,Regular Season,5332.0,SportNation.bet Stadium,255.0,International,898.0,A. Fearn,,,,,37,4,England,FA Women's Super League,female,2018/2019,Chelsea FCW,Birmingham City WFC,2018-10-21 Birmingham City WFC 0 vs. 0 Chels...,38.0,74.0,8.0,54.0,,,,,,,,df1,38.0,74.0,Jessica Carter


In [307]:
sorted(df_sb_pass_network.columns)

['50_50.outcome.id',
 '50_50.outcome.name',
 'Full_Fixture_Date',
 'Opponent',
 'Pass_X',
 'Pass_Y',
 'Team',
 'away_score',
 'away_team.away_team_gender',
 'away_team.away_team_group',
 'away_team.away_team_id',
 'away_team.away_team_name',
 'away_team.country.id',
 'away_team.country.name',
 'away_team.managers',
 'bad_behaviour.card.id',
 'bad_behaviour.card.name',
 'ball_receipt.outcome.id',
 'ball_receipt.outcome.name',
 'ball_recovery.offensive',
 'ball_recovery.recovery_failure',
 'block.deflection',
 'block.offensive',
 'block.save_block',
 'carry.end_location',
 'carry.end_location_x',
 'carry.end_location_y',
 'clearance.aerial_won',
 'clearance.body_part.id',
 'clearance.body_part.name',
 'clearance.head',
 'clearance.left_foot',
 'clearance.other',
 'clearance.right_foot',
 'competition.competition_id',
 'competition.competition_name',
 'competition.country_name',
 'competition_gender',
 'competition_id',
 'competition_name',
 'competition_stage.id',
 'competition_stage.nam

In [303]:
df_sb_pass_network.shape

(352108, 211)

In [308]:
# Select columns of interest

## Define columns
cols = ['df_name',
        'id',
        'index',
        'competition_name',
        'season_name',
        'match_date',
        'kick_off',
        'Full_Fixture_Date',
        'Team',
        'Opponent',
        'home_team.home_team_name',
        'away_team.away_team_name',
        'home_score',
        'away_score',
        'player_recipient',
        'player.name',
        'pass.recipient.name',
        'position.id',
        'position.name',
        'type.name',
        'pass.type.name',
        'pass.outcome.name',
        'location_x',
        'location_y', 
        'pass.end_location_x',
        'pass.end_location_y',
        'Pass_X',
        'Pass_Y'
       ]

##
df_sb_pass_network_select = df_sb_pass_network[cols]

In [309]:
df_sb_pass_network_select['pass.to.from'] = df_sb_pass_network_select['player.name'] + ' - ' + df_sb_pass_network_select['pass.recipient.name']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [310]:
# List unique values in the df_sb_pass_network_select['pass.outcome.name'] column
df_sb_pass_network_select['pass.outcome.name'].unique()

array([nan, 'Incomplete', 'Out', 'Unknown', 'Injury Clearance',
       'Pass Offside'], dtype=object)

In [311]:
df_sb_pass_network_select = df_sb_pass_network_select[df_sb_pass_network_select['pass.outcome.name'].isnull()]

In [312]:
df_sb_pass_network_select.shape

(256868, 29)

In [313]:
df_sb_pass_network_select = df_sb_pass_network_select.sort_values(['season_name', 'match_date', 'kick_off', 'Full_Fixture_Date', 'index', 'id', 'df_name'], ascending=[True, True, True, True, True, True, True])

In [314]:
df_sb_pass_network_select['Pass_X'] = df_sb_pass_network_select['Pass_X'].astype(str).astype(float)
df_sb_pass_network_select['Pass_Y'] = df_sb_pass_network_select['Pass_Y'].astype(str).astype(float)
df_sb_pass_network_select['location_x'] = df_sb_pass_network_select['location_x'].astype(str).astype(float)
df_sb_pass_network_select['location_y'] = df_sb_pass_network_select['location_y'].astype(str).astype(float)
df_sb_pass_network_select['pass.end_location_x'] = df_sb_pass_network_select['pass.end_location_x'].astype(str).astype(float)
df_sb_pass_network_select['pass.end_location_y'] = df_sb_pass_network_select['pass.end_location_y'].astype(str).astype(float)

In [315]:
df_sb_pass_network_select.dtypes

df_name                      object
id                           object
index                         int64
competition_name             object
season_name                  object
match_date                   object
kick_off                     object
Full_Fixture_Date            object
Team                         object
Opponent                     object
home_team.home_team_name     object
away_team.away_team_name     object
home_score                    int64
away_score                    int64
player_recipient             object
player.name                  object
pass.recipient.name          object
position.id                 float64
position.name                object
type.name                    object
pass.type.name               object
pass.outcome.name            object
location_x                  float64
location_y                  float64
pass.end_location_x         float64
pass.end_location_y         float64
Pass_X                      float64
Pass_Y                      

In [316]:
df_sb_pass_network_select.head()

Unnamed: 0,df_name,id,index,competition_name,season_name,match_date,kick_off,Full_Fixture_Date,Team,Opponent,home_team.home_team_name,away_team.away_team_name,home_score,away_score,player_recipient,player.name,pass.recipient.name,position.id,position.name,type.name,pass.type.name,pass.outcome.name,location_x,location_y,pass.end_location_x,pass.end_location_y,Pass_X,Pass_Y,pass.to.from
148316,df1,1c8ebf3c-09af-4571-aeae-6afc0e5e6bf1,5,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Vivianne Miedema,Vivianne Miedema,Lia Wälti,23.0,Center Forward,Pass,Kick Off,,60.0,40.0,48.0,39.0,60.0,40.0,Vivianne Miedema - Lia Wälti
148316,df2,1c8ebf3c-09af-4571-aeae-6afc0e5e6bf1,5,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Lia Wälti,Vivianne Miedema,Lia Wälti,23.0,Center Forward,Pass,Kick Off,,60.0,40.0,48.0,39.0,48.0,39.0,Vivianne Miedema - Lia Wälti
148320,df1,7c77c968-d48d-41d3-b8ff-e78803ddb9d0,9,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Lia Wälti,Lia Wälti,Lisa Evans,14.0,Center Midfield,Pass,,,44.0,42.0,45.0,70.0,44.0,42.0,Lia Wälti - Lisa Evans
148320,df2,7c77c968-d48d-41d3-b8ff-e78803ddb9d0,9,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Lisa Evans,Lia Wälti,Lisa Evans,14.0,Center Midfield,Pass,,,44.0,42.0,45.0,70.0,45.0,70.0,Lia Wälti - Lisa Evans
148324,df1,0e3a80ea-6af5-40e5-af40-90f2cead6c3b,13,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Lisa Evans,Lisa Evans,Danielle van de Donk,2.0,Right Back,Pass,,,43.0,69.0,69.0,76.0,43.0,69.0,Lisa Evans - Danielle van de Donk


In [323]:
#

##
df_sb_pass_network_grouped = (df_sb_pass_network_select
                                  .groupby(['competition_name',
                                            'season_name',
                                            'match_date',
                                            'kick_off',
                                            'Full_Fixture_Date',
                                            'Team',
                                            'Opponent',
                                            'home_team.home_team_name',
                                            'away_team.away_team_name',
                                            'home_score',
                                            'away_score',
                                            'pass.to.from',
                                            'player.name',
                                            'pass.recipient.name',
                                            'player_recipient'
                                           ])
                                  .agg({'pass.to.from': ['count']
                                       })
                             )

##
df_sb_pass_network_grouped.columns = df_sb_pass_network_grouped.columns.droplevel(level=0)

##
df_sb_pass_network_grouped = df_sb_pass_network_grouped.reset_index()

## 
df_sb_pass_network_grouped.columns = ['competition_name',
                                      'season_name',
                                      'match_date',
                                      'kick_off',
                                      'full_fixture_date',
                                      'team',
                                      'opponent',
                                      'home_team_name',
                                      'away_team_name',
                                      'home_score',
                                      'away_score',
                                      'pass_to_from',
                                      'player_name',
                                      'pass_recipient_name',
                                      'player_recipient',
                                      'count_passes',
                                     ]

##
#df_sb_pass_network_grouped['count_passes'] = df_sb_pass_network_grouped['count_passes'] / 2

##
df_sb_pass_network_grouped = df_sb_pass_network_grouped.sort_values(['season_name', 'match_date', 'kick_off', 'full_fixture_date', 'team', 'opponent', 'pass_to_from'], ascending=[True, True, True, True, True, True, True])

##
df_sb_pass_network_grouped.head()

Unnamed: 0,competition_name,season_name,match_date,kick_off,full_fixture_date,team,opponent,home_team_name,away_team_name,home_score,away_score,pass_to_from,player_name,pass_recipient_name,player_recipient,count_passes
0,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Danielle van de Donk,Ava Kuyken,Danielle van de Donk,Ava Kuyken,2
1,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Danielle van de Donk,Ava Kuyken,Danielle van de Donk,Danielle van de Donk,2
2,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Vivianne Miedema,Ava Kuyken,Vivianne Miedema,Ava Kuyken,1
3,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Vivianne Miedema,Ava Kuyken,Vivianne Miedema,Vivianne Miedema,1
4,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Bethany Mead - Danielle van de Donk,Bethany Mead,Danielle van de Donk,Bethany Mead,1


In [324]:
df_sb_pass_network_grouped.shape

(76920, 16)

In [326]:
# Select columns of interest

## Define columns
cols = ['Full_Fixture_Date',
        'player.name',
        'position.id',
        'position.name',
        'Pass_X',
        'Pass_Y'
       ]

##
df_sb_pass_network_avg_pass = df_sb_pass_network_select[cols]

In [327]:
df_sb_pass_network_avg_pass 

Unnamed: 0,Full_Fixture_Date,player.name,position.id,position.name,Pass_X,Pass_Y
148316,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Vivianne Miedema,23.0,Center Forward,60.0,40.0
148316,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Vivianne Miedema,23.0,Center Forward,48.0,39.0
148320,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Lia Wälti,14.0,Center Midfield,44.0,42.0
148320,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Lia Wälti,14.0,Center Midfield,45.0,70.0
148324,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Lisa Evans,2.0,Right Back,43.0,69.0
...,...,...,...,...,...,...
631681,2020-02-23 West Ham United LFC 4 vs. 2 Liver...,Anke Preuß,1.0,Goalkeeper,45.4,47.1
631697,2020-02-23 West Ham United LFC 4 vs. 2 Liver...,Christie Murray,9.0,Right Defensive Midfield,72.4,72.0
631697,2020-02-23 West Ham United LFC 4 vs. 2 Liver...,Christie Murray,9.0,Right Defensive Midfield,84.1,59.1
631701,2020-02-23 West Ham United LFC 4 vs. 2 Liver...,Rhiannon Roberts,11.0,Left Defensive Midfield,75.8,51.2


In [328]:
#

##
df_sb_pass_network_avg_pass_grouped = (df_sb_pass_network_avg_pass 
                                          .groupby(['Full_Fixture_Date',
                                                    'player.name',
                                                    'position.id',
                                                    'position.name',
                                                   ])
                                          .agg({'Pass_X': ['mean'],
                                                'Pass_Y': ['mean']
                                               })
                                     )

##
df_sb_pass_network_avg_pass_grouped.columns = df_sb_pass_network_avg_pass_grouped .columns.droplevel(level=0)

##
df_sb_pass_network_avg_pass_grouped = df_sb_pass_network_avg_pass_grouped.reset_index()

## 
df_sb_pass_network_avg_pass_grouped.columns = ['full_fixture_date',
                                               'player_name',
                                               'position_id',
                                               'position_name',
                                               'avg_location_pass_x',
                                               'avg_location_pass_y'
                                     ]

##
df_sb_pass_network_avg_pass_grouped['avg_location_pass_x'] = df_sb_pass_network_avg_pass_grouped['avg_location_pass_x'].round(decimals=1)
df_sb_pass_network_avg_pass_grouped['avg_location_pass_y'] = df_sb_pass_network_avg_pass_grouped['avg_location_pass_y'].round(decimals=1)

##
df_sb_pass_network_avg_pass_grouped = df_sb_pass_network_avg_pass_grouped.sort_values(['full_fixture_date', 'player_name'], ascending=[True, True])

##
df_sb_pass_network_avg_pass_grouped.head()

Unnamed: 0,full_fixture_date,player_name,position_id,position_name,avg_location_pass_x,avg_location_pass_y
0,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Anke Preuß,1.0,Goalkeeper,20.4,47.9
1,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Ava Kuyken,9.0,Right Defensive Midfield,55.8,26.8
2,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Ava Kuyken,21.0,Left Wing,35.5,13.0
3,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Bethany Mead,17.0,Right Wing,111.0,65.5
4,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Bethany Mead,21.0,Left Wing,76.9,46.2


In [334]:
# Join the Events DataFrame to the Matches DataFrame
df_sb_pass_network_final = pd.merge(df_sb_pass_network_grouped, df_sb_pass_network_avg_pass_grouped, left_on=['full_fixture_date', 'player_recipient'], right_on=['full_fixture_date', 'player_name'])

In [338]:
## Rename columns
df_sb_pass_network_final = df_sb_pass_network_final.rename(columns={'player_name_x': 'player_name',
                                                                   #'player_name_x': 'player_name'
                                                                   }
                                                          )

In [339]:
df_sb_pass_network_final.head()

Unnamed: 0,competition_name,season_name,match_date,kick_off,full_fixture_date,team,opponent,home_team_name,away_team_name,home_score,away_score,pass_to_from,player_name,pass_recipient_name,player_recipient,count_passes,player_name_y,position_id,position_name,avg_location_pass_x,avg_location_pass_y
0,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Danielle van de Donk,Ava Kuyken,Danielle van de Donk,Ava Kuyken,2,Ava Kuyken,9.0,Right Defensive Midfield,55.8,26.8
1,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Danielle van de Donk,Ava Kuyken,Danielle van de Donk,Ava Kuyken,2,Ava Kuyken,21.0,Left Wing,35.5,13.0
2,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Vivianne Miedema,Ava Kuyken,Vivianne Miedema,Ava Kuyken,1,Ava Kuyken,9.0,Right Defensive Midfield,55.8,26.8
3,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Ava Kuyken - Vivianne Miedema,Ava Kuyken,Vivianne Miedema,Ava Kuyken,1,Ava Kuyken,21.0,Left Wing,35.5,13.0
4,FA Women's Super League,2018/2019,2018-09-09,13:30:00.000,2018-09-09 Arsenal WFC 5 vs. 0 Liverpool WFC,Arsenal WFC,Liverpool WFC,Arsenal WFC,Liverpool WFC,5,0,Danielle van de Donk - Ava Kuyken,Danielle van de Donk,Ava Kuyken,Ava Kuyken,1,Ava Kuyken,9.0,Right Defensive Midfield,55.8,26.8


In [340]:
df_sb_pass_network_final.shape

(87862, 21)

##### Export Dataset

In [341]:
# Export 
df_sb_pass_network_final.to_csv(data_dir_sb + '/events/engineered/' + '/sb_events_passing_network_1819_2021_wsl.csv', index=None, header=True)

# Export 
df_sb_pass_network_final.to_csv(data_dir + '/export/' + '/sb_wsl_events_passing_network.csv', index=None, header=True)

## <a id='#section6'>6. Summary</a>
This notebook engineers scraped [StatsBomb](https://statsbomb.com/) data using [pandas](http://pandas.pydata.org/).

## <a id='#section7'>7. Next Steps</a>
The next stage is to visualise this data in Tableau.

## <a id='#section8'>8. References</a>

#### Data
*    [StatsBomb](https://statsbomb.com/) data
*    [StatsBomb](https://github.com/statsbomb/open-data/tree/master/data) open data GitHub repository

---

***Visit my website [EddWebster.com](https://www.eddwebster.com) or my [GitHub Repository](https://github.com/eddwebster) for more projects. If you'd like to get in contact, my Twitter handle is [@eddwebster](http://www.twitter.com/eddwebster) and my email is: edd.j.webster@gmail.com.***

[Back to the top](#top)