# Data Exploration and first metrics computation
From Laurie Shaw:

Thee first step is simply to get the example script to run (it should produce a bunch of plots, obviously you'll have to adjust the file paths to wherever you saved the tracking data).

Once you've got it to run, try to understand what each line in the example script is doing. 

- The data is mostly stored in 'frames_tb' which is a list of individual frames, as defined by the class with the same name. Each frame instance contains the positions and velocities of the players and ball at a given instant in time. The data is sampled at 25Hz, so there are 25 frames/second, and about 140,000 for the match. 
- The example code gives you some idea of how to extract positions and velocities over some range of frames. 
- The Tracab.py module describes how the data is organized: take a look at the 'tracab_frame' class to see the structure.
- Tracking_Visuals contains plotting routines, and Tracking_Velocities contains the code that calculates player and ball velocities from the positions (which could probably be done better).

In [85]:
%load_ext autoreload
%autoreload 2

import os
import Tracab as tracab
import Tracking_Visuals as vis
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# import importlib
# import foo #import the module here, so that it can be reloaded.
# importlib.reload(foo)


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [86]:
# config
current_dir = os.path.dirname(os.getcwd())
fpath = os.path.join(current_dir, 'TrackingSample') # path to directory of Tracab data
LEAGUE = 'DSL'

# Read Tracking data

We read the data:
* frames is a list of the individual match snapshots (positions, velocities)
* match contains some metadata (pitch dimensions, etc)
* team1_players is a dictionary of the home team players (containing arrays of their positions/velocities over the match)
* team0_players is a dictionary of the away team players (containing arrays of their positions/velocities over the match)

In [87]:
# data
fname = '984628'

# read frames, match meta data, and data for individual players
frames_tb, match_tb, team1_players, team0_players = tracab.read_tracab_match_data(LEAGUE, fpath, fname, verbose=True)

Reading match metadata
Reading match tracking data
Timestamping frames
Measuring velocities
home goalkeeper(s):  [1]
away goalkeeper(s):  [73]
0 67615
67616 139808


In [88]:
print('there are {} frames'.format(len(frames_tb)))

there are 139810 frames


In [90]:
match_tb

<Tracab.tracab_match at 0x1c75901c50>

# Read Split data

In [34]:
split_players = pd.read_csv(os.path.join(fpath, '984628_Physical_Summary_1_clean_players.csv'))
split_agg = pd.read_csv(os.path.join(fpath, '984628_Physical_Summary_1_clean_agg.csv'), index_col=0)

In [35]:
split_players.head()

Unnamed: 0,ID,team_id,Player,Minutes,Distance,Standing,Walking,Jogging,Running,High Speed Running,...,Sprint Distance TIP,No. of High Intensity Runs TIP,Distance OTIP,HSR Distance OTIP,Sprint Distance OTIP,No. of High Intensity Runs OTIP,Distance BOP,HSR Distance BOP,Sprint Distance BOP,No. of High Intensity Runs BOP
0,182413,1,Jacob Rinne,93:12:00,4528.57,20.3,3361.22,1029.15,96.67,21.23,...,0.0,2,1641.41,13.59,0.0,2,1536.91,0.0,0.0,0
1,155453,1,Kasper Pedersen,93:12:00,9532.75,9.14,3549.7,4441.56,1081.65,322.31,...,1.02,5,4216.26,271.79,127.37,34,2271.82,38.31,0.0,3
2,80502,1,Jores Okore,93:12:00,9691.12,7.04,3179.93,4849.27,1262.01,327.06,...,0.0,8,4065.08,266.05,65.81,29,2413.78,6.52,0.0,4
3,180169,1,Philipp Ochs,93:12:00,10420.81,5.36,3619.81,4643.77,1455.0,563.26,...,59.4,15,4308.79,392.9,74.21,36,2569.26,18.36,0.0,1
4,48601,1,Patrick Kristensen,93:12:00,10907.62,8.13,3216.4,4905.82,2093.52,569.47,...,64.62,25,4381.58,376.43,49.66,33,2546.22,19.84,0.0,4


In [84]:
split_agg

Unnamed: 0,Total,First,Second
Game time,93:13:00,45:05:00,48:08:00
Ball in play,57:51:00,27:12:00,30:39:00
Home TIP,27:23:00,12:36,14:47
Away TIP,29:37:00,13:57,15:40


In [38]:
team1_players

{32: <Tracab.tracab_player at 0x1c0b6a8080>,
 1: <Tracab.tracab_player at 0x1c0b6a8198>,
 2: <Tracab.tracab_player at 0x1c0b6a82b0>,
 5: <Tracab.tracab_player at 0x1c0b6a80f0>,
 7: <Tracab.tracab_player at 0x1c0b6a80b8>,
 8: <Tracab.tracab_player at 0x1c0b6a8160>,
 9: <Tracab.tracab_player at 0x1c0b6a83c8>,
 10: <Tracab.tracab_player at 0x1c0b6a8208>,
 11: <Tracab.tracab_player at 0x1c0b6a81d0>,
 17: <Tracab.tracab_player at 0x1c0b6a8278>,
 18: <Tracab.tracab_player at 0x1c0b6a8128>,
 21: <Tracab.tracab_player at 0x1c0b6a84e0>,
 25: <Tracab.tracab_player at 0x1c0b6a8320>}

## Understanding the code

In [75]:
fmetadata, fdata = tracab.get_tracabdata_paths(fpath, fname, league=LEAGUE)


In [83]:
match = tracab.read_tracab_match(fmetadata)


In [82]:
match.match_attributes

{'iId': '984628',
 'dtDate': '2019-03-17 17:00:00',
 'iFrameRateFps': '25',
 'fPitchXSizeMeters': '105.00',
 'fPitchYSizeMeters': '68.00',
 'fTrackingAreaXSizeMeters': '111.00',
 'fTrackingAreaYSizeMeters': '88.00'}

# Reproducing split metrics 

In [None]:
# p