### Welcome to Emaly Vatne's submission to the PySport x SkillCorner Analytics Cup!

I sincerely hope you find this project as useful and interesting as I do. I also hope that one day it can be an open-source resource for analysts and sport scientists alike.

#### 1. Set-Up  
Start by importing the libraries needed for the project.

In [13]:
# ----- imports and set up -----
import requests, json, os, glob, time, datetime, warnings
from datetime import date, timedelta
from tqdm.auto import tqdm
import numpy as np
import pandas as pd
from pandas import json_normalize
from scipy.stats import zscore
import os
from pathlib import Path
import matplotlib.pyplot as plt
from pandas.plotting import table as pd_table
import plotly.graph_objects as go
import ipywidgets as widgets
from IPython.display import display, clear_output

# setting some configurations
warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", None)

In [2]:
# imports helper functions
from src.utils import * 

# imports functions needed to calculate the worst-case scenario demands/peak running intensities from tracking data
from src.wcs_calcs import * 

# imports functions to visualize peak running intensities for a selected athlete and match
from src.wcs_movement_sequences import *

The SkillCorner open data will be saved to a "data" folder and results will be stored in a "wcs" folder. Results and other figures will be saved in a folder named, "figs"

In [3]:
# ----- where raw data will be saved -----
data_dir = Path("data")
data_dir.mkdir(exist_ok=True)

# ----- where worst-case scenario demands results will be saved -----
wcs_dir = Path("wcs")
wcs_dir.mkdir(exist_ok=True)

# ----- where results and other figures will be saved -----
figs_dir = Path("figs")
figs_dir.mkdir(exist_ok=True)

#### 2. Load and Preprocess Data  
Next, we'll specify which SkillCorner match-ID we would like to load and analyze. **For the purpose of this report, I'm only going to analyze one random match**. However, this notebook could analyze any match with event and tracking data. 

I think the best workflow would be to have this running immediatley after each match, save the worst-case scenario demands/peak running itensity results, and move on from this processing the match.

In [4]:
# ----- SkillCorner match Ids -----
match_ids = [
    1925299
]

Now, using `scr/utils.py` and the `SkillCorner()` function, we'll load and preprocess the metadata (e.g., player and match details), event, and tracking data.

In [5]:
# ----- initialize object for loading and parsing the data later -----
skillcorner = SkillCorner()

In [6]:
# ----- load data for matches specified in match_ids ----- 
# initialize objects
tracking_list = []
metadata_list = []
event_list = []

# iterate over matches in match_ids and load
for match_id in match_ids:
    metadata_data, tracking_data, event_data = skillcorner.load(match_id)

    tracking_list.append(tracking_data)
    event_list.append(event_data)
    metadata_list.append(metadata_data)

# convert each to a pandas dataframe
# Convert lists → single DataFrames
tracking_df = pd.concat(tracking_list, ignore_index=True)
event_df = pd.concat(event_list, ignore_index=True)
metadata_df = pd.concat(metadata_list, ignore_index=True)

Brisbane Roar FC vs Perth Glory Football Club on 2024-12-21 parsed...


#### 3. Calculate the Worst-Case Scenario (WCS) Demands/Peak Running Intensities
Next, we will apply the code that processes the tracking data into the worst-case scenario (WCS) demands/peak running intensities. By default, the speed thresholds for high-speed running and sprint distancse are 5.28 m/s and 6.39 m/s, respectively. There is commented out code that shows how a user could adjust the threshold bands to match their environment.

In [7]:
# ----- apply the function to calculate the WCS demands/peak running intensities -----
peak_intensity = compute_peak_intensities_from_tracking(tracking_df)

# ----- example of using different/custom speed thresholds -----
# peak_intensity_men = compute_peak_intensities_from_tracking(
#     tracking_df,
#     thr=4.0,
#     hsr=6.0,
#     spr=7.0
# )

# ----- for the match_ids in peak_intensity, save the WCS demands to /wcs -----
for mid in peak_intensity["match_id"].unique():
    subset = peak_intensity[peak_intensity["match_id"] == mid]

    # create file name using match_id
    out_path = os.path.join("wcs/", f"{mid}_WCS.csv")

    subset.to_csv(out_path, index=False, encoding="utf-8")
    print(f"Saved: {out_path}")

# ----- preview peak_intensity df -----
peak_intensity.head(10)

Saved: wcs/1925299_WCS.csv


Unnamed: 0,match_id,player_id,team_name,Total Distance (m),High Speed Running Distance (m),Sprint Distance (m),Peak m/min 60s,Peak m/min 60s_FrameStart,Peak m/min 60s_TimeStart,Peak m/min 120s,Peak m/min 120s_FrameStart,Peak m/min 120s_TimeStart,Peak m/min 180s,Peak m/min 180s_FrameStart,Peak m/min 180s_TimeStart,Peak m/min 240s,Peak m/min 240s_FrameStart,Peak m/min 240s_TimeStart,Peak m/min 300s,Peak m/min 300s_FrameStart,Peak m/min 300s_TimeStart
0,1925299,11897,Perth Glory Football Club,8955.646347,452.398149,125.845636,207.703015,44161,2025-12-25 01:08:11.100,201.75334,40,2025-12-25 00:00:03.000,201.75334,40,2025-12-25 00:00:03.000,201.75334,40,2025-12-25 00:00:03.000,201.75334,40,2025-12-25 00:00:03.000
1,1925299,50982,Perth Glory Football Club,7929.443111,610.941476,234.821292,201.316075,1293,2025-12-25 00:02:08.300,180.994959,33765,2025-12-25 00:50:51.500,169.952099,980,2025-12-25 00:01:37.000,169.16021,40,2025-12-25 00:00:03.000,169.16021,40,2025-12-25 00:00:03.000
2,1925299,50999,Brisbane Roar FC,3877.751792,13.175264,0.0,98.199207,51785,2025-12-25 01:20:53.500,78.676564,50143,2025-12-25 01:18:09.300,73.160537,49366,2025-12-25 01:16:51.600,67.484011,17951,2025-12-25 00:29:54.100,65.172164,40,2025-12-25 00:00:03.000
3,1925299,51002,Brisbane Roar FC,10583.173961,753.155103,200.263812,228.881419,34878,2025-12-25 00:52:42.800,213.774749,40,2025-12-25 00:00:03.000,213.774749,40,2025-12-25 00:00:03.000,213.774749,40,2025-12-25 00:00:03.000,213.774749,40,2025-12-25 00:00:03.000
4,1925299,51043,Brisbane Roar FC,7184.406427,549.388259,163.494481,246.37925,40,2025-12-25 00:00:03.000,246.37925,40,2025-12-25 00:00:03.000,246.37925,40,2025-12-25 00:00:03.000,246.37925,40,2025-12-25 00:00:03.000,246.37925,40,2025-12-25 00:00:03.000
5,1925299,51050,Brisbane Roar FC,3064.128982,173.814489,38.793516,190.191408,2742,2025-12-25 00:04:33.200,188.481837,40,2025-12-25 00:00:03.000,188.481837,40,2025-12-25 00:00:03.000,188.481837,40,2025-12-25 00:00:03.000,188.481837,40,2025-12-25 00:00:03.000
6,1925299,51694,Perth Glory Football Club,11055.900515,595.964258,153.774575,228.859245,979,2025-12-25 00:01:36.900,197.904028,977,2025-12-25 00:01:36.700,193.875736,40,2025-12-25 00:00:03.000,193.875736,40,2025-12-25 00:00:03.000,193.875736,40,2025-12-25 00:00:03.000
7,1925299,51722,Brisbane Roar FC,2989.073324,347.73745,112.745715,212.073405,48007,2025-12-25 01:14:35.700,186.469386,55549,2025-12-25 01:27:09.900,180.669951,48024,2025-12-25 01:14:37.400,170.470232,44156,2025-12-25 01:08:10.600,168.637581,45966,2025-12-25 01:11:11.600
8,1925299,69111,Brisbane Roar FC,6125.272957,290.96875,83.249203,206.514096,20278,2025-12-25 00:33:46.800,206.514096,20278,2025-12-25 00:33:46.800,206.514096,20278,2025-12-25 00:33:46.800,206.514096,20278,2025-12-25 00:33:46.800,206.514096,20278,2025-12-25 00:33:46.800
9,1925299,287934,Brisbane Roar FC,11261.918432,796.643696,262.634966,220.645961,18495,2025-12-25 00:30:48.500,209.869317,40,2025-12-25 00:00:03.000,209.869317,40,2025-12-25 00:00:03.000,209.869317,40,2025-12-25 00:00:03.000,209.869317,40,2025-12-25 00:00:03.000


In [8]:
# ----- Present a clean summary of WCS demands by period duration -----
team_wcs_summary = summarize_team_wcs(peak_intensity)

# ----- Save the summary table -----
save_table_as_image(team_wcs_summary, filename=f"{match_id}_WCS.png")

# ----- Print the summary table -----
team_wcs_summary

Saved table image to: figs/1925299_WCS.png


Unnamed: 0,Team,Peak m/min 60s,Peak m/min 120s,Peak m/min 180s,Peak m/min 240s,Peak m/min 300s
0,Brisbane Roar FC,220.3 (98.2 – 269.6),212.1 (78.7 – 269.6),210.4 (73.2 – 269.6),209.4 (67.5 – 269.6),209.2 (65.2 – 269.6)
1,Perth Glory Football Club,225.1 (114.2 – 375.5),212.7 (97.9 – 375.5),210.7 (89.6 – 375.5),208.9 (81.6 – 375.5),208.5 (81.6 – 375.5)


#### 4. Merge 60-Second WCS Demands with Preceding Technical/Tactical Event  
Before visualizing or characterizing the movement patterns that define each player’s WCS running demands, it is important to understand what is happening in the game during those moments. Thus, we merge each identified 60-second WCS period with the closest corresponding technical or tactical event recorded for the same player. The 60-second WCS period was used for this notebook, but changing what duration (in seconds) is passed through the function for `window_seconds` changes what WCS demands periods are used.

Overall, this step allows for contextualizing peak physical efforts by linking them to specific in-game actions such as off-ball runs, pressing, or possession involvement that precede the efforts. The output provides a unified dataset that connects WCS running intensity with the actual behaviours driving those demanding phases of play.

In [9]:
# ----- Apply the merge 60-Second WCS demands (can be replicated for other windows) -----
merged_WCS_events_60 = merge_wcs_peaks_with_events(
    peak_intensity=peak_intensity,
    event_df=event_df,
    window_seconds=60,
    player_col="player_id", 
    tolerance_frames=100  # this is optional but this sets window for matching at 10 seconds (10 Hz)
)

#### 5. Summarize the Events the Precede the WCS Demands/Peak Running Intensities

Since the WCS demands are the most physically challenging moments players experience within a match, understanding what actions lead to these demanding periods is essential for effective training design and performance interpretation.  

To provide this context, this code identified the closest on-ball or off-ball event that preceded each player’s WCS demand period for the selected duration and summarizes the distribution of event types and sub-types by team for the match being analyzed. This table helps sport scientists and coaches alike quickly determine which tactical actions most often preced high-intensity locomotor demands (e.g., supporting runs, pressing actions, or ball engagements). 

**Importantly, sport scientists can use this information to understand if these actions align with the game model.**

In [10]:
# ----- merge the 60-second WCS demands with preceding events -----
merged_WCS_events_60, event_summary_60 = merge_wcs_peaks_with_events(
    peak_intensity=peak_intensity,
    event_df=event_df,
    window_seconds=60,
    player_col="player_id", 
    tolerance_frames=50, # equates to 5-second buffer (10 Hz) when matching event and WCS for each player
    return_summary=True
)

# ----- save the results to /figs -----
save_table_as_image(event_summary_60, filename=f"{match_id}_WCS_PrecedingEvents_60s.png")

# ----- Print the summary of precending events by team -----
event_summary_60

Saved table image to: figs/1925299_WCS_PrecedingEvents_60s.png


Unnamed: 0,Team,Event Type,EventType_Count,% of Events that Precede WCS,Event Sub-Type,Subtype_Count
0,Brisbane Roar FC,passing_option,2,28.57,No Sub-Type,2
1,Brisbane Roar FC,off_ball_run,2,28.57,run_ahead_of_the_ball,1
2,Brisbane Roar FC,off_ball_run,2,28.57,support,1
3,Brisbane Roar FC,on_ball_engagement,2,28.57,pressure,1
4,Brisbane Roar FC,on_ball_engagement,2,28.57,recovery_press,1
5,Brisbane Roar FC,player_possession,1,14.29,No Sub-Type,1
6,Perth Glory Football Club,off_ball_run,3,37.5,support,2
7,Perth Glory Football Club,off_ball_run,3,37.5,run_ahead_of_the_ball,1
8,Perth Glory Football Club,passing_option,2,25.0,No Sub-Type,2
9,Perth Glory Football Club,player_possession,2,25.0,No Sub-Type,2


For example, identifying the events that most frequently precede WCS periods enables sport scientists to evaluate whether these demanding moments align with the coaching staff’s tactical intentions, or the team’s *game model*. In the case of Perth Glory FC against Brisbane Roar, 37.5% of WCS demands/peak running intensity periods occur during off-ball runs. This may reflect a deliberate tactical strategy (e.g., vertical stretching, aggressive support play), or it could suggest opportunities for tactical refinement. 

High-intensity off-ball efforts can result from delayed support, poor starting positions, or a lack of tactical recognition, which means the physical demand is a symptom of a decision-making lapse rather than a desired action. In this scenario, the optimal intervention may involved a strategy that supports improved understanding of when and how to create or support attacking solutions efficiently.  

If this off-ball running does align with the coaching staff's *game model*, then the tracking data can also support the design of position-specific and individualized running-based WCS drills. See the next section for how tracking data can provide support in this area.

#### 6. Movement Sequences of WCS Demands/Peak Running Intensity Periods  
The code below now visualizes how players actually move during these intense periods in a match using an interactive animated pitch view. Users can select any player involved in the match and choose a WCS duration window (e.g., 60 seconds), and the animation will display their movement trail coloured by instantaneous speed. This provides a dynamic representation of how physical loads, tactical role, and spatial behaviour interact in real match settings. This tool supports applied use in cases such as:  
- Constructing drills that replicate the tactical triggers of WCS moments. 
    - Importantly, the visualization demonstrates that **WCS demands/peak running intensities are never linear and in the same speed**. Therefore, to develop sport-specific conditioning drills, linear and same-speed activities may not prepare athletes from the demands of football
- Evaluating positional demands and high-pressing strategies. 
- Monitoring return-to-play readiness based on ability to sustain game-speed actions. 
- Scouting player physical-tactical fit in recruitment processes

In [None]:
# ----- Render the select widgets and movement sequences plot -----
build_wcs_widget(tracking_df, peak_intensity, match_id=match_id, fps=10.0)

VBox(children=(Dropdown(description='Player ID:', options=(np.int64(11897), np.int64(50982), np.int64(50999), …

#### Together, these tools transform event data and raw tracking data into actionable performance insight for *sport scientists and coaches alike* by linking what happened physically with why it occurred tactically and translating it into a practically useful too