# Off-Ball Run Analysis: Quantifying Space Creation in Football

**Author:** Ivo Steinke  
**Competition:** PySport X SkillCorner Analytics Cup

---

## Introduction

Off-ball movement is crucial to modern football tactics, yet traditional analysis focuses primarily on players in possession. This work presents a novel approach to quantify how high-speed off-ball runs create exploitable space for the ball carrier.

**Research Question:** *How much space do coordinated off-ball movements generate during team possession?*

The methodology analyzes 10 A-League matches using SkillCorner's broadcast tracking data, combining velocity-based run detection with Voronoi tessellation to measure spatial impact. This transparency-first approach provides interpretable metrics for tactical analysis without relying on black-box machine learning.

In [4]:
# Imports
from src.data_loader import load_matches_info, load_match_data
from src.space_analysis import analyze_all_matches_normalized
import pandas as pd

# Configuration
MATCH_IDS = ['2017461', '1996435', '1886347', '1899585', '1925299',
             '1953632', '2006229', '2011166', '2013725', '2015213']
VELOCITY_THRESHOLD = 5.0  # m/s (18 km/h)

print(f"Analyzing {len(MATCH_IDS)} A-League matches")
print(f"Velocity threshold: {VELOCITY_THRESHOLD} m/s")

Analyzing 10 A-League matches
Velocity threshold: 5.0 m/s


## Methodology

**Data:** 10 A-League matches, 688,000+ frames at 10 Hz

**Run Detection Criteria:**
- Velocity ≥5.0 m/s (18 km/h)
- Duration ≥3 seconds
- Own team in possession, runner without ball

**Space Measurement:** Voronoi tessellation at run start (t₀) and end (t₀+3s):
```
Space_Created = Voronoi_Area(ball_carrier, t₀+3s) - Voronoi_Area(ball_carrier, t₀)
```

**Normalization:** All runs normalized to left-to-right attack direction.

In [5]:
# Load and analyze all matches
matches = load_matches_info(MATCH_IDS)
trajectories = analyze_all_matches_normalized(matches, load_match_data, 
                                              velocity_threshold=VELOCITY_THRESHOLD)

print(f"\n{'='*80}")
print("ANALYSIS RESULTS")
print(f"{'='*80}")

# Overall statistics
total_runs = len(trajectories)
avg_runs_per_match = total_runs / len(MATCH_IDS)
total_space = trajectories['total_space_created'].sum()
avg_space_per_run = trajectories['total_space_created'].mean()
pitch_area = 105 * 68  # 7,140 m²
pct_of_pitch = (avg_space_per_run / pitch_area) * 100
avg_velocity = trajectories['max_velocity'].mean()
avg_velocity_kmh = avg_velocity * 3.6
avg_duration_seconds = trajectories['duration_frames'].mean() / 10

print(f"\nTotal runs detected: {total_runs}")
print(f"Average runs per match: {avg_runs_per_match:.1f}")
print(f"Total space created: {total_space:,.0f} m²")
print(f"Average space per run: {avg_space_per_run:,.0f} m² ({pct_of_pitch:.1f}% of pitch)")

# Calculate Positive Runs statistics
positive_runs = trajectories[trajectories['total_space_created'] > 0]
positive_total_space = positive_runs['total_space_created'].sum()
positive_avg_space = positive_runs['total_space_created'].mean()
positive_count = len(positive_runs)
positive_pct = (positive_count / total_runs) * 100

print(f"\n--- POSITIVE SPACE RUNS ({positive_count} runs, {positive_pct:.1f}%) ---")
print(f"Average space change: {positive_avg_space:,.0f} m²")
print(f"Net space effect: {positive_total_space:,.0f} m²")


=== ANALYZING ALL MATCHES (threshold: 5.0 m/s) ===



Processing matches:   0%|          | 0/10 [00:00<?, ?it/s]

✓ Detected 12,920 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 12,920 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12920/12920 [00:08<00:00, 1569.29it/s]


✓ Analyzed 199 off-ball runs (space for ball carrier only)
✓ Detected 12,068 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 30 unique players

Analyzing 12,068 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12068/12068 [00:07<00:00, 1580.41it/s]
Processing matches:  10%|█         | 1/10 [00:19<02:56, 19.64s/it]

✓ Analyzed 362 off-ball runs (space for ball carrier only)
✓ Detected 11,520 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 11,520 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 11520/11520 [00:09<00:00, 1203.73it/s]


✓ Analyzed 293 off-ball runs (space for ball carrier only)
✓ Detected 12,030 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 28 unique players

Analyzing 12,030 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12030/12030 [00:07<00:00, 1640.21it/s]
Processing matches:  20%|██        | 2/10 [00:40<02:44, 20.58s/it]

✓ Analyzed 129 off-ball runs (space for ball carrier only)
✓ Detected 13,048 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 13,048 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 13048/13048 [00:09<00:00, 1429.27it/s]


✓ Analyzed 308 off-ball runs (space for ball carrier only)
✓ Detected 12,393 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 27 unique players

Analyzing 12,393 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12393/12393 [00:05<00:00, 2207.70it/s]
Processing matches:  30%|███       | 3/10 [00:59<02:18, 19.74s/it]

✓ Analyzed 250 off-ball runs (space for ball carrier only)
✓ Detected 10,335 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 21 unique players

Analyzing 10,335 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 10335/10335 [00:04<00:00, 2174.87it/s]


✓ Analyzed 97 off-ball runs (space for ball carrier only)
✓ Detected 12,230 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 26 unique players

Analyzing 12,230 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12230/12230 [00:07<00:00, 1713.10it/s]
Processing matches:  40%|████      | 4/10 [01:15<01:48, 18.14s/it]

✓ Analyzed 310 off-ball runs (space for ball carrier only)
✓ Detected 9,544 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 21 unique players

Analyzing 9,544 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 9544/9544 [00:05<00:00, 1814.77it/s]


✓ Analyzed 256 off-ball runs (space for ball carrier only)
✓ Detected 6,680 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 28 unique players

Analyzing 6,680 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 6680/6680 [00:05<00:00, 1234.34it/s]
Processing matches:  50%|█████     | 5/10 [01:30<01:25, 17.03s/it]

✓ Analyzed 195 off-ball runs (space for ball carrier only)
✓ Detected 8,340 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 22 unique players

Analyzing 8,340 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 8340/8340 [00:04<00:00, 1844.35it/s]


✓ Analyzed 63 off-ball runs (space for ball carrier only)
✓ Detected 8,292 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 26 unique players

Analyzing 8,292 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 8292/8292 [00:04<00:00, 1901.12it/s]
Processing matches:  60%|██████    | 6/10 [01:43<01:02, 15.72s/it]

✓ Analyzed 176 off-ball runs (space for ball carrier only)
✓ Detected 8,364 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 8,364 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 8364/8364 [00:05<00:00, 1526.99it/s]


✓ Analyzed 211 off-ball runs (space for ball carrier only)
✓ Detected 9,318 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 29 unique players

Analyzing 9,318 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 9318/9318 [00:05<00:00, 1667.78it/s]
Processing matches:  70%|███████   | 7/10 [01:59<00:47, 15.93s/it]

✓ Analyzed 335 off-ball runs (space for ball carrier only)
✓ Detected 13,459 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 13,459 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 13459/13459 [00:11<00:00, 1162.11it/s]


✓ Analyzed 401 off-ball runs (space for ball carrier only)
✓ Detected 13,513 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 30 unique players

Analyzing 13,513 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 13513/13513 [00:12<00:00, 1078.59it/s]


✓ Analyzed 265 off-ball runs (space for ball carrier only)


Processing matches:  80%|████████  | 8/10 [02:28<00:40, 20.07s/it]

✓ Detected 12,759 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 12,759 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 12759/12759 [00:09<00:00, 1324.92it/s]


✓ Analyzed 205 off-ball runs (space for ball carrier only)
✓ Detected 14,530 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 28 unique players

Analyzing 14,530 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 14530/14530 [00:13<00:00, 1051.54it/s]
Processing matches:  90%|█████████ | 9/10 [02:57<00:22, 22.73s/it]

✓ Analyzed 340 off-ball runs (space for ball carrier only)
✓ Detected 13,953 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 20 unique players

Analyzing 13,953 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 13953/13953 [00:07<00:00, 1778.02it/s]


✓ Analyzed 220 off-ball runs (space for ball carrier only)
✓ Detected 15,069 frames with velocity >= 5.0 m/s and duration >= 3.0s
✓ From 25 unique players

Analyzing 15,069 run frames (team possession filter)...


Analyzing runs: 100%|██████████| 15069/15069 [00:11<00:00, 1299.82it/s]
Processing matches: 100%|██████████| 10/10 [03:23<00:00, 20.32s/it]


✓ Analyzed 336 off-ball runs (space for ball carrier only)

✓ Total runs: 855
✓ Avg velocity: 6.31 m/s
✓ Avg space created: -127 m²

ANALYSIS RESULTS

Total runs detected: 855
Average runs per match: 85.5
Total space created: -108,823 m²
Average space per run: -127 m² (-1.8% of pitch)

--- POSITIVE SPACE RUNS (412 runs, 48.2%) ---
Average space change: 434 m²
Net space effect: 178,639 m²


## Results

### Overall Findings
Across 10 matches, **855 distinct off-ball runs** were detected (average 85.5 runs/match).

**Overall Metrics:**
- Average space change: **-127 m²** per run
- Net space effect: **-108,823 m²** across all runs

**Positive Space Runs (412 runs, 48.2%):**
- Average space change: **434 m²** per run
- Net space effect: **178,639 m²**

**Interpretation:** Most off-ball runs occur in congested areas where defensive pressure increases simultaneously, resulting in net negative space for ball carriers. However, 48.2% of runs demonstrate positive space creation (434 m² average), revealing that effective space creation requires coordinated timing and positioning that successfully manipulates defensive structure, not just high volume running.

In [6]:
# Extract player names and calculate statistics
all_player_names = {}
for match_id in trajectories['match_id'].unique():
    data = load_match_data(match_id)
    for p in data['metadata']['players']:
        player_id = p['id']
        team_id = p['team_id']
        home_id = data['metadata']['home_team']['id']
        team_name = (data['metadata']['home_team']['name'] if team_id == home_id 
                    else data['metadata']['away_team']['name'])
        all_player_names[player_id] = {
            'name': f"{p.get('first_name', '')} {p.get('last_name', '')}".strip(), 
            'team': team_name
        }

player_stats = trajectories.groupby('player_id').agg({
    'total_space_created': ['count', 'sum', 'mean'],
    'match_id': 'nunique'
}).reset_index()
player_stats.columns = ['player_id', 'num_runs', 'total_space', 'avg_space_per_run', 'matches']
player_stats['runs_per_match'] = player_stats['num_runs'] / player_stats['matches']
player_stats['space_per_match'] = player_stats['total_space'] / player_stats['matches']

print(f"\n{'='*80}\nTOP PERFORMERS\n{'='*80}")

for i, (metric, label) in enumerate([
    ('runs_per_match', 'Most Runs/Match'),
    ('space_per_match', 'Most Space/Match'),
], 1):
    top = player_stats.nlargest(1, metric).iloc[0]
    pinfo = all_player_names.get(int(top['player_id']), {'name': f"Player {int(top['player_id'])}", 'team': 'Unknown'})
    print(f"\n{i}. {label}: {pinfo['name']} ({pinfo['team']})")
    print(f"   {top[metric]:,.1f} {metric.split('_')[0]}/match ({int(top['matches'])} matches)")

# Most efficient (min 5 runs)
efficient = player_stats[player_stats['num_runs'] >= 5].nlargest(1, 'avg_space_per_run').iloc[0]
pinfo = all_player_names.get(int(efficient['player_id']), {'name': f"Player {int(efficient['player_id'])}", 'team': 'Unknown'})
print(f"\n3. Most Efficient (min 5 runs): {pinfo['name']} ({pinfo['team']})")
print(f"   {efficient['avg_space_per_run']:,.0f} m²/run ({int(efficient['matches'])} matches)")


TOP PERFORMERS

1. Most Runs/Match: Ivan Vujica (Macarthur FC)
   11.0 runs/match (1 matches)

2. Most Space/Match: Joshua Rawlins (Melbourne Victory Football Club)
   9,445.3 space/match (1 matches)

3. Most Efficient (min 5 runs): Corban Piper (Wellington Phoenix FC)
   1,094 m²/run (1 matches)


## Visualizations

### Figure 1: Voronoi Tessellation Methodology
![Voronoi Example](figs/voronoi_example.png)

Voronoi tessellation showing controlled space per player. The ball carrier (gold square) gains exploitable space when teammates make off-ball runs, pulling defenders away.

### Figure 2: Run Trajectory Patterns
![Trajectory Visualization](figs/trajectory_visualization.png)

All off-ball runs from Melbourne Victory showing normalized trajectories. Green markers indicate run start, red markers show run end. Arrows reveal coordinated movement patterns creating space in attacking third.

## Conclusion

This transparent, Voronoi based approach successfully quantifies off-ball run effectiveness using broadcast tracking data. The methodology reveals that while high speed off-ball movements are frequent (85.5 per match), 48.2% create positive space for ball carriers.

### Key Finding
The nearly even split between positive and negative space runs (48.2% vs 51.8%) demonstrates that running volume alone does not guarantee tactical effectiveness. This highlights the importance of coordinated timing and positioning that successfully manipulates defensive structure.

### Limitations
The methodology uses strict detection criteria (≥5.0 m/s or 18 km/h sustained for 3 seconds), which may exclude lower intensity runs that still create tactical value. The 3 second measurement window may be too short to capture delayed defensive reactions. Future work should test multiple velocity thresholds and time windows to identify optimal detection parameters.

### Impact
This interpretable framework enables coaches to distinguish between high volume running and tactically effective movements that actually create exploitable space. The methodology provides actionable insights for tactical preparation and player development without requiring complex machine learning infrastructure.

### Key Contributions
- Transparency first methodology: No black box ML, fully interpretable metrics  
- Counterintuitive findings: Nearly half of runs create positive space despite negative overall average  
- Actionable insights: Quantifies previously unmeasured tactical elements

---

**Author:** Ivo Steinke  
I am an Analytics Engineer and Data Science instructor with an M.Sc. in Applied Data Science from Georg-August-University Göttingen. I combine eight years of Python experience with a competitive football background, specializing in sports analytics and machine learning applications in tactical analysis.