# What is Pressure in Football?

According to [Wikipedia](https://en.wikipedia.org/wiki/Glossary_of_association_football_terms#P), **Pressing**
is a tactic of **defending players moving forward towards the ball, rather than remaining in position** near their goal. They may pressure the player that has the ball or get close to other opponents in order to remove passing options. A successful press will recover the ball quickly and further up the pitch, or force the opponents to make an inaccurate long kick. However, if the opponents are able to pass the ball forward, fewer defending players are protecting the goal, making pressing a high-risk, high-reward strategy.

According to [Soccer Pilot](https://www.soccerpilot.com/tactic/articles/soccer-tactic-types-of-pressing.html), there are three types of pressing, namely, Midfield, High and Low.

In [None]:
from IPython.display import Image

In [None]:
Image("../pics/soccer-midfield-pressing.jpg")  # PC: Soccer Pilot

In [None]:
Image("../pics/high-pressure-soccer-2.jpg")  # PC: Soccer Pilot

In [None]:
Image("../pics/low-pressure-soccer.jpg")  # PC: Soccer Pilot

# Import Required Libraries

In [None]:
# Data Manipulation libraries:
import numpy as np
import pandas as pd
from copy import deepcopy

# Plotting libraries
import mplsoccer
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import matplotlib.patheffects as path_effects

from matplotlib.patches import Arc
from plotly.subplots import make_subplots
from matplotlib.backends.backend_pdf import PdfPages
from matplotlib.projections import get_projection_class

# Load the Data

In [None]:
eventDataLL1920 = pd.read_csv("../data/matchwise_events_data_updated.csv",
                              low_memory=False)

In [None]:
pd.set_option("display.max_columns", 50)
pd.set_option("display.max_rows", 100)

# Data Preparation

In [None]:
eventDataLL1920.columns[eventDataLL1920.columns.str.contains("pressure")]

In [None]:
eventDataLL1920[["type.id", "type.name"]].drop_duplicates()

In [None]:
eventDataLL1920[eventDataLL1920["type.id"] == 17]

In [None]:
eventDataLL1920[eventDataLL1920["type.id"] == 17].iloc[0].values

In [None]:
"""
Separate out the X and Y coordinates for pressure location
"""
# Start location for any action:
eventDataLL1920["startX"] = eventDataLL1920["location"]\
    .str.split(", ", expand=True)[0].str[1:].apply(pd.to_numeric)
eventDataLL1920["startY"] = eventDataLL1920["location"]\
    .str.split(", ", expand=True)[1].str[:-1].apply(pd.to_numeric)

In [None]:
pressureDataLL1920 = deepcopy(eventDataLL1920[eventDataLL1920["type.id"] == 17])

# Generating Pressure Maps

In [None]:
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#383838', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, ax = pitch.draw(figsize=(4, 6))

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, ax = pitch.draw(figsize=(4, 6))

# Calculating the pressure difference
"""
The 'pitch.bin_statistic_positional' function
calculates binned statistics for the Juego de posición (position game) concept.
It uses scipy.stats.binned_statistic_2d.
"""
bin_statistic = pitch.bin_statistic_positional(x=pressureDataLL1920["startX"],
                                               y=pressureDataLL1920["startY"],
                                               statistic='count',
                                               positional='horizontal',
                                               normalize=True)


In [None]:
bin_statistic

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, ax = pitch.draw(figsize=(4, 6))

# Calculating the pressure difference
"""
The 'pitch.bin_statistic_positional' function
calculates binned statistics for the Juego de posición (position game) concept.
It uses scipy.stats.binned_statistic_2d.
"""
bin_statistic = pitch.bin_statistic_positional(pressureDataLL1920["startX"],
                                               pressureDataLL1920["startY"],
                                               statistic='count',
                                               positional='horizontal',
                                               normalize=True)
# Plot the Heatmap according to the positions selected above
pitch.heatmap_positional(bin_statistic,
                         ax=ax,
                         cmap='coolwarm',
                         edgecolors='#22312b')

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, ax = pitch.draw(figsize=(4, 6))

# Calculating the pressure difference
"""
The 'pitch.bin_statistic_positional' function
calculates binned statistics for the Juego de posición (position game) concept.
It uses scipy.stats.binned_statistic_2d.
"""
bin_statistic = pitch.bin_statistic_positional(pressureDataLL1920["startX"],
                                               pressureDataLL1920["startY"],
                                               statistic='count',
                                               positional='vertical',
                                               normalize=True)
# Plot the Heatmap according to the positions selected above
pitch.heatmap_positional(bin_statistic,
                         ax=ax,
                         cmap='coolwarm',
                         edgecolors='#22312b')
# Plot the points at the exact location of where the pressure was applied:
pitch.scatter(pressureDataLL1920["startX"],
              pressureDataLL1920["startY"],
              c='white', s=1, ax=ax, alpha=0.5)

In [None]:
path_eff = [path_effects.Stroke(linewidth=3, foreground='black'),
            path_effects.Normal()]

In [None]:
path_eff

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, ax = pitch.draw(figsize=(4, 6))

# Calculating the pressure difference
"""
The 'pitch.bin_statistic_positional' function
calculates binned statistics for the Juego de posición (position game) concept.
It uses scipy.stats.binned_statistic_2d.
"""
bin_statistic = pitch.bin_statistic_positional(pressureDataLL1920["startX"],
                                               pressureDataLL1920["startY"],
                                               statistic='count',
                                               positional='vertical',
                                               normalize=True)
# Plot the Heatmap according to the positions selected above
pitch.heatmap_positional(bin_statistic,
                         ax=ax,
                         cmap='coolwarm',
                         edgecolors='#22312b')
# Plot the points at the exact location of where the pressure was applied:
pitch.scatter(pressureDataLL1920["startX"],
              pressureDataLL1920["startY"],
              c='white', s=2, ax=ax, alpha=0.5)
# Add the Distribution count for each section of the pitch:
labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                             ax=ax, ha='center', va='center',
                             str_format='{:.0%}', path_effects=path_eff)

NOTE: In case you would like to know more about path effects, kindly check out [Matplotlib Path Effects Docs](https://matplotlib.org/stable/tutorials/advanced/patheffects_guide.html)

## Different types of Positional Sections

In [None]:
"""
Distribution of Pressure on a Pitch Map 
(Horizontal, Vertical and Full)
"""

# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010', line_zorder=2, line_color='#ffffff')
# Draw the pitch according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=1, ncols=3, title_height=0.08,
                     axis=False)

pitchPos = ["horizontal", "vertical", "full"]
for idx, ax in enumerate(axs["pitch"]):
    pos = pitchPos[idx]
    # Calculating the pressure difference
    bin_statistic = pitch.bin_statistic_positional(pressureDataLL1920["startX"],
                                                   pressureDataLL1920["startY"],
                                                   statistic='count',
                                                   positional=pos,
                                                   normalize=True)
    # Plot the Heatmap according to the positions selected above
    pitch.heatmap_positional(bin_statistic,
                             ax=ax,
                             cmap='coolwarm',
                             edgecolors='#22312b')
    # Plot the points at the exact location of where the pressure was applied:
    pitch.scatter(pressureDataLL1920["startX"],
                  pressureDataLL1920["startY"],
                  c='white', s=1, ax=ax, alpha=0.3)
    # Add the Distribution count for each section of the pitch:
    labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                 ax=ax, ha='center', va='center',
                                 str_format='{:.0%}', path_effects=path_eff)
    axs['title'].text(0.5, 0.5, "Positional Pressure Maps", color='#dee6ea',
                  va='center', ha='center', path_effects=path_eff,
                  fontsize=25)
    ax.set_title(pos.capitalize())
    

In [None]:
pitch = mplsoccer.VerticalPitch(line_color='#000009',
                                pitch_color='white',
                                line_zorder=2)
fig, ax = pitch.draw(figsize=(4, 6))
hexmap = pitch.hexbin(pressureDataLL1920["startX"],
                      pressureDataLL1920["startY"],
                      ax=ax, edgecolors='#f4f4f4',
                      gridsize=(5, 5), cmap="Reds")

# Analysing Pressure Maps

## Team-wise

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010',
                                line_color='#ffffff',
                                line_zorder=2)
# Draw the pitch grid according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=5, ncols=4,
                      axis=False, figheight=40,
                     space=0.1, grid_height=0.98, grid_width=0.9,
                     title_height=0, endnote_height=0)

teamIDs = pressureDataLL1920["team.id"].unique()
for idx, ax in enumerate(axs["pitch"].flat):
    # Get the data for the current team in the loop:
    teamData = pressureDataLL1920[pressureDataLL1920["team.id"] == teamIDs[idx]]

    # Calculating the pressure difference:
    bin_statistic = pitch.bin_statistic_positional(teamData["startX"],
                                                   teamData["startY"],
                                                   statistic='count',
                                                   positional='full',
                                                   normalize=True)
    # Plot the Heatmap according to the positions selected above
    pitch.heatmap_positional(bin_statistic,
                             ax=ax,
                             cmap='Reds',
                             edgecolors='#22312b')
    # Plot the points at the exact location of where the pressure was applied:
    pitch.scatter(teamData["startX"],
                  teamData["startY"],
                  c='white', s=2, ax=ax, alpha=0.5)
    # Add the Distribution count for each section of the pitch:
    labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                 ax=ax, ha='center', va='center',
                                 str_format='{:.0%}', path_effects=path_eff)
    teamName = teamData["team.name"].unique().item()
    totPressure = len(teamData)
    ax.set_title(teamName + "\n Pressure Count: " + str(totPressure),
                 fontsize=25)


In [None]:
eventDataLL1920[(eventDataLL1920["match_id"] == 303707)
               & (eventDataLL1920["shot.outcome.name"] == "Goal")]

## Player-Wise

In [None]:
barcaPressureData = pressureDataLL1920[pressureDataLL1920["team.id"] == 217]

In [None]:
barcaPressureData["player.id"].nunique()

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010',
                                line_color='#ffffff',
                                line_zorder=2)
# Draw the pitch grid according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=7, ncols=4,
                      axis=False, figheight=40,
                     space=0.2, grid_height=0.98, grid_width=0.9,
                     title_height=0, endnote_height=0)

playerIDs = barcaPressureData["player.id"].unique()
for idx, ax in enumerate(axs["pitch"].flat):
    if idx < len(playerIDs):
        # Get the data for the current player in the loop:
        playerData = barcaPressureData[barcaPressureData["player.id"] == playerIDs[idx]]

        # Calculating the pressure difference:
        bin_statistic = pitch.bin_statistic_positional(playerData["startX"],
                                                       playerData["startY"],
                                                       statistic='count',
                                                       positional='full',
                                                       normalize=True)
        # Plot the Heatmap according to the positions selected above
        pitch.heatmap_positional(bin_statistic,
                                 ax=ax,
                                 cmap='Reds',
                                 edgecolors='#22312b')
        # Plot the points at the exact location of where the pressure was applied:
        pitch.scatter(playerData["startX"],
                      playerData["startY"],
                      c='white', s=3, ax=ax)
        # Add the Distribution count for each section of the pitch:
        labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                     ax=ax, ha='center', va='center',
                                     str_format='{:.0%}', path_effects=path_eff)
        playerName = playerData["player.name"].unique().item()
        totPressure = len(playerData)
        positionName = playerData["position.name"].unique()[0]
        ax.set_title(playerName + "\n" + positionName + "\n Pressure Count: " + str(totPressure),
                     fontsize=15)

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010',
                                line_color='#ffffff',
                                line_zorder=2)
# Draw the pitch grid according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=7, ncols=4,
                      axis=False, figheight=40,
                     space=0.2, grid_height=0.98, grid_width=0.9,
                     title_height=0, endnote_height=0)

playerIDs = barcaPressureData["player.id"].unique()
for idx, ax in enumerate(axs["pitch"].flat):
    if idx < len(playerIDs):
        # Get the data for the current player in the loop:
        playerData = barcaPressureData[barcaPressureData["player.id"] == playerIDs[idx]]

        # Calculating the pressure difference:
        bin_statistic = pitch.bin_statistic_positional(playerData["startX"],
                                                       playerData["startY"],
                                                       statistic='count',
                                                       positional='full',
                                                       normalize=True)
        # Plot the Heatmap according to the positions selected above
        pitch.heatmap_positional(bin_statistic,
                                 ax=ax,
                                 cmap='Reds',
                                 edgecolors='#22312b')
        # Plot the points at the exact location of where the pressure was applied:
        pitch.scatter(playerData["startX"],
                      playerData["startY"],
                      c='white', s=3, ax=ax)
        # Add the Distribution count for each section of the pitch:
        labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                     ax=ax, ha='center', va='center',
                                     str_format='{:.0%}', path_effects=path_eff)
        playerName = playerData["player.name"].unique().item()
        totPressure = len(playerData)
        positionName = playerData["position.name"].unique()[0]
        playerMatchMins =\
            playerData.dropna(subset=["minsPlayed"])\
                .drop_duplicates(subset=["match_id"])["minsPlayed"].sum().astype(int).item()
        ax.set_title(playerName + "\n" + positionName
                     + "\n Total: " + str(totPressure)
                     + " | Mins: " + str(playerMatchMins),
                     fontsize=15)

## El-Classico

In [None]:
elclassicoMatchIds = [303596, 303470]

In [None]:
pressureDataLL1920[pressureDataLL1920["match_id"].isin(elclassicoMatchIds)]

In [None]:
ecPressureData = deepcopy(pressureDataLL1920[pressureDataLL1920["match_id"].isin(elclassicoMatchIds)])

In [None]:
ecPressureData[["match_id", "team.id"]].drop_duplicates()

In [None]:
matchIDList = ecPressureData[["match_id", "team.id"]].drop_duplicates()["match_id"].tolist()
teamIDList = ecPressureData[["match_id", "team.id"]].drop_duplicates()["team.id"].tolist()

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010',
                                line_color='#ffffff',
                                line_zorder=2)
# Draw the pitch grid according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=2, ncols=2,
                      axis=False, figheight=15,
                     space=0.1, grid_height=0.98, grid_width=0.9,
                     title_height=0, endnote_height=0)

for idx, ax in enumerate(axs["pitch"].flat):
    # Get the data for the current team in the loop:
    teamData = ecPressureData[(ecPressureData["match_id"] == matchIDList[idx])
                              & (ecPressureData["team.id"] == teamIDList[idx])]

    # Calculating the pressure difference:
    bin_statistic = pitch.bin_statistic_positional(teamData["startX"],
                                                   teamData["startY"],
                                                   statistic='count',
                                                   positional='full',
                                                   normalize=True)
    # Plot the Heatmap according to the positions selected above
    pitch.heatmap_positional(bin_statistic,
                             ax=ax,
                             cmap='coolwarm',
                             edgecolors='white')
    # Plot the points at the exact location of where the pressure was applied:
    pitch.scatter(teamData["startX"],
                  teamData["startY"],
                  c='white', s=3, ax=ax)
    # Add the Distribution count for each section of the pitch:
    labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                 ax=ax, ha='center', va='center',
                                 str_format='{:.0%}', path_effects=path_eff)
    teamName = teamData["team.name"].unique().item()
    totPressure = len(teamData)
    ax.set_title(str(matchIDList[idx]) + " | " + teamName
                 + "\n Pressure Count: " + str(totPressure),
                 fontsize=25)

In [None]:
eventDataLL1920[(eventDataLL1920["match_id"] == 303596)
               & (eventDataLL1920["shot.outcome.name"] == "Goal")]

In [None]:
eventDataLL1920[(eventDataLL1920["match_id"] == 303470)
               & (eventDataLL1920["shot.outcome.name"] == "Goal")]

In [None]:
resultsDict = {303470: "RMA 2-0 BAR", 303596: "BAR 0-0 RMA"}

In [None]:
""" Distribution of Pressure on a Pitch Map """
# Set the Pitch Parameters:
pitch = mplsoccer.VerticalPitch(pitch_color='#101010',
                                line_color='#ffffff',
                                line_zorder=2)
# Draw the pitch grid according to the set Pitch Parameters:
fig, axs = pitch.grid(nrows=2, ncols=2,
                      axis=False, figheight=15,
                     space=0.1, grid_height=0.98, grid_width=0.9,
                     title_height=0, endnote_height=0)

for idx, ax in enumerate(axs["pitch"].flat):
    # Get the data for the current team in the loop:
    teamData = ecPressureData[(ecPressureData["match_id"] == matchIDList[idx])
                              & (ecPressureData["team.id"] == teamIDList[idx])]

    # Calculating the pressure difference:
    bin_statistic = pitch.bin_statistic_positional(teamData["startX"],
                                                   teamData["startY"],
                                                   statistic='count',
                                                   positional='full',
                                                   normalize=True)
    # Plot the Heatmap according to the positions selected above
    pitch.heatmap_positional(bin_statistic,
                             ax=ax,
                             cmap='coolwarm',
                             edgecolors='white')
    # Plot the points at the exact location of where the pressure was applied:
    pitch.scatter(teamData["startX"],
                  teamData["startY"],
                  c='white', s=3, ax=ax)
    # Add the Distribution count for each section of the pitch:
    labels = pitch.label_heatmap(bin_statistic, color='white', fontsize=20,
                                 ax=ax, ha='center', va='center',
                                 str_format='{:.0%}', path_effects=path_eff)
    teamName = teamData["team.name"].unique().item()
    totPressure = len(teamData)
    result = resultsDict[matchIDList[idx]]
    ax.set_title(str(matchIDList[idx]) + " | " + teamName
                 + "\n" + result
                 + "\n Pressure Count: " + str(totPressure),
                 fontsize=15)

# Passes per Defensive Action (PpDA)

In [None]:
pressureDataLL1920.columns

In [None]:
pressureDataLL1920[["team.id", "home_team.home_team_id", "away_team.away_team_id"]]

In [None]:
eventDataLL1920["opponent_team.id"] =\
    np.where(eventDataLL1920["team.id"] == eventDataLL1920["home_team.home_team_id"],
         eventDataLL1920["away_team.away_team_id"],
         eventDataLL1920["home_team.home_team_id"])

In [None]:
eventDataLL1920["opponent_team.name"] =\
    np.where(eventDataLL1920["team.name"] == eventDataLL1920["home_team.home_team_name"],
         eventDataLL1920["away_team.away_team_name"],
         eventDataLL1920["home_team.home_team_name"])

In [None]:
passCond = eventDataLL1920["type.id"] == 30

In [None]:
oppoHalfCond = eventDataLL1920["startX"] < 60

In [None]:
eventDataLL1920[passCond & oppoHalfCond].groupby(["opponent_team.id"]).agg({"opponent_team.name": "first",
                                                                            "type.name": "count"})

In [None]:
oppoPassData =\
    eventDataLL1920[passCond & oppoHalfCond].groupby(["opponent_team.id"]).agg({"opponent_team.name": "first",
                                                                                "type.name": "count"})

In [None]:
eventDataLL1920[["type.id", "type.name"]].drop_duplicates()

In [None]:
eventDataLL1920[eventDataLL1920["type.id"].isin([4, 10, 22])]

In [None]:
defCond = eventDataLL1920["type.id"].isin([4, 10, 22])

In [None]:
oppoHalfCondDef = eventDataLL1920["startX"] > 60

In [None]:
eventDataLL1920[defCond & oppoHalfCond].groupby(["team.id"])["type.name"].count()

In [None]:
defData =\
    eventDataLL1920[defCond & oppoHalfCond].groupby(["team.id"])["type.name"].count()

In [None]:
pd.concat([oppoPassData, defData], axis=1)

In [None]:
ppdaData = pd.concat([oppoPassData, defData], axis=1)

In [None]:
ppdaData.columns = ["teamName", "oppoPasses", "defActions"]

In [None]:
ppdaData

In [None]:
ppdaData["PpDA"] = ppdaData["oppoPasses"].divide(ppdaData["defActions"]).round(1)

In [None]:
ppdaData

In [None]:
px.bar(data_frame=ppdaData, x="teamName", y="PpDA")

In [None]:
ppdaData.style.bar(subset=["PpDA"], color="maroon").set_precision(1)

# PpDA (with Pressure)

In [None]:
eventDataLL1920[["type.id", "type.name"]].drop_duplicates()

In [None]:
defCond = eventDataLL1920["type.id"].isin([4, 10, 17, 22])

In [None]:
eventDataLL1920[defCond & oppoHalfCondDef].groupby(["team.id"])["type.name"].count()

In [None]:
defPData =\
    eventDataLL1920[defCond & oppoHalfCond].groupby(["team.id"])["type.name"].count()

In [None]:
pd.concat([ppdaData, defPData], axis=1)

In [None]:
ppdaWithPressure = pd.concat([ppdaData, defPData], axis=1)

In [None]:
ppdaWithPressure.rename(columns={"type.name": "defPActions"},
                        inplace=True)

In [None]:
ppdaWithPressure

In [None]:
ppdaWithPressure["PpPA"] = ppdaWithPressure["oppoPasses"].divide(ppdaWithPressure["defPActions"]).round(1)

In [None]:
ppdaWithPressure.style.bar(subset=["PpDA", "PpPA"], color="maroon").set_precision(1)