# Extraction of Kos Angle from Event Data
This Jupyter notebook can serve as an exploration the data processing taken place in the scientific paper titled:
The Kos Angle, an optimizing parameter for football expected goals (xG).


Open event data that's provided by Statsbomb typically contains structured information about various ball events or actions that occur during a football match. Here's a brief overview of the type of data usually included:

-  **Event type:** such as passes, shots, tackles, fouls, interceptions, and more.

-  **Location coordinates:** Coordinates or positional data related to where the event occurred/ended up using X and Y coordinates.

-  **Timestamps** Timing information, including when each event occurred during the match.

-  **Match and team information:** Details about the match, teams involved, and additional metadata.

-  **Outcome and result:** Outcome of events, such as goals scored, successful passes, or unsuccessful tackles.

For the main purpose of jupyter notebook, its mandatory to know that we're mainly going to operate on shot freeze frames that are attached to every shot revealing the location of all players, attacking or defending, within the camera frame, as well as the goalkeeper's position.

In [1]:
import json 
import math 
import numpy as np 
import pandas as pd 
import geopandas as gpd 

from mplsoccer import Pitch
from shapely.geometry import MultiPoint, Polygon, Point 

## *1. Loading and visualizing short_freeze_frames*

In [2]:
with open('data/events/7298.json') as file:
    data= json.load(file)

shots_list= []
for item in data:
    shot = item.get('shot')
    location = item.get('location')
    id_ = item.get('id')
    if shot and 'freeze_frame' in shot:
        shots_list.append((id_, shot['freeze_frame'], location ))
        
shots_df= pd.DataFrame(data= shots_list, columns= ['id', 'freeze_frame', 'location'])

In [3]:
# extracting location cordinates
shots_df[['X','Y']] = shots_df['location'].apply(lambda x: pd.Series(x, index=['X', 'Y']))
shots_df.head()

Unnamed: 0,id,freeze_frame,location,X,Y
0,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,"[{'location': [97.0, 48.0], 'player': {'id': 1...","[115.0, 25.0]",115.0,25.0
1,25dace9c-6bf8-4ada-8a4f-bad0485141c9,"[{'location': [106.0, 42.0], 'player': {'id': ...","[109.0, 51.0]",109.0,51.0
2,5e58cab7-75c2-47f8-903c-2874de6ed5b0,"[{'location': [111.0, 44.0], 'player': {'id': ...","[99.0, 52.0]",99.0,52.0
3,624a8c1d-b775-4a4f-85e8-a516aed3f3a5,"[{'location': [105.0, 35.0], 'player': {'id': ...","[107.0, 40.0]",107.0,40.0
4,3f0fc8e9-a09f-480a-9396-132e1ca05ec5,"[{'location': [103.0, 36.0], 'player': {'id': ...","[108.0, 32.0]",108.0,32.0


In [21]:
shots_df.freeze_frame[0]

[{'location': [97.0, 48.0],
  'player': {'id': 17275, 'name': 'Hannah Jayne Blundell'},
  'position': {'id': 12, 'name': 'Right Midfield'},
  'teammate': True},
 {'location': [113.0, 38.0],
  'player': {'id': 4638, 'name': 'Drew Spence'},
  'position': {'id': 15, 'name': 'Left Center Midfield'},
  'teammate': True},
 {'location': [112.0, 28.0],
  'player': {'id': 4649, 'name': 'Esme Beth Morgan'},
  'position': {'id': 2, 'name': 'Right Back'},
  'teammate': False},
 {'location': [103.0, 50.0],
  'player': {'id': 4635, 'name': 'Julia Spetsmark'},
  'position': {'id': 21, 'name': 'Left Wing'},
  'teammate': False},
 {'location': [120.0, 26.0],
  'player': {'id': 4637, 'name': 'Ellie Roebuck'},
  'position': {'id': 1, 'name': 'Goalkeeper'},
  'teammate': False},
 {'location': [109.0, 39.0],
  'player': {'id': 4645, 'name': 'Isobel Mary Christiansen'},
  'position': {'id': 15, 'name': 'Left Center Midfield'},
  'teammate': False},
 {'location': [117.0, 31.0],
  'player': {'id': 4648, 'name

## *2. Identifying players who interfers with the shot angle*

We construct shot angle polygons using player and goal post coordinates, then assesses player interference by checking for intersections with these polygons, resulting in a boolean value indicating interference status.

In [19]:
events_id= list(shots_df.id)
shot_frames= list(shots_df['freeze_frame'])

rows=[]
for event_id, frame in zip(events_id, shot_frames):
    for player in frame:
        rows.append((event_id, player['location'][0], player['location'][1],
                     player['position']['name'],player['teammate']))
df_shot_frame= pd.DataFrame(data= rows, columns=["id", "x", "y", "position", "teammate"])

statsbomb_pitch = Pitch()
vertices = np.zeros((len(df_shot_frame), 3, 2))
vertices[:, 1:, :] = statsbomb_pitch.goal_right
vertices[:, 0, :] = df_shot_frame[['x','y']].values
vertices = gpd.GeoSeries([Polygon(vert) for vert in vertices])
vertices = gpd.GeoDataFrame({'id': df_shot_frame['id'], 'shot_polygon': gpd.GeoSeries(vertices)})

player_positions = gpd.GeoSeries.from_xy(df_shot_frame['x'], df_shot_frame['y'])
player_positions = gpd.GeoDataFrame({'id': df_shot_frame['id'], 'position': player_positions,
                                    'Tactical position':df_shot_frame["position"], 'teammate':df_shot_frame["teammate"],
                                     'X_':df_shot_frame["x"], 'Y_':df_shot_frame["y"]
                                    })

player_positions = gpd.GeoDataFrame(player_positions.merge(vertices, on='id'))

#detect wether players intersects with the shot angle
player_positions['does player interfer with the goal angle?'] = player_positions['position'].intersects(player_positions['shot_polygon'])

#drop players who doesn't interefer with the shot angle
player_positions= player_positions[player_positions['does player interfer with the goal angle?']]
player_positions.head()

Unnamed: 0,id,position,Tactical position,teammate,X_,Y_,shot_polygon,does player interfer with the goal angle?
0,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,POINT (97.00000 48.00000),Right Midfield,True,97.0,48.0,"POLYGON ((97.00000 48.00000, 120.00000 44.0000...",True
11,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,POINT (113.00000 38.00000),Left Center Midfield,True,113.0,38.0,"POLYGON ((113.00000 38.00000, 120.00000 44.000...",True
15,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,POINT (113.00000 38.00000),Left Center Midfield,True,113.0,38.0,"POLYGON ((109.00000 39.00000, 120.00000 44.000...",True
22,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,POINT (112.00000 28.00000),Right Back,False,112.0,28.0,"POLYGON ((112.00000 28.00000, 120.00000 44.000...",True
33,9b82eaa3-2048-4157-aa9a-eabeb4fa0ebe,POINT (103.00000 50.00000),Left Wing,False,103.0,50.0,"POLYGON ((103.00000 50.00000, 120.00000 44.000...",True


## *3. Angles calculation: Kos angle, shot angle, and angles occupied by players (bar shooter)*

In [None]:
TO give an overview to how i

$$\alpha = \arccos\left(\frac{{2 \times (x - LP)^2 + (y - y_b)^2 + (y - y_c)^2 - (y_b - y_c)^2}}{{2 \times \sqrt{{((x - LP)^2 + (y - y_b)^2) \times ((x - LP)^2 + (y - y_c)^2)}}}}\right)$$


$$\theta = \arccos\left(\frac{(x - x_i)^2 + (y - y_i)^2 - \left(\frac{L_i}{2}\right)^2}{\sqrt{((x - x_i)^2 + (y - y_i - (L_i/2))^2) \times ((x - x_i)^2 + (y - y_i + (L_i/2))^2)}}\right)$$


In [5]:
xb, yb = (120, 44)
xc, yc = (120, 36)
LP = 120

def calculate_shot_angle(x, y):
    numerator = 2 * (x - LP) ** 2 + (y - yb) ** 2 + (y - yc) ** 2 - (yb - yc) ** 2
    denominator = 2 * math.sqrt(((x - LP) ** 2 + (y - yb) ** 2) * ((x - LP) ** 2 + (y - yc) ** 2))
    shot_angle_radians = math.acos(numerator / denominator)
    angle = math.degrees(shot_angle_radians)
    return angle

def calculate_angle(xy, xi_yi_Li):
    x, y = xy
    xi, yi, Li = xi_yi_Li
    numerator = (x - xi) ** 2 + (y - yi) ** 2 - (Li / 2) ** 2
    denominator1 = (x - xi) ** 2 + (y - yi - (Li / 2)) ** 2
    denominator2 = (x - xi) ** 2 + (y - yi + (Li / 2)) ** 2
    try:
        angle_radians = math.acos(numerator / math.sqrt(denominator1 * denominator2))
        anlge = math.degrees(angle_radians)
    except ValueError:
        anlge = 0.0
    
    return anlge

def determine_L(row):
    if row['teammate']:
        return 0.4
    elif row['Tactical position'] == 'Goalkeeper':
        return 1.4
    else:
        return 0.5

In [6]:
#create a column for the width of player based on his position and wheter he's a teammate of the shooter or not
player_positions['L'] = player_positions.apply(determine_L, axis=1)

#add as much potential features as possible 
player_positions = gpd.GeoDataFrame(player_positions.merge(shots_df, on='id'))

In [7]:
def calculate_angle_for_row(row):
    xy = (row['X'], row['Y'])
    xi_yi_Li = (row['X_'], row['Y_'], row['L'])
    return calculate_angle(xy, xi_yi_Li)

#calculate the angle of each player
player_positions['angle'] = player_positions.apply(calculate_angle_for_row, axis=1)

In [8]:
columns_to_drop= ['position', 'Tactical position', 'teammate', 'shot_polygon', 
                  'does player interfer with the goal angle?', 'L', 'location' , 'X_' ,'Y_']
player_positions.drop(columns= columns_to_drop, inplace= True)

In [9]:
grouped_df = player_positions.groupby('id').agg({
    'freeze_frame': 'first',
    'X': 'first',  
    'Y': 'first',  
    'angle': 'sum'
}).reset_index()

In [22]:
grouped_df.head()

Unnamed: 0,id,freeze_frame,X,Y,angle
0,12092a46-bc36-4f00-91f6-767ef8601ae1,"[{'location': [109.0, 42.0], 'player': {'id': ...",98.0,46.0,103.103861
1,13933d30-56e3-4900-b942-0ee01af8ed1f,"[{'location': [115.0, 41.0], 'player': {'id': ...",108.0,36.0,197.152094
2,23d74ba8-6e5d-4855-8b3a-81b310e9d35a,"[{'location': [115.0, 36.0], 'player': {'id': ...",102.0,50.0,95.84498
3,25dace9c-6bf8-4ada-8a4f-bad0485141c9,"[{'location': [106.0, 42.0], 'player': {'id': ...",109.0,51.0,114.91452
4,28c07afb-53b7-497f-9bb0-1c586c27e2de,"[{'location': [109.0, 31.0], 'player': {'id': ...",109.0,39.0,157.702841
