# SkillCorner Open Data + kloppy

## Model for Corner Kick Identification
#### by Ana De Souza

As mentioned in the notebook "Exploratory Data Analysis" notebook, we will create a model to attempt to identify a corner kick numerically.

#### **Importing Libraries**

In [1]:
from kloppy import skillcorner
import pandas as pd

#### **Loading the Dataset**

In [2]:
dataset = skillcorner.load_open_data(
    match_id=2068,
    sample_rate=1/10,
    limit=100,
    coordinates="skillcorner",
    include_empty_frames=False
)

df = dataset.to_pandas()

  df = dataset.to_pandas()


#### **Converting timestamps in mm:ss format**

In [3]:
def convert_to_mmss(seconds):
    minutes = int(seconds // 60)
    remaining_seconds = int(seconds % 60)
    return '{:02d}:{:02d}'.format(minutes, remaining_seconds)

df['timestamp'] = df['timestamp'].apply(convert_to_mmss)

#### **Filtering for certain conditions to identify a corner kick using ball coordinates**

In [4]:
filtered_ball_df = df[(df['ball_x'] < -52) | (df['ball_x'] > 52) |
        (df['ball_y'] < -33) | (df['ball_y'] > 33)
    ]
filtered_ball_df.sort_values('ball_x', ascending=False)

Unnamed: 0,period_id,timestamp,frame_id,ball_state,ball_owning_team_id,ball_x,ball_y,ball_z,home_37_x,home_37_y,...,home_anon_322_d,home_anon_322_s,away_anon_322_x,away_anon_322_y,away_anon_322_d,away_anon_322_s,home_anon_327_x,home_anon_327_y,home_anon_327_d,home_anon_327_s
88,1,02:03,3063,,145.0,11.392073,-34.899497,,35.518485,-5.243791,...,,,35.06367,1.722366,,,,,,


- We will conduct a quest to see if we can use this technique to numerically identify a corner kick

**Quest Results**

- 2068 - no corners found, only throw ins
- 2269 - local variable before assignment error
- 2417 - no corners found, only throw ins
- 2440 - no corners found, only goalkicks
- 2841 - no corners found
- 3442 - no corners found
- 3518 - no corners found
- 3749 - no corners found, only throw ins
- 4039 - no corners found

Unfortunately, we were unable to find a cornerkick using ball data. However, to move forward with the project, we will plot the pitch and events to see frame by frame if there are any corner kicks and create a model to see if we can use player data to numerically identify a corner kick.

#### **Data Cleaning and Prepping for Model and Viz**

In [5]:
row = df.loc[88]
non_empty_columns = row.notna()
non_empty_df = pd.DataFrame({'Column': non_empty_columns.index, 'Non-Empty': non_empty_columns.values})
non_empty_df

Unnamed: 0,Column,Non-Empty
0,period_id,True
1,timestamp,True
2,frame_id,True
3,ball_state,False
4,ball_owning_team_id,True
...,...,...
155,away_anon_322_s,False
156,home_anon_327_x,False
157,home_anon_327_y,False
158,home_anon_327_d,False


In [6]:
columns = non_empty_df['Column'].tolist()

In [7]:
home = [column for column in columns if 'home' in column]
away = [column for column in columns if 'away' in column]
players = home + away

**Observations**
- The reason why I filtered this into a df is to see the frame and the x,y much easier since there are a lot of columns in the df
- It looks like we have 10 players on the home team to pitch and 9 players on the away team to pitch
- I've also put this into a list so that it is easier for me to plot these into a pitch and for filtering

#### **Filtering for certain conditions to identify a corner kick using player coordinates**

In [8]:
filtered_players_df = df[(df['home_2_x'] < -40) | (df['home_2_x'] > 40) |
        (df['home_2_y'] < -20) & (df['home_2_y'] > 20)
    ]

filtered_players_df.sort_values('home_2_x', ascending=False)

Unnamed: 0,period_id,timestamp,frame_id,ball_state,ball_owning_team_id,ball_x,ball_y,ball_z,home_37_x,home_37_y,...,home_anon_322_d,home_anon_322_s,away_anon_322_x,away_anon_322_y,away_anon_322_d,away_anon_322_s,home_anon_327_x,home_anon_327_y,home_anon_327_d,home_anon_327_s
85,1,02:00,3033,,145.0,29.472164,-23.814743,3.15236,46.058268,-2.753479,...,,,,,,,,,,
84,1,01:59,3023,,145.0,34.925227,-21.540771,5.147768,46.793338,-2.199739,...,,,,,,,,,,
86,1,02:01,3043,,145.0,22.481727,-27.852126,0.887964,42.988223,-3.064134,...,,,,,,,,,,
83,1,01:58,3013,,145.0,36.730271,-22.898476,0.225993,43.401831,-1.457707,...,,,,,,,,,,
87,1,02:02,3053,,145.0,16.178981,-32.21437,0.246884,39.374126,-2.088705,...,,,38.06179,2.898468,,,,,,
82,1,01:57,3003,,145.0,,,,43.136971,0.85407,...,,,,,,,,,,
21,1,00:24,2073,,139.0,-33.107187,18.690752,2.82107,-35.842192,20.1284,...,,,,,,,,,,
22,1,00:25,2083,,139.0,-37.854674,5.93361,2.046888,-36.967743,18.818364,...,,,,,,,,,,
8,1,00:08,1913,,139.0,,,,-39.093412,-1.815034,...,,,,,,,,,,
9,1,00:09,1923,,139.0,,,,-43.700808,-1.790126,...,,,,,,,,,,


**Observations**
- I used y in between -20 and 20 becauseI wanted to see the players prepping for the corner kick in the penalty area
- When I checked for home_1, we got a lot a frames with in this criteria, but that is because #1 is the goalkeeper
- Home Team : It seems like that home_10, home_23, home_37, home_6, home_2, home_77, home_9 are all present in the penalty area for a while between 80-88
- Away Team : It seems like that away_12, away_14, away_19, away_16, away_10, away_5, away_6 are  present are all present in the penalty area for a while between 80-88

Next steps 

Use the app that we will create an app to visualize and confirm if frames 80-88 is a corner kick. 