# FEATURES - DIFESA <a class="anchor" id="up"></a>

Tutte le funzioni restituiscono un DataFrame del tipo


| teamId | feature |
| --- | --- |


Le features sono

* [Altezza linea difesa](#defense_line)
* [Attaccanti in difesa](#defending_attackers) 
* [Varianza linea difesa](#defense_variance)
* [Falli difesa: centro o fasce](#defensive_fouls)
* [Recuperi in attacco](#attacking_tackles)


In [1]:
import numpy as np
import pandas as pd
from scipy.spatial import distance
import pickle
import re
import warnings
warnings.filterwarnings('ignore')

In [2]:
df = pd.read_csv('clean/events_no_champions.csv')
if 'Unnamed: 0' in df.columns:
    del df['Unnamed: 0']
    
feats = pd.read_csv('clean/feats.csv')
    
display(df.head(2))
display(feats.head(2))

Unnamed: 0,eventId,subEventName,tags,playerId,matchId,eventName,teamId,matchPeriod,eventSec,subEventId,id,League,x0,y0,x1,y1,teamName,playerName,playerRole
0,8,Simple pass,1801,25413,2499719,Pass,1609,1H,2.758649,85.0,177959171,England,49,49,31.0,78.0,Arsenal,A. Lacazette,Forward
1,8,High pass,1801,370224,2499719,Pass,1609,1H,4.94685,83.0,177959172,England,31,78,51.0,75.0,Arsenal,R. Holding,Defender


Unnamed: 0,teamId,teamName
0,1609,Arsenal
1,1631,Leicester City


***
***

### Altezza media della linea di difesa <a class="anchor" id="defense_line"></a>[up](#up)

Altezza media della linea difensiva inferita dalla media di `x0` per subEvent come `Ground loose ball duel` o `Ground defending duel` effettuati dai difensori.

In [3]:
def defense_line(df):
    x = df.loc[df['playerRole'] == 'Defender',]
    x = x.loc[x['subEventId'].isin([12,13])]
    
    x = x[['teamId','matchId','x0']].groupby(['teamId','matchId']).mean().reset_index()
    x = x[['teamId','x0']].groupby(['teamId']).mean().reset_index()
    x.columns = ['teamId', 'defense_line']
    
    return x

a = defense_line(df)
feats = pd.merge(feats, a, on = 'teamId', how = 'left')
display(feats.head(2))

Unnamed: 0,teamId,teamName,defense_line
0,1609,Arsenal,35.920419
1,1631,Leicester City,33.49487


### Tocchi degli attaccanti in difesa <a class="anchor" id="defending_attackers"></a>[up](#up)

Media del numero di eventi a partita prodotti da giocatori con ruolo `Forward` nella propria metà campo (i.e. `x0 < 50`).

In [4]:
def defending_attackers(df):
    x = df.loc[((df['playerRole'] == 'Forward') & (df['x0'] < 50)),]
    
    x = x[['teamId','matchId','x0']].groupby(['teamId','matchId']).count().reset_index()
    x = x[['teamId','x0']].groupby(['teamId']).mean().reset_index()
    x.columns = ['teamId', 'defending_attackers']
    
    return x

a = defending_attackers(df)
feats = pd.merge(feats, a, on = 'teamId', how = 'left')
display(feats.head(5))

Unnamed: 0,teamId,teamName,defense_line,defending_attackers
0,1609,Arsenal,35.920419,23.947368
1,1631,Leicester City,33.49487,22.315789
2,1625,Manchester City,38.114127,20.342105
3,1651,Brighton & Hove Albion,30.903564,29.026316
4,1646,Burnley,32.499457,19.105263


### Varianza dell'altezza media della linea di difesa <a class="anchor" id="defense_variance"></a>[up](#up)

Varianza dell'altezza media della linea difensiva inferita dalla media di `x0` per subEvent come `Ground loose ball duel` o `Ground defending duel` effettuati dai difensori.

In [5]:
def defense_variance(df):
    x = df.loc[df['playerRole'] == 'Defender',]
    x = x.loc[x['subEventId'].isin([12,13])]
    
    x = x[['teamId','matchId','x0']].groupby(['teamId','matchId']).var().reset_index()
    x = x[['teamId','x0']].groupby(['teamId']).mean().reset_index()
    x.columns = ['teamId', 'defense_variance']
    
    return x

a = defense_variance(df)
feats = pd.merge(feats, a, on = 'teamId', how = 'left')
display(feats.head(2))

Unnamed: 0,teamId,teamName,defense_line,defending_attackers,defense_variance
0,1609,Arsenal,35.920419,23.947368,481.803278
1,1631,Leicester City,33.49487,22.315789,496.225946


### Falli in difesa: centro vs fasce <a class="anchor" id="defensive_fouls"></a>[up](#up)

Eventi di tipo `Foul` prodotti nella propria metà campo: proporzione degli eventi prodotti sulle fasce (`33 < y0 < 66`) o al centro.

In [6]:
def fouls_side_center(df):
    x = df.loc[df['subEventId'] == 20,]
    x['fouls_side_center'] = 0
    x.loc[((x['y0'] < 33) | (x['y0'] > 66)), 'fouls_side_center'] = 1

    x = x[['teamId','fouls_side_center']].groupby('teamId').mean().reset_index()
    x.columns = ['teamId', 'fouls_side_center']
    return x

a = fouls_side_center(df)
feats = pd.merge(feats, a, on = 'teamId', how = 'left')
feats.head()

Unnamed: 0,teamId,teamName,defense_line,defending_attackers,defense_variance,fouls_side_center
0,1609,Arsenal,35.920419,23.947368,481.803278,0.739011
1,1631,Leicester City,33.49487,22.315789,496.225946,0.730659
2,1625,Manchester City,38.114127,20.342105,508.981138,0.734375
3,1651,Brighton & Hove Albion,30.903564,29.026316,424.139252,0.703046
4,1646,Burnley,32.499457,19.105263,458.670014,0.675926


### Recuperi in attacco <a class="anchor" id="attacking_tackles"></a>[up](#up)

Eventi di tipo `Ground loose ball duel` o `Ground defending duel` prodotti nella metà campo avversaria (`x0` > 50).

In [7]:
def attacking_tackles(df):
    x = df.loc[((df['subEventId'].isin([12,13])) & (df['x0'] > 50)),]
    x = x[['teamId','matchId','x0']].groupby(['teamId', 'matchId']).mean().reset_index()
    x = x[['teamId','x0']].groupby('teamId').mean().reset_index()
    x.columns = ['teamId', 'attacking_tackles']
    return x

a = attacking_tackles(df)
feats = pd.merge(feats, a, on = 'teamId', how = 'left')
feats.head()

Unnamed: 0,teamId,teamName,defense_line,defending_attackers,defense_variance,fouls_side_center,attacking_tackles
0,1609,Arsenal,35.920419,23.947368,481.803278,0.739011,69.973069
1,1631,Leicester City,33.49487,22.315789,496.225946,0.730659,69.441156
2,1625,Manchester City,38.114127,20.342105,508.981138,0.734375,70.504875
3,1651,Brighton & Hove Albion,30.903564,29.026316,424.139252,0.703046,69.136218
4,1646,Burnley,32.499457,19.105263,458.670014,0.675926,70.386003


In [8]:
feats.to_csv('clean/feats_difesa.csv', index = False)