# Notebook Model v.2

## Content:
1. [Imports](#Imports)
2. [Read data](#Read_data)
3. [Inputs](#Inputs)
4. [Modelling](#Modelling)
5. [Results](#Results)


## Summary
Given a football player $u$ with match event data and team possession data we want to detect its playing style $y(pos_{u})$, where $y(pos_{u}) \in \{Offensive, Defensive\}$, for the detected playing position $pos_{u}$. This can be formulated as a binary classification problem where we want to classify all available football players $\mathbf{u}$ from the data into the two groups $\{Offensive, Defensive\}$.
The classification is also position dependent, therefore each position $pos \in \{ST, CM, OW, FB, CB\}$ (goalkeeper group neglected as explained in Section \ref{sec_Limitations}) should have its own classifier within that position group. 

## 1. Imports <a class="anchor" id="Imports"></a>

In [14]:
# Basics
import pandas as pd

# Project module
import modules.validation_lib as validate
from modules.models_lib import model_off_def

## 2. Read data <a class="anchor" id="Read_data"></a>

As inputs to the model there is the processed and filtered off/def actions KPI-data found in the data directory. 

We also import here the model kpi data (mainly used for Model v.2). This is only done in order to reach the detected positions for each player which exists in this .xlsx-file.

In [15]:
# Read model KPI dataframe from PL
df_model_KPI_PL = pd.read_excel('../data/model_kpis_PL21-22.xlsx')

# Read model KPI dataframe from Allsvenskan, Sweden
df_model_KPI_Swe = pd.read_excel('../data/model_kpis_Swe21.xlsx')

# Read off/def KPI dataframe from PL
df_KPI_off_def_PL = pd.read_excel('../data/off_def_kpis_PL21-22.xlsx')

# Read off/def KPI dataframe from Allsvenskan, Sweden
df_KPI_off_def_Swe = pd.read_excel('../data/off_def_kpis_Swe21.xlsx')

# Read validation data, (this model is validated against validation data set v.1, as in the report)
df_validation = pd.read_excel('../data/validation_data_v1.xlsx')


## 3. Inputs <a class="anchor" id="Inputs"></a>

In [16]:
# Choose league to validate
# league = 'PL'
# league = 'Swe'
league = 'both'


# parameter settings for Model v.2
position_quantile_mapper = {
    'ST': 0.25,
    'CM': 0.25,
    'OW': 0.3,
    'FB': 0.4,
    'CB': 0.7
    }


Handle inputs

In [17]:
# Call to base case model depending on chosen league
df_KPI_off_def = pd.DataFrame()
df_model_KPI = pd.DataFrame()

if league == 'PL':
    df_KPI_off_def = df_KPI_off_def_PL
    df_model_KPI = df_model_KPI_PL
elif league == 'Swe':
    df_KPI_off_def = df_KPI_off_def_Swe
    df_model_KPI = df_model_KPI_Swe
elif league == 'both':
    df_KPI_off_def = pd.concat([df_KPI_off_def_PL, df_KPI_off_def_Swe])
    df_model_KPI = pd.concat([df_model_KPI_PL, df_model_KPI_Swe])
else: 
    print("WRONG INPUT")

## 4. Modelling <a class="anchor" id="Modelling"></a>
Each player gets classified as either an Offensive or a Defensive player by using quantile classification with the quantile input argument q as position dependent, i.e.
q = q(pos). Thus, the classifier can be tuned with the parameter q(pos) for each position independently.

In [18]:
df_model_result = model_off_def(df_KPI_off_def, df_model_KPI, position_quantile_mapper)


## 5. Results <a class="anchor" id="Results"></a>
See the confusion matrix and class metrics results for each position validated against the validation data set v.1

In [20]:
for position in position_quantile_mapper:
    
    print(f"============== Result position {position} ==============\n")

    # Filter off def by the set position
    df_model_result_pos_i = df_model_result[df_model_result['position'] == position]
    
    # get players from validation data in this position
    list_validation_pos_i_players = df_validation[df_validation['Position'] == position]['Player_name'].tolist()
    
    # filter to only indlude validation data for correct position
    df_model_result_pos_i = df_model_result_pos_i[df_model_result_pos_i['name'].isin(list_validation_pos_i_players)]
    
    # Compare detected positions to validation data
    dict_validation_results_pos = validate.create_validation_dataframes(
        df_model_result_pos_i, "Player_name", "name",
        'Playing-style_primary',
        'playing style',
        position=position,
        binary_playing_style=True) # Behöver se över denna
    
    # Find the resulting dataframe from the dictionary
    df_result = dict_validation_results_pos['df_result']
    df_correct = dict_validation_results_pos['df_correct']
    df_incorrect = dict_validation_results_pos['df_incorrect']
    
    # Compute and show the confusion matrix with accuracy
    print("Confusion matrix result Off/Def classification: \n")
    df_conf = validate.confusion_matrix(df_result, ['Offensive', 'Defensive'], 'predicted_class', 'actual_class', show_results=True)
    
    # Compute confusion matrix metrics
    print("Confusion matrix class metrics for Off/Def classification: \n")
    df_class_metrics_pos = validate.confusion_matrix_class_metrics(df_conf, ['Offensive', 'Defensive'], show_results=True)
    
    print("==============  ==============\n\n")


Confusion matrix result Off/Def classification: 

+------------+-------------+-------------+-----------+
|            |   Offensive |   Defensive |   #actual |
| Offensive  |          49 |          18 |        67 |
+------------+-------------+-------------+-----------+
| Defensive  |          10 |           8 |        18 |
+------------+-------------+-------------+-----------+
| #predicted |          59 |          26 |        85 |
+------------+-------------+-------------+-----------+
Confusion matrix class metrics for Off/Def classification: 

Total accuracy: 0.67
+-----------+-------------+----------+---------------+------------+
|           |   precision |   recall |   specificity |   F1-score |
| Offensive |        0.73 |     0.83 |          0.31 |       0.78 |
+-----------+-------------+----------+---------------+------------+
| Defensive |        0.44 |     0.31 |          0.83 |       0.36 |
+-----------+-------------+----------+---------------+------------+



Confusion matrix