# Analyzing Player Roles Using Touch Data & XGBoost

Using data segmenting player touches by their position on the pitch, we can compare player offensive roles. In addition to touch data, progressive passing data, and final third passing data can help identify the creative outlets in a team's offensive plays. XGBoost is an appropriate tool for dealing with this classification problem, and consistently outperformed other similar methods.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import  metrics, model_selection
from xgboost.sklearn import XGBClassifier

In [None]:
df_possession = pd.read_csv('possession-stats.csv').dropna()
df_passing = pd.read_csv('passing-stats.csv').dropna()

df_full = pd.merge(df_possession, df_passing, on=['player', 'position', 'team',
                                   'league', 'age', 'year_of_birth',
                                   'mins_90s'], how='left')

In [None]:
for col in df_possession.columns:
    print(col)

In [None]:
touches = df_possession[['touches_def_pen', 'touches_def_third',
                        'touches_mid_third', 'touches_att_third', 'touches_att_pen']]
X = touches.to_numpy()

## Defining Roles

Assuming that players can be classified based on certain roles in the team, we can determine those roles, and the quintessential players that represent those roles. However, splitting players based on roles using touch data means that we won't be able to identify differences between goalscorers, or chance creators, and we won't be able to identify defensive duties. Instead, it will be more based on where on the pitch they carry out their on-ball work.

This means that the kind of roles being identified will relate more to the nature of their part in the buildup of an offensive play. For example, a player touching the ball most often in the defensive third is likely to be a defender, but their role in the offense is probably ball retention, as a safety valve.

### Roles & Players That Represent Those Roles

1. Ball Retention/Possession - Jorginho, Hummels, Van Dijk
2. Transition/Play Creation - Witsel, JWP, Kovacic
3. Chance Creation - Sancho, Muller, TAA
4. Offensive Lead - Lewandowski, Suarez, Haaland, Aguero

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
           X, y, test_size = 0.25, random_state = 0)