![](https://i.ytimg.com/vi/xehD6lmwm2s/maxresdefault.jpg)

### Part I : With this kernel I have tried to achieve following KPIs dealt on Scout and Performance level for ESports using the dataset provided from year 2015 to year 2020. Dataset source: [FIFA 20](https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset)

The following KPI's useful for ESports:
1. Pick Top 5 players per position based on value constraint (Buy/Loan/Youth).
2. Recommend alternate playing position for player.
3. Player Recommendation for team.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory
import plotly.graph_objects as go
import plotly.express as px
import plotly
import os

for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print("Path:", os.path.join(dirname, filename))
     

In [None]:
df_16 = pd.read_csv("../input/fifa-20-complete-player-dataset/players_16.csv", error_bad_lines=False)
df_17 = pd.read_csv("../input/fifa-20-complete-player-dataset/players_17.csv", error_bad_lines=False)
df_18 = pd.read_csv("../input/fifa-20-complete-player-dataset/players_18.csv", error_bad_lines=False)
df_19 = pd.read_csv("../input/fifa-20-complete-player-dataset/players_19.csv", error_bad_lines=False)
df_20 = pd.read_csv("../input/fifa-20-complete-player-dataset/players_20.csv", error_bad_lines=False)

# Data Preprocessing & Feature Engineering

In [None]:
# Drop Unnecessary columns
df_20 = df_20.drop(['sofifa_id', 'player_url', 'long_name', 'body_type', 'real_face', 'loaned_from', 'nation_position', 'nation_jersey_number'], axis=1)
df_19 = df_19.drop(['sofifa_id', 'player_url', 'long_name', 'body_type', 'real_face', 'loaned_from', 'nation_position', 'nation_jersey_number'], axis=1)
df_18 = df_18.drop(['sofifa_id', 'player_url', 'long_name', 'body_type', 'real_face', 'loaned_from', 'nation_position', 'nation_jersey_number'], axis=1)
df_17 = df_17.drop(['sofifa_id', 'player_url', 'long_name', 'body_type', 'real_face', 'loaned_from', 'nation_position', 'nation_jersey_number'], axis=1)
df_16 = df_16.drop(['sofifa_id', 'player_url', 'long_name', 'body_type', 'real_face', 'loaned_from', 'nation_position', 'nation_jersey_number'], axis=1)

### 1: Position Columns
Clean, Process and Assign the new attributes to columns listed below. These columns will be used to identify best alternate playing position based on ratings:
<br>
'ls', 'st', 'rs', 'lw', 'lf', 'cf', 'rf', 'rw', 'lam', 'cam', 'ram', 'lm', 'lcm', 'cm', 'rcm', 'rm', 'lwb', 'ldm', 'cdm', 'rdm', 'rwb', 'lb', 'lcb', 'cb', 'rcb', 'rb'

In [None]:
stats = ['ls', 'st', 'rs', 'lw', 'lf', 'cf', 'rf', 'rw', 'lam', 'cam', 'ram',
       'lm', 'lcm', 'cm', 'rcm', 'rm', 'lwb', 'ldm', 'cdm', 'rdm', 'rwb', 'lb',
       'lcb', 'cb', 'rcb', 'rb']
for col in stats:
    new = df_20[col].str.split("+", n = 1, expand = True)
    df_20[col] = new[0]
# Replace NaN with 0
df_20[stats] = df_20[stats].fillna(0)
df_20[stats] = df_20[stats].astype(int)

### 2: Player's Work Rate
Convert the categorical values in Work Rate column in integer values. These columns are used to identify players work rate in offense and defence. Having one-hot encoding them provides us to use these features in further analysis and recommendations.

In [None]:
# Create dummy variables and append to dataframe
df_20 = pd.concat([df_20, pd.get_dummies(df_20['work_rate'])], axis=1)
# Drop original work_rate column
df_20 = df_20.drop(['work_rate'], axis=1)

### 3: Player's DOB
Clean and fetch the month value from dob column to use for analysis.

In [None]:
# Split the dob column to fetch month
new = df_20["dob"].str.split("-", n = 2, expand = True)
df_20["birth_month"] = new[1].astype(int)

### 4: Player's Position
Convert the categorical values in Player's Position column in integer values. These columns are used to identify players playing positions (single-multi). Having one-hot encoding them provides us to use these features in further analysis and recommendations.

In [None]:
df_20 = pd.concat([df_20, df_20['player_positions'].str.get_dummies(sep=', ').add_prefix('Position_')], axis=1) 
# Drop original work_rate column
df_20 = df_20.drop(['player_positions'], axis=1)

### 5. BMI: New feature creation
Creation of a BMI feature using Weight_kg & Height_cm and use it instead of 'body_type' feature.

In [None]:
df_20['bmi'] = df_20['weight_kg'] / (df_20['height_cm']/100)**2
df_19['bmi'] = df_19['weight_kg'] / (df_19['height_cm']/100)**2
df_18['bmi'] = df_18['weight_kg'] / (df_18['height_cm']/100)**2
df_17['bmi'] = df_17['weight_kg'] / (df_17['height_cm']/100)**2
df_16['bmi'] = df_16['weight_kg'] / (df_16['height_cm']/100)**2

### 6: Missing Value Estimation
Count of missing elements in the columns

In [None]:
# Check the missing values in the column
missing_data = df_20.isnull().sum().sort_values(ascending=False)
missing_data = missing_data.reset_index(drop=False)
missing_data = missing_data.rename(columns={"index": "Columns", 0: "Value"})
missing_data['Proportion'] = (missing_data['Value']/len(df_20))*100

In [None]:
sample = missing_data[missing_data['Proportion']>10]
fig = px.pie(sample, names='Columns', values='Proportion',
             color_discrete_sequence=px.colors.sequential.Viridis_r,
             title='Percentage of Missing values in Columns')
fig.update_traces(textposition='inside', textinfo='label')
fig.update_layout(paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'))
fig.show()

### 7: Fill Missing Values
Columns: "dribbling", "defending", "physic", "passing", "shooting", "pace" can't have minimum value as 0
Position attributes and player's position missing values have been computed in steps above.

In [None]:
# Fill missing values of these columns by median
cols = ["dribbling", "defending", "physic", "passing", "shooting", "pace"]
for col in cols:
    df_20[col] = df_20[col].fillna(df_20[col].median())
df_20 = df_20.fillna(0)

# Exploratory Data Analysis

### 1: Scatter Plot (colored by Age) year 2020 - Overall Rating vs Value in Euros

In [None]:
fig = go.Figure()

fig = go.Figure(data=go.Scatter(
    x = df_20['overall'],
    y = df_20['value_eur'],
    mode='markers',
    marker=dict(
        size=16,
        color=df_20['age'], #set color equal to a variable
        colorscale='Plasma', # one of plotly colorscales
        showscale=True
    ),
    text= df_20['short_name'],
))

fig.update_layout(title='Styled Scatter Plot (colored by Age) year 2020 - Overall Rating vs Value in Euros',
                  xaxis_title='Overall Rating',
                  yaxis_title='Value in Euros',
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'))
fig.show()

### 2: Scatterpolar - Player's Growth with Time

In [None]:
import re
# Creating a method to compare a Players growth over Time
def playergrowth(x):
    a = df_20[df_20.short_name.str.startswith(x)]
    b = df_19[df_19.short_name.str.startswith(x)]
    c = df_18[df_18.short_name.str.startswith(x)]
    d = df_17[df_17.short_name.str.startswith(x)]
    e = df_16[df_16.short_name.str.startswith(x)]
    
    trace0 = go.Scatterpolar(
      r = [a['pace'].values[0],a['shooting'].values[0],a['passing'].values[0],a['dribbling'].values[0],a['defending'].values[0],a['physic'].values[0],a["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = '2020'
    )

    trace1 = go.Scatterpolar(
      r = [b['pace'].values[0],b['shooting'].values[0],b['passing'].values[0],b['dribbling'].values[0],b['defending'].values[0],b['physic'].values[0],b["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = '2019'
    )
    
    trace2 = go.Scatterpolar(
      r = [c['pace'].values[0],c['shooting'].values[0],c['passing'].values[0],c['dribbling'].values[0],c['defending'].values[0],c['physic'].values[0],c["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = '2018'
    )
    
    trace3 = go.Scatterpolar(
      r = [d['pace'].values[0],d['shooting'].values[0],d['passing'].values[0],d['dribbling'].values[0],d['defending'].values[0],d['physic'].values[0],d["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = '2017'
    )
    
    trace4 = go.Scatterpolar(
      r = [e['pace'].values[0],e['shooting'].values[0],e['passing'].values[0],e['dribbling'].values[0],e['defending'].values[0],e['physic'].values[0],e["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = '2016'
    )
    
    data = [trace0, trace1, trace2, trace3, trace4]

    layout = go.Layout(
      polar = dict(
        radialaxis = dict(
          visible = True,
          range = [0, 100]
        )
      ),
      template="plotly_white",  
      showlegend = True,
      font=dict(family='Cambria, monospace', size=12, color='#000000'),
      title = "Stats: {} from 2016 to 2020".format(a.short_name.values[0])
        
    )
    fig = go.Figure(data=data, layout=layout)
    plotly.offline.iplot(fig, filename = "Player stats")

In [None]:
# Comparing over year growth
playergrowth("Neymar")

### 3: Scatter Plot - Nationality vs Overall

In [None]:
fig = go.Figure()
sample = df_20.sort_values(by='nationality')
fig = go.Figure(data=go.Scatter(
    x = sample['nationality'],
    y = sample['overall'],
    mode='markers',
    marker=dict(
        size=14,
        color=sample['overall'], #set color equal to a variable
        colorscale='Viridis', # one of plotly colorscales
        showscale=True
    ),
    text= sample['short_name']
))

fig.update_layout(title='Styled Scatter Plot - Nationality vs Overall',
                  xaxis_title='Nationality',
                  yaxis_title='Overall Rating',
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000')
                 )
fig.show()

### 4: Box Plot (with Suspected Outliers) - Overall Rating vs BMI

In [None]:
fig = go.Figure()
sample = df_20.sort_values(by='overall')

fig.add_trace(go.Box(
    x = sample['overall'],
    y = sample['bmi'],
    name="Suspected Outliers",
    boxpoints='suspectedoutliers', # only suspected outliers
    marker=dict(
        size=12,
        color='rgb(251, 158, 58)',
        outliercolor='rgba(216, 87, 107, 0.6)',
        line=dict(
            outliercolor='rgba(216, 87, 107, 0.6)',
            outlierwidth=2)),
    line_color='rgb(73, 3, 159)',
    text= sample['short_name']
))

fig.update_layout(title='Styled Box Plot (with Suspected Outliers) - Overall Rating vs BMI',
                  xaxis_title='Overall Rating',
                  yaxis_title='BMI',
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'),
                  xaxis_rangeslider_visible=True)
fig.show()

### 5: Box Plot (with Suspected Outliers) - Nationality vs BMI

In [None]:
fig = go.Figure()
sample = df_20.sort_values(by='nationality')

fig.add_trace(go.Box(
    x = sample['nationality'],
    y = sample['bmi'],
    name="Suspected Outliers",
    boxpoints='suspectedoutliers', # only suspected outliers
    marker=dict(
        size=12,
        color='rgb(180, 222, 43)',
        outliercolor='rgba(31, 158, 137, 0.6)',
        line=dict(
            outliercolor='rgba(31, 158, 137, 0.6)',
            outlierwidth=2)),
    line_color='rgb(72, 40, 120)',
    text= sample['short_name']
))

fig.update_layout(title='Styled Box Plot (with Suspected Outliers) - Nationality vs BMI',
                  xaxis_title='Nationality',
                  yaxis_title='BMI',
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'),
                  xaxis_rangeslider_visible=True)
fig.show()

### 6: Proportion of Player's per Position

In [None]:
attack = ['RW', 'LW', 'ST', 'CF', 'LS', 'RS', 'RF', 'LF']
sample = df_20.query('team_position in @attack')    
fig = px.pie(sample, names='team_position',
             color_discrete_sequence=px.colors.sequential.Plasma_r,
             title='Percentage of players in Attacker Role')
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.update_layout(paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'))
fig.show()

In [None]:
mid = ['CAM', 'RCM', 'CDM', 'LDM', 'RM', 'LCM', 'LM', 'RDM', 'RAM','CM', 'LAM']
sample = df_20.query('team_position in @mid')    
fig = px.pie(sample, names='team_position',
             color_discrete_sequence=px.colors.sequential.Viridis_r,
             title='Percentage of players in Midfielder Role')
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.update_layout(paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'))
fig.show()

In [None]:
defence = ['LCB', 'RCB', 'LB', 'RB', 'CB', 'RWB', 'LWB']
sample = df_20.query('team_position in @defence')    
fig = px.pie(sample, names='team_position',
             color_discrete_sequence=px.colors.sequential.Magma_r,
             title='Percentage of players in Defender Role')
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.update_layout(paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(family='Cambria, monospace', size=12, color='#000000'))
fig.show()

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import MiniBatchKMeans, KMeans 
from sklearn.metrics.pairwise import pairwise_distances_argmin 

In [None]:
cols = ['age', 'height_cm', 'weight_kg', 'weak_foot', 'skill_moves', 'pace', 'shooting', 'passing', 'dribbling', 'defending', 'physic',
        'gk_diving', 'gk_handling', 'gk_kicking', 'gk_reflexes', 'gk_speed', 'gk_positioning', 'attacking_crossing', 'attacking_finishing',
        'attacking_heading_accuracy', 'attacking_short_passing', 'attacking_volleys', 'skill_dribbling', 'skill_curve', 'skill_fk_accuracy',
        'skill_long_passing', 'skill_ball_control', 'movement_acceleration', 'movement_sprint_speed', 'movement_agility', 'movement_reactions',
        'movement_balance', 'power_shot_power', 'power_jumping', 'power_stamina', 'power_strength', 'power_long_shots', 'mentality_aggression',
        'mentality_interceptions', 'mentality_positioning', 'mentality_vision', 'mentality_penalties', 'mentality_composure',  'defending_marking',
        'defending_standing_tackle', 'defending_sliding_tackle', 'goalkeeping_diving', 'goalkeeping_handling', 'goalkeeping_kicking', 'goalkeeping_positioning',
        'goalkeeping_reflexes', 'High/High','High/Low', 'High/Medium', 'Low/High', 'Low/Low', 'Low/Medium', 'Medium/High', 'Medium/Low', 'Medium/Medium', 'birth_month','bmi']

In [None]:
# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(df_20[cols])

In [None]:
batch_size = 200
mbk = MiniBatchKMeans(init ='k-means++', n_clusters = 5, 
                      batch_size = batch_size, n_init = 10, 
                      max_no_improvement = 10, verbose = 0) 
  
mbk.fit(X_std)
mbk_means_cluster_centers = np.sort(mbk.cluster_centers_, axis = 0) 
mbk_means_labels = pairwise_distances_argmin(X_std, mbk_means_cluster_centers)

In [None]:
df_20['cluster_labels'] = mbk_means_labels

# Pick Top 5 Players per Position
Available Positions: CAM, CB, CDM, CF, CM, GK, LB, LM, LW, LWB, RB, RM, RW, RWB, ST

In [None]:
# Creating a method to compare 5 Players
def top5(ls_name, column, value):
    x = df_20[df_20["short_name"] == ls_name[0]]
    y = df_20[df_20["short_name"] == ls_name[1]]
    z = df_20[df_20["short_name"] == ls_name[2]]
    az = df_20[df_20["short_name"] == ls_name[3]]
    bz = df_20[df_20["short_name"] == ls_name[4]]

    
    trace0 = go.Scatterpolar(
      r = [x['pace'].values[0],x['shooting'].values[0],x['passing'].values[0],x['dribbling'].values[0],x['defending'].values[0],x['physic'].values[0],x["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = x.short_name.values[0]
    )

    trace1 = go.Scatterpolar(
      r = [y['pace'].values[0],y['shooting'].values[0],y['passing'].values[0],y['dribbling'].values[0],y['defending'].values[0],y['physic'].values[0],y["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = y.short_name.values[0]
    )
    
    trace2 = go.Scatterpolar(
      r = [z['pace'].values[0],z['shooting'].values[0],z['passing'].values[0],z['dribbling'].values[0],z['defending'].values[0],z['physic'].values[0],z["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = z.short_name.values[0]
    )
    
    trace3 = go.Scatterpolar(
      r = [az['pace'].values[0],az['shooting'].values[0],az['passing'].values[0],az['dribbling'].values[0],az['defending'].values[0],az['physic'].values[0],az["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = az.short_name.values[0]
    )
    
    trace4 = go.Scatterpolar(
      r = [bz['pace'].values[0],bz['shooting'].values[0],bz['passing'].values[0],bz['dribbling'].values[0],bz['defending'].values[0],bz['physic'].values[0],bz["overall"].values[0]],
      theta = ['Pace','Shooting','Passing','Dribbling','Defending','Physic','Overall'],
      fill = 'toself',
      name = bz.short_name.values[0]
    )
    
    
    data = [trace0, trace1, trace2, trace3, trace4]

    layout = go.Layout(
      polar = dict(
        radialaxis = dict(
          visible = True,
          range = [0, 100]
        )
      ),
      template="plotly_white",  
      showlegend = True,
      font=dict(family='Cambria, monospace', size=12, color='#000000'),
      title = "{} stats comparison for top 5 under €{} -: {} vs {} vs {} vs {} vs {}".format(column, value, x.short_name.values[0], y.short_name.values[0], z.short_name.values[0], az.short_name.values[0], bz.short_name.values[0] )
        
    )
    fig = go.Figure(data=data, layout=layout)
    plotly.offline.iplot(fig, filename = "Player stats")
    return None

# Creating a method to pick top 5 player in position
def position(pos, value):
    column = str('Position_')+str.upper(pos)
    print("You've Entered Position: ",column)
    print("You've Entered Value €: ", value)
    ls_name = df_20[(df_20[column]==1) & (df_20['value_eur']<=value)]['short_name'].head(5).values
    print(ls_name)
    top5(ls_name, column, value)
    return None

### Test 1: 'LB'

In [None]:
position('lb', 34000000)

### Test 2: 'CDM'

In [None]:
position('cdm', 44000000)

# Recommend Alternate Playing Position per Player
Display alternate playing positions of a player which are not current playing positions of the player. <br>
*For Goalkeepers alternate playing position will have 0 ratings*

In [None]:
def alternate_position(player, df_20):
    # Get player's index
    idx = df_20[df_20['short_name']==player].index[0]
    sample_1 = df_20.iloc[:,68:94]
    # Find 5 highest scores per position of player
    ls = sample_1.loc[idx].nlargest(5).index[0:].values
    ls = ls.tolist()
    sample_2 = df_20.iloc[idx,104:119]
    # Identify positions which are not current playing positions
    ls2 = sample_2[sample_2==1].index[:].values
    # Make index values lower case
    ls2 = list(map(lambda x: x.lower(), ls2.tolist()))
    ls2 = [re.sub(r'position_', '', i) for i in ls2]
    # Find Positions with highest rating and are not currently playing positions
    alt_pos = set(ls)-(set(ls2))
    print("Alternate Playing Positions for {} are".format(player))
    for i in alt_pos:
        print("Position: {}, Rating: {}".format(str.upper(i), df_20[i].values[idx]))
    return None

### Test 1: P. Pogba

In [None]:
alternate_position("P. Pogba", df_20)

### Test 2: M. Salah

In [None]:
alternate_position('M. Salah', df_20)

# Player Recommendation

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import NearestNeighbors
from sklearn.decomposition import PCA

### 1: Fetch numeric columns

In [None]:
sample = df_20.select_dtypes(include='number')
print(sample.head())

### 2: Correlation Matrix

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(12,12))

# Compute the correlation matrix
corr = sample.corr()

# Generate a mask for the upper triangle
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True

# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap="GnBu", vmax=.3, center=0,
            square=True, linewidths=.7, cbar_kws={"shrink": .7})

From the above correlation chart, we can see a lot of Goalkeepers attributes have a negative correlation with the attributes possessed by a Forward, Midfielder and Defender.

### 3: Standardize, implement NearestNeighbors generate 5 similar players

In [None]:
scaled = StandardScaler()
X = scaled.fit_transform(sample)
recommendations = NearestNeighbors(n_neighbors=6,algorithm='kd_tree')
recommendations.fit(X)
player_index = recommendations.kneighbors(X)[1]


In [None]:
# Define a function to get Player's Index
def get_index(x):
    return df_20[df_20['short_name']==x].index.tolist()[0]

# Fetch 5 indexes of similar players
def recommend_similar(player):
    print("These are 5 players similar to {} : ".format(player))
    index=  get_index(player)
    for i in player_index[index][1:]:
        print("Name: {}\nOverall: {}\nMarket Value: {}\nAge: {}\n".format(df_20.iloc[i]['short_name'],df_20.iloc[i]['overall'], df_20.iloc[i]['value_eur'], df_20.iloc[i]['age']))

### Test 1: Eden Hazard

In [None]:
recommend_similar('E. Hazard')

### Test 2: J. Gomez

In [None]:
recommend_similar('J. Gomez')