# FIFA 19 Cluster Analysis

## Problem Definition
A challenge would be to go through and individually scout thousands of players in the FIFA 19 dataset in an effort to identify transfer targets. Elements to consider in a player acquisition include team positional need, transfer fee, and player wage; teams should balance these variables to optimally operate and maximize the return on investment. A player's ability can develop and plateau so timing the acquistion in a player's career is critical to maximize value.

To improve the scouting process, players are clustered. The dataset has features that would enable similar players to be grouped together based on certain feature values. The goal of the clustering analysis is to generate clusters of players based on the value they can be of for a team. Hence, players in the cluster that have the potential to develop into a highly rated player whose market value is currently relatively low but can increase to a substantial value will be the most valuable. This undervalued cluster of players can serve as suggested transfer targets that can be quality assets for a team.

In [1]:
# import libraries
%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import collections

from sklearn.metrics.pairwise import euclidean_distances
from sklearn.cluster import DBSCAN

sns.set_style('whitegrid')

In [2]:
df = pd.read_csv('cleaned data/fifa19data_clean.csv')
print(df.columns)

Index(['Name', 'Age', 'Nationality', 'Overall', 'Potential', 'Club', 'Value',
       'Wage', 'Special', 'Preferred Foot', 'International Reputation',
       'Weak Foot', 'Skill Moves', 'Work Rate', 'Position', 'Height', 'Weight',
       'LS', 'ST', 'RS', 'LW', 'LF', 'CF', 'RF', 'RW', 'LAM', 'CAM', 'RAM',
       'LM', 'LCM', 'CM', 'RCM', 'RM', 'LWB', 'LDM', 'CDM', 'RDM', 'RWB', 'LB',
       'LCB', 'CB', 'RCB', 'RB', 'Crossing', 'Finishing', 'HeadingAccuracy',
       'ShortPassing', 'Volleys', 'Dribbling', 'Curve', 'FKAccuracy',
       'LongPassing', 'BallControl', 'Acceleration', 'SprintSpeed', 'Agility',
       'Reactions', 'Balance', 'ShotPower', 'Jumping', 'Stamina', 'Strength',
       'LongShots', 'Aggression', 'Interceptions', 'Positioning', 'Vision',
       'Penalties', 'Composure', 'Marking', 'StandingTackle', 'SlidingTackle',
       'GKDiving', 'GKHandling', 'GKKicking', 'GKPositioning', 'GKReflexes'],
      dtype='object')


## Feature Engineering

In [3]:
# split Work Rate into 2 separate features
df['Attack Work Rate'] = df.apply(lambda row: row['Work Rate'].split("/ ")[0], axis=1)
df['Defensive Work Rate'] = df.apply(lambda row: row['Work Rate'].split("/ ")[1], axis=1)

In [4]:
# group similar positions together and make new feature 'Position Category'
def groupPosition(position):
    forward = ['RS', 'LS', 'RF', 'LF', 'CF', 'ST']
    attack_mid = ['RAM', 'LAM', 'CAM']
    wings = ['RM', 'RW', 'LM', 'LW']
    central_mid = ['CM', 'LCM', 'RCM']
    defensive_mid = ['CDM', 'LDM', 'RDM']
    fullback = ['RB', 'RWB', 'LB', 'LWB']
    cb_def = ['CB', 'LCB', 'RCB']

    if position == 'GK':
        return 'GK'
    elif position in forward:
        return 'FW'
    elif position in attack_mid:
        return 'AM'
    elif position in wings:
        return 'W'
    elif position in central_mid:
        return 'CM'
    elif position in defensive_mid:
        return 'DM'
    elif position in fullback:
        return 'FB'
    elif position in cb_def:
        return 'CB'

df['Position Category'] = df['Position'].apply(groupPosition)

In [5]:
# one-hot encode categorical features
categorical = ['Preferred Foot', 'Attack Work Rate', 'Defensive Work Rate', 'Position Category']
dummy_prefix = ['Foot', 'AWR', 'DWR', 'Pos']

for i in range(0, len(categorical)):
    df = pd.concat([df, pd.get_dummies(df[categorical[i]], prefix=dummy_prefix[i])], axis=1)

## Model Training

In [9]:
# select features to cluster data by
X_columns = ['Age', 'Overall', 'Potential', 'Value', 'Wage', 'Pos_FW', 'Pos_AM', 'Pos_W', 'Pos_CM', 
             'Pos_DM', 'Pos_FB', 'Pos_CB', 'Pos_GK']

In [10]:
model = DBSCAN(eps=2, min_samples=15)
model.fit(df[X_columns])

cluster_labels = model.labels_
n_clusters = len(set(cluster_labels))
print(collections.Counter(cluster_labels))

df['cluster'] = cluster_labels

Counter({1: 12861, -1: 5218, 4: 16, 0: 15, 3: 15, 2: 14, 6: 12, 5: 8})


## Model Evaluation

In [11]:
# Inter-Cluster
centroids = []
for cluster in sorted(set(model.labels_)):
    centroids.append(df[df['cluster']==cluster][X_columns].mean().values)
distances = []
for c1 in centroids:
    for c2 in centroids:
        distances.append(euclidean_distances(c1.reshape(-1, 1), c2.reshape(-1, 1))[0][0])
print('Inter Cluster distance', np.mean(distances))

# Intra-Cluster
distances = []
for cluster in sorted(set(model.labels_)):
    df_filter = df[df['cluster']==cluster]
    centroid = df_filter[X_columns].mean().values
    for k, v in df_filter[X_columns].iterrows():
        distances.append(euclidean_distances(centroid.reshape(-1, 1), v.values.reshape(-1, 1))[0][0])
print('Intra Cluster distance', np.mean(distances))

# Inertia
distances = []
for cluster in sorted(set(model.labels_)):
    df_filter = df[df['cluster']==cluster]
    centroid = df_filter[X_columns].mean().values
    for k, v in df_filter[X_columns].iterrows():
        distances.append(euclidean_distances(centroid.reshape(1, -1), v.values.reshape(1, -1), squared=True)[0][0])
print('Inertia', np.sum(distances))

Inter Cluster distance 4.232891472112151
Intra Cluster distance 3.82047634653173
Inertia 8763042.84007172


## Cluster Descriptions

Attempts to cluster using DBSCAN with various epsilon and minimum sample values yielded similar results where one cluster had majority of players. A decent portion of players were unable to be clustered using the defined DBSCAN settings.

In [26]:
print(df['Overall'].mean())
print(df['Potential'].mean())
print(df['Value'].mean())

66.24990362905446
71.31912550250564
2.4161313949005554


In [22]:
df[df['cluster']==1].sort_values(by='Potential', ascending=False).head()

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
11916,J. Carbonero,18,Colombia,64,82,Once Caldas,0.925,1.0,1522,Right,...,1,0,0,0,0,0,0,0,1,1
13568,A. Almendra,18,Argentina,62,81,Boca Juniors,0.625,2.0,1630,Right,...,1,0,0,1,0,0,0,0,0,1
9685,M. Coulibaly,19,Senegal,66,81,Udinese,1.4,4.0,1763,Right,...,1,0,0,1,0,0,0,0,0,1
12791,Y. Dhanda,19,England,63,81,Swansea City,0.8,3.0,1655,Right,...,1,1,0,0,0,0,0,0,0,1
10113,J. Pérez,20,United States,65,81,Los Angeles FC,1.2,2.0,1623,Left,...,1,0,0,0,0,0,0,0,1,1


In [23]:
df[df['cluster']==1].sort_values(by='Value', ascending=False).head()

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
2210,Míchel,29,Spain,74,74,Real Valladolid CF,5.5,18.0,1891,Right,...,1,0,0,1,0,0,0,0,0,1
2219,J. Kembo-Ekoko,30,DR Congo,74,74,Bursaspor,5.5,18.0,1873,Right,...,0,0,0,0,0,0,0,0,1,1
2301,Diogo Figueiras,27,Portugal,74,75,SC Braga,5.0,12.0,1964,Right,...,1,0,0,0,0,1,0,0,0,1
2432,J. Kana-Biyik,28,Cameroon,74,75,Kayserispor,5.0,13.0,1609,Right,...,1,0,1,0,0,0,0,0,0,1
3158,Y. Namli,24,Denmark,73,76,PEC Zwolle,5.0,8.0,1811,Left,...,1,0,0,0,0,0,0,0,1,1


The cluster with the majority of players varies broadly in age, position group, overall rating, potential, value, and wage. There are older players in the cluster whose potential is almost or has been fulfilled and younger players who have large unfulfilled potential and reasonable market value. The latter sub-cluster can be considered to be some players that should be viewed as possible transfer targets. The top world-class players are not included in this cluster of the majority of players.

In [24]:
df[df['cluster']==4]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
3168,Hervías,25,Spain,73,76,SD Eibar,5.0,17.0,1783,Right,...,1,0,0,0,0,0,0,0,1,4
3189,A. Solari,26,Argentina,73,74,Racing Club,4.7,17.0,1956,Right,...,0,0,0,0,0,0,0,0,1,4
3222,Rober Ibáñez,25,Spain,72,75,Getafe CF,3.9,15.0,1836,Right,...,1,0,0,0,0,0,0,0,1,4
3265,K. Karaman,24,Turkey,72,76,Fortuna Düsseldorf,4.1,18.0,1726,Right,...,1,0,0,0,0,0,0,0,1,4
3403,T. Pledl,24,Germany,72,75,FC Ingolstadt 04,3.9,16.0,1872,Right,...,1,0,0,0,0,0,0,0,1,4
3523,João Schmidt,25,Brazil,72,76,Rio Ave FC,3.9,18.0,1977,Left,...,1,0,0,1,0,0,0,0,0,4
3651,R. Gómez,25,Argentina,72,75,Unión de Santa Fe,3.9,17.0,1903,Right,...,0,0,0,0,0,0,0,0,1,4
3664,R. Matos,25,Brazil,72,75,Hellas Verona,3.9,19.0,1814,Right,...,1,0,0,0,0,0,0,0,1,4
3743,A. Biyogo Poko,25,Gabon,72,76,Göztepe SK,3.9,17.0,1941,Right,...,0,0,0,0,1,0,0,0,0,4
3802,O. Rivero,26,Uruguay,72,75,Club Atlas,3.9,17.0,1683,Right,...,1,0,0,0,0,0,1,0,0,4


In [29]:
df[df['cluster']==3]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
2700,S. Thioub,23,France,73,78,Nîmes Olympique,5.5,14.0,1785,Left,...,1,0,0,0,0,0,0,0,1,3
2930,T. Murg,23,Austria,73,77,SK Rapid Wien,5.0,14.0,1912,Left,...,1,0,0,0,0,0,0,0,1,3
3255,P. Gerkens,23,Belgium,72,77,RSC Anderlecht,4.2,13.0,1909,Right,...,1,0,0,0,0,0,1,0,0,3
3469,A. Castro,23,Argentina,72,78,San Lorenzo de Almagro,4.3,13.0,1918,Left,...,1,0,0,0,0,0,0,0,1,3
3575,A. Barboza,23,Argentina,72,77,Defensa y Justicia,3.5,13.0,1449,Left,...,1,0,1,0,0,0,0,0,0,3
3669,M. Møller Dæhli,23,Norway,72,77,FC St. Pauli,4.2,12.0,1619,Right,...,0,0,0,0,0,0,0,0,1,3
3703,D. Bouanga,23,Gabon,72,77,Nîmes Olympique,4.2,12.0,1778,Right,...,1,0,0,0,0,0,0,0,1,3
3742,J. Otero,23,Colombia,72,79,Amiens SC,4.4,12.0,1847,Right,...,1,0,0,0,0,0,0,0,1,3
3787,M. Rodríguez,23,Chile,72,78,U.N.A.M.,4.3,14.0,1816,Right,...,1,0,0,0,0,0,0,0,1,3
3864,L. Phiri,23,South Africa,72,78,En Avant de Guingamp,4.2,14.0,1962,Right,...,0,0,0,1,0,0,0,0,0,3


The two clusters above have players in their early and mid-20s who have not reached their full potential. Wingers are the most represented in this cluster with regards to position group. All players in this cluster have an overall rating and potential rating greater than the respective means of the entire dataset suggesting they are decent players and should be looked into as possible transfer targets. However, the value of every player in this cluster is greater than the mean value of all players.

In [28]:
df[df['cluster']==0]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
977,O. Kıvrak,30,Turkey,77,77,Trabzonspor,5.5,24.0,1289,Right,...,1,0,0,0,0,0,0,1,0,0
1167,N. Pallois,30,France,77,77,FC Nantes,6.5,25.0,1754,Left,...,0,0,1,0,0,0,0,0,0,0
1220,S. Langkamp,30,Germany,76,76,SV Werder Bremen,5.5,24.0,1488,Right,...,1,0,1,0,0,0,0,0,0,0
1225,M. Esser,30,Germany,76,76,Hannover 96,4.9,24.0,1193,Right,...,1,0,0,0,0,0,0,1,0,0
1282,Alexo Baia,30,Brazil,76,76,Cruzeiro,5.5,25.0,1920,Right,...,1,0,0,0,0,1,0,0,0,0
1392,F. Lustenberger,30,Switzerland,76,76,Hertha BSC,5.5,24.0,1758,Right,...,0,0,1,0,0,0,0,0,0,0
1433,P. Aguilar,31,Paraguay,76,76,Cruz Azul,5.0,23.0,1781,Right,...,0,0,1,0,0,0,0,0,0,0
1455,Júnior Caiçara,29,Brazil,76,76,Medipol Başakşehir FK,6.0,24.0,2059,Right,...,1,0,0,0,0,1,0,0,0,0
1486,Andeson Trigo,30,Brazil,76,76,Fluminense,5.5,25.0,2105,Left,...,1,0,0,0,0,1,0,0,0,0
1513,Raúl Navas,30,Spain,76,76,Real Sociedad,5.5,24.0,1688,Right,...,1,0,1,0,0,0,0,0,0,0


This cluster has players who are approaching or have reached 30 years of age and have reached their peak potential. Their market values are greater than the dataset mean value as well. A return on investment from a market value perspective is unlikely to materialize by acquiring a player from this cluster.

In [32]:
df[df['cluster']==5]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
4112,P. Škuletić,28,Serbia,71,71,Montpellier HSC,2.4,18.0,1565,Left,...,0,0,0,0,0,0,1,0,0,5
4179,N. Ghilas,28,Algeria,71,71,Göztepe SK,2.4,18.0,1745,Right,...,1,0,0,0,0,0,1,0,0,5
4295,D. Kaiser,29,Germany,71,71,Brøndby IF,2.3,18.0,1970,Right,...,1,1,0,0,0,0,0,0,0,5
4515,F. Navarro,29,Mexico,71,71,Club León,1.8,18.0,1928,Right,...,1,0,0,0,0,1,0,0,0,5
4623,R. Herrera,29,Uruguay,71,71,Pachuca,1.8,18.0,1549,Right,...,1,0,1,0,0,0,0,0,0,5
4731,Cristian López,29,Spain,71,71,Angers SCO,2.4,17.0,1703,Right,...,1,0,0,0,0,0,1,0,0,5
4970,D. Blum,27,Germany,70,71,UD Las Palmas,2.1,19.0,1727,Left,...,0,0,0,0,0,0,1,0,0,5
5455,L. Jutkiewicz,29,England,70,70,Birmingham City,1.8,18.0,1761,Left,...,1,0,0,0,0,0,1,0,0,5


This cluster is similar to the previous cluster with players approaching 30 years of age that have reached their potential but a difference is that all players in this cluster have a market value that is less than the mean market value of the entire dataset. With the potential rating also being all lower than that of the dataset mean, these players will likely not help a team and thus should not be targeted in transfers.

In [30]:
df[df['cluster']==2]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
3193,C. Roldan,23,United States,73,79,Seattle Sounders FC,4.7,6.0,1987,Right,...,0,0,0,0,0,0,0,0,1,2
3247,Bruno Tabata,21,Brazil,72,81,Portimonense SC,4.8,6.0,1827,Left,...,0,0,0,0,0,0,0,0,1,2
3253,H. Diallo,23,Senegal,72,80,FC Metz,4.6,6.0,1643,Right,...,1,0,0,0,0,0,1,0,0,2
3272,M. Murillo,22,Panama,72,80,New York Red Bulls,3.9,5.0,1913,Right,...,0,0,0,0,0,1,0,0,0,2
3294,Pedro Nuno,23,Portugal,72,79,Moreirense FC,4.4,5.0,1830,Right,...,1,0,0,0,0,0,0,0,1,2
3381,Rafa Soares,23,Portugal,72,80,Vitória Guimarães,3.9,6.0,1945,Left,...,1,0,0,0,0,1,0,0,0,2
3391,S. Adegbenro,22,Nigeria,72,80,Rosenborg BK,4.6,6.0,1873,Right,...,1,0,0,0,0,0,0,0,1,2
3480,S. Mosquera,23,Colombia,72,80,FC Dallas,4.6,6.0,1734,Right,...,0,0,0,0,0,0,0,0,1,2
3656,L. Agbenyenu,21,Ghana,72,80,Sporting CP,3.9,6.0,1802,Left,...,1,0,0,0,0,1,0,0,0,2
3849,K. Acosta,22,United States,72,80,Colorado Rapids,4.5,6.0,2032,Right,...,0,0,0,1,0,0,0,0,0,2


This cluster is similar to the clusters with the players in their 20s who have not fulfilled their potential but the players in this cluster have a slightly greater potential and their wages are lower. None of the wages of players in this cluster exceeds 7K where none of the wages in the aforementioned clusters were less than 12K. Players in this cluster would be of better value due to the lower wages and should be deemed as transfer targets especially after considering age and potential. Like its similar clusters, wingers are well-represented in this cluster.

In [31]:
df[df['cluster']==6]

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
14815,A. Kay,35,England,60,60,Port Vale,0.05,2.0,1579,Right,...,1,0,0,0,1,0,0,0,0,6
15342,T. Enomoto,35,Japan,59,59,Urawa Red Diamonds,0.04,1.0,841,Right,...,1,0,0,0,0,0,0,1,0,6
15431,M. Sawa,35,Japan,59,59,Kashiwa Reysol,0.07,1.0,1476,Right,...,0,0,0,0,0,0,1,0,0,6
15476,L. Kryger,35,Denmark,59,59,AC Horsens,0.07,1.0,1620,Right,...,0,0,0,0,0,0,0,0,1,6
15624,Ahn Seong Nam,34,Korea Republic,59,59,Gyeongnam FC,0.05,1.0,1757,Right,...,0,0,0,0,1,0,0,0,0,6
15644,S. Russell,35,England,59,59,Grimsby Town,0.04,1.0,1171,Right,...,1,0,0,0,0,0,0,1,0,6
15720,K. Tokushige,34,Japan,59,59,V-Varen Nagasaki,0.06,1.0,899,Right,...,1,0,0,0,0,0,0,1,0,6
15808,P. Cherrie,34,Scotland,58,58,Derry City,0.05,1.0,977,Right,...,1,0,0,0,0,0,0,1,0,6
15844,B. Williams,35,England,58,58,Bolton Wanderers,0.03,1.0,1123,Right,...,1,0,0,0,0,0,0,1,0,6
16023,S. Farelli,35,Italy,58,58,Pescara,0.03,1.0,1033,Right,...,1,0,0,0,0,0,0,1,0,6


This cluster has players that are below the dataset mean for overall quality and are in the age range when most players are past their prime and approach retirement. These players should definitely not be considered for acquisition.

In [35]:
df[df['cluster']==-1].head()

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
0,L. Messi,31,Argentina,94,94,FC Barcelona,110.5,565.0,2202,Left,...,1,0,0,0,0,0,1,0,0,-1
1,Cristiano Ronaldo,33,Portugal,94,94,Juventus,77.0,405.0,2228,Right,...,0,0,0,0,0,0,1,0,0,-1
2,Neymar Jr,26,Brazil,92,93,Paris Saint-Germain,118.5,290.0,2143,Right,...,1,0,0,0,0,0,0,0,1,-1
3,De Gea,27,Spain,91,93,Manchester United,72.0,260.0,1471,Right,...,1,0,0,0,0,0,0,1,0,-1
4,K. De Bruyne,27,Belgium,91,92,Manchester City,102.0,355.0,2281,Right,...,0,0,0,1,0,0,0,0,0,-1


In [34]:
df[df['cluster']==-1].tail(20)

Unnamed: 0,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Special,Preferred Foot,...,DWR_Medium,Pos_AM,Pos_CB,Pos_CM,Pos_DM,Pos_FB,Pos_FW,Pos_GK,Pos_W,cluster
18127,E. Clarke,19,England,48,59,Fleetwood Town,0.04,1.0,1225,Left,...,1,0,0,0,0,1,0,0,0,-1
18128,T. Hillman,17,Wales,48,57,Newport County,0.04,1.0,1218,Right,...,1,0,0,0,0,0,0,0,1,-1
18129,R. Roache,18,Republic of Ireland,48,69,Blackpool,0.07,1.0,1178,Right,...,1,0,0,0,0,0,1,0,0,-1
18132,M. Hurst,22,Scotland,48,58,St. Johnstone FC,0.04,1.0,987,Right,...,1,0,0,0,0,0,0,1,0,-1
18135,K. Pilkington,44,England,48,48,Cambridge United,0.0,1.0,774,Right,...,1,0,0,0,0,0,0,1,0,-1
18136,D. Horton,18,England,48,55,Lincoln City,0.04,1.0,1368,Right,...,1,0,0,1,0,0,0,0,0,-1
18137,E. Tweed,19,Republic of Ireland,48,59,Derry City,0.05,1.0,1315,Right,...,1,0,0,1,0,0,0,0,0,-1
18138,Zhang Yufeng,20,China PR,47,64,Beijing Renhe FC,0.06,1.0,1389,Right,...,1,0,0,1,0,0,0,0,0,-1
18139,C. Ehlich,19,Germany,47,59,SpVgg Unterhaching,0.04,1.0,1366,Right,...,1,0,0,0,0,1,0,0,0,-1
18140,L. Collins,17,Wales,47,62,Newport County,0.06,1.0,1297,Right,...,1,0,0,1,0,0,0,0,0,-1


The players that were not able to be clustered include the top world-class players and players whose potential ratings are below the mean potential of all players. There are young players who have room to grow to fulfill their potential but their fulfilled potential rating is below the dataset mean and would not be a quality player to add to a team. None of these unclustered players should be considered to be acquired as the cost to obtain them would be too much and a substantial return on investment would be difficult to achieve or their quality as a player would not make them an asset to a team.