# **This notebook provides a quick analysis of the Fantasy EPL Assets for the upcoming season (2020-21)**

The analysis is purely based on fantasy stats from the FPL dataset. More sophistaced methods involve supplementing the dataset here with xG and xA stats which are availble with paid data streams. 

A fair bit of the starting block of the notebook is an exploratory piece into the dataset for the benefit of beginners and people who want to analyze it further. 

In [None]:
import numpy as np
import pandas as pd 
import json


The datafiles are jsons and require to be read from the file and converted into a dictionary using the code below. 

In [None]:
fpl2020_file = open('../input/fantasy-epl-new-season-research-2020-2021/FPL_2019_20_season_stats.jscsrc')
fpl2020 = fpl2020_file.read()
fpl2020 = json.loads(fpl2020)
fpl2021_file = open('../input/fantasy-epl-new-season-research-2020-2021/FPL_2020_21_player_list.jscsrc')
fpl2021 = fpl2021_file.read()
fpl2021 = json.loads(fpl2021)

Let's take a look at the keys in the dataset....

In [None]:
fpl2020.keys()

The key called 'elements' contains the player list and the key called 'teams' contains the team level stats.
Lets get a look at the team stats for 2019-20. It is a good practice to take a look at the datatype of the data in each key.

In [None]:
for key in fpl2020.keys():
    print('Data type: %s for Key: %s' %(type(fpl2020[key]),key))

In [None]:
print('Understanding the data structure for Teams')
print(fpl2020['teams'][0].keys())
print('Understanding the data structure for Elements')
print(fpl2020['elements'][0].keys())

We can see that both data structures are lists of dictionaries. Further manipulation can be done using Pandas. 

In [None]:
teams2020 = pd.DataFrame(fpl2020['teams'])
players2020 = pd.DataFrame(fpl2020['elements'])
teams2020.head()

In [None]:
players2020.head()

It is intresting to check how the clubs rank on total fantasy points earned. 

In [None]:
fpoints_table = players2020.groupby('team_code')['total_points'].sum()
fpoints_table = pd.DataFrame(fpoints_table)
fpoints_table['code'] = fpoints_table.index
fpoints_table['fpoints_rank']=fpoints_table['total_points'].rank(ascending = False)
leaguetable_2020 = teams2020[['short_name','code','win','draw','loss','points','position','strength','strength_overall_home','strength_overall_away','strength_attack_home','strength_attack_away','strength_defence_home','strength_defence_away','pulse_id']]
league_fpoints = leaguetable_2020.join(fpoints_table, on='code', how='left',lsuffix = 'lt')
table_comparison = league_fpoints[['short_name','position','points','fpoints_rank','total_points','code']]
table_comparison.sort_values('position')

It is evident from the table that league success does not always imply fantasy success and vice versa. 
Let's look at this relationship in a visual manner. We will plot a linear regression between league points and fantasy points. 

A critical consideration is understand which is the dependent vs independent variable. Since there are only 2 varibles, the consideration is mathematically moot. Y = mX + c and X = (Y-c)/m are both linear. However we typically look at league points to establish if a team did well in a season so we will keep that as the independent variable. This is purely a convention based approach. 

In [None]:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
X = table_comparison['points'].values.reshape(-1,1)
y = table_comparison['total_points'].values.reshape(-1,1)
names = table_comparison['short_name'].values.reshape(-1,1)
model.fit(X,y)
y_pred = model.predict(X)
import matplotlib.pyplot as plt
plt.figure(figsize=(15,10))
plt.scatter(X,y,color='black')
plt.plot(X,y_pred, color='blue', linewidth=3)

for a,b,l in zip(X,y,names):
    plt.annotate(l,(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.show()


We can that ManCity and to some extent Leicester outperformed in terms of Fantasy points. Newcastle, WestHam and Villa also showed a similar trend. 

Similarly Liverpool, Spurs, Southampton, Palace, showed a negative deviation. 

Let's dive into understanding how these clubs scored those points on a position and a player level. 

In [None]:
fpoints_by_pos = players2020.groupby(by = ['team_code','element_type'],as_index=False)['total_points'].sum()
fpoints_pos = fpoints_by_pos.pivot(index='team_code', columns='element_type', values='total_points')
fpoints_pos.columns = ['GK','Def','Mid','Fwd']
fpoints_pos['code']=fpoints_pos.index
fp_pos = table_comparison.join(fpoints_pos, on='code', how='left',lsuffix = 'lt')
for each in ['GK','Def','Mid','Fwd']:
    fp_pos['perc_'+str(each)]=(fp_pos[each]/fp_pos['total_points']*100).round(2)
fp_pos['perc_Defense'] = fp_pos['perc_GK'] + fp_pos['perc_Def']
fp_pos[['short_name', 'position','total_points', 'perc_GK', 'perc_Def',
       'perc_Mid', 'perc_Fwd','code','perc_Defense']].sort_values('total_points',ascending = False)

The table is starting to give some insights. City's defensive fantasy assets had a terrible fantasy season with just 31.15% of their total points coming from Goalkeepers and Defenders. That's the worst in the league. Astonishing for a team of their cadre. 

In contrast, Sheffield scored 61.48% of their points from their strong 5 man defense and the goal scoring exploits of the misclassified Lundstram. 

Along with Sheffield, Newcastle and Wolves make a case for investment in their defensive assets but the change in position for Matt Ritche may put a damper on the Magpies. 

City had the best midfield points ratio lead by the massive 251 point haul from Kevin De Bruyne. West Ham's 55.06% points can be attributed to the misclassified Antonio hitting from after the lockdown. Villa have the next best midfield ratio mostly attributed to Grealish who was 10th in the midfielder ranking. Spurs have the benefit of Son's exploits. Other notable mentions include Liverpool with Salah and Mane, Chelsea with none of their midfields in the top 10, and Norwich who were not let down by their midfield. 

The teams depending mostly on their front line are Southampton with Ings, Arsenal with Aubamayeng (now a midfielder), Burnley, and the now departed Watford and Bournemouth.

We have only considered aggregate team level stats, and it's time to dive into player level analysis. Before we do that it is important to address that player positions and prices have changed in the new season. We will factor those in our calculations. 

In [None]:
players2021 = pd.DataFrame(fpl2021['elements'])
combined = players2020.merge(players2021[[ 'now_cost','code','element_type','team_code']], on = 'code',how='outer',suffixes = ['','_21'])
teams2020['team_code'] = teams2020['code'].astype('Int64')
combined['team_code'] = combined['team_code'].astype('Int64')
combined = combined.merge(teams2020, on = 'team_code', how = 'left',suffixes = ('','_team'))

The first thing we are going to do is just replicate the analysis above after accouting for positional changes in the new season. It is not enough to reclassify the player directly, rather we need to recount the total score as a result of the players change of position. We can do that by defining a function for points calculation and applying it to the dataframe row-wise. 

A key point to note is that we are not working with a gameweek level dataset. Hence we are unable to know if a players gets 2 points in a gw after playing 60 mins or if points got deducted for more than 2 goals conceded by a defensive player in a gameweek. 

In order to approximate this, we will calculate the points based on the FPL scoring model and then subtract these calculated points from the total points. This gives us the sum of the gameweek points over the season.

In [None]:
def points_calc(row,label='element_type'):
    points = 0
    if row[label] == 1:
        points = points + 6*row['goals_scored'] + 4*row['clean_sheets'] + 5*row['penalties_saved'] # + save_points - 2goals concededpts
    elif row[label] == 2:
        points = points + 6*row['goals_scored'] + 4*row['clean_sheets'] #  - 2goals concededpts
    elif row[label] == 3:
        points = points + 5*row['goals_scored'] + 1*row['clean_sheets']
    else:
        points = points + 4*row['goals_scored']
    
    points = points + 3*row['assists'] - 1*row['yellow_cards'] - 2*row['red_cards'] - 2*row['own_goals'] -2*row['penalties_missed'] + row['bonus']
    return points

In [None]:
players2020['projected_points'] = players2020.apply(points_calc,axis = 1)
players2020[['projected_points','total_points']]
players2020['gw_points'] = players2020['total_points'] - players2020['projected_points']
combined = players2020.merge(players2021[[ 'now_cost','code','element_type','team_code']], on = 'code',how='outer',suffixes = ['','_21'])
combined['projected_points_21'] = combined.apply(points_calc,axis = 1,args = ['element_type_21'] )
combined['proj_tot_points_21'] = combined['projected_points_21'] + combined['gw_points']
combined['team_code'] = combined['team_code'].astype('Int64')
combined['team_code_21'] = combined['team_code_21'].astype('Int64')
combined = combined.merge(teams2020, on = 'team_code', how = 'left',suffixes = ('','_team'))
combined = combined.merge(teams2020, left_on = 'team_code_21',right_on='team_code', how = 'left',suffixes = ('','_team_21'))

We have created a combined dataframe which has all 2019-20 metrics with the 2020 player positions, teams, prices. Lets have another look at our points table. 

In [None]:
fpoints_table_21 = combined.groupby('team_code')['proj_tot_points_21'].sum()
fpoints_table_21 = pd.DataFrame(fpoints_table_21)
fpoints_table_21['code'] = fpoints_table.index
fpoints_table_21['fpoints_rank']=fpoints_table_21['proj_tot_points_21'].rank(ascending = False)
leaguetable_2021 = teams2020[['short_name','code','win','draw','loss','points','position','strength','strength_overall_home','strength_overall_away','strength_attack_home','strength_attack_away','strength_defence_home','strength_defence_away','pulse_id']]
league_fpoints_21 = leaguetable_2021.join(fpoints_table_21, on='code', how='left',lsuffix = 'lt')
table_comparison_21 = league_fpoints_21[['short_name','position','points','fpoints_rank','proj_tot_points_21','code']]
table_comparison_21 = table_comparison_21.merge(table_comparison[['code','total_points']], on='code', how='left',suffixes = ['','_20'])
table_comparison_21 = table_comparison_21.sort_values('position')
table_comparison_21[:-3]

While most of the values have very small changes in total points, there are some notable items in the table. Arsenal see a 25 point bump due to Aubamayeng's position change to a midfielder. Sheffield, Newcastle and West Ham the previous outperformers all see points docked due to positional updates. 
If we take a look at the split of projected points by position, we can identify each team's area of investment more accurately. 

In [None]:
fpoints_by_pos_21 = combined.groupby(by = ['team_code','element_type_21'],as_index=False)['proj_tot_points_21'].sum()
fpoints_pos_21 = fpoints_by_pos_21.pivot(index='team_code', columns='element_type_21', values='proj_tot_points_21')
fpoints_pos_21.columns = ['GK','Def','Mid','Fwd']
fpoints_pos_21['code']=fpoints_pos_21.index
fp_pos_21 = table_comparison_21.join(fpoints_pos_21, on='code', how='left',lsuffix = 'lt')
for each in ['GK','Def','Mid','Fwd']:
    fp_pos_21['proj_perc_'+str(each)]=(fp_pos_21[each]/fp_pos_21['proj_tot_points_21']*100).round(2)
fp_pos_21['proj_perc_Defense'] = fp_pos_21['proj_perc_GK'] + fp_pos_21['proj_perc_Def']
fp_pos_21 = fp_pos_21[['short_name', 'position','proj_tot_points_21', 'proj_perc_GK', 'proj_perc_Def',
       'proj_perc_Mid', 'proj_perc_Fwd','code','proj_perc_Defense']].sort_values('proj_tot_points_21',ascending = False)
fp_pos_21[:-3] 

We can see the confirmation of some of our earlier notions and some interesting observations. 
City still have the worst defense which they hope to fix by introducing $40 mn man Ben Ake and the fitness of Laporte who was on a lengthy injury lay-off last season. 

Sheffield still has the highest defensive points even with Lundstram as a midfielder. However the prices of the previous cut price Sheffield assets have gone up considerably and we will look at that shortly. 

Matt Riche as a midfielder puts Burley and Wolves at a higher level than Newcastle. 

United turns out to be the champion in midfield and not City. Villa is the second best by a slim margin over a host of other clubs. The midfield competition gets tigher with City, Liverpool, Chelsea, Arsenal, Spurs, West Ham all between 45-48 percent points. 

Southampton, and now Everton (due to Richarlison as a forward), depend on their front lines to reign in the fantasy points. 

The stage is set to dive into player level analysis.......

In [None]:
# At a player level there is some need for basic filtering the list of players.
#Remove players that do not have a team code for 2021. 
targets = combined.dropna(axis=0,how = 'any',subset = ['team_code_21'])
# Remove players that dont have a Projected points score or the score is 0
targets = targets.dropna(axis=0,how = 'any',subset = ['proj_tot_points_21'])
targets = targets[targets['proj_tot_points_21']!=0]
#Remove players that have 0 minutes played
targets = targets[targets['minutes']!=0]
#Remove players that have lesser than 2 projected points per game
targets['proj_ppg_21'] = targets['proj_tot_points_21']/round(targets['total_points'] / targets['points_per_game'].astype(float) , 0)
targets = targets [targets['proj_ppg_21']>=2]
targets.shape[0]

We have more than halved our player list from 747 to 296. But even 296 is a really large list and we are yet to consider new arrivals to the epl this year both players and promoted clubs. 
But lets take a look at how many players are there per club in this list.

In [None]:
players_per_club = targets['team_code_21'].value_counts()
targets.pivot_table(values ='team_code_21',index =  'short_name_team_21',aggfunc = 'count')

Most clubs are around the 17-20 mark since this type of filtering has removed the inconsequential 93rd minute substitutions getting 1 point per game or the holding midfields who hardly earn any points. 

The next step is to create a metric to compare these players. It would be too easy if there was a single metric to identify the best players. We are going to build a few metrics with their logic explained. To the hardened FPL player these should be familiar terms. 
* Points per 90 mins - Gives the expected value of points in a gameweek
* Points per million per 90 mins - Gives the expected ROI of points in a gameweek 
* Player points as a percent of club points - Gives the relative importance of the player to the club
* Bonus points percent of total points scored - Compares the ability of the player to earn bonus points

In [None]:
targets['proj_points_per90'] = round(targets['proj_tot_points_21'] / targets['minutes']*90,2)
targets['proj_points_permn_per90'] = round(targets['proj_tot_points_21'] / (targets['now_cost_21']/10)/ targets['minutes']*90,3)
club_proj_scores = targets.groupby(by = 'team_code_21')['proj_tot_points_21'].sum()
targets['club_points_21'] =  targets['team_code_21'].apply(lambda x: club_proj_scores[x])
targets['proj_club_points_percent'] = round(targets['proj_tot_points_21'] / targets['club_points_21']*100,2)
targets['proj_bonus_points_perc'] = round(targets['bonus'] / targets['proj_tot_points_21']*100,2)

targets[['proj_points_per90','proj_points_permn_per90','proj_club_points_percent','proj_bonus_points_perc']].corr()

As with any created features, its important to note their correlation. We can observe that the points per 90 and points per 90 are certainly correlated. We will dig into this further to make sense of the relationship. For now we will consider it to be the fact that players who score more points also naturally give more ROI. 

The other correlated pairs are club points percent and bonus points percent which is expected. Players who outperform at their club are the usual suspects for bonus points. 

Let's plot some scatter charts...


In [None]:
# Plotting Points per 90 vs Total Points for the first 30 players by total points
data = targets[['proj_points_per90','proj_tot_points_21','web_name']].nlargest(50,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(data['proj_tot_points_21'],data['proj_points_per90'],color='black')
for a,b,l in zip(data['proj_tot_points_21'],data['proj_points_per90'],data['web_name']):
    plt.annotate(l,(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per 90')

plt.show()


In terms of points scoring potential in a gameweek, the city assets Aguero, Mahrez, and De Bruyne are head and shoulders above anyone else. Liverpool pair Salah and Mane are distant second places. 

It would make a better read to look at this chart normalized by price. This is the reason we have constructed the per million per 90 minutes ROI value.  

In [None]:
data = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','code_team_21']].nlargest(50,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(data['proj_tot_points_21'],data['proj_points_permn_per90'],c=data['code_team_21'],cmap = 'tab20')
for a,b,l in zip(data['proj_tot_points_21'],data['proj_points_permn_per90'],data['web_name']):
    plt.annotate(l,(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()

Before we jump into reading this chart, lets split out the chart by postion of the player. 


In [None]:
gkdata = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

gkdata = gkdata[gkdata['element_type_21']==1].nlargest(25,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(gkdata['proj_tot_points_21'],gkdata['proj_points_permn_per90'],c=gkdata['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(gkdata['proj_tot_points_21'],gkdata['proj_points_permn_per90'],gkdata['web_name'],gkdata['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()

The chart above shows the ROI analysis for Goalkeepers. In the premium segment, both Ederson and Alission do not make a case for investment due to City's poor defensive record last season and Alissons injury which also to an extent impacted the Liverpool defense. 

A keeper priced at 5.5 is the new go to choice if the intention is to gain a lift from the Goalkeepers. Choices include Pope, Schmeichel, Patricio, deGea, with Henderson moving to United that is an interesting dynamic. However the availablity of Ryan at 4.5 and Guaita and Dubravka both at 5.0 is a tempting prospect.   

Ryan looks like the best option to start with. A wildcard can be used to realgin the strategy should a higher priced keeper start providing consistent returns. The incremental lift in points from Ryan to anyone is at best around 25 for \\$1. The case for investment of this \\$1 elsewhere in the team is much stronger thanks to the various options further up the pitch. 

Let's look at Defenders

In [None]:
defchrt = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

defchrt = defchrt[defchrt['element_type_21']==2].nlargest(50,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(defchrt['proj_tot_points_21'],defchrt['proj_points_permn_per90'],c=defchrt['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(defchrt['proj_tot_points_21'],defchrt['proj_points_permn_per90'],defchrt['web_name'],defchrt['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()

TAA as we all know is an absolute monster providing a fantastic ROI combined with the highest number of points. Alonso was a source of cosistent points and has to potential to make a mockery of the 6.0 price tag if he can nail down a consistent starting place. 

Doherty still makes a strong case for investment over the other Liverpool defenders provided Wolves can keep up their defensive record and still allow him freedom to attack. 

The rest of the chart is almost illegible and I am going to redo it below after removing the premium defenders. 

In [None]:
defchrt = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

defchrt = defchrt[defchrt['element_type_21']==2].nlargest(50,'proj_tot_points_21')[4:]
defchrt = defchrt[defchrt['proj_points_permn_per90']<1]
plt.figure(figsize=(15,10))
plt.scatter(defchrt['proj_tot_points_21'],defchrt['proj_points_permn_per90'],c=defchrt['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(defchrt['proj_tot_points_21'],defchrt['proj_points_permn_per90'],defchrt['web_name'],defchrt['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()

If Willy Boly 5.5 is fit, the case for a double investment in the Wolves defense is strong if the fixtures align well. 

Also in the 5.5 bracket, Stevens, Tarkowski, Baldock, Wan Bissaka, Jonny, Aurier, vanAnholt are in a tight race. The first three are the best consistent bets while Aurier could be a strong differential if Spurs are able to keep things tight at the back.

Egan provides the best balance between ROI and total points at 5.0. Dunk is a close second. 

The best 4.5 player is Webster just shy of 90 points. Him along with the rest of the players do not inspire a lot confidence. It is wise to spend that \\$1 saved from the Goalkeepers in defense to go from 4.5 to 5.5 or 5.0. 

In [None]:
midchrt = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

midchrt = midchrt[midchrt['element_type_21']==3].nlargest(50,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(midchrt['proj_tot_points_21'],midchrt['proj_points_permn_per90'],c=midchrt['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(midchrt['proj_tot_points_21'],midchrt['proj_points_permn_per90'],midchrt['web_name'],midchrt['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()

In midfield, De Bruyne 11.5 provides better value than the higher priced Salah, Mane, Aubamayeng all 12.0 and Sterling at 11.5. 

The next group of players comprises Rashford at 9.5, Willian at 8.0 and Son at 9.0. 

Mahrez and Greenwood are absolute value picks if they can manage retain a regular place and keep up with their last season form. 

In a similar fashion, I am going to clear the clutter (hopefully) by removing these players. 

In [None]:
midchrt = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

midchrt = midchrt[midchrt['element_type_21']==3].nlargest(50,'proj_tot_points_21')[9:]

midchrt = midchrt[midchrt['proj_points_permn_per90']<1]
plt.figure(figsize=(15,10))
plt.scatter(midchrt['proj_tot_points_21'],midchrt['proj_points_permn_per90'],c=midchrt['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(midchrt['proj_tot_points_21'],midchrt['proj_points_permn_per90'],midchrt['web_name'],midchrt['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.title('Points per million per 90 vs Total Points')
plt.xlabel('Total Points')
plt.ylabel('Points per million per 90')
plt.show()



The notable mention here goes to Fernandes 10.5 (overwritten by Henderson) who played just half a season and racked up 115 points. 

Pulisic can also offer a massive differential if he can cement his starting position under Lampard. 

Perez now classified as a midfielder offers great value at 6.5 and also will be a star if he can put minutes on the board rather than on the bench. 

Most other assets seem to be overpriced and it is upon us to find the new Grealish and Mount rather than buy them at their increased prices. Armstrong has the potential to deliver in this area and will need an eye-test and good run of fixtures to merit investment. 

In [None]:
fwdchrt = targets[['proj_points_permn_per90','proj_tot_points_21','web_name','element_type_21','now_cost_21','code_team_21']]

fwdchrt = fwdchrt[fwdchrt['element_type_21']==4].nlargest(35,'proj_tot_points_21')
plt.figure(figsize=(15,10))
plt.scatter(fwdchrt['proj_tot_points_21'],fwdchrt['proj_points_permn_per90'],c=fwdchrt['code_team_21'],cmap = 'tab20')
for a,b,l,c in zip(fwdchrt['proj_tot_points_21'],fwdchrt['proj_points_permn_per90'],fwdchrt['web_name'],fwdchrt['now_cost_21']):
    plt.annotate(l+ ' '+ str(c/10),(a,b), textcoords="offset points", xytext=(0,7), # distance from text to points (x,y)
                 ha='center') # horizontal alignment can be left, right or center
plt.show()

Aguero leads the line in terms of ROI in the premium bracket with Ings a close second. Jesus, Martial, Jiminez, Vardy form the next group of potential candidates to displace those two. 

Kane, Firmino, Richarlison are overpriced in the current scenario and it would be best to stay away from them till their form improves. 

The 6.0 quartet of Adams, Gayle, Mousset, and Ihenacho is the rotating shoppers paradise with Che Admas showing particular promise towards the end of the last season. 

Any injury to the Liverpool frontline immediately puts Origi in contention with his ROI making a mockert of the 5.5 tag. 

Giroud and Abhraham although had strong ROIs will most likely play second fiddle to the incoming Werner. 

Chris Wood, Antonio, Maupay all 6.5 and Rogriguez can prove to be great differentials during a strong run of fixtures. 



**And that's all folks. Hope you found this interesting and enlightening. Feel free to share any questions and have a great FPL season. 
Good Luck!**