# Selecting the best team of Fantasy Premier League Players


The task is select 15 players from approx 550 English Premier League players for a fantasy team. Every fantasy player has cost and points.
The target is to maximise total points of the fantasy team.
The constraints are:
- total budget (say, 1000);
- number of team members - 15;
- no more than 3 players from one team;
- limits by position - two forwards, four midfielders, four defenders, two goalkeepers.

We will use PULP library with embedded solver CBC_CMD. 

In [279]:
#importing labraries
import urllib.request, json,pprint
import pandas as pd
from pulp import *
import time

In [132]:
#searching for solvers with pulp
pulp.pulpTestAll()

	 Testing zero subtraction
	 Testing inconsistant lp solution
	 Testing continuous LP solution
	 Testing maximize continuous LP solution
	 Testing unbounded continuous LP solution
	 Testing Long Names
	 Testing repeated Names
	 Testing zero constraint
	 Testing zero objective
	 Testing LpVariable (not LpAffineExpression) objective
	 Testing Long lines in LP
	 Testing LpAffineExpression divide
	 Testing MIP solution
	 Testing MIP solution with floats in objective
	 Testing MIP relaxation
	 Testing feasibility problem (no objective)
	 Testing an infeasible problem
	 Testing an integer infeasible problem
	 Testing column based modelling
	 Testing dual variables and slacks reporting
	 Testing fractional constraints
	 Testing elastic constraints (no change)
	 Testing elastic constraints (freebound)
	 Testing elastic constraints (penalty unchanged)
	 Testing elastic constraints (penalty unbounded)
* Solver <class 'pulp.solvers.PULP_CBC_CMD'> passed.
Solver <class 'pulp.solvers.CPLEX_DLL'> un

PulpError: Tests Failed

In [245]:
#getting data from Fantasy Premier League site API
url="https://fantasy.premierleague.com/drf/bootstrap-static"
response = urllib.request.urlopen(url)
data = json.loads(response.read())

In [246]:
#looking at data
pd.DataFrame.from_dict(data["teams"]).head()

Unnamed: 0,code,current_event_fixture,draw,form,id,link_url,loss,name,next_event_fixture,played,...,strength,strength_attack_away,strength_attack_home,strength_defence_away,strength_defence_home,strength_overall_away,strength_overall_home,team_division,unavailable,win
0,3,"[{'is_home': False, 'day': 5, 'event_day': 2, ...",0,,1,,0,Arsenal,"[{'is_home': True, 'day': 8, 'event_day': 1, '...",0,...,4,1260,1240,1320,1270,1320,1290,1,False,0
1,91,"[{'is_home': True, 'day': 4, 'event_day': 1, '...",0,,2,,0,Bournemouth,"[{'is_home': True, 'day': 8, 'event_day': 1, '...",0,...,3,1120,1090,1200,1150,1180,1150,1,False,0
2,36,"[{'is_home': True, 'day': 4, 'event_day': 1, '...",0,,3,,0,Brighton,"[{'is_home': False, 'day': 8, 'event_day': 1, ...",0,...,3,1080,1040,1120,1030,1110,1040,1,False,0
3,90,"[{'is_home': True, 'day': 5, 'event_day': 2, '...",0,,4,,0,Burnley,"[{'is_home': True, 'day': 8, 'event_day': 1, '...",0,...,2,1150,1040,1070,1040,1070,1050,1,False,0
4,97,"[{'is_home': False, 'day': 4, 'event_day': 1, ...",0,,5,,0,Cardiff,"[{'is_home': True, 'day': 8, 'event_day': 1, '...",0,...,2,1060,1020,1030,1010,1050,1020,1,False,0


In [247]:
#getting dataframe with teams
df_teams=pd.DataFrame.from_dict(data["teams"])[["id","name"]]

In [248]:
df_teams.head()

Unnamed: 0,id,name
0,1,Arsenal
1,2,Bournemouth
2,3,Brighton
3,4,Burnley
4,5,Cardiff


In [249]:
#getting dataframe with players
df_players=pd.DataFrame.from_dict(data["elements"])

In [250]:
df_players.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 558 entries, 0 to 557
Data columns (total 58 columns):
assists                         558 non-null int64
bonus                           558 non-null int64
bps                             558 non-null int64
chance_of_playing_next_round    363 non-null float64
chance_of_playing_this_round    359 non-null float64
clean_sheets                    558 non-null int64
code                            558 non-null int64
cost_change_event               558 non-null int64
cost_change_event_fall          558 non-null int64
cost_change_start               558 non-null int64
cost_change_start_fall          558 non-null int64
creativity                      558 non-null object
dreamteam_count                 558 non-null int64
ea_index                        558 non-null int64
element_type                    558 non-null int64
ep_next                         558 non-null object
ep_this                         558 non-null object
event_points         

In [251]:
#reducting dataframe leaving columns needed
df=df_players[["web_name","total_points","now_cost","team","element_type"]]

In [252]:
#preparing dictionary of teams
dict_teams = df_teams.set_index("id")["name"].to_dict()

In [253]:
dict_teams

{1: 'Arsenal', 2: 'Bournemouth', 3: 'Brighton', 4: 'Burnley', 5: 'Cardiff', 6: 'Chelsea', 7: 'Crystal Palace', 8: 'Everton', 9: 'Fulham', 10: 'Huddersfield', 11: 'Leicester', 12: 'Liverpool', 13: 'Man City', 14: 'Man Utd', 15: 'Newcastle', 16: 'Southampton', 17: 'Spurs', 18: 'Watford', 19: 'West Ham', 20: 'Wolves'}

In [254]:
#preparing dictionary of teams
dict_pos = {1:"Goalkeeper",2:"Defender",3:"Midfielder",4:"Forward"}

In [255]:
#replacing team numbers with real names of teams (e.g. Arsenal, Man City etc.)
df["team"]=df["team"].map(dict_teams)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [256]:
#replacing position numbers with real names of positions (e.g. forward etc.)
df["element_type"]=df["element_type"].map(dict_pos)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [257]:
#getting list of teams from our dataframe with players
list_teams=list(df["team"].unique())

In [258]:
#getting list of players' names from our dataframe with players
list_players=list(df["web_name"].unique())

In [259]:
#getting list of positions from our dataframe with players
list_pos = list(df["element_type"].unique())

In [260]:
#one-hot-encoding for team names
dfPl = pd.concat([df,pd.get_dummies(df["team"])],axis=1)

In [261]:
#one-hot-encoding for positions
dfPl = pd.concat([dfPl,pd.get_dummies(dfPl["element_type"])],axis=1)

In [262]:
#ensuring integers for points
dfPl["total_points"] = [int(x) for x in dfPl["total_points"]]

In [282]:
#looking at our dataframe
dfPl.head()

Unnamed: 0,web_name,total_points,now_cost,team,element_type,Arsenal,Bournemouth,Brighton,Burnley,Cardiff,...,Newcastle,Southampton,Spurs,Watford,West Ham,Wolves,Defender,Forward,Goalkeeper,Midfielder
0,Cech,24,49,Arsenal,Goalkeeper,1,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
1,Leno,22,48,Arsenal,Goalkeeper,1,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
2,Koscielny,0,54,Arsenal,Defender,1,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
3,Bellerín,42,54,Arsenal,Defender,1,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,Monreal,30,54,Arsenal,Defender,1,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0


In [283]:
start = time.time()

# Create the 'prob' variable to contain the problem data
prob = LpProblem("The FPL players selection",LpMaximize)

#setting variable
var_players = pulp.LpVariable.dicts("p",list_players,cat='Binary')

#setting objective function
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["total_points"].iloc[0] for p in list_players])

#setting constraints on number of players from one PL team
for t in list_teams:
    prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p][t].iloc[0] for p in list_players])<=3
    
#setting constraints on number of players for every position
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["Goalkeeper"].iloc[0] for p in list_players])<=2
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["Defender"].iloc[0] for p in list_players])<=5
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["Midfielder"].iloc[0] for p in list_players])<=5
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["Forward"].iloc[0] for p in list_players])<=3

#setting budget constraint
budget = 1000
prob+=sum([var_players[p]*dfPl[dfPl["web_name"]==p]["now_cost"].iloc[0] for p in list_players])<=budget

#setting team count constraint
prob+=sum([var_players[p] for p in list_players])==15

#solving
prob.solve()

end = time.time()

In [284]:
#printing results
if LpStatus[prob.status]==1:  
    print("Fantasy team players for budget of ", budget)
    print("--------------------------------------------")
    t_cost = 0
    t_points = 0
    columns = ["player","team","position","cost","points"]
    df_out = pd.DataFrame(columns=columns)
    row=0
    for p in list_players:
        if var_players[p].value()==1:
            team = dfPl[dfPl["web_name"]==p]["team"].iloc[0]
            pos = dfPl[dfPl["web_name"]==p]["element_type"].iloc[0]
            cost = dfPl[dfPl["web_name"]==p]["now_cost"].iloc[0]
            points = dfPl[dfPl["web_name"]==p]["total_points"].iloc[0]
            t_cost+=cost
            t_points+=points
            df_out.loc[row] = [p,team,pos,cost,points]
            row+=1
    print(df_out)
    print("total cost:",t_cost)
    print("total points in the table:",t_points)
    if t_points == value(prob.objective):
        print("equal to objective function value:",value(prob.objective))
    else:
        print("not equal to objective function value:",value(prob.objective))
    print ("time elapsed in sec.:", end - start)
else:
    print("Solver failed!")    

Solver failed!
