# Fantasy Premier League (FPL) Advisor

# Purpose
The purpose of this Jupyter notebook is to help with the selection of team members for the [Fantasy Premier League](https://fantasy.premierleague.com/) (FPL) by forecasting how many points players will earn. It uses the [fpl-data](https://github.com/177arc/fpl-data) generated estimate points and other stats. It provides:
- a visual tool for analysing the performance of each player and understanding their potential to earn points
- a optimiser to recommend a team with the maximum expected points to improve the performance of your current team
- tools for selecting the best game weeks play your chips
- visual tools to understand the re-liability of the data

If you are not familar with the Fantasy Permier League, you can watch this introduction:

<a href="http://www.youtube.com/watch?v=SV_F-cL8fC0" target="_blank"><img src="http://img.youtube.com/vi/SV_F-cL8fC0/0.jpg" 
alt="How to play FPL" width="600" height="400"/></a>

# Installation
To get started, run the following command to install all required dependencies.

In [None]:
#!pip install -q -r ./requirements.txt

# Import requirements
Here we import all external and local modulues.

In [None]:
import pandas as pd
import os, sys

# Load local modules
sys.path.append(os.getcwd())
from data import get_df, get_next_gw_counts

pd.set_option('display.max_columns', 100)

# Define type aliases
DF = pd.DataFrame
S = pd.Series

# Set variables
This section sets all important global variables.

In [None]:
CREDS_FILE = 'fpl_credentials.csv' # Location of file holding the FPL user name and password. These are only required for the personalised recommendations in the second half of this notebook.
DATA_URL = 'https://s3.eu-west-2.amazonaws.com/fpl-test.177arc.net/v1/latest/'
LAST_SEASON = '2019-20'
CURRENT_SEASON = '2020-21'
FIXTURES_LOOK_BACK = 38  # Limit of how many fixtures to look back for calculating rolling team stats
PLAYER_FIXTURES_LOOK_BACK = 12 # Limit of how many fixture to look back for calcating rolling player stats

# Load pre-processed data
This section loads data sets generated by the [fpl-data](https://github.com/177arc/fpl-data) lambda function and made available via the S3 bucket specified in the `DATA_URL` variable.

In [None]:
gws = get_df(url=f'{DATA_URL}gws.csv', index='GW ID')
teams = get_df(url=f'{DATA_URL}teams.csv', index='Team Code')
players_ext = get_df(url=f'{DATA_URL}players_ext.csv', index='Player Code')
player_teams = get_df(url=f'{DATA_URL}player_teams.csv', index='Player Code')
players_gw_team_eps_ext = get_df(url=f'{DATA_URL}players_gw_team_eps_ext.csv', index=['Player Code', 'Season', 'Game Week'])
player_gw_next_eps_ext = get_df(url=f'{DATA_URL}player_gw_next_eps_ext.csv', index=['Player Code'])
team_fixture_strength_ext = get_df(url=f'{DATA_URL}team_fixture_stats_ext.csv', index='Fixture Code')
dd = get_df(url=f'{DATA_URL}data_dictionary.csv')

# Configure context
This section we configure important setting for this notebook including the data dictionary. The data dictionary contains default ordering of fields, for each field a description, default format and mapping of API field names to more readable ones. It is used to show data in a more user-friendly way.

In [None]:
from common import Context
from datadict.jupyter import DataDict

ctx = Context()
ctx.fixtures_look_back = FIXTURES_LOOK_BACK
ctx.player_fixtures_look_back = PLAYER_FIXTURES_LOOK_BACK
ctx.last_season = LAST_SEASON
ctx.current_season = CURRENT_SEASON
ctx.dd = DataDict(data_dict=dd)
ctx.total_gws = gws.shape[0]
ctx.next_gw = gws[lambda df: df['Is Next GW?']].index.values[0]
ctx.def_next_gws = 'Next 8 GWs'
ctx.next_gw_counts = get_next_gw_counts(ctx)

## Visualise players' cost vs their expected points
The chart below shows expected points and cost for each player. The expected points are calculated hourly using the [fpl-data](https://github.com/177arc/fpl-data) lambda function. Use filters to focus on a particular segment and click on a dot to view more details about the player.

In [None]:
from jupyter import show_eps_vs_cost


show_eps_vs_cost(player_gw_next_eps_ext, players_gw_team_eps_ext, teams, ctx)

# Get best team for wildcard or season start
You can use the code below to get the best team for a wildcard or at the start of the season. It uses the [PuLP linear optimiser](https://pythonhosted.org/PuLP/) to find the team combination within the current money available with the highest total expected points of the over the next game weeks.

In [None]:
from optimiser import get_optimal_squad
from jupyter import display_team

team_budget = total_budget if 'total_budget' in globals() else 100.0

player_team_optimal = (get_optimal_squad(player_gw_next_eps_ext, 
                                        optimise_team_on='Expected Points Next 8 GWs', # Name of the column to use for optimisng the whole team.
                                        optimise_sel_on='Expected Points Next GW', # Name of the column to use for optimising the selection of the team.
                                        formation='2-5-5-3', # Formation of the team in format GKP-DEF-MID-FWD.
                                        budget=team_budget, # Maximum budget available for optimising the team.
                                        include=[], # List of player names that must be in the team, e.g. ['De Bruyne', 'Mané']
                                        exclude=[], # List of player names that must NOT be in the team.
                                        risk=0.2) # The amount of risk to take when evaluating the column to optimise on. This number has to be between 0 and 1.
    .sort_values(['Field Position Code'])
    .pipe(ctx.dd.reorder))

display_team(player_team_optimal, ctx)

# Load user team data
This section loads the data of the user's team. 

**Note this requires your user credentials to be saved in fpl_credentials.csv in the same directory as this notebook. Use fpl_credentials_template.csv as template.** Alternatively, you can set the fpl_email and fpl_password variables below.

In [None]:
from fplpandas import FPLPandas

# Enter your FPL credentials here.
fpl_email = ''
fpl_password = ''

if not os.path.exists(CREDS_FILE):
    fpl_cred = {'email': fpl_email, 'password': fpl_password}
else:
    fpl_cred = pd.read_csv('fpl_credentials.csv').iloc[0].to_dict()
    
assert len(fpl_cred['email']) > 0 and len(fpl_cred['password']) > 0, 'FPL credentials not set. Please provide your email and password.'

fpl = FPLPandas(**fpl_cred)

In [None]:
try:
    user_team_raw, _, user_trans_info_raw = fpl.get_user_team()
except aiohttp.ClientResponseError as e:
    if e.status == 404:
        print('Your team cannot be found. Have you created it? You can only optimise your team once you have created it.')
    else:
        print(e)

In [None]:
from data import get_players_id_code_map

players_id_code_map = (players_ext
                       [lambda df: df['Season'] == ctx.current_season]
                       .pipe(get_players_id_code_map))

user_team = (user_team_raw
    .pipe(ctx.dd.remap, data_set='player')
    .assign(**{'In Team?': True})
    .assign(**{'Selling Price': lambda df: df['Selling Price']/10})
    .assign(**{'Purchase Price': lambda df: df['Purchase Price']/10})
    .assign(**{'Selected?': lambda df: df['Team Position'].map(lambda x: x <= 11)}) 
    .rename_axis('Player ID')
    .reset_index()
    .merge(players_id_code_map, left_on='Player ID', right_index=True, suffixes=(None, None))
    .drop(columns='Player ID')
    .set_index('Player Code')
    )

user_trans_info = user_trans_info_raw.loc[0]

## Current team

In [None]:
player_user_team = user_team.merge(player_gw_next_eps_ext, left_on='Player Code', right_on='Player Code', how='left', suffixes=(None, None))
display_team(player_user_team, ctx)

In [None]:
total_budget = (user_trans_info['bank']/10+player_user_team['Selling Price'].sum())
total_budget

# Recommend selection for next GW and transfers for next 8 GWs
Use this section to get a recommendation on what players to select to optimise the expected points of your team and to improve it by making transfers. You need to have provided your FPL credentials for this to work.

It uses the PuLP linear optimiser to find the team combination within the current budget available with the highest total expected points of the over the next five game weeks while taking your current team into account for a user defined number of transfers. Note that when executing more than one transfer on the FPL website, 4 points will be deducted from your balance for every transfer.

It uses the same PuLP linear optimiser to find the selection with the highest expected points for the next game week.

## Recommended team

In [None]:
# Gets the cost and player ID of the second goal keeper so that the optimiser does not recommend his replacement.
second_gk = player_user_team[player_user_team['Field Position'] == 'GK'].sort_values('Expected Points Next GW')[['Current Cost']].iloc[0]
second_gk_cost = second_gk.values[0]
second_gk_id = second_gk.name

player_team_eps_user = (user_team
    .merge(player_gw_next_eps_ext, left_on='Player Code', right_on='Player Code', how='right', suffixes=(None, None))
    .assign(**{'Current Cost': lambda df: df['Selling Price'].fillna(df['Current Cost'])}))

player_team_optimal = (get_optimal_squad(player_team_eps_user, 
                        optimise_team_on='Expected Points Next 8 GWs', # Name of the column to use for optimisng the whole team.
                        optimise_sel_on='Expected Points Next GW', # Name of the column to use for optimising the selection of the team.
                        formation='1-5-5-3', # Formation of the team in format GKP-DEF-MID-FWD. Not 2-5-5-3 if we want to avoid the transfer of the second goal keeper recommended.
                        budget=team_budget-second_gk_cost, # Maximum budget available for optimising the team. Not just total_budget if we want to avoid the transfer of the second goal keeper recommended.
                        include=[], # List of player names that must be in the team, e.g. ['De Bruyne', 'Mané']
                        exclude=[], # List of player names that must NOT be in the team.
                        risk=0.2, # The amount of risk to take when evaluating the column to optimise on. This number has to be between 0 and 1.
                        recommend=0) # Number of transfers to recommend. If set to 0, the optimiser will still recommend a team selection that maximises the expected points.\
    .sort_values(['Field Position Code']))
player_team_optimal = player_team_optimal.pipe(ctx.dd.reorder)
display_team(player_team_optimal, ctx, in_team=True)

In [None]:
player_team_removed = player_user_team[(player_user_team['In Team?'] == True) 
                                       & (player_user_team.index.isin(player_team_optimal.index.values) == False)
                                      & (player_user_team.index.isin([second_gk_id]) == False)]
ctx.dd.display(player_team_removed[['Name', 'Current Cost', 'Field Position', 'Captain?', 'Vice Captain?', 'Minutes Percent', 'News And Date', 'Expected Points Next GW', f'Expected Points {ctx.def_next_gws}']],
           index=False, footer=False, descriptions=False)

# Select a good week to play the free hit chip
The idea here is to use the expected points for each player to determine the expected points of the optimal team (selected players only) for each game week. The game week with the highest expected points is the best for a free hit. **Be aware that towards the end of the season, double game weeks get scheduled and therefore it is advisable to wait till early March.**

In [None]:
from jupyter import log_progress
from backtest import pred_free_hit_gw

if ctx.next_gw > 1:
    free_hist_eps = DF()
    for gw in log_progress(range(ctx.next_gw, ctx.total_gws+1), name='Game Week'):
        free_hist_eps = free_hist_eps.append(
            pred_free_hit_gw(players_gw_team_eps_ext, player_teams, team_budget, gw, ctx), 
            ignore_index=True)

    display(free_hist_eps
        .sort_values('Expected Points', ascending=False)
        .set_index('Game Week'))
else:
    print('This simulation relies on data that will only be available after game week 1 and will only become reliable later in the season.')

# Select a good week for playing the bench boost chip
Here we use the expected points for each player to determine the expected points of the user team (incl. non-selected players) for each game week. The game week with the highest expected points is the best for a bench boost. **Be aware that towards the end of the season, double game weeks get scheduled and therefore it is advisable to wait till early March.**

In [None]:
from backtest import pred_bench_boost_gw

if ctx.next_gw > 1:
    player_user_team_eps = (user_team
        .merge(players_gw_team_eps_ext.reset_index(), left_on='Player Code', right_on='Player Code', how='right', suffixes=(None, None))
        .assign(**{'Current Cost': lambda df: df['Selling Price'].fillna(df['Current Cost'])}))

    bench_boost_eps = DF()

    for gw in log_progress(range(ctx.next_gw, ctx.total_gws+1), name='Game Week'):
        bench_boost_eps = bench_boost_eps.append(
            pred_bench_boost_gw(player_user_team_eps, player_teams, team_budget, gw, ctx), 
            ignore_index=True)

    display(bench_boost_eps
        .sort_values('Expected Points', ascending=False)
        .set_index('Game Week'))
else:
    print('This simulation relies on data that will only be available after game week 1 and will only become reliable later in the season.')

In [None]:
from data import get_team_fixtures_by_gw

fdr_by_team_gw = get_team_fixtures_by_gw(team_fixture_strength_ext, 'Team FDR', ctx)
fdr_labels_by_team_gw = get_team_fixtures_by_gw(team_fixture_strength_ext, 'Label', ctx)

In [None]:
from jupyter import get_fdr_chart

get_fdr_chart(fdr_by_team_gw, fdr_labels_by_team_gw, 'FDR', True).show()

In [None]:
goal_strength_by_team_gw = get_team_fixtures_by_gw(
    team_fixture_strength_ext.assign(**{'MDR': lambda df: 1/df['Rel Def Fixture Strength']}), 'MDR', ctx)

get_fdr_chart(goal_strength_by_team_gw, fdr_labels_by_team_gw, 'Defensive difficulty').show()

In [None]:
goal_strength_by_team_gw = get_team_fixtures_by_gw(
    team_fixture_strength_ext.assign(**{'MDR': lambda df: 1/df['Rel Att Fixture Strength']}), 'MDR', ctx)

get_fdr_chart(goal_strength_by_team_gw, fdr_labels_by_team_gw, 'Attacking difficulty').show()

In [None]:
from backtest import get_gw_points_backtest

gw_points_backtest = get_gw_points_backtest(players_gw_team_eps_ext, ctx)

In [None]:
gw_points_backtest[['Error', 'Error Simple']].mean()

In [None]:
import plotly.express as px

px.line(gw_points_backtest, x='Season Game Week', y=['Avg Expected Points', 'Avg Fixture Total Points', 'Error']).show()

# Back test the expected points
The basic idea of testing the predictions is to look at each past game week, predict the expected points for the game week (both adjusted for relative team strengths and not adjusted), optimise the team based on the expected points and then calculate the total expected points for the optimised team (only for the selected player). For validation, we calculate the actual points of the players of the optimised team. We also calculate the points of the dream team, i.e. the total points of the team with highest actual points for each game week.

In [None]:
from backtest import back_test_gw

if ctx.next_gw > 1:
    backtest_results = DF()
    for gw in log_progress(range(2, ctx.next_gw), name='Game Week'):
        backtest_results = backtest_results.append(back_test_gw(players_gw_team_eps_ext.reset_index(), gw, player_teams, ctx), ignore_index=True)
else:
    print('This simulation relies on data that will only be available after game week 1 and will only become reliable later in the season.')

In [None]:
if ctx.next_gw > 1:
    px.line(backtest_results, x='Game Week', y=['Actual Points Dream Team', 'Calc Actual Points', 'Calc Expected Points']).show()