# Fantasy Football AI project

## Team Selections - Full Season

In this notebook we've written the clean functions from the team_selections_season notebook

**Problem 1 - see separate notebook**: What is the highest scoring team, budget permitting, that could have been set and left on gw1? i.e. given the current points totals for all players, which team, when selected in gw 1 would give you the highest possible return at the current date?

**Problem 2**:  The solution to this problem will inform the final AI selector. What is the best possible manager result when viewing the history of current season gameweeks. i.e. which sequence of teams, when selected within the rules of the game, would provide the maximum possible points haul at the current date?

**Proble 3**: Select the best gw squads as if you were choosing for the first time in each gameweek. So gw you have a budget of £100 and a blank canvas: pick your squad. Gw2 you have a squad of £100M and a blank canvas: pick your squad... etc etc

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

pd.set_option("display.max_columns", None)

In [2]:
# Read in the appropriate datasets with selected features
players = pd.read_csv("players.csv")[["element", "round", "total_points", "value" ]]
plr_info = pd.read_csv("basic_player_info.csv")[["element_type", "id", "web_name", "short_name"]]
cumulative = pd.read_csv("cumulative_points.csv")[["element", "round", "cum_points", "cost"]]

In [3]:
players.head()

Unnamed: 0,element,round,total_points,value
0,1,1,0,70
1,1,2,0,69
2,1,3,0,69
3,1,4,0,68
4,1,5,0,68


In [4]:
plr_info.head()

Unnamed: 0,element_type,id,web_name,short_name
0,3,1,Özil,ARS
1,2,2,Sokratis,ARS
2,2,3,David Luiz,ARS
3,3,4,Aubameyang,ARS
4,2,5,Cédric,ARS


In [5]:
cumulative.head()

Unnamed: 0,element,round,cum_points,cost
0,1,1,0,70
1,1,2,0,69
2,1,3,0,69
3,1,4,0,68
4,1,5,0,68


In [6]:
# We need a data set that holds all player gameweeks including metrics from each gameweek
#Create a function that takes the data and merges into a useful set

def merge_data(players, plr_info, cumulative):
    """
    In this function we'll take the available datasets, filter for useful features and values and merge into one 
    master set
    """

    # join the players and info sets
    data = players.merge(plr_info, how="left", left_on="element", right_on="id").drop("element", axis=1)
    
    # join the data and cumulative points sets
    data = data.merge(cumulative, how="left", left_on=["id", "round"], 
                      right_on=["element", "round"]).drop("element", axis=1)


    data = data[["id", "web_name", "round", "total_points", "element_type", "short_name", "cum_points", "value"]]
    
    return data

In [7]:
data = merge_data(players, plr_info, cumulative)

This master data set holds all key player information we'll need for this problem.

In [8]:
data.head()

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value
0,1,Özil,1,0,3,ARS,0.0,70
1,1,Özil,2,0,3,ARS,0.0,69
2,1,Özil,3,0,3,ARS,0.0,69
3,1,Özil,4,0,3,ARS,0.0,68
4,1,Özil,5,0,3,ARS,0.0,68


## Step 1
First we need to get the set of players available in each game week so we can select the best performing players at that snap shot in time, we also need to select a set of target transfers in future gameweeks.

In [9]:
def round_players(full_set, gw_idx):
    
    """
    In this function we'll create a set of available, scoring players for the requested gameweek. From that set 
    we'll sort by points scored and generate a set of four potential transfers, 1 per position and store them 
    in a targets dataset.
    
    We return the available players set and the targets
    
    We need to add in some functionality to check that our targets are not already in the current team
    
    
    """
    
    # First we'll check which players were available at the stated gameweek
    gw_true = full_set[full_set["round"]==gw_idx]
    # Create a list of ids for these players
    gw_id_list = gw_true.id.tolist()
    # Now select all the available players in all rounds from our data
    available_players = full_set.loc[data["id"].isin(gw_id_list)]

    # Now we'll remove all players who have 0 points at the latest gw, first filter for latest gw
    latest_round = available_players[available_players["round"]==gw_idx]
    # Now filter in latest round for players scoring 1 or more points
    scorers = latest_round[latest_round.cum_points>0]

    # Get the gw 2 costs from the cumulative df cost feature
    week_cost = cumulative[cumulative["round"]==gw_idx][["element", "cost"]]

    rd_player_set= scorers.merge(week_cost, how="left", left_on="id", right_on="element").drop("element", axis=1)

    # Finally we create a value feature (points per £million) which will assist our team selection
    dec = 1
    rd_player_set["value"] = rd_player_set.cum_points / rd_player_set.cost
    rd_player_set["value"] = rd_player_set["value"].apply(lambda x: round(x, dec)) # Round the values down to 1 decimal place
    
    #if team == None

    # Now generate a target per position
    gk2 = rd_player_set[rd_player_set.element_type==1].sort_values(by=["total_points", "cost"], ascending=[False, True])
    def2 = rd_player_set[rd_player_set.element_type==2].sort_values(by=["total_points", "cost"], ascending=[False, True])
    mid2 = rd_player_set[rd_player_set.element_type==3].sort_values(by=["total_points", "cost"], ascending=[False, True])
    att2 = rd_player_set[rd_player_set.element_type==4].sort_values(by=["total_points", "cost"], ascending=[False, True])
    
    # Set up an empty dataframe to store the next players in each position
    targets = pd.DataFrame(columns = rd_player_set.columns)

    # Initialise i to 0 although this will change as we iterate 
    #### If the initial 4 players are rejected we'll need to go to the next level down..
    ### We would expect to only go 3 or 4 levels deep in the worst case scenario?? More thinking required here though
    i=0
    next_gk = gk2.iloc[i].to_frame().T
    next_def = def2.iloc[i].to_frame().T
    next_mid = mid2.iloc[i].to_frame().T
    next_att = att2.iloc[i].to_frame().T

    targets = targets.append(next_gk).append(next_def).append(next_mid).append(next_att)

    
    return rd_player_set, targets

In [10]:
round_player_set, targets = round_players(data, 1)

In [11]:
round_player_set ### OK... GOOD!!

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
0,4,Aubameyang,1,7,3,ARS,7.0,0.1,120
1,6,Lacazette,1,7,4,ARS,7.0,0.1,85
2,8,Leno,1,7,1,ARS,7.0,0.1,50
3,9,Xhaka,1,3,3,ARS,3.0,0.1,55
4,11,Bellerín,1,5,2,ARS,5.0,0.1,50
...,...,...,...,...,...,...,...,...,...
208,513,Lewis,1,7,2,NEW,7.0,0.2,45
209,515,Vitinha,1,1,3,WOL,1.0,0.0,50
210,518,Mendy,1,3,3,LEI,3.0,0.1,45
211,525,Odoi,1,1,2,FUL,1.0,0.0,45


## STEP 2
Now we want to generate a baseline team for gw1

In [12]:

# Create a function that builds our baseline team - i.e. a team consisting of highest points scorers
def baseline_team(rd_players):
    """
    In this function we'll build a baseline GW1 team that selects the highest scoring players regardless of costs
    """
    # Call the prepset to create the full set of available players
    df = rd_players
    
    # Create position sets sorted in descending order of points scored
    gk = df[df.element_type==1].sort_values(by="cum_points", ascending=False)
    dfn = df[df.element_type==2].sort_values(by="cum_points", ascending=False)
    mid = df[df.element_type==3].sort_values(by="cum_points", ascending=False)
    fwd = df[df.element_type==4].sort_values(by="cum_points", ascending=False)
    
    # Select the players for our squad based on the number of allowed players per position in the FPL game
    gk_selected = gk.iloc[0:2]
    def_selected = dfn.iloc[0:5]
    mid_selected = mid.iloc[0:5]
    fwd_selected = fwd.iloc[0:3]

    # Store the squad player's ids in a dataframe. We use ids over names as some names are duplicated
    gk_ids = gk_selected.id.values.tolist()
    def_ids = def_selected.id.values.tolist()
    mid_ids = mid_selected.id.values.tolist()
    fwd_ids = fwd_selected.id.values.tolist()

    
    # Let's store the team in a data frame
    team_list = [gk_ids, def_ids, mid_ids, fwd_ids]
    team_frame = pd.DataFrame(team_list).T
    team_frame.rename(columns = {0: "gk", 1:"def", 2:"mid", 3:"fwd"}, inplace=True)

    # Now, select the team_frame data points from our prepared df to create a the base team
    base_team = df[df.id.isin(team_frame.values.reshape((20)).tolist())].sort_values(by="element_type", 
                                                                                     ascending=True)
    
    
    # Finally we sort the players by element and then in descending order of points scored
    base_gk = base_team[base_team.element_type==1].sort_values(by="cum_points", ascending=False)
    base_dfn = base_team[base_team.element_type==2].sort_values(by="cum_points", ascending=False)
    base_mid = base_team[base_team.element_type==3].sort_values(by="cum_points", ascending=False)
    base_fwd = base_team[base_team.element_type==4].sort_values(by="cum_points", ascending=False)

    base_team = pd.DataFrame(columns=base_gk.columns)
    base_team = base_team.append(base_gk).append(base_dfn).append(base_mid).append(base_fwd)
    base_team.reset_index(inplace=True)
    base_team.drop("index", axis=1, inplace=True)
    
    # We'll need to access the team cost for budgeting
    team_cost = base_team.cost.sum()/10
    
    return base_team, team_cost

In [13]:
gw1_team, team_cost = baseline_team(round_player_set)

In [14]:
gw1_team ## OK... GOOD!!!

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
0,128,Guaita,1,10,1,CRY,10.0,0.2,50
1,157,Pickford,1,8,1,EVE,8.0,0.2,50
2,461,Saïss,1,15,2,WOL,15.0,0.3,50
3,494,Gabriel,1,15,2,ARS,15.0,0.3,50
4,123,James,1,14,2,CHE,14.0,0.3,50
5,498,Castagne,1,14,2,LEI,14.0,0.3,55
6,155,Digne,1,12,2,EVE,12.0,0.2,60
7,254,Salah,1,20,3,LIV,20.0,0.2,120
8,478,Willian,1,14,3,ARS,14.0,0.2,80
9,485,Hendrick,1,14,3,NEW,14.0,0.3,50


In [15]:
team_cost

95.5

## STEP 3
Next we need to get the next game weeks set of players and targets so we copy and paste round_players() func

In [16]:
def round_players(full_set, gw_idx):
    
    """
    In this function we'll create a set of available, scoring players for the requested gameweek. From that set 
    we'll sort by points scored and generate a set of four potential transfers, 1 per position and store them 
    in a targets dataset.
    
    We return the available players set and the targets
    
    We need to add in some functionality to check that our targets are not already in the current team
    
    
    """
    
    # First we'll check which players were available at the stated gameweek
    gw_true = full_set[full_set["round"]==gw_idx]
    # Create a list of ids for these players
    gw_id_list = gw_true.id.tolist()
    # Now select all the available players in all rounds from our data
    available_players = full_set.loc[data["id"].isin(gw_id_list)]

    # Now we'll remove all players who have 0 points at the latest gw, first filter for latest gw
    latest_round = available_players[available_players["round"]==gw_idx]
    # Now filter in latest round for players scoring 1 or more points
    scorers = latest_round[latest_round.cum_points>0]

    # Get the gw 2 costs from the cumulative df cost feature
    week_cost = cumulative[cumulative["round"]==gw_idx][["element", "cost"]]

    rd_player_set= scorers.merge(week_cost, how="left", left_on="id", right_on="element").drop("element", axis=1)

    # Finally we create a value feature (points per £million) which will assist our team selection
    dec = 1
    rd_player_set["value"] = rd_player_set.cum_points / rd_player_set.cost
    rd_player_set["value"] = rd_player_set["value"].apply(lambda x: round(x, dec)) # Round the values down to 1 decimal place
    
    #if team == None

    # Now generate a target per position
    gk2 = rd_player_set[rd_player_set.element_type==1].sort_values(by=["total_points", "cost"], ascending=[False, True])
    def2 = rd_player_set[rd_player_set.element_type==2].sort_values(by=["total_points", "cost"], ascending=[False, True])
    mid2 = rd_player_set[rd_player_set.element_type==3].sort_values(by=["total_points", "cost"], ascending=[False, True])
    att2 = rd_player_set[rd_player_set.element_type==4].sort_values(by=["total_points", "cost"], ascending=[False, True])
    
    # Set up an empty dataframe to store the next players in each position
    targets = pd.DataFrame(columns = rd_player_set.columns)

    # Initialise i to 0 although this will change as we iterate 
    #### If the initial 4 players are rejected we'll need to go to the next level down..
    ### We would expect to only go 3 or 4 levels deep in the worst case scenario?? More thinking required here though
    i=0
    next_gk = gk2.iloc[i].to_frame().T
    next_def = def2.iloc[i].to_frame().T
    next_mid = mid2.iloc[i].to_frame().T
    next_att = att2.iloc[i].to_frame().T

    targets = targets.append(next_gk).append(next_def).append(next_mid).append(next_att)

    
    return rd_player_set, targets

In [17]:
round_player_set, targets = round_players(data, 2)

In [18]:
round_player_set["round"].value_counts()
targets ## OK ... GOOD!!!

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
129,252,Alisson,2,14,1,LIV,15,0.2,60
20,46,Konsa,2,15,2,AVL,15,0.3,45
205,390,Son,2,24,3,TOT,26,0.3,89
204,388,Kane,2,21,4,TOT,23,0.2,105


## STEP 4 
We want to sort our old team so that the weakest players (in the latest gw) occupy certain index values, as follows:
gk=1, def=6, mid=11, att=14

In [19]:
def team_sorted(round_player_set, current_team): # pass in the existing team
    
    
    # First put our current team ids into a list 
    team_ids = current_team.id.values.tolist()
    
    # Now get these ids from the rd_2_players set
    current_team_next = round_player_set[round_player_set.id.isin(team_ids)].sort_values(by="element_type", 
                                                                                         ascending=True)
    
    
    # Sort by points, cost for each position
    team_gk = current_team_next[current_team_next.element_type==1].sort_values(by=["total_points", "cost"], 
                                                                           ascending=[False, True])
    team_def = current_team_next[current_team_next.element_type==2].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_mid = current_team_next[current_team_next.element_type==3].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_att = current_team_next[current_team_next.element_type==4].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    
    # Join the positions back into a dataframe
    team_sorted_for_transfer = pd.DataFrame(columns = current_team.columns)
    team_sorted_for_transfer = team_sorted_for_transfer.append(team_gk).append(team_def).append(team_mid).append(team_att)
    team_sorted_for_transfer = team_sorted_for_transfer.reset_index()
    team_sorted_for_transfer = team_sorted_for_transfer.drop("index", axis=1)

    

    #Index for worst in each pos, gk=1, def = 6, mid = 11, att = 14
    
    return team_sorted_for_transfer

In [20]:
team_indexed_for_transfer = team_sorted(round_player_set, gw1_team)

In [21]:
team_indexed_for_transfer ## OK... GOOD!!

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
0,128,Guaita,2,3,1,CRY,13.0,0.3,50
1,157,Pickford,2,1,1,EVE,9.0,0.2,50
2,498,Castagne,2,9,2,LEI,23.0,0.4,55
3,494,Gabriel,2,2,2,ARS,17.0,0.3,50
4,123,James,2,1,2,CHE,15.0,0.3,50
5,461,Saïss,2,1,2,WOL,16.0,0.3,50
6,155,Digne,2,1,2,EVE,13.0,0.2,61
7,198,Klich,2,9,3,LEE,18.0,0.3,55
8,254,Salah,2,3,3,LIV,23.0,0.2,120
9,485,Hendrick,2,2,3,NEW,16.0,0.3,50


## STEP 5
The next step is to get our order of preferred transfers from the target list, this is based on the best net points gain!! We'll return each element type in order of preference... we'll later search the targets set in order of element type best to worst and transfer in the first one that falls within budget!!

In [22]:
def net_loss(df, targets, g_idx, d_idx, m_idx, f_idx):
    """
    In this function we calculate the potential loss of points of each player to be considered for transfer into 
    the squad and save them to a list. We then organise loss values in ascending order and create a list of 
    indexes that the ordered values are found in the initial loss list
    df = team_sorted_for_transfer
    """
    
    gk, dfn, mid, fwd = g_idx, d_idx, m_idx, f_idx
    
    # Set an empty list to store our player losses
    net_loss = []
    
    
    ### ADD IN SOME FUNCIONALITY TO CHECK THAT THE DF.ID AND TARGET.ID ARE NOT THE SAME, IE WE'RE NOT TRYING TO 
    ### TRANSFER IN THE SAME PLAYER SON OUT SON IN, KANE OUT KANE IN ETC ETC
    ### THIS IS PRACTICALLY IMPOSSIBLE BUT MAYBE NOT THEORETICALLY SO
    
    ### THE ABOVE COMMENT BEGS THE QUESTION - WHAT HAPPENS IF MY CURRENT TEAM ARE THE BEST SCORING PLAYERS 
    ### IN THE LEAGUE... WE NEED TO MAKE SURE THAT NO TRANSFERS OCCUR IN THAT CASE!!
    
    # calculate the potential loss each transfer would incur
    gk_loss = df.total_points.iloc[1] - targets.total_points.iloc[0]
    def_loss = df.total_points.iloc[6] - targets.total_points.iloc[1]
    mid_loss = df.total_points.iloc[11] - targets.total_points.iloc[2]
    fwd_loss = df.total_points.iloc[14] - targets.total_points.iloc[3]

    # Add each loss to the net_loss list which is our unordered list of net losses
    net_loss.extend([gk_loss, def_loss, mid_loss, fwd_loss])

    # Create an ordered list of net losses high to low
    ordered_loss = net_loss.copy()
    ordered_loss.sort()

    # Now create an ordered list of indexes - i.e the index in net_loss that we can find values in the ordered_loss
    # Note the values have 1 added to equate to the element types 1=gk, 2=def, 3=mid, 4=fwd
    ordered_idx = [net_loss.index(ordered_loss[x])+1 for x in range(len(ordered_loss))]
    
    return ordered_idx

In [23]:
targets_indexed = net_loss(team_indexed_for_transfer, targets, 1, 6, 11, 14)

In [24]:
targets_indexed ### OK... GOOD!!!

[3, 4, 2, 1]

## STEP 6
Now we want to make the transfer using our targets and the preferred order in targets_indexed

In [25]:
def make_transfers(order_index, team_sorted_for_transfer, team_cost, targets):


    g_idx, d_idx, m_idx, f_idx = 1, 6, 11, 14

    #team_cost = base_team_gw1.cost.sum()/10
    #cost_out = team_sorted_for_transfer.iloc[11].cost/10

    #cost_in = float(targets[targets.element_type==3].cost/10)

    #new_budget = team_cost - cost_out
    #new_team_cost = new_budget + cost_in
    new_team_cost = 0

    best_team = team_sorted_for_transfer.copy()

    for m, n in enumerate(order_index):
        if n == 1: # i.e if we're looking to swap a goalkeeper
            cost_out = team_sorted_for_transfer.iloc[g_idx].cost/10
            cost_in = float(targets[targets.element_type==1].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=g_idx, axis=0)
                best_team.loc[g_idx] = targets.iloc[0]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 2: # i.e if we're looking to swap a defender
            cost_out = team_sorted_for_transfer.iloc[d_idx].cost/10
            cost_in = float(targets[targets.element_type==2].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=d_idx, axis=0)
                best_team.loc[d_idx] = targets.iloc[1]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 3: # i.e if we're looking to swap a midfielder
            cost_out = team_sorted_for_transfer.iloc[m_idx].cost/10
            cost_in = float(targets[targets.element_type==3].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=m_idx, axis=0)
                best_team.loc[m_idx] = targets.iloc[2]
                best_team = best_team.sort_index()
                break
            else:
                continue

        if n == 4: # i.e if we're looking to swap an attacker
            cost_out = team_sorted_for_transfer.iloc[f_idx].cost/10
            cost_in = float(targets[targets.element_type==4].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=f_idx, axis=0)
                best_team.loc[f_idx] = targets.iloc[3]
                best_team = best_team.sort_index(inplace=True)
                break
            else:
                continue
    
    team_cost = best_team.cost.sum()/10


    # Review our new team            
    return best_team, team_cost

In [26]:
updated_team, team_cost = make_transfers(targets_indexed, team_indexed_for_transfer, team_cost, targets)

In [27]:
updated_team

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
0,128,Guaita,2,3,1,CRY,13.0,0.3,50
1,157,Pickford,2,1,1,EVE,9.0,0.2,50
2,498,Castagne,2,9,2,LEI,23.0,0.4,55
3,494,Gabriel,2,2,2,ARS,17.0,0.3,50
4,123,James,2,1,2,CHE,15.0,0.3,50
5,461,Saïss,2,1,2,WOL,16.0,0.3,50
6,155,Digne,2,1,2,EVE,13.0,0.2,61
7,198,Klich,2,9,3,LEE,18.0,0.3,55
8,254,Salah,2,3,3,LIV,23.0,0.2,120
9,485,Hendrick,2,2,3,NEW,16.0,0.3,50


In [None]:
targets

In [30]:
team_sorted_for_transfer = team_indexed_for_transfer.copy()
best_team = team_indexed_for_transfer.copy()
team_cost = 95.5
g_idx, d_idx, m_idx, f_idx = 1, 4, 11, 14

for m, n in enumerate(targets_indexed):
        if n == 1: # i.e if we're looking to swap a goalkeeper
            print(n)
            cost_out = team_sorted_for_transfer.iloc[g_idx].cost/10
            cost_in = float(targets[targets.element_type==1].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=g_idx, axis=0)
                best_team.iloc[g_idx] = targets.iloc[0]
                best_team = best_team.reset_index()
                break
            else:
                continue

        elif n == 2: # i.e if we're looking to swap a defender
            print(n)
            cost_out = team_sorted_for_transfer.iloc[d_idx].cost/10
            cost_in = float(targets[targets.element_type==2].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=d_idx, axis=0)
                best_team.loc[d_idx] = targets.iloc[1]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 3: # i.e if we're looking to swap a midfielder
            print(n)
            cost_out = team_sorted_for_transfer.iloc[m_idx].cost/10
            cost_in = float(targets[targets.element_type==3].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=m_idx, axis=0)
                best_team.loc[m_idx] = targets.iloc[2]
                #best_team = best_team.sort_index()
                break
            else:
                continue

        if n == 4: # i.e if we're looking to swap an attacker
            print(n)
            cost_out = team_sorted_for_transfer.iloc[f_idx].cost/10
            cost_in = float(targets[targets.element_type==4].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=f_idx, axis=0)
                best_team.loc[f_idx] = targets.iloc[3]
                #best_team = best_team.sort_index(inplace=True)
                break
            else:
                continue

3


In [31]:
best_team

Unnamed: 0,id,web_name,round,total_points,element_type,short_name,cum_points,value,cost
0,128,Guaita,2,3,1,CRY,13.0,0.3,50
1,157,Pickford,2,1,1,EVE,9.0,0.2,50
2,498,Castagne,2,9,2,LEI,23.0,0.4,55
3,494,Gabriel,2,2,2,ARS,17.0,0.3,50
4,123,James,2,1,2,CHE,15.0,0.3,50
5,461,Saïss,2,1,2,WOL,16.0,0.3,50
6,155,Digne,2,1,2,EVE,13.0,0.2,61
7,198,Klich,2,9,3,LEE,18.0,0.3,55
8,254,Salah,2,3,3,LIV,23.0,0.2,120
9,485,Hendrick,2,2,3,NEW,16.0,0.3,50


In [None]:
best_team.loc[11] = targets.iloc[2]

In [None]:
best_team

In [None]:
best_team = best_team.sort_index()

In [None]:
best_team

## STEP 7
Append the updated team to the old team

In [None]:
def append_teams(gw1_team, updated_team):
    
    season = gw1_team.append(updated_team)
    
    return season

In [None]:
appended_teams = append_teams(gw1_team, updated_team)

In [None]:
appended_teams

# THAT'S ONE LOOP COMPLETE FOR GW1 AND GW2.

## STEP 8

Now we need to repeat steps 3 to 7 for the remaining gwks but this is where we encountered problems earlier so we're going to run through manually.

## REPEAT STEP 3

In [None]:
def round_players(full_set, gw_idx):
    
    """
    In this function we'll create a set of available, scoring players for the requested gameweek. From that set 
    we'll sort by points scored and generate a set of four potential transfers, 1 per position and store them 
    in a targets dataset.
    
    We return the available players set and the targets
    
    We need to add in some functionality to check that our targets are not already in the current team
    
    
    """
    
    # First we'll check which players were available at the stated gameweek
    gw_true = full_set[full_set["round"]==gw_idx]
    # Create a list of ids for these players
    gw_id_list = gw_true.id.tolist()
    # Now select all the available players in all rounds from our data
    available_players = full_set.loc[data["id"].isin(gw_id_list)]

    # Now we'll remove all players who have 0 points at the latest gw, first filter for latest gw
    latest_round = available_players[available_players["round"]==gw_idx]
    # Now filter in latest round for players scoring 1 or more points
    scorers = latest_round[latest_round.cum_points>0]

    # Get the gw 2 costs from the cumulative df cost feature
    week_cost = cumulative[cumulative["round"]==gw_idx][["element", "cost"]]

    rd_player_set= scorers.merge(week_cost, how="left", left_on="id", right_on="element").drop("element", axis=1)

    # Finally we create a value feature (points per £million) which will assist our team selection
    dec = 1
    rd_player_set["value"] = rd_player_set.cum_points / rd_player_set.cost
    rd_player_set["value"] = rd_player_set["value"].apply(lambda x: round(x, dec)) # Round the values down to 1 decimal place
    
    #if team == None

    # Now generate a target per position
    gk2 = rd_player_set[rd_player_set.element_type==1].sort_values(by=["total_points", "cost"], ascending=[False, True])
    def2 = rd_player_set[rd_player_set.element_type==2].sort_values(by=["total_points", "cost"], ascending=[False, True])
    mid2 = rd_player_set[rd_player_set.element_type==3].sort_values(by=["total_points", "cost"], ascending=[False, True])
    att2 = rd_player_set[rd_player_set.element_type==4].sort_values(by=["total_points", "cost"], ascending=[False, True])
    
    # Set up an empty dataframe to store the next players in each position
    targets = pd.DataFrame(columns = rd_player_set.columns)

    # Initialise i to 0 although this will change as we iterate 
    #### If the initial 4 players are rejected we'll need to go to the next level down..
    ### We would expect to only go 3 or 4 levels deep in the worst case scenario?? More thinking required here though
    i=0
    next_gk = gk2.iloc[i].to_frame().T
    next_def = def2.iloc[i].to_frame().T
    next_mid = mid2.iloc[i].to_frame().T
    next_att = att2.iloc[i].to_frame().T

    targets = targets.append(next_gk).append(next_def).append(next_mid).append(next_att)

    
    return rd_player_set, targets

In [None]:
round_player_set, targets = round_players(data, 3)

In [None]:
round_player_set
targets

## REPEAT STEP 4

In [None]:
def team_sorted(round_player_set, current_team): # pass in the existing team
    
    
    # First put our current team ids into a list 
    team_ids = current_team.id.values.tolist()
    
    # Now get these ids from the rd_2_players set
    current_team_next = round_player_set[round_player_set.id.isin(team_ids)].sort_values(by="element_type", 
                                                                                         ascending=True)
    
    
    # Sort by points, cost for each position
    team_gk = current_team_next[current_team_next.element_type==1].sort_values(by=["total_points", "cost"], 
                                                                           ascending=[False, True])
    team_def = current_team_next[current_team_next.element_type==2].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_mid = current_team_next[current_team_next.element_type==3].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_att = current_team_next[current_team_next.element_type==4].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    
    # Join the positions back into a dataframe
    team_sorted_for_transfer = pd.DataFrame(columns = current_team.columns)
    team_sorted_for_transfer = team_sorted_for_transfer.append(team_gk).append(team_def).append(team_mid).append(team_att)
    team_sorted_for_transfer = team_sorted_for_transfer.reset_index()
    team_sorted_for_transfer = team_sorted_for_transfer.drop("index", axis=1)

    

    #Index for worst in each pos, gk=1, def = 6, mid = 11, att = 14
    
    return team_sorted_for_transfer, team_gk, team_def, team_mid, team_att

In [None]:
team_indexed_for_transfer, g, d, m, a = team_sorted(round_player_set, updated_team)

In [None]:
team_indexed_for_transfer


## REPEAT STEP 5

In [None]:
def net_loss(df, targets, g_idx, d_idx, m_idx, f_idx):
    """
    In this function we calculate the potential loss of points of each player to be considered for transfer into 
    the squad and save them to a list. We then organise loss values in ascending order and create a list of 
    indexes that the ordered values are found in the initial loss list
    df = team_sorted_for_transfer
    """
    
    gk, dfn, mid, fwd = g_idx, d_idx, m_idx, f_idx
    
    # Set an empty list to store our player losses
    net_loss = []
    
    
    ### ADD IN SOME FUNCIONALITY TO CHECK THAT THE DF.ID AND TARGET.ID ARE NOT THE SAME, IE WE'RE NOT TRYING TO 
    ### TRANSFER IN THE SAME PLAYER SON OUT SON IN, KANE OUT KANE IN ETC ETC
    ### THIS IS PRACTICALLY IMPOSSIBLE BUT MAYBE NOT THEORETICALLY SO
    
    ### THE ABOVE COMMENT BEGS THE QUESTION - WHAT HAPPENS IF MY CURRENT TEAM ARE THE BEST SCORING PLAYERS 
    ### IN THE LEAGUE... WE NEED TO MAKE SURE THAT NO TRANSFERS OCCUR IN THAT CASE!!
    
    # calculate the potential loss each transfer would incur
    gk_loss = df.total_points.iloc[1] - targets.total_points.iloc[0]
    def_loss = df.total_points.iloc[6] - targets.total_points.iloc[1]
    mid_loss = df.total_points.iloc[11] - targets.total_points.iloc[2]
    fwd_loss = df.total_points.iloc[14] - targets.total_points.iloc[3]

    # Add each loss to the net_loss list which is our unordered list of net losses
    net_loss.extend([gk_loss, def_loss, mid_loss, fwd_loss])

    # Create an ordered list of net losses high to low
    ordered_loss = net_loss.copy()
    ordered_loss.sort()

    # Now create an ordered list of indexes - i.e the index in net_loss that we can find values in the ordered_loss
    # Note the values have 1 added to equate to the element types 1=gk, 2=def, 3=mid, 4=fwd
    ordered_idx = [net_loss.index(ordered_loss[x])+1 for x in range(len(ordered_loss))]
    
    return ordered_idx

In [None]:
targets_indexed = net_loss(team_indexed_for_transfer, targets, 1, 6, 11, 14)

In [None]:
team_indexed_for_transfer.total_points.iloc[14] - targets.total_points.iloc[3]

In [None]:
team_indexed_for_transfer

In [None]:
targets_indexed

## REPEAT STEP 6

In [None]:
def make_transfers(order_index, team_sorted_for_transfer, team_cost, targets):


    g_idx, d_idx, m_idx, f_idx = 1, 6, 11, 14

    #team_cost = base_team_gw1.cost.sum()/10
    #cost_out = team_sorted_for_transfer.iloc[11].cost/10

    #cost_in = float(targets[targets.element_type==3].cost/10)

    #new_budget = team_cost - cost_out
    #new_team_cost = new_budget + cost_in
    new_team_cost = 0

    best_team = team_sorted_for_transfer.copy()

    for m, n in enumerate(order_index):
        if n == 1: # i.e if we're looking to swap a goalkeeper
            cost_out = team_sorted_for_transfer.iloc[g_idx].cost/10
            cost_in = float(targets[targets.element_type==1].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=g_idx, axis=0)
                best_team.loc[g_idx] = targets.iloc[0]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 2: # i.e if we're looking to swap a defender
            cost_out = team_sorted_for_transfer.iloc[d_idx].cost/10
            cost_in = float(targets[targets.element_type==2].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=d_idx, axis=0)
                best_team.loc[d_idx] = targets.iloc[1]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 3: # i.e if we're looking to swap a midfielder
            cost_out = team_sorted_for_transfer.iloc[m_idx].cost/10
            cost_in = float(targets[targets.element_type==3].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=m_idx, axis=0)
                best_team.loc[m_idx] = targets.iloc[2]
                best_team = best_team.sort_index()
                break
            else:
                continue

        if n == 4: # i.e if we're looking to swap an attacker
            cost_out = team_sorted_for_transfer.iloc[f_idx].cost/10
            cost_in = float(targets[targets.element_type==4].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=f_idx, axis=0)
                best_team.loc[f_idx] = targets.iloc[3]
                best_team = best_team.sort_index(inplace=True)
                break
            else:
                continue
    
    team_cost = best_team.cost.sum()/10


    # Review our new team            
    return best_team, team_cost

In [None]:
updated_team, team_cost = make_transfers(targets_indexed, team_indexed_for_transfer, team_cost, targets)

## REPEAT STEP 7

In [None]:
def append_teams(gw1_team, updated_team):
    
    season = gw1_team.append(updated_team)
    
    return season

In [None]:
appended_teams = append_teams(appended_teams, updated_team)

In [None]:
appended_teams

In [None]:
team_cost

In [None]:
updated_team.cost.sum()/10

# That's the first generalised loop completed, let's do one more

## Repeat STEP 3 again

In [None]:
def round_players(full_set, gw_idx):
    
    """
    In this function we'll create a set of available, scoring players for the requested gameweek. From that set 
    we'll sort by points scored and generate a set of four potential transfers, 1 per position and store them 
    in a targets dataset.
    
    We return the available players set and the targets
    
    We need to add in some functionality to check that our targets are not already in the current team
    
    
    """
    
    # First we'll check which players were available at the stated gameweek
    gw_true = full_set[full_set["round"]==gw_idx]
    # Create a list of ids for these players
    gw_id_list = gw_true.id.tolist()
    # Now select all the available players in all rounds from our data
    available_players = full_set.loc[data["id"].isin(gw_id_list)]

    # Now we'll remove all players who have 0 points at the latest gw, first filter for latest gw
    latest_round = available_players[available_players["round"]==gw_idx]
    # Now filter in latest round for players scoring 1 or more points
    scorers = latest_round[latest_round.cum_points>0]

    # Get the gw 2 costs from the cumulative df cost feature
    week_cost = cumulative[cumulative["round"]==gw_idx][["element", "cost"]]

    rd_player_set= scorers.merge(week_cost, how="left", left_on="id", right_on="element").drop("element", axis=1)

    # Finally we create a value feature (points per £million) which will assist our team selection
    dec = 1
    rd_player_set["value"] = rd_player_set.cum_points / rd_player_set.cost
    rd_player_set["value"] = rd_player_set["value"].apply(lambda x: round(x, dec)) # Round the values down to 1 decimal place
    
    #if team == None

    # Now generate a target per position
    gk2 = rd_player_set[rd_player_set.element_type==1].sort_values(by=["total_points", "cost"], ascending=[False, True])
    def2 = rd_player_set[rd_player_set.element_type==2].sort_values(by=["total_points", "cost"], ascending=[False, True])
    mid2 = rd_player_set[rd_player_set.element_type==3].sort_values(by=["total_points", "cost"], ascending=[False, True])
    att2 = rd_player_set[rd_player_set.element_type==4].sort_values(by=["total_points", "cost"], ascending=[False, True])
    
    # Set up an empty dataframe to store the next players in each position
    targets = pd.DataFrame(columns = rd_player_set.columns)

    # Initialise i to 0 although this will change as we iterate 
    #### If the initial 4 players are rejected we'll need to go to the next level down..
    ### We would expect to only go 3 or 4 levels deep in the worst case scenario?? More thinking required here though
    i=0
    next_gk = gk2.iloc[i].to_frame().T
    next_def = def2.iloc[i].to_frame().T
    next_mid = mid2.iloc[i].to_frame().T
    next_att = att2.iloc[i].to_frame().T

    targets = targets.append(next_gk).append(next_def).append(next_mid).append(next_att)

    
    return rd_player_set, targets

In [None]:
round_player_set, targets = round_players(data, 4) # index changed

In [None]:
targets

## Repeat STEP 4 Again

In [None]:
def team_sorted(round_player_set, current_team): # pass in the existing team
    
    
    # First put our current team ids into a list 
    team_ids = current_team.id.values.tolist()
    
    # Now get these ids from the rd_2_players set
    current_team_next = round_player_set[round_player_set.id.isin(team_ids)].sort_values(by="element_type", 
                                                                                         ascending=True)
    
    
    # Sort by points, cost for each position
    team_gk = current_team_next[current_team_next.element_type==1].sort_values(by=["total_points", "cost"], 
                                                                           ascending=[False, True])
    team_def = current_team_next[current_team_next.element_type==2].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_mid = current_team_next[current_team_next.element_type==3].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    team_att = current_team_next[current_team_next.element_type==4].sort_values(by=["total_points", "cost"], 
                                                                                ascending=[False, True])
    
    # Join the positions back into a dataframe
    team_sorted_for_transfer = pd.DataFrame(columns = current_team.columns)
    team_sorted_for_transfer = team_sorted_for_transfer.append(team_gk).append(team_def).append(team_mid).append(team_att)
    team_sorted_for_transfer = team_sorted_for_transfer.reset_index()
    team_sorted_for_transfer = team_sorted_for_transfer.drop("index", axis=1)

    

    #Index for worst in each pos, gk=1, def = 6, mid = 11, att = 14
    
    return team_sorted_for_transfer

In [None]:
team_indexed_for_transfer = team_sorted(round_player_set, updated_team)

In [None]:
team_indexed_for_transfer

In [None]:
appended_teams

## Repeat STEP 5 Again

In [None]:
def net_loss(df, targets, g_idx, d_idx, m_idx, f_idx):
    """
    In this function we calculate the potential loss of points of each player to be considered for transfer into 
    the squad and save them to a list. We then organise loss values in ascending order and create a list of 
    indexes that the ordered values are found in the initial loss list
    df = team_sorted_for_transfer
    """
    
    gk, dfn, mid, fwd = g_idx, d_idx, m_idx, f_idx
    
    # Set an empty list to store our player losses
    net_loss = []
    
    
    ### ADD IN SOME FUNCIONALITY TO CHECK THAT THE DF.ID AND TARGET.ID ARE NOT THE SAME, IE WE'RE NOT TRYING TO 
    ### TRANSFER IN THE SAME PLAYER SON OUT SON IN, KANE OUT KANE IN ETC ETC
    ### THIS IS PRACTICALLY IMPOSSIBLE BUT MAYBE NOT THEORETICALLY SO
    
    ### THE ABOVE COMMENT BEGS THE QUESTION - WHAT HAPPENS IF MY CURRENT TEAM ARE THE BEST SCORING PLAYERS 
    ### IN THE LEAGUE... WE NEED TO MAKE SURE THAT NO TRANSFERS OCCUR IN THAT CASE!!
    
    # calculate the potential loss each transfer would incur
    gk_loss = df.total_points.iloc[1] - targets.total_points.iloc[0]
    def_loss = df.total_points.iloc[6] - targets.total_points.iloc[1]
    mid_loss = df.total_points.iloc[11] - targets.total_points.iloc[2]
    fwd_loss = df.total_points.iloc[14] - targets.total_points.iloc[3]

    # Add each loss to the net_loss list which is our unordered list of net losses
    net_loss.extend([gk_loss, def_loss, mid_loss, fwd_loss])

    # Create an ordered list of net losses high to low
    ordered_loss = net_loss.copy()
    ordered_loss.sort()

    # Now create an ordered list of indexes - i.e the index in net_loss that we can find values in the ordered_loss
    # Note the values have 1 added to equate to the element types 1=gk, 2=def, 3=mid, 4=fwd
    ordered_idx = [net_loss.index(ordered_loss[x])+1 for x in range(len(ordered_loss))]
    
    return ordered_idx

In [None]:
targets_indexed = net_loss(team_indexed_for_transfer, targets, 1, 6, 11, 14)

In [None]:
targets_indexed

## Repeat STEP 6 AGAIN

In [None]:
def make_transfers(order_index, team_sorted_for_transfer, team_cost, targets):


    g_idx, d_idx, m_idx, f_idx = 1, 6, 11, 14

    #team_cost = base_team_gw1.cost.sum()/10
    #cost_out = team_sorted_for_transfer.iloc[11].cost/10

    #cost_in = float(targets[targets.element_type==3].cost/10)

    #new_budget = team_cost - cost_out
    #new_team_cost = new_budget + cost_in
    new_team_cost = 0

    best_team = team_sorted_for_transfer.copy()

    for m, n in enumerate(order_index):
        if n == 1: # i.e if we're looking to swap a goalkeeper
            cost_out = team_sorted_for_transfer.iloc[g_idx].cost/10
            cost_in = float(targets[targets.element_type==1].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=g_idx, axis=0)
                best_team.loc[g_idx] = targets.iloc[0]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 2: # i.e if we're looking to swap a defender
            cost_out = team_sorted_for_transfer.iloc[d_idx].cost/10
            cost_in = float(targets[targets.element_type==2].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=d_idx, axis=0)
                best_team.loc[d_idx] = targets.iloc[1]
                best_team = best_team.sort_index()
                break
            else:
                continue

        elif n == 3: # i.e if we're looking to swap a midfielder
            cost_out = team_sorted_for_transfer.iloc[m_idx].cost/10
            cost_in = float(targets[targets.element_type==3].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=m_idx, axis=0)
                best_team.loc[m_idx] = targets.iloc[2]
                best_team = best_team.sort_index()
                break
            else:
                continue

        if n == 4: # i.e if we're looking to swap an attacker
            cost_out = team_sorted_for_transfer.iloc[f_idx].cost/10
            cost_in = float(targets[targets.element_type==4].cost/10)
            new_budget = team_cost - cost_out
            new_team_cost = new_budget + cost_in
            if new_team_cost <= 100:
                best_team = best_team.drop(index=f_idx, axis=0)
                best_team.loc[f_idx] = targets.iloc[3]
                best_team = best_team.sort_index(inplace=True)
                break
            else:
                continue
    
    team_cost = best_team.cost.sum()/10


    # Review our new team            
    return best_team, team_cost

In [None]:
updated_team, team_cost = make_transfers(targets_indexed, team_indexed_for_transfer, team_cost, targets)

In [None]:
print(team_cost)
updated_team

## Repeat STEP 7 Again

In [None]:
def append_teams(gw1_team, updated_team):
    
    season = gw1_team.append(updated_team)
    
    return season

In [None]:
appended_teams = append_teams(appended_teams, updated_team)

In [None]:
appended_teams#.web_name.values

In [None]:
appended_teams.web_name.value_counts()

## NARRATIVE
So the only item that needs a manual update is the gameweek. We should now be able to make a general set of functions to solve the problem::::::>>>>

In [None]:
def generate_teams():
    
    
    # step 1First let's generate the gw1_team, get the set of players to choose from
    round_player_set, targets = round_players(data, 1)
    
    # step 2 Now generate the baseline team
    gw1_team, team_cost = baseline_team(round_player_set)
    
    updated_team = gw1_team.copy()
    appended_teams = gw1_team.copy()
    
    # Create a list of gwks to iterate thro' - note we'll start from gw2
    gwks = list(range(2, data["round"].unique().tolist()[-1]+1))
    
    for i, j in enumerate(gwks):
        # step 3 generate a the set of players from next gameweek, together with a set of targets
        round_player_set, targets = round_players(data, j)

        # step 4, organise our weakest players in to their appointed transfer index
        team_indexed_for_transfer = team_sorted(round_player_set, updated_team)

        # step 5, get the index of preferred transfers
        targets_indexed = net_loss(team_indexed_for_transfer, targets, 1, 6, 11, 14)

        # step 6, make a transfer and update the team cost
        updated_team, team_cost = make_transfers(targets_indexed, team_indexed_for_transfer, team_cost, targets)


        # step 7, append the updated team to the master season list of teams
        appended_teams = append_teams(appended_teams, updated_team)
        print(i, updated_team.web_name.values)
    
    return appended_teams

In [None]:
season = generate_teams()

In [None]:
updated_team.index.tolist()

In [None]:
variable

In [None]:
season

In [None]:
gwks = list(range(2, data["round"].unique().tolist()[-1]+1))

In [None]:
for i, j in enumerate(gwks):
    print(i, j)