
## Overview

This notebook aims to deliver training session plans to various triathletes that I coach which takes into account their aims and goals in triathlon, their current fitness and their time and availability. The outcome of this is to create a training programme that gives an achievable stretch to their training and aiming to increase their training load (which manifests in the form of training volume and training intensity) safely.


## Data Collection - sessions

This section collects the metadata from every single training session I have created (with key information such as training type, TSS (training stress score, a measure of how much training stress the session will give, based on how intense the session is and how long it is). Other features include whether the session is applicable for the training phase (which are preparation, base, build and peak), and session type (i.e. the focus of the session, such as endurance focussed, strength focussed, speed focussed etc)


In [265]:


import pandas as pd
file_path = '/Users/nicholasholman/Downloads/tps-2.csv' 

df = pd.read_csv(file_path)


df['workout_name'] = df['Sport'] + df['Name']

df.tail()

Unnamed: 0,Sport,Name,sport_name,Time,TSS,Swim_ind,Bike_ind,Run_ind,Recovery_ind,Elite_ind,...,Base 3_ind,Build 1_ind,Build 2_ind,Peak_ind,E_ind,S_ind,M_Ind,P_ind,F_ind,A_ind
509,Run,E2.13Elt,RunE2,90.0,109.5,0,0,1,0,1,...,1,1,1,1,1,0,0,0,0,0
510,Run,E2.14Elt,RunE2,97.5,119.0,0,0,1,0,1,...,1,1,1,1,1,0,0,0,0,0
511,Run,E2.15Elt,RunE2,105.0,128.5,0,0,1,0,1,...,1,1,1,1,1,0,0,0,0,0
512,Run,E2.16Elt,RunE2,112.5,138.0,0,0,1,0,1,...,1,1,1,1,1,0,0,0,0,0
513,Run,E2.17Elt,RunE2,120.0,147.0,0,0,1,0,1,...,1,1,1,1,1,0,0,0,0,0


## Data Collection - athletes

This section is a stopgap solution as TrainingPeaks (the software I use to create the sessions) has severed their API links and therefore have to create a manual workaround. This essentially collects information on their current fitness, and time availability across each of the 3 disciplines.


In [None]:
# Original data
import pandas as pd
import numpy as np

athlete = {
            'athlete_name': ['VN','IB','KH', 'MA', 'NS'],
            'ramp': [1,3,1,5,1],
            'recov_ramp': [-1,-2,-1,-3,-2],
            'starting_ctl' : [48,58,36,90,29],
            'run_ramp': [0.6,1,0.6,1.5,-1],
            'run_rec_ramp': [-1.25, -0.66, -0.66,-1, -0.66],
            'run_start_ctl': [17,26,10,42,18],
            'bike_ramp': [0.6,1,0.6,2.5,-1],
            'bike_rec_ramp': [-1.25, -0.66, -0.66,-1, -0.66],
            'bike_start_ctl': [17,26,10,23,18],
            'phase' : ['Base 2' ,'Base 2' ,'Base 2', 'Base 3', 'Base 1'],
            'Non_tri_TSS': [30, 70, 45, 0, 50],
            'max_time_swim': [50, 80, 45, 150, 60],
            'max_time_bike': [150, 120, 60, 180, 60],
            'max_time_run': [100, 90, 4, 180, 60],
            'max_swim' : [1,1,3,3,2],
            'max_bike': [2,2,2,4,2],
            'max_run': [2,2,2,4,2],
            'max_total': [5,5,5,10,6]

}

athletes_df = pd.DataFrame(athlete)

## Target TSS creation
This next section creates a target TSS (training stress score) for the next 4 weeks, based on their current fitness, current focus and their target stretch (ramp rate). FYI - CTL stands for Chronic Training Load - a measure of how fit an athlete is. The code is based on a 4 week training cycle, which gives 3 weeks of improving fitness and 1 week of recovery.

In [392]:
data1 = []
for _, row in athletes_df.iterrows():
        athlete_name = row['athlete_name']
        ramp = row['ramp']
        run_ramp = row['run_ramp']
        bike_ramp = row['bike_ramp']
        max_total = row['max_total']
        max_swim = row['max_swim']
        max_bike = row['max_bike']
        max_run = row['max_run']
        max_time_swim = row['max_time_swim']
        max_time_bike = row['max_time_bike']
        max_time_run = row['max_time_run']
        Non_tri_TSS = row ['Non_tri_TSS']
        recov_ramp = row['recov_ramp']
        run_rec_ramp = row['run_rec_ramp']
        bike_rec_ramp = row['bike_rec_ramp']
        phase = row['phase']
        starting_ctl = row['starting_ctl']
        run_start_ctl = row['run_start_ctl']
        bike_start_ctl = row['bike_start_ctl']
        data1.extend([
            {'week_num': 'Week 1', 'Athlete': athlete_name, 'max_total': max_total, 'max_time_swim': max_time_swim, 'max_time_bike': max_time_bike, 'max_time_run': max_time_run, 'max_swim': max_swim, 'max_bike': max_bike, 'max_run': max_run, 'Non_tri_TSS': Non_tri_TSS, 'phase': phase, 'ramp': ramp, 'ctl' : starting_ctl, 'run_ramp': run_ramp, 'run_ctl' : run_start_ctl , 'bike_ramp': bike_ramp, 'bike_ctl' : bike_start_ctl   },
            {'week_num': 'Week 2', 'Athlete': athlete_name, 'max_total': max_total, 'max_time_swim': max_time_swim, 'max_time_bike': max_time_bike, 'max_time_run': max_time_run, 'max_swim': max_swim, 'max_bike': max_bike, 'max_run': max_run, 'Non_tri_TSS': Non_tri_TSS, 'phase': phase, 'ramp': ramp, 'ctl' : starting_ctl + ramp, 'run_ramp': run_ramp, 'run_ctl' : run_start_ctl + run_ramp, 'bike_ramp': bike_ramp, 'bike_ctl' : bike_start_ctl + bike_ramp },
            {'week_num': 'Week 3', 'Athlete': athlete_name, 'max_total': max_total, 'max_time_swim': max_time_swim, 'max_time_bike': max_time_bike, 'max_time_run': max_time_run, 'max_swim': max_swim, 'max_bike': max_bike, 'max_run': max_run, 'Non_tri_TSS': Non_tri_TSS, 'phase': phase, 'ramp': ramp, 'ctl' :starting_ctl + ramp + ramp, 'run_ramp': run_ramp, 'run_ctl' : run_start_ctl + run_ramp + run_ramp, 'bike_ramp': bike_ramp, 'bike_ctl' : bike_start_ctl + bike_ramp + bike_ramp},
            {'week_num': 'Week 4', 'Athlete': athlete_name, 'max_total': max_total, 'max_time_swim': max_time_swim, 'max_time_bike': max_time_bike, 'max_time_run': max_time_run, 'max_swim': max_swim, 'max_bike': max_bike, 'max_run': max_run, 'Non_tri_TSS': Non_tri_TSS, 'phase': phase, 'ramp': recov_ramp, 'ctl' : starting_ctl + ramp + ramp + ramp, 'run_ramp': run_rec_ramp, 'run_ctl' : run_start_ctl + run_ramp + run_ramp + run_ramp, 'bike_ramp': bike_rec_ramp, 'bike_ctl' : bike_start_ctl + bike_ramp + bike_ramp + bike_ramp}

        ])

# Create DataFrame
demand = pd.DataFrame(data1)


# Display the updated DataFrame


demand['TSS_target'] = ((demand['ctl'] + (demand['ramp'] * np.exp(-1/42)) - demand['ctl'] * np.exp(-7/42)) /
                           (1 - np.exp(-1/42)))  

demand['run_TSS_target'] = ((demand['run_ctl'] + (demand['run_ramp'] * np.exp(-1/42)) - demand['run_ctl'] * np.exp(-7/42)) /
                           (1 - np.exp(-1/42)))  

demand['bike_TSS_target'] = ((demand['bike_ctl'] + (demand['bike_ramp'] * np.exp(-1/42)) - demand['bike_ctl'] * np.exp(-7/42)) /
                           (1 - np.exp(-1/42)))  

print(demand)

   week_num Athlete  max_total  max_time_swim  max_time_bike  max_time_run  \
0    Week 1      VN          5             50            150           100   
1    Week 2      VN          5             50            150           100   
2    Week 3      VN          5             50            150           100   
3    Week 4      VN          5             50            150           100   
4    Week 1      IB          5             80            120            90   
5    Week 2      IB          5             80            120            90   
6    Week 3      IB          5             80            120            90   
7    Week 4      IB          5             80            120            90   
8    Week 1      KH          5             45             60             4   
9    Week 2      KH          5             45             60             4   
10   Week 3      KH          5             45             60             4   
11   Week 4      KH          5             45             60    

## Session creation
Now we have the targets and the sessions available, we can now solve by using a linear programming package called pulp. The goal is to achieve the target TSS scores (overall and per discipline) with the minimum amount of time with the various constraints

In [269]:

from pulp import *
prod = pulp.LpVariable.dicts("prod",
                                     ((workout_name) for  workout_name in df.index),
                                     lowBound=0, upBound=1,
                                     cat='Integer')
                                  

In [371]:
weeks = ['Week 1','Week 2' ,'Week 3' ,'Week 4']

athletes = [ 'MA', 'NS', 'VN','KH','IB']

In [398]:
for week in weeks:
    for athlete in athletes:
        # Create a new model and appropriate constraints for the current athlete and week
        model = LpProblem("The_TP_Problem_" + athlete + "_" + week, LpMinimize)
        # Add constraints to the model for the current athlete and week
        tss_subset = demand[(demand['Athlete'] == athlete) & (demand['week_num'] == week)]
        if len(tss_subset) == 0:
            print(f"No TSS target value for athlete {athlete} in week {week}")
        else:
            tss_value = tss_subset['TSS_target'].iloc[0] - tss_subset['Non_tri_TSS'].iloc[0] 
            run_tss_value = tss_subset['run_TSS_target'].iloc[0] 
            bike_tss_value = tss_subset['bike_TSS_target'].iloc[0] 
            tss_value_low = tss_value * 0.98
            tss_value_high = tss_value * 1.03
            run_tss_value_low = run_tss_value * 0.98
            run_tss_value_high = run_tss_value * 1.03
            bike_tss_value_low = bike_tss_value * 0.98
            bike_tss_value_high = bike_tss_value * 1.03
            swim_max = tss_subset['max_swim'].iloc[0] 
            bike_max = tss_subset['max_bike'].iloc[0] 
            run_max = tss_subset['max_run'].iloc[0] 
            swim_time_max = tss_subset['max_time_swim'].iloc[0] 
            bike_time_max = tss_subset['max_time_bike'].iloc[0] 
            run_time_max = tss_subset['max_time_run'].iloc[0] 
            total_max = tss_subset['max_total'].iloc[0] 
            # TSS values between target ranges 
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index]) >= tss_value_low
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index]) <= tss_value_high
            
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index if df.loc[workout_name, 'Run_ind'] == 1]) <= run_tss_value_high
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index if df.loc[workout_name, 'Run_ind'] == 1]) >= run_tss_value_low
          
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index if df.loc[workout_name, 'Bike_ind'] == 1]) <= bike_tss_value_high
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'TSS'] for workout_name in df.index if df.loc[workout_name, 'Bike_ind'] == 1]) >= bike_tss_value_low

            # Constraint: Number of sessions per discipline do not exceed maximum for each athlete
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Swim_ind'] for workout_name in df.index] ) <= swim_max
            # Bikes per week
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Bike_ind'] for workout_name in df.index] ) <= bike_max 
            # Runs per week
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Run_ind'] for workout_name in df.index] ) <= run_max 
        
            # Constraint: Number of sessions per discipline do not exceed minimum for each athlete
            model += pulp.lpSum(
                 [prod[workout_name] * df.loc[(workout_name), 'Swim_ind'] for workout_name in df.index] ) >= 1
            # Bikes per week
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Bike_ind'] for workout_name in df.index] ) >= 1 
            # Runs per week
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Run_ind'] for workout_name in df.index] ) >= 1 

            #Max sessions in week
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'Base 3_ind'] for workout_name in df.index] ) <= total_max 
      
            model += pulp.lpSum(
                [prod[workout_name] * df.loc[(workout_name), 'M_Ind'] for workout_name in df.index] ) == 0   

 
        # Solve the model for the current athlete and week
        model.solve()

        # Output the results for the current athlete and week
        output = []
        for workout_name in prod:
            var_output = {
                'Week': week,
                'Athlete': athlete,
                'Workout': workout_name,
                'Unit': prod[(workout_name)].varValue,
                'Workout Name': df.loc[(workout_name), 'workout_name'],
                'TSS': df.loc[(workout_name), 'TSS'],
                'Time': df.loc[(workout_name), 'Time']
            }
            output.append(var_output)

        output_df = pd.DataFrame.from_records(output).loc[lambda x: x['Unit'] != 0].sort_values(['Unit', 'Workout'], ascending=False)
        output_df.set_index(['Workout'], inplace=True)

        print("Results for Week {} and Athlete {}\n".format(week, athlete))
        print(output_df)    # Create a new model and appropriate constraints for the current athlete and week
      



Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - /usr/local/anaconda3/lib/python3.12/site-packages/pulp/solverdir/cbc/osx/64/cbc /var/folders/vp/r78j8zcd58zcmdq5cn012pb40000gn/T/1cb8745036f84a038efb0c0a2ef0879e-pulp.mps -timeMode elapsed -branch -printingOptions all -solution /var/folders/vp/r78j8zcd58zcmdq5cn012pb40000gn/T/1cb8745036f84a038efb0c0a2ef0879e-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 19 COLUMNS
At line 4479 RHS
At line 4494 BOUNDS
At line 5010 ENDATA
Problem MODEL has 14 rows, 515 columns and 3430 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 0 - 0.00 seconds
Cgl0002I 70 variables fixed
Cgl0004I processed model has 7 rows, 298 columns (298 integer (204 of which binary)) and 1082 elements
Cbc0038I Initial state - 2 integers unsatisfied sum - 0.240393
Cbc0038I Pass   1: suminf.    0.36955 (2) obj. 0 iterations 2
Cbc003



Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - /usr/local/anaconda3/lib/python3.12/site-packages/pulp/solverdir/cbc/osx/64/cbc /var/folders/vp/r78j8zcd58zcmdq5cn012pb40000gn/T/9eedf5cea9404bf981fdd220dbdc3acb-pulp.mps -timeMode elapsed -branch -printingOptions all -solution /var/folders/vp/r78j8zcd58zcmdq5cn012pb40000gn/T/9eedf5cea9404bf981fdd220dbdc3acb-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 19 COLUMNS
At line 4479 RHS
At line 4494 BOUNDS
At line 5010 ENDATA
Problem MODEL has 14 rows, 515 columns and 3430 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 0 - 0.00 seconds
Cgl0002I 70 variables fixed
Cgl0004I processed model has 7 rows, 298 columns (298 integer (204 of which binary)) and 1082 elements
Cbc0038I Initial state - 3 integers unsatisfied sum - 0.690749
Cbc0038I Pass   1: suminf.    0.85088 (2) obj. 0 iterations 2
Cbc003