Section 0: Intro

In [1]:
#USE INSTRUCTIONS:

#The way this notebook is written, youll want to Restart+Run all for each test.
#menu_start(), in the bottom cell starts the whole program. This is the only function you'll need to use

#athletes, runs and segments are held in ATHLETES and SEGMENTS. 
#these are a mix of personal bests, record times, and strava times

#James' set is fully from strava workouts. Some points are nearly to ideal performance, but others, like his mile, are from normal workouts that offset his curve 
#Jay, Max, and Julion have data closer to ideal performance. Some are actually logged on strava, others manually inserted
#The set of world record performances, is stored as option 5. 

#in the off chance a seldom-used method is missing, everything here was pasted from Jay_Notebooks/pretty_methods_3.ipynb


In [2]:
#COMPARISON METHODS

#Single Athlete compares an athlete's pace curve to segment distance, estimating their time for the segment
#Multiple Athlete factors in the KOM holder's other runs, giving a better indicator of the segment's difficulty

In [3]:
#EXAMPLES AND INSIGHTS

#Lake and back: This is from a workout, not representative of ideal effort. 
#As the slow pace falls below each athlete's curve any single athlete comparison will predict a successful effort
#Due to the relatuively slow pace, in a multi athlete comparison (assuming Jay is correctly entered as the KOM holder) every athlete except Julion (who is better at short distances) should be able to beat this
#Jay will be predicted to beat his KOM in a single athlete comparison since it is slower than his average curve, but he will tie against himself in a multi athlete comparison since the slow time is not used to calculate his curve.

#Trashmore: This is a hard hill circuit, times will be significantly slower than ideal due to the added terrain/incline
#In a single comparison, Julion beats Jay since despite Jay being generally faster at the distance, wihtout factoring in difficulty the run appears quite doable
#In a comparison against Jay, Julion loses, since the difficulty scaling slows down Julion's prediction.
#In a comparison against Jay, James will win, since he is faster than jay at long distancea

#Julion's 2mile PR: An old PR from a runner who specializes in distances shorter than this one
#If Julion is compared against himself for an old time he ran, he is projected to win by a hair.
#Though it is his own effort, it is older than his other data points, so his curve indicates he can probably beat his time if he tries now.

#World Records: Given the scaling of multi-athlete comparisons, if a faster athlete like this set is selected as the KOM holder, it will blow any other athlete's projected time out of the water, since the slower KOM time implies extreme trail difficulty

Section 1: menu functions

In [4]:
import matplotlib.pyplot as pp
import numpy as np

In [5]:
def menu_start(): #starts program for user
    print("Welcome to implementation 3")
    print("\nSelect mode: ")
    print("1. Single Athlete Prediction")    
    print("2. Multi Athlete Prediction")
    
    mode = getchoice(2)
    
    if mode==1:
        menu_single()
    if mode==2:
        menu_multiple()

In [6]:
def menu_single(): #makes single athlete comparison to segment
    
    #get data for comparison
    print("Comparing segment to athlete's recorded performances.\n")
    athletes = ATHLETES
    print("\nSelect Athlete: ")
    athlete = getAthlete(athletes)
    print("\nSelecting Segment: ")
    segments = SEGMENTS
    segment = getSegment(segments)

    #format data for graphing
    activities = athlete.get_activities()
    dt_data = []
    for a in athlete.activities:
        d = a.get_dist()
        t = a.get_time()
        dt = (d,t)
        dt_data.append(dt)
    
    #make prediction
    spm_data = dt_to_spm(dt_data)
    prediction = int(predict_1v_seg(segment,dt_data))
    to_beat = int(segment.get_KOM())
    difference = prediction - to_beat

    #present results to user
    print("Based on your past performances, your estimated time for the segment '"+segment.get_name()+"' is "+time_to_string(prediction))
    comparison_str = ""
    if difference > 0:
        comparison_str = "This is about "+time_to_string(difference)+" slower than the current KOM."
    elif difference < 0:
        comparison_str = "This is about "+time_to_string(difference*-1)+" faster than the current KOM!"
    else:
        comparison_str = "This time would tie the current KOM. Go for it!"
    print(comparison_str)    
    
    #print("without information on the KOM holder's other performances, prediction can't account for segment-specific difficulty")

In [7]:
def menu_multiple(): #allows user to select type of multi-user comparison. Currently overriden
    print("Comparing multiple athletes' recorded performances")
    
    print("\nSelect comparison type: ")
    print("1. Segment Prediction")    
    print("2. Pace Prediction")
    mode = getchoice(2)
    
    if mode==1:
        menu_multiple_segment()
    if mode==2:
        menu_multiple_paces()

In [8]:
def menu_multiple_segment(): #compares user to KOM-holder to predict user's time on segment
    
    #get data for comparison
    athletes = ATHLETES
    print("\nSelect Athlete 1 (You): ")
    athlete1 = getAthlete(athletes)
    print("\nSelect Athlete 2 (KOM Holder): ")
    athlete2 = getAthlete(athletes)

    segments = SEGMENTS
    print("\nSelect Segment (This should be a performance by Athlete 2): ")
    segment = getSegment(segments)
        
    athlete1_dt = make_dt_points(athlete1.get_activities())
    athlete2_dt = make_dt_points(athlete2.get_activities())
    
    #determine pace curves
    athlete1_spm = dt_to_spm(athlete1_dt)
    athlete2_spm = dt_to_spm(athlete2_dt)
    
    curve1 = points_to_logeq(athlete1_spm)
    curve2 = points_to_logeq(athlete2_spm) 
    
    #predict KOM holder's time on KOM, compare to actual time to determine segment's difficulty compared to "ideal" condisions
    KOM_dist = segment.get_dist()
    expected_KOM = curve2.f(KOM_dist)
    
    real_KOM = segment.get_KOM()
    seg_difficulty = real_KOM/expected_KOM
    
    #predict user's time and scale by difficulty, compare this prediction to record time
    expected_time = curve1.f(KOM_dist)
    adjusted_time = expected_time * seg_difficulty
    expected_gap = adjusted_time - real_KOM
    
    
    #present results to user
    print("Based on your past performances, your estimated time for the segment '"+segment.get_name()+"' is "+time_to_string(adjusted_time))
    
    comparison_str = ""
    if expected_gap > 0:
        comparison_str = "This is about "+time_to_string(expected_gap)+" slower than the current KOM."
    elif expected_gap < 0:
        comparison_str = "This is about "+time_to_string(expected_gap*-1)+" faster than the current KOM!"
    else:
        comparison_str = "This time would tie the current KOM. Go for it! "
    print(comparison_str)
    

In [9]:
#This is an idea I came up with more recently, to use pace curves in a more interesting way.
#The method crashes in jupyter for a reason that eludes me. Saying there's a mismatch with the number of arguments, but the method is only defined in one place and it clearly has the right number. Jupyter is running really slowly and weirdly for me right now, so I'm hoping its just an issue on my end
#I think the code is really cool, but since I haven't been able to run test it for real data points I can't be certain it doesn't have bugs. Since I can't vouch for the accuracy of its predictions here and don't want to lose points by formally including it in my submission, I'm adding a boolean to skip it.
    
    
def menu_multiple_paces(): 
    #change the boolean and explore at your own risk!
    abort = False
    if abort:
        print("\nMethod skipped, see menu_multiple_paces() for details")
        return
    
    
    #get data for comparison
    athletes = ATHLETES
    print("\nSelect Athlete 1 (You): ")
    athlete1 = getAthlete(athletes)
    print("\nSelect Athlete 2 (Nemesis): ")
    athlete2 = getAthlete(athletes)
    
    athlete1_dt = make_dt_points(athlete1.get_activities())
    athlete2_dt = make_dt_points(athlete2.get_activities())
    
    
    athlete1_spm = dt_to_spm(athlete1_dt)
    athlete2_spm = dt_to_spm(athlete2_dt)
    
    #get curves from data
    curve1 = points_to_logeq(athlete1_spm)
    curve2 = points_to_logeq(athlete2_spm)
    curves = gap_formula(curve1,curve2)
    
    #choose distance to evaluate over
    max_distance = None
    while max_distance==None:
        entry = input("Enter a maximum distance (in miles) for comparison: ")
        if entry.isdecimal():
            max_distance = int(entry)
        else:
            print("invalid entry '"+entry+"'. Please note only integers are supported")
    
    #compare curves for selected range, report to user
    min_distance = 0.1
    print("about to get an error?")
    comparison = curves.check_cross(min_distance,max_distance)
    if comparison == 0:
        crosspoint, short_d = curves.compare_paces(max_distance)
        shorter = athlete1.get_name() if short_d == 1 else athlete2.get_name()
        longer = athlete2.get_name() if short_d == 1 else athlete1.get_name()
        print("Based both athlete's performances, it looks like "+athlete1.get_name()+" and "+athlete2.get_name()+" are evenly matched for any distance under "+str("%.2f" % crosspoint)+" miles, with "+shorter+" winning shorter races and "+longer+" winning anything longer")
    elif comparison == 1:
        print("Based both athlete's performances, it looks like "+athlete1.get_name()+" will beat "+athlete2.get_name()+" for any distance under "+str(max_distance)+" miles.")
    elif comparison == 2:
        print("Based both athlete's performances, it looks like "+athlete2.get_name()+" will beat "+athlete1.get_name()+" for any distance under "+str(max_distance)+" miles.")


Section 2: Data Classes 

In [10]:
class Athlete(): #class to represent athlete data
    def __init__(self,name):
        self.name = name
        self.activities = []
    
    def add_activities(self,act):
        for a in act:
            self.activities.append(a)
            
    def add_activity(self,act):
        self.activities.append(a)
                    
    def get_name(self):
        return self.name
    
    def get_activities(self):
        return self.activities

In [11]:
class Activity(): #class to represent activity data
    def __init__(self,name,distance,time):
        self.n = name
        self.d = distance
        self.t = time
    
    def get_time(self):
        return self.t
    
    def get_dist(self):
        return self.d
        
    def get_name(self):
        return self.n

In [12]:
class Segment(): #class to represent segment data
    def __init__(self,name,distance,KOM):
        self.n = name
        self.d = distance
        self.t = KOM
    
    def get_name(self):
        return self.n  
    
    def get_KOM(self):
        return self.t
    
    def get_dist(self):
        return self.d
        

Section 3: Pace prediction functions

In [13]:
def predict_1v_seg(seg,dt): #given segment and set of (distance,time) points, predicts user's time on segment
    spm_data = dt_to_spm(dt)
    curve = points_to_logeq(spm_data) #fit curve to points
    seg_length = seg.get_dist()
    pred_p = curve.f(seg_length) #predict pace per mile given segment's distance
    est_t = pred_p * seg_length 
    return est_t

    #print("without information on the KOM holder's other performances, prediction can't account for segment-specific difficulty")
    
def predict_1v_time(d,dt): #given run distance and set of (distance,time) points, predicts user's time for run distance
    spm_data = dt_to_spm(dt)
    curve = points_to_logeq(spm_data) #fit curve to points
    pred_p = curve.f(d) #predict pace per mile given segment's distance
    est_t = pred_p * d
    return est_t

def points_to_logeq(points): #finds constants for log curve representing ideal pace as function of timne
    pts = np.array(points)
    x = pts[:,0] #extract x and y's from points set
    y = pts[:,1]
    fit = np.polyfit(np.log(x), y, 1)# determine polynomial
    return log_eq(fit)

class log_eq(): #class to represent a log equation with predetermined constants
    def __init__(self,args):
        self.a = args[0]
        self.b = args[1]
        #print(self.f(5))
        #print("----")
        
    def f(self,x):# y = b + a log(x)
        return self.b+np.log(x)*self.a 
    
class gap_formula(): #holds two pace curves and has methods for comparison
    
    def __init__(self,formula_a,formula_b):
        self.a = formula_a
        self.b = formula_b
        x = np.linspace(.001,5,100)
        y1 = []
        y2 = []
        y3 = []
        for i in range(len(x)):
            y1.append(self.f(x[i]))
            y2.append(self.a.f(x[i]))
            y3.append(self.b.f(x[i]))

        
        
    def check_cross(self,min_distance,max_distance): #compares endpoints of curves to see if they cross, or one runner is objectively faster
        min_point = self.f(min_distance)
        max_point = self.f(max_distance)
        
        side1 = 1 if min_point>0 else -1
        side2 = 1 if max_point>0 else -1
        
        if side1 == side2: #no cross
            if side1 == 1:
                return 1
            else:
                return 2
        else: #cross
            return 0
        
    def compare_paces(self,max_distance):
        crosspoint = self.find_0(0.25,max_distance,0.1)
        shorter = 1 if self.f(0.25) < 0 else 2
        return (crosspoint,shorter)
                
    def find_0(self,min_x,max_x,accuracy): #recursively finds y=0 (point for which curves meet) by splitting range in half and checking for cross over that interval
        xrange = max_x - min_x
        halfrange = xrange/2
        
        if xrange<accuracy:
            return min_x+halfrange
        else:
            if self.check_cross(min_x,min_x+halfrange) != 0:
                return self.find_0(min_x,min_x+halfrange,accuracy)
            
            elif self.check_cross(max_x-halfrange,max_x) != 0:
                return self.find_0(max_x-halfrange,max_x,accuracy)
            else:
                print("no cross detected, something went wrong here")
        
    def f(self,x): #determine gap by f(x) = a(x) - b(x)
        return self.a.f(x) - self.b.f(x)

Section 4: Data formatting functions

In [14]:
def plotable(data): #format set of tuples into graphable coordinates tuple
    xdata = []
    ydata = []
    for d in data:
        xdata.append(d[0])
        ydata.append(d[1])
    return (xdata,ydata)

def make_dt_points(activities):
    dt_data = []
    for a in activities:
        d = a.get_dist()
        t = a.get_time()
        dt = (d,t)
        dt_data.append(dt)
    return dt_data

def dt_to_spm(old_set): #convert time pair to pace pair
    new_set = []
    for old_tuple in old_set:
        if old_tuple[0]!=0:
            new_tuple = (old_tuple[0],old_tuple[1]/old_tuple[0])
            new_set.append(new_tuple)
        else:
            print("bad data: d=0")
            new_tuple = (old_tuple[0],old_tuple[1]/0.00001)
            new_set.append(new_tuple)
    return new_set

def spm_to_dt(old_set): #convert pace to time pair
    new_set = []
    for old_tuple in old_set:
        new_tuple = (old_tuple[0],old_tuple[1]*old_tuple[0])
        new_set.append(new_tuple)
    return new_set



Section 5: Functions relevant to user experience

In [15]:
def getchoice(n): #returns valid integer choice in range
    choice = None
    while choice==None:
        entry = input("enter number 1-"+str(n)+": ")
        if entry.isdecimal():
            val = int(entry)
            if val > 0 and val <= n:
                print()
                return val
        print("invalid entry '"+entry+"'")

def getAthlete(athletes): #selects athlete from list
    for i in range(len(athletes)):
        print(str(i+1)+". "+athletes[i].get_name())
    choice = getchoice(len(athletes))-1
    return athletes[choice]

def getSegment(segments): #selects segment from list
    for i in range(len(segments)):
        print(str(i+1)+". "+segments[i].get_name())
    choice = getchoice(len(segments))-1
    return segments[choice]

def time_to_string(seconds): #self explanatory
    seconds = int(seconds)
    mins = seconds//60
    secs = seconds % 60
    
    mterm = "minutes" if mins!=1 else "minute"
    sterm = "seconds" if secs!=1 else "second"
    
    if mins==0:
        return str(secs)+" seconds"
    else:
        return str(mins)+" "+mterm+" and "+str(secs)+" "+sterm

def minsec(minutes,seconds): # for user friendly time entry, all calculations here use seconds for time input
    return (minutes*60)+seconds

def hourminsec(hours,minutes,seconds):
    return (hours*3600)+(minutes*60)+seconds

Section 5: Data

In [16]:
James = Athlete("James")

run1 = Activity("Afternoon Run",0.621371192, 132)
run2 = Activity("Getting ready to sit and kick on paul at Ted C",1.0, 289)
run3 = Activity("I could feel Josh Fry's presence",3.10685596, 963)
run4 = Activity("POV you work at the Columbia North YMCA",6.21371192, 2081)

James.add_activities([run1,run2,run3,run4])

In [17]:
Jay = Athlete("Jay")

run1 = Activity("NTK NTK NTK",0.25, 53)
run2 = Activity("State 2018",0.5, 120)
run3 = Activity("Mile PR",1, 275)
run4 = Activity("2018 2 mile",2, 600)
run5 = Activity("2017 XC Sectional",3.0, 960)
run6 = Activity("midwest xc>>",4.97, 1610)
Jay.add_activities([run1,run2,run3,run4,run5,run6])

In [18]:
Julion = Athlete("Julion")

run1 = Activity("run",0.25, 49)
run2 = Activity("run",0.5, 116)
run3 = Activity("run",1, 270)
run4 = Activity("run",3, 945)

Julion.add_activities([run1,run2,run3,run4])

In [19]:
Max = Athlete("Max")

run1 = Activity("run",0.25, 55)
run2 = Activity("run",0.5, 121)
run3 = Activity("run",1, 272)
run4 = Activity("run",3.1, 925)
run5 = Activity("run",4.97, 1560)

Max.add_activities([run1,run2,run3,run4,run5])

In [20]:
records = Athlete("Chuck Norris") # world records 400m to 5000m

run1 = Activity("run",0.25, 43)
run2 = Activity("run",0.5, 101)
run3 = Activity("run",1, 223)
run4 = Activity("run",2, 478)
run5 = Activity("run",3.1, 750)
run6 = Activity("run",4.97, 1264)

records.add_activities([run1,run2,run3,run4,run5,run6])

In [21]:
lake3 = Segment("Lake and back (Jay, beatable effort)",2.9,1080)
HomewoodFlossmoor = Segment("Julion's 2mile PR (Julion)",2,600)
trashmore = Segment("Mt. Trashmore Circuits. (Jay, hills)",2.7,1296)

In [22]:
ATHLETES = [James,Jay,Max,Julion,records]
SEGMENTS = [lake3,HomewoodFlossmoor,trashmore]






––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––






In [23]:
menu_start()

Welcome to implementation 3

Select mode: 
1. Single Athlete Prediction
2. Multi Athlete Prediction
enter number 1-2: 2

Comparing multiple athletes' recorded performances

Select comparison type: 
1. Segment Prediction
2. Pace Prediction
enter number 1-2: 1


Select Athlete 1 (You): 
1. James
2. Jay
3. Max
4. Julion
5. Chuck Norris
enter number 1-5: 1


Select Athlete 2 (KOM Holder): 
1. James
2. Jay
3. Max
4. Julion
5. Chuck Norris
enter number 1-5: 2


Select Segment (This should be a performance by Athlete 2): 
1. Lake and back (Jay, beatable effort)
2. Julion's 2mile PR (Julion)
3. Mt. Trashmore Circuits. (Jay, hills)
enter number 1-3: 3

Based on your past performances, your estimated time for the segment 'Mt. Trashmore Circuits. (Jay, hills)' is 21 minutes and 12 seconds
This is about 23 seconds faster than the current KOM!
