<div align="right" style="text-align: right"><i>Peter Norvig, Oct 2017<br>pandas Aug 2020<br>Data updated monthly</i></div>

# Bike Code

Code to support the analysis in the notebook [Bike Speed versus Grade.ipynb](Bike%20Speed%20versus%20Grade.ipynb).

In [1]:
from IPython.core.display import HTML
from typing import Iterator, Tuple, List, Dict
import matplotlib
import matplotlib.pyplot as plt
import numpy  as np
import pandas as pd
import re

# Reading Data: `rides`

I downloaded a bunch of my recorded [Strava](https://www.strava.com/athletes/575579) rides, most of them longer than 25 miles (with a few exceptions), as [`bikerides.tsv`](bikerides.tsv).  The columns are: the date; the year; a title; the elapsed time of the ride; the length of the ride in miles; and the total climbing in feet, e.g.: 

    Mon, 10/5	2020	Half way around the bay on bay trail	6:26:35	80.05	541
    
I parse the file into the pandas dataframe `rides`, adding derived columns for miles per hour, vertical feet climbed per hour (VAM), grade in feet per mile, grade in percent, and kilometers ridden:

In [2]:
def parse_hours(time: str) -> float: 
    pass
def parse_int(field: str) -> int: return int(field.replace(',', ''))

def add_derived_columns(rides) -> pd.DataFrame:
    return rides.assign(
        mph=round(rides['miles'] / rides['hours'], 2),
        vam=round(rides['feet'] / rides['hours']),
        fpm=round(rides['feet']  / rides['miles']),
        pct=round(rides['feet']  / rides['miles'] * 100 / 5280, 2),
        kms=round(rides['miles'] * 1.609, 2))

In [3]:
rides = add_derived_columns(pd.read_table(open('bikerides.tsv'), comment='#',
            converters=dict(hours=parse_hours, feet=parse_int)))

# Reading Data: `segments`

I picked some representative climbing segments ([`bikesegments.csv`](bikesegments.csv)) with the segment length in miles and climb in feet, along with several of my times on the segment. A line like

    Old La Honda, 2.98, 1255, 28:49, 34:03, 36:44
    
means that this segment of Old La Honda Rd is 2.98 miles long, 1255 feet of climbing, and I've selected three times for my rides on that segment: the fastest, middle, and slowest of the times  that Strava shows. (However, I ended up dropping the slowest time in the charts to make them less busy.)

In [4]:
def parse_segments(lines):
    """Parse segments into rides. Each ride is a tuple of:
    (segment_title, time,  miles, feet_climb)."""
    for segment in lines:
        title, mi, ft, *times = segment.split(',')[:5]
        for time in times:
            yield title, parse_hours(time), float(mi), parse_int(ft)

In [5]:
segments = add_derived_columns(pd.DataFrame(
               parse_segments(open('bikesegments.csv')), 
               columns='title	hours	miles	feet'.split()))

# Reading Data: `places`

Monthly, I will take my [summary data from wandrer.earth](https://wandrer.earth/athletes/3534/santa-clara-county-california) and enter it in the file [bikeplaces.txt](bikeplaces.txt), in a format where

      Cupertino: 172: 22.1 23.9 26.2*3 26.3 | 26.4
      
means that Cupertino has 172 miles of roads, and that by the first month I started keeping track, I had ridden 22.1% of them; in the last month 26.4%; and the `26.2*3` means that for 3 months in a row I had 26.2%. The `|` indicates the end of a year. A line that starts with `#` is a comment.

In [6]:
class Month(int):
    """An integer in the form: 12 * year + month."""
    def __str__(self): return f'{(self - 1) // 12}-{(self % 12) or 12:02d}'
    pass
start   = Month(2020 * 12 + 7) # Starting month: July 2020
bonuses = (25, 90, 99)         # Percents the earn important bonuses

Entry = Tuple[str, float, List[float]] # (Place_Name, miles_of_roads, [pct_by_month,...])

def wandrer(category, entries, start=start):
    pass
def label(pcts, place, miles) -> str:
    pct = f'{rounded(pcts[-1]):>3}' if pcts[-1] > 1.4 else f'{pcts[-1]}'
    done = miles * pcts[-1]
    bonus = next((f' {rounded((p - pcts[-1]) / 100 * miles):>3} to {p}%' 
                  for p in bonuses if p >= pcts[-1]), '')
    return f'{pct}% ({rounded(done / 100):>3}/{rounded(miles):<3} mi){bonus} {place}'
    
def parse_places(lines) -> Dict[str, List[Entry]]:
    "Parse bikeplaces.txt into a dict of {'Title': [entry,...]}"
    pass
def parse_entry(line: str) -> Entry:
    """Parse line => ('Place Name', miles, [percents]); '=' can be used."""
    if line.count(':') != 2:
        print('bad', line)
    place, miles, pcts = line.replace('|', ' ').split(':')
    pcts = re.sub('( [0-9.]+)[*]([0-9]+)', lambda m: m.group(1) * int(m.group(2)),
                  pcts).split()
    for i, p in enumerate(pcts):
        pcts[i] = pcts[i - 1] if p == '=' else 100 if p == '100' else float(p)
    return place, float(miles), pcts 
                   
def rounded(x: float) -> str: return f'{round(x):,d}' if x > 10 else f'{x:.1f}'
    pass
def wandering(places: dict):
    "Plot charts of unique roads ridden in various places."
    for category in places:
        wandrer(category, places[category])

In [7]:
places = parse_places(open('bikeplaces.txt'))

In [8]:
def update_places(filename='bikeplaces.txt'):
    pass
def update_line(line):
    words = line.split()
    if not words or words[0].startswith(':'):
        pass
    elif '*' in words[-1]:
        m, d = words[-1].split('*')
        words[-1] = m + '*' + str(int(d) + 1)
    else:
        words[-1] += '*2'
    return ' '.join(words) + '\n'

# Eddington Number

In [9]:
def Ed_number(distances) -> int:
    pass
def Ed_gap(distances, target) -> int:
    """The number of rides needed to reach an Eddington number target."""
    return target - sum(distances > target)

def Ed_progress(years=range(2013, 2022), rides=rides) -> pd.DataFrame:
    """A table of Eddington numbers by year, and a plot."""
    def Ed(year, d): return Ed_number(rides[rides['year'] <= year][d])
    data  = [(y, Ed(y, 'kms'), Ed(y, 'miles')) for y in years]
    frame = pd.DataFrame(data, columns=['year', 'Ed_km', 'Ed_mi'])
    frame.plot('year', ['Ed_km', 'Ed_mi'], style='o:',
               title='Eddington Numbers in kms and miles')
    grid(axis='y')
    return frame

# Plotting and Curve-Fitting

In [10]:
plt.rcParams["figure.figsize"] = (10, 6)

def show(X, Y, data, title='', degrees=(2, 3)): 
    pass
def grid(axis='both'): 
    "Turn on the grid."
    plt.minorticks_on() 
    plt.grid(which='major', ls='-', alpha=3/4, axis=axis)
    plt.grid(which='minor', ls=':', alpha=1/2, axis=axis)
    
def poly_fit(X, Y, degree: int) -> callable:
    pass
estimator = poly_fit(rides['feet'] / rides['miles'], 
                   rides['miles'] / rides['hours'], 2)

def estimate(miles, feet, estimator=estimator) -> float:
    pass
def top(frame, field, n=20): return frame.sort_values(field, ascending=False).head(n)