This is the date for the activity that we will display.  If more than one actiivy is defined for this date then set `index` to select the appropriate entry (zero-index, time order).

The group is the name of the ActivityGroup.  If you are running, then it would be 'Run'.  You can see available vales using the session below, or by running

    > ch2 data table ActivityGroup
    
and looking at the `name` column values.

In [1]:
date = '2018-02-14'
index = 0
group = 'Bike'

Import the commands we need to access the database from Choochoo and open a session.

In [2]:
from ch2.data import *
from ch2.squeal import *

s = session('-v4')

    INFO: Using database at /home/andrew/.ch2/database.sqln


Find the ActivityJournal entry associated with the given date (and index) and display StatisticNames associated with that activity (there will be other - global - statistics, associated with Intervals, that we will access later).

In [3]:
from ch2.lib.date import to_time
from sqlalchemy import distinct
import datetime as dt

aj = s.query(ActivityJournal).\
       join(ActivityGroup).\
       filter(ActivityJournal.start > to_time(date), 
              ActivityJournal.start < to_time(date) + dt.timedelta(days=1),
              ActivityGroup.name == group
             ).all()[index]
print('Found activity from %s to %s' % (aj.start, aj.finish))

sn = s.query(distinct(StatisticName.name)).join(StatisticJournal).filter(StatisticJournal.source == aj).all()
print('\nAssociated StatisticNames: %s' % ', '.join(s[0] for s in sn))

Found activity from 2018-02-14 10:17:17+00:00 to 2018-02-14 15:28:29+00:00

Associated StatisticNames: Latitude, Longitude, Spherical Mercator X, Spherical Mercator Y, Distance, Speed, Heart Rate, Active Distance, Active Time, Active Speed, Median 5km Time, Median 10km Time, Median 15km Time, Median 20km Time, Median 25km Time, Median 50km Time, Median 75km Time, Median 100km Time, Percent in Z1, Percent in Z2, Percent in Z3, Percent in Z4, Percent in Z5, Percent in Z6, Time in Z1, Time in Z2, Time in Z3, Time in Z4, Time in Z5, Time in Z6, Max Med HR 5m, Max Med HR 10m, Max Med HR 15m, Max Med HR 20m, Max Med HR 30m, Max Med HR 60m, Max Med HR 120m, HR Zone, HR Impulse, HR Impulse (duration), HR Impulse / 10s


Name the statistics we're interested in to save typing later, and then load their values.

In [4]:
sphx = 'Spherical Mercator X'
sphy = 'Spherical Mercator Y'
dist = 'Distance'
speed = 'Speed'
hr = 'Heart Rate'
hrz = 'HR Zone'
hri = 'HR Impulse'
hrid = 'HR Impulse (duration)'
hr10 = 'HR Impulse / 10s'
st = statistics(s, sphx, sphy, dist, speed, hr, hrz, hri, hrid, hr10, source_ids=[aj.id])
print(st.describe())



       Spherical Mercator X  Spherical Mercator Y       Distance        Speed  \
count          2.755000e+03          2.755000e+03    2755.000000  2755.000000   
mean          -7.863457e+06         -3.962990e+06   54643.968849     6.488172   
std            9.843230e+03          1.177180e+04   32692.495815     2.185844   
min           -7.879939e+06         -3.981475e+06       0.540000     0.000000   
25%           -7.874589e+06         -3.974786e+06   25890.380000     5.458000   
50%           -7.861143e+06         -3.961700e+06   53300.050000     6.821000   
75%           -7.853755e+06         -3.952083e+06   85222.600000     7.903000   
max           -7.851148e+06         -3.945191e+06  110697.800000    12.578000   

        Heart Rate      HR Zone  HR Impulse / 10s  HR Impulse (duration)  \
count  2755.000000  2755.000000       1868.000000            2747.000000   
mean    127.886388     3.045361          0.267278               6.285766   
std      10.781979     0.606612          0

In [5]:
nearby = nearby_any_time(s, aj)[0]
print(nearby)
nb = statistics(s, sphx, sphy, dist, speed, hr, hrz, hri, hrid, hr10, source_ids=[nearby.id])



ActivityJournal Bike 2018-01-14 09:43:29 to 2018-01-14 14:30:51


In [6]:
distkm = '%s / km' % dist
st[distkm] = st[dist] / 1000
nb[distkm] = nb[dist] / 1000
speedkm = '%s / kmh' % speed
st[speedkm] = st[speed] * 3.6
nb[speedkm] = nb[speed] * 3.6

At this point it's useful to generate evenly sampled data for nicer display.  The `HR Impulse / 10s` data are already sampled every 10 seconds (this is done by Choochoo's statistic pipeline because the calculation is more complex than simple interpolation), so we interpolate everything else and select just those rows.

In [7]:
import pandas as pd

def interpolate_10(data):
    data10 = data.copy()
    data10['keep'] = pd.notna(data10[hr10])
    data10.interpolate(method='time', inplace=True)
    data10 = data10.loc[data10['keep'] == True]
    return data10

st10 = interpolate_10(st)
print(st10.describe())
nb10 = interpolate_10(nb)

       Spherical Mercator X  Spherical Mercator Y       Distance        Speed  \
count          1.868000e+03          1.868000e+03    1868.000000  1868.000000   
mean          -7.864681e+06         -3.962863e+06   58496.285755     6.036530   
std            9.772855e+03          1.184373e+04   32829.035477     2.579967   
min           -7.879929e+06         -3.981468e+06       0.540000     0.000000   
25%           -7.874779e+06         -3.974724e+06   29864.584375     4.999962   
50%           -7.864278e+06         -3.961613e+06   59276.137381     6.654421   
75%           -7.854066e+06         -3.951678e+06   87807.700667     7.779937   
max           -7.851152e+06         -3.945196e+06  110697.280000    12.438200   

        Heart Rate      HR Zone  HR Impulse / 10s  HR Impulse (duration)  \
count  1868.000000  1868.000000       1868.000000            1867.000000   
mean    128.058731     3.029693          0.267278               7.461218   
std      10.840838     0.647895          0

We're going to plot some data, so load and initialise Bokeh.

In [8]:
from bokeh.plotting import figure, output_notebook, show

output_notebook()

We'll structure plotting as a function.  This will help us re-use the code later when we're experimenting with different layouts.

This is a map that shows stress at different parts of the ride.  It shows where we rode and also how "hard" each part of the ride was.

In [9]:
from bokeh.tile_providers import CARTODBPOSITRON, STAMEN_TERRAIN, STAMEN_TONER
from bokeh.palettes import Inferno, Plasma, Viridis
from colorcet import b_diverging_gkr_60_10_c40 as palette

def hr10_map(side, data):
    
    mx, mn = data[hr10].max(), data[hr10].min()
    z = (data[hr10] - mn) / (mx - mn)
    sz = side * (z ** 3) / 10

    f = figure(plot_width=side, plot_height=side, x_axis_type='mercator', y_axis_type='mercator')
    f.circle(x=data[sphx], y=data[sphy], line_alpha=0, fill_color='red', size=sz, fill_alpha=0.03)
    f.line(x=data[sphx], y=data[sphy], line_color='black')
    f.add_tile(STAMEN_TERRAIN, alpha=0.1)
    f.axis.visible = False
    f.toolbar_location = None

    return f
    
show(hr10_map(300, st10))

In [10]:
def interpolate_to_index(reference, raw, method='linear'):
    reference = pd.DataFrame(True, index=reference.index, columns=['keep'])
    raw = pd.DataFrame(raw, index=raw.index)
    both = reference.merge(raw, how='outer', left_index=True, right_index=True)
    both['keep'].loc[pd.isna(both['keep'])] = False
    both.interpolate(method=method, inplace=True)
    both = both.loc[both['keep'] == True]
    return both.drop(columns=['keep']).dropna().iloc[:, 0]

def closed_patch(y, zero=0):
    x = y.index
    return y.append(pd.Series([zero, zero], index=[x[len(x)-1], x[0]]))

def _delta(z):
    z = z.dropna()
    if len(z):
        return closed_patch(z)
    else:
        return None

def delta_patches(y1, y2):
    dy = y1 - y2
    scale = dy.dropna().abs().max() * 1.1
    y1 = _delta(dy.clip(lower=0))
    y2 = _delta(dy.clip(upper=0))
    return y1, y2, Range1d(start=-scale, end=scale)

In [11]:
from bokeh.models.formatters import NumeralTickFormatter, PrintfTickFormatter
from bokeh.models import LinearAxis, Range1d

def td_plot(nx, ny, key, x_axis, data, nearby=None):
    
    if x_axis not in ('time', 'distance'): raise Exception(x_axis)
    
    f = figure(plot_width=nx, plot_height=ny, 
               x_axis_type='datetime' if x_axis == 'time' else 'linear')

    y1 = data[key]
    if x_axis == 'time':
        f.xaxis.axis_label = 'Time'
        f.xaxis[0].formatter = NumeralTickFormatter(format='00:00:00')
        y1.index = (data.index - data.index[0]).total_seconds()
    else:
        f.xaxis.axis_label = '%s / km' % dist
        f.xaxis[0].formatter = PrintfTickFormatter(format='%.3f')
        y1.index = data[dist] / 1000

    f.line(x=y1.index, y=y1, color='black')
    f.y_range = Range1d(start=0, end=y1.max() * 1.1)
    f.yaxis.axis_label = key
    
    if nearby is not None:
        y2 = nearby[key]
        if x_axis == 'time':
            y2.index = (nearby.index - nearby.index[0]).total_seconds()
        else:
            y2.index = nearby[dist] / 1000
            y2 = interpolate_to_index(y1, y2)
        f.line(x=y2.index, y=y2, color='grey')
        
        y1, y2, range = delta_patches(y1, y2)
        f.extra_y_ranges = {'delta': range}
        if y1 is not None:
            f.patch(x=y1.index, y=y1, color='green', alpha=0.1, y_range_name='delta')
        if y2 is not None:
            f.patch(x=y2.index, y=y2, color='red', alpha=0.1, y_range_name='delta')

    f.toolbar_location = None

    return f
    
show(td_plot(900, 150, hr10, 'time', st10, nb10))
show(td_plot(900, 150, hr10, 'distance', st10, nb10))
show(td_plot(900, 150, distkm, 'time', st10, nb10))

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)


Cumulative HR Impulse / 10s is a way to visualise how intense the session was.  This will make more sense when overlaid with data from another activity.

In [12]:
def cum_plot(nx, ny, key, data, nearby=None):
    
    y1 = data[key].sort_values(ascending=False).reset_index()[key]
    
    f = figure(plot_width=nx, plot_height=ny, 
               x_range=Range1d(start=y1.index.max() * 10, end=0), 
               x_axis_type='datetime',
               x_axis_label='Time',
               y_range=Range1d(start=0, end=1.1 * y1.max()),
               y_axis_location='right',
               y_axis_label=key)
    f.xaxis[0].formatter = NumeralTickFormatter(format='00:00:00')

    f.line(x=y1.index * 10, y=y1, color='black')
    
    if nearby is not None:
        y2 = nearby[key].sort_values(ascending=False).reset_index()[key]
        f.line(x=y2.index * 10, y=y2, color='grey')
        
        y1, y2, range = delta_patches(y1, y2)
        f.extra_y_ranges = {'delta': range}
        f.patch(x=y1.index * 10, y=y1, color='green', alpha=0.1, y_range_name='delta')
        f.patch(x=y2.index * 10, y=y2, color='red', alpha=0.1, y_range_name='delta')

    f.toolbar_location = None
    
    return f
    
show(cum_plot(200, 200, hr10, st10, nb10))
show(cum_plot(200, 200, speedkm, st10, nb10))

Now we come to the global statistics, unassociated with this particular activity (although affected by it).

In [13]:
fn = 'Fitness'
fg = 'Fatigue'

finish = aj.finish + dt.timedelta(hours=3)  # to show new level
start = finish - dt.timedelta(days=30)

ff = statistics(s, fn, fg, start=start, finish=finish)
print(ff.describe())

            Fitness       Fatigue
count  21536.000000  21536.000000
mean   46536.292652  37706.594087
std     3635.406614  10190.947271
min    37771.161788  10276.302393
25%    44452.500657  31641.262385
50%    46889.563532  36319.954599
75%    49718.526184  45959.300887
max    51691.821704  57944.199347


In [14]:
from sqlalchemy import func

def ff_plot(nx, ny, data):
    
    f = figure(plot_width=nx, plot_height=ny, x_axis_type='datetime')
    f.xaxis.axis_label = 'Date'

    max_f = s.query(func.max(StatisticJournalFloat.value)). \
              join(StatisticName). \
              filter(StatisticName.name.in_([fn, fg])).scalar()
    
    f.line(x=data.index, y=data[fn], color='black')
    f.y_range = Range1d(start=0, end=max_f)
    f.yaxis.axis_label = '%s, %s' % (fn, fg)
    f.yaxis[0].formatter = PrintfTickFormatter(format='')
    
    y2 = closed_patch(data[fg])
    f.patch(x=y2.index, y=y2, color='grey', line_alpha=0.2, fill_alpha=0.1)

    f.toolbar_location = None
    
    return f
    
show(ff_plot(500, 200, ff))

In [15]:
from bokeh.layouts import column, row

doc = column(row(td_plot(700, 200, hr10, 'distance', st10, nb10),
                 cum_plot(200, 200, hr10, st10, nb10)),
             row(column(row(td_plot(400, 150, distkm, 'time', st10, nb10), 
                            cum_plot(150, 150, speedkm, st10, nb10)),
                        ff_plot(600, 150, ff)),
                 hr10_map(300, st10)))
show(doc)