# Load Strava Data
This notebook loads the activity data from Strava, puts it in a data frame and stores the data frame on disc. The data frame is not included in .gitignore, so the data is also stored in the git repo. In case of privacy concerns you may want to change that.

## Connect to Strava
To run through the Strava OAuth workflow, run the first cell of this notebook and click on the resulting URL. This leads you to the login page of Strava and once logged in, to a defunct redirect. Nevermind the redirect, just take out the temp code from the URL. Insert the temp code in the STRAVA_CODE variable in the second cell of this notebook. Once you run the second cell of the notebook, you have an authenticated client for Strava.

In [1]:
import os

from stravalib.client import Client

# The STRAVA API keys are expected as env variables
STRAVA_ID=int(os.environ.get("STRAVA_ID"))
STRAVA_SECRET=os.environ.get("STRAVA_SECRET")

client=Client() 
authorize_url = client.authorization_url(
    client_id=STRAVA_ID, 
    redirect_uri='http://localhost:8282/authorized') 
print(authorize_url)

https://www.strava.com/oauth/authorize?client_id=29670&redirect_uri=http%3A%2F%2Flocalhost%3A8282%2Fauthorized&approval_prompt=auto&response_type=code


In [2]:
STRAVA_CODE="171482836f49334b7bc81ea7365372d372131d8d"

access_token = client.exchange_code_for_token(
    client_id=STRAVA_ID, 
    client_secret=STRAVA_SECRET, 
    code=STRAVA_CODE)
client = Client(access_token=access_token) 

## Load basic data from Strava
The query requests all activity data from 2018. For other time intervalls, the query parameter can be changed accordingly. The data is stored in a Pandas data frame. The columns can be defined in the respective array. 

In [49]:
import pandas as pd

# Define columns and create data frame
data = []
columns =['average_cadence', 'average_heartrate', 'average_speed', 'calories',  'description', 'distance', 'elapsed_time', 'end_latlng', 'gear', 'id', 'location_city', 'location_country', 'start_date', 'start_date_local', 'start_latitude', 'start_longitude', 'start_latlng', 'type', 'workout_type']
index = []
index_column = "start_date_local"

# List some activities
activities = client.get_activities(after = "2018-01-01T00:00:00Z", limit=500)

for activity in activities:
    activity_dict = {}
    for column in columns:
        activity_dict[column] = activity.__getattribute__(column)
    data.append(activity_dict)
    index.append(activity_dict[index_column])
    
activity_df = pd.DataFrame(
    data, 
    index=index, 
    columns=columns)

activity_df.to_pickle("./activity_basic.pkl")

No such attribute visibility on entity <Activity id=1917790913 name='Gernlinden - Thal' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917790913 name='Gernlinden - Thal' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917790913 name='Gernlinden - Thal' resource_state=2>
No such attribute visibility on entity <Activity id=1917790861 name='Leutstetten - Gauting' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917790861 name='Leutstetten - Gauting' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917790861 name='Leutstetten - Gauting' resource_state=2>
No such attribute visibility on entity <Activity id=1917790652 name='Gernlinden - Thal' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917790652 name='Gernlinden - Thal' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917790652

No such attribute display_hide_heartrate_option on entity <Activity id=1917789079 name='Gernlinden - Maisach' resource_state=2>
No such attribute visibility on entity <Activity id=1917789020 name='Gernlinden - Maisach' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917789020 name='Gernlinden - Maisach' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917789020 name='Gernlinden - Maisach' resource_state=2>
No such attribute visibility on entity <Activity id=1917788913 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917788913 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917788913 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917788912 name='Gernlinden - Maisach' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917788912 name='Gernlinden 

No such attribute display_hide_heartrate_option on entity <Activity id=1917787345 name='Industriehof - Rödelheim' resource_state=2>
No such attribute visibility on entity <Activity id=1917787269 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917787269 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917787269 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute visibility on entity <Activity id=1917787089 name='Gernlinden - Maisach' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917787089 name='Gernlinden - Maisach' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917787089 name='Gernlinden - Maisach' resource_state=2>
No such attribute visibility on entity <Activity id=1917786980 name='Gernlinden - Maisach' resource_state=2>
No such attribute heartrate_opt_

No such attribute heartrate_opt_out on entity <Activity id=1917785613 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917785613 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917785589 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917785589 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917785589 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917785552 name='Swimming' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917785552 name='Swimming' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917785552 name='Swimming' resource_state=2>
No such attribute visibility on entity <Activity id=1917785559 name='Gernlinden - Maisach' resource_state=2>
No such attribute heartrate

No such attribute display_hide_heartrate_option on entity <Activity id=1917784828 name='Gernlinden - Maisach' resource_state=2>
No such attribute visibility on entity <Activity id=1917784790 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917784790 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917784790 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917784712 name='Swimming' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917784712 name='Swimming' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917784712 name='Swimming' resource_state=2>
No such attribute visibility on entity <Activity id=1917784736 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917784736 name='Milbertshofen - Bogenhausen' resource_st

No such attribute display_hide_heartrate_option on entity <Activity id=1917784040 name='Suite Novotel Berlin Potsdamer Platz - Mehringdamm' resource_state=2>
No such attribute visibility on entity <Activity id=1917783973 name='Suite Novotel Berlin City Potsdamer Platz - Mehringdamm' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917783973 name='Suite Novotel Berlin City Potsdamer Platz - Mehringdamm' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917783973 name='Suite Novotel Berlin City Potsdamer Platz - Mehringdamm' resource_state=2>
No such attribute visibility on entity <Activity id=1917783893 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917783893 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917783893 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=191778379

No such attribute display_hide_heartrate_option on entity <Activity id=1917783197 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute visibility on entity <Activity id=1917783144 name='Swimming' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917783144 name='Swimming' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917783144 name='Swimming' resource_state=2>
No such attribute visibility on entity <Activity id=1917783162 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917783162 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917783162 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917783150 name='Gernlinden - Neuesting' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917783150 name='Gernlinden - Neuesting' resource_state

No such attribute display_hide_heartrate_option on entity <Activity id=1917782354 name='Nuevos Ministerios - Agora Juan De Austria' resource_state=2>
No such attribute visibility on entity <Activity id=1917782388 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917782388 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917782388 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute visibility on entity <Activity id=1917782320 name='Gernlinden - Neuesting' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917782320 name='Gernlinden - Neuesting' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917782320 name='Gernlinden - Neuesting' resource_state=2>
No such attribute visibility on entity <Activity id=1917782301 name='Gernlinden - Maisach' resource_state=2>
No such 

No such attribute display_hide_heartrate_option on entity <Activity id=1917781333 name='Gernlinden - Maisach' resource_state=2>
No such attribute visibility on entity <Activity id=1917781339 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917781339 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917781339 name='Milbertshofen - Bogenhausen' resource_state=2>
No such attribute visibility on entity <Activity id=1917781337 name='Star Inn Hotel - Bogenhausen' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917781337 name='Star Inn Hotel - Bogenhausen' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917781337 name='Star Inn Hotel - Bogenhausen' resource_state=2>
No such attribute visibility on entity <Activity id=1917781356 name='Gernlinden - Maisach' resource_state=2>
No such attr

No such attribute display_hide_heartrate_option on entity <Activity id=1917780538 name='Willesden Green - North West London' resource_state=2>
No such attribute visibility on entity <Activity id=1917780546 name='Walcott - Bacton' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917780546 name='Walcott - Bacton' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917780546 name='Walcott - Bacton' resource_state=2>
No such attribute visibility on entity <Activity id=1917780445 name='Walcott - Bacton' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917780445 name='Walcott - Bacton' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917780445 name='Walcott - Bacton' resource_state=2>
No such attribute visibility on entity <Activity id=1917780423 name='Walcott - Bacton' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917780423 

No such attribute visibility on entity <Activity id=1917779397 name='Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917779397 name='Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917779397 name='Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917779326 name='Neuesting - Gernlinden' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917779326 name='Neuesting - Gernlinden' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917779326 name='Neuesting - Gernlinden' resource_state=2>
No such attribute visibility on entity <Activity id=1917779223 name='Indoor Running' resource_state=2>
No such attribute heartrate_opt_out on entity <Activity id=1917779223 name='Indoor Running' resource_state=2>
No such attribute display_hide_heartrate_option on entity <Activity id=1917779223 name='Indoor Running' r

In case you want to save API limits on Strava, you can load the activity data frame BEFORE enrichment from this pickle.

In [52]:
activity_df = pd.read_pickle("./activity_basic.pkl")
activity_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 258 entries, 2018-01-05 10:59:23 to 2018-10-28 15:40:26
Data columns (total 19 columns):
average_cadence      251 non-null float64
average_heartrate    258 non-null float64
average_speed        258 non-null object
calories             0 non-null object
description          0 non-null object
distance             258 non-null object
elapsed_time         258 non-null timedelta64[ns]
end_latlng           207 non-null object
gear                 0 non-null object
id                   258 non-null int64
location_city        0 non-null object
location_country     257 non-null object
start_date           258 non-null datetime64[ns, UTC]
start_date_local     258 non-null datetime64[ns]
start_latitude       207 non-null float64
start_longitude      207 non-null float64
start_latlng         207 non-null object
type                 258 non-null object
workout_type         1 non-null object
dtypes: datetime64[ns, UTC](1), datetime64[ns](1), float

## Enrich activity data
For each activity, additional data points like heartrate are provided as streams. This needs to be requested by activity id.

In [53]:
# Lookup streams
def lookup_stream (id, stream_name):
    types = [ stream_name ]
    streams = client.get_activity_streams(id, types=types, resolution='medium')
    if stream_name in streams.keys():
        return streams[stream_name].data
    else:
        return []
    
activity_df['stream_heartrate'] = activity_df.apply(
    lambda row: lookup_stream(row['id'], 'heartrate'), axis=1)

activity_df.head()

Unnamed: 0,average_cadence,average_heartrate,average_speed,calories,description,distance,elapsed_time,end_latlng,gear,id,location_city,location_country,start_date,start_date_local,start_latitude,start_longitude,start_latlng,type,workout_type,stream_heartrate
2018-01-05 10:59:23,51.3,106.4,1.52 m / s,,,7949.40 m,01:28:40,"(48.22, 11.29)",,1917790913,,,2018-01-05 09:59:23+00:00,2018-01-05 10:59:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[95, 95, 94, 94, 94, 94, 94, 94, 111, 109, 107..."
2018-01-06 12:12:54,41.9,96.0,1.25 m / s,,,9725.20 m,02:20:37,"(48.07, 11.38)",,1917790861,,,2018-01-06 11:12:54+00:00,2018-01-06 12:12:54,48.02,11.37,"(48.02, 11.37)",Walk,,"[98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 1..."
2018-01-07 14:55:21,50.2,118.8,1.60 m / s,,,5685.60 m,00:59:45,"(48.22, 11.29)",,1917790652,,,2018-01-07 13:55:21+00:00,2018-01-07 14:55:21,48.22,11.29,"(48.22, 11.29)",Walk,,"[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 9..."
2018-01-08 19:54:08,51.3,125.7,1.70 m / s,,,6057.40 m,00:59:29,"(48.22, 11.29)",,1917790573,,,2018-01-08 18:54:08+00:00,2018-01-08 19:54:08,48.22,11.29,"(48.22, 11.29)",Walk,,"[119, 119, 119, 119, 111, 110, 109, 109, 109, ..."
2018-01-09 20:55:23,56.1,118.4,1.74 m / s,,,5909.70 m,00:57:17,"(48.22, 11.29)",,1917790502,,,2018-01-09 19:55:23+00:00,2018-01-09 20:55:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[113, 113, 113, 113, 113, 113, 113, 113, 113, ..."


In [55]:
activity_df['stream_time'] = activity_df.apply(
    lambda row: lookup_stream(row['id'], 'time'), axis=1)
activity_df.head()

In [58]:
activity_df['stream_latlng'] = activity_df.apply(
    lambda row: lookup_stream(row['id'], 'latlng'), axis=1)
activity_df.head()

Unnamed: 0,average_cadence,average_heartrate,average_speed,calories,description,distance,elapsed_time,end_latlng,gear,id,...,start_date,start_date_local,start_latitude,start_longitude,start_latlng,type,workout_type,stream_heartrate,stream_time,stream_latlng
2018-01-05 10:59:23,51.3,106.4,1.52 m / s,,,7949.40 m,01:28:40,"(48.22, 11.29)",,1917790913,...,2018-01-05 09:59:23+00:00,2018-01-05 10:59:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[95, 95, 94, 94, 94, 94, 94, 94, 111, 109, 107...","[0, 11, 18, 24, 29, 34, 40, 45, 51, 55, 62, 70...","[[48.218981, 11.292518], [48.219032, 11.292494..."
2018-01-06 12:12:54,41.9,96.0,1.25 m / s,,,9725.20 m,02:20:37,"(48.07, 11.38)",,1917790861,...,2018-01-06 11:12:54+00:00,2018-01-06 12:12:54,48.02,11.37,"(48.02, 11.37)",Walk,,"[98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 1...","[0, 9, 15, 23, 30, 35, 40, 46, 51, 58, 64, 72,...","[[48.016734, 11.372375], [48.016779, 11.37234]..."
2018-01-07 14:55:21,50.2,118.8,1.60 m / s,,,5685.60 m,00:59:45,"(48.22, 11.29)",,1917790652,...,2018-01-07 13:55:21+00:00,2018-01-07 14:55:21,48.22,11.29,"(48.22, 11.29)",Walk,,"[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 9...","[0, 3, 7, 11, 14, 18, 21, 23, 26, 29, 33, 36, ...","[[48.216719, 11.29224], [48.216719, 11.29224],..."
2018-01-08 19:54:08,51.3,125.7,1.70 m / s,,,6057.40 m,00:59:29,"(48.22, 11.29)",,1917790573,...,2018-01-08 18:54:08+00:00,2018-01-08 19:54:08,48.22,11.29,"(48.22, 11.29)",Walk,,"[119, 119, 119, 119, 111, 110, 109, 109, 109, ...","[0, 5, 9, 13, 17, 21, 26, 28, 33, 36, 40, 43, ...","[[48.217046, 11.292157], [48.217046, 11.292157..."
2018-01-09 20:55:23,56.1,118.4,1.74 m / s,,,5909.70 m,00:57:17,"(48.22, 11.29)",,1917790502,...,2018-01-09 19:55:23+00:00,2018-01-09 20:55:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[113, 113, 113, 113, 113, 113, 113, 113, 113, ...","[0, 5, 8, 12, 14, 17, 20, 24, 27, 31, 34, 37, ...","[[48.217386, 11.292988], [48.217386, 11.292988..."


In [61]:
activity_df['stream_altitude'] = activity_df.apply(
    lambda row: lookup_stream(row['id'], 'altitude'), axis=1)
activity_df.head()

Unnamed: 0,average_cadence,average_heartrate,average_speed,calories,description,distance,elapsed_time,end_latlng,gear,id,...,start_date_local,start_latitude,start_longitude,start_latlng,type,workout_type,stream_heartrate,stream_time,stream_latlng,stream_altitude
2018-01-05 10:59:23,51.3,106.4,1.52 m / s,,,7949.40 m,01:28:40,"(48.22, 11.29)",,1917790913,...,2018-01-05 10:59:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[95, 95, 94, 94, 94, 94, 94, 94, 111, 109, 107...","[0, 11, 18, 24, 29, 34, 40, 45, 51, 55, 62, 70...","[[48.218981, 11.292518], [48.219032, 11.292494...","[507.1, 507.1, 507.1, 507.1, 507.1, 507.1, 507..."
2018-01-06 12:12:54,41.9,96.0,1.25 m / s,,,9725.20 m,02:20:37,"(48.07, 11.38)",,1917790861,...,2018-01-06 12:12:54,48.02,11.37,"(48.02, 11.37)",Walk,,"[98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 1...","[0, 9, 15, 23, 30, 35, 40, 46, 51, 58, 64, 72,...","[[48.016734, 11.372375], [48.016779, 11.37234]...","[611.8, 611.9, 612.1, 612.4, 612.7, 612.9, 613..."
2018-01-07 14:55:21,50.2,118.8,1.60 m / s,,,5685.60 m,00:59:45,"(48.22, 11.29)",,1917790652,...,2018-01-07 14:55:21,48.22,11.29,"(48.22, 11.29)",Walk,,"[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 9...","[0, 3, 7, 11, 14, 18, 21, 23, 26, 29, 33, 36, ...","[[48.216719, 11.29224], [48.216719, 11.29224],...","[511.1, 511.1, 510.9, 510.8, 510.7, 510.6, 510..."
2018-01-08 19:54:08,51.3,125.7,1.70 m / s,,,6057.40 m,00:59:29,"(48.22, 11.29)",,1917790573,...,2018-01-08 19:54:08,48.22,11.29,"(48.22, 11.29)",Walk,,"[119, 119, 119, 119, 111, 110, 109, 109, 109, ...","[0, 5, 9, 13, 17, 21, 26, 28, 33, 36, 40, 43, ...","[[48.217046, 11.292157], [48.217046, 11.292157...","[511.1, 511.1, 511.1, 511.1, 510.6, 510.5, 510..."
2018-01-09 20:55:23,56.1,118.4,1.74 m / s,,,5909.70 m,00:57:17,"(48.22, 11.29)",,1917790502,...,2018-01-09 20:55:23,48.22,11.29,"(48.22, 11.29)",Walk,,"[113, 113, 113, 113, 113, 113, 113, 113, 113, ...","[0, 5, 8, 12, 14, 17, 20, 24, 27, 31, 34, 37, ...","[[48.217386, 11.292988], [48.217386, 11.292988...","[509.9, 509.9, 509.8, 509.6, 509.6, 509.5, 509..."


In [62]:
activity_df.to_pickle("./activity_streams.pkl")

In [69]:
laps_data = []
laps_columns = ['activity_id', 'average_cadence', 'average_heartrate', 
               'average_speed', 'distance', 'elapsed_time', 'id', 
               'end_index', 'lap_index', 'max_heartrate', 'max_speed', 
               'moving_time', 'name', 'pace_zone', 'resource_state',
               'split', 'start_date', 'start_date_local', 'start_index',
               'total_elevation_gain']
laps_index = []
laps_index_column = 'start_date_local'

for idx, row in activity_df.iterrows():
    for lap in client.get_activity_laps(row['id']):
        lap_dict = {}
        for lap_column in laps_columns:
            if lap_column == "activity_id":
                lap_dict[lap_column] = lap.__getattribute__('activity').__getattribute__('id')
            else:
                lap_dict[lap_column] = lap.__getattribute__(lap_column)
                
        laps_data.append(lap_dict)
        laps_index.append(lap_dict[laps_index_column])

lap_df = pd.DataFrame(
    laps_data, 
    index=laps_index, 
    columns=laps_columns)

lap_df.head()

Unnamed: 0,activity_id,average_cadence,average_heartrate,average_speed,distance,elapsed_time,id,end_index,lap_index,max_heartrate,max_speed,moving_time,name,pace_zone,resource_state,split,start_date,start_date_local,start_index,total_elevation_gain
2018-01-05 10:59:23,1917790913,50.0,106.8,1.51 m / s,1000.00 m,00:11:03,6171899282,658,1,118.0,2.00 m / s,00:11:03,Lap 1,,2,1,2018-01-05 09:59:23+00:00,2018-01-05 10:59:23,0,0.00 m
2018-01-05 11:10:27,1917790913,50.9,105.7,1.64 m / s,1000.00 m,00:10:10,6171899283,1268,2,113.0,2.10 m / s,00:10:10,Lap 2,,2,2,2018-01-05 10:10:27+00:00,2018-01-05 11:10:27,659,0.00 m
2018-01-05 11:20:37,1917790913,54.0,108.7,1.60 m / s,1000.00 m,00:10:24,6171899284,1893,3,113.0,2.20 m / s,00:10:24,Lap 3,,2,3,2018-01-05 10:20:37+00:00,2018-01-05 11:20:37,1269,2.30 m
2018-01-05 11:31:02,1917790913,52.5,109.9,1.41 m / s,1000.00 m,00:11:47,6171899285,2601,4,121.0,2.10 m / s,00:11:47,Lap 4,,2,4,2018-01-05 10:31:02+00:00,2018-01-05 11:31:02,1894,3.00 m
2018-01-05 11:42:50,1917790913,48.9,104.0,1.50 m / s,1000.00 m,00:11:07,6171899286,3268,5,111.0,2.10 m / s,00:11:07,Lap 5,,2,5,2018-01-05 10:42:50+00:00,2018-01-05 11:42:50,2602,2.50 m


In [70]:
laps_df.info()

NameError: name 'laps_df' is not defined

## Enrich location data
The location data is provided as latitude/logitude only. We use the Nominatim service to convert this in an address and store country and postcode.

In [67]:
from time import sleep
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="mybinder")

def lookup_address(start_latlng):
    sleep(1.5)
    lat = start_latlng.lat
    lon = start_latlng.lon
    loc = geolocator.reverse(str(lat)+", "+str(lon))
    return loc.raw["address"]
    
def lookup_country(start_latlng):
    if start_latlng:
        return lookup_address(start_latlng)["country"]

def lookup_postcode(start_latlng):
    if start_latlng:
        return lookup_address(start_latlng)["postcode"]

#activity_df['location_country'] = activity_df.apply(
#    lambda row: lookup_country(row['start_latlng']), axis=1)
activity_df['location_postcode'] = activity_df.apply(
    lambda row: lookup_postcode(row['start_latlng']), axis=1)
activity_df["location_postcode"]

2018-01-05 10:59:23    82216
2018-01-06 12:12:54    82319
2018-01-07 14:55:21    82216
2018-01-08 19:54:08    82216
2018-01-09 20:55:23    82216
2018-01-10 20:58:40    82216
2018-01-11 20:10:20    82216
2018-01-12 18:56:49    82216
2018-01-13 15:31:56    82216
2018-01-14 14:30:01    82216
2018-01-15 20:08:06    82216
2018-01-20 11:07:55    82216
2018-01-21 13:11:36    82216
2018-01-22 14:11:08    82216
2018-01-23 20:28:07    82216
2018-01-24 22:49:22    82216
2018-01-25 20:59:05    82216
2018-01-26 23:49:23    82216
2018-01-27 15:26:26    82216
2018-01-28 14:24:29    82216
2018-01-29 21:02:35    82216
2018-01-31 20:25:05    82216
2018-02-01 20:31:13    82216
2018-02-02 20:42:45    82216
2018-02-03 15:19:43    82216
2018-02-04 12:42:53    82216
2018-02-04 15:35:15    82216
2018-02-07 20:22:47    82216
2018-02-08 21:41:04    82216
2018-02-09 18:26:49    82216
                       ...  
2018-09-25 06:17:51    08034
2018-09-26 06:12:06    08034
2018-09-27 05:25:22    10717
2018-09-28 05:

In [68]:
activity_df.to_pickle("./activity_streams_location.pkl")