# Strava Activities Downloader

Download all of your Strava Activites and Export to CSV

SEE: strava_data_analysis.ipynb for Strava Data Analysis and Data Visualization of Running, Cycling and Other Activities

References:

* https://developers.strava.com/docs/

* https://github.com/sladkovm/stravaio



Contributions:

* Kristoffer J. Zieba (https://github.com/kriszieba)

* Contributors of StravaIO (https://github.com/sladkovm/stravaio)

## 1. Import required libraries

In [1]:
import os # will be used for reading environment variables
from stravaio import StravaIO, strava_oauth2 # will be used for accessing Strava data
import pandas as pd # if you know this, you are qualified at least for the data analitics job 
import numpy as np # the mother of all data processing in Python
import datetime # decoding of time info
from dateutil.tz import tzutc # decoding of time info

## 2. Load Strava Data

To access your strava data programmatically you will need an **access token**. Strava recently changed their policy and accessing a token with activity level permission become a bit cumbersome. For such cases I've created a project that allows to lounch a local server to get the personal token. Go to the [strava-oauth](https://github.com/sladkovm/strava-oauth) for more info.

To access the strava data you need a *stravaio* library. It's purpose is to be as declarative as possible and the user should never have a feeling that he is working with a web service. In the center of the *stravaio* workflow is the *StravaIO* object that exposes all required functions to access *athlete*, *activities* and *streams* data.

In [2]:
access_token = strava_oauth2(client_id=NNNNN, client_secret='XXXXXXX') # Replace Ns and Xs by real values

2020-08-26 16:34:00.016 | INFO     | stravaio:strava_oauth2:343 - serving at port 8000
2020-08-26 16:34:02.771 | DEBUG    | stravaio:run_server_and_wait_for_token:397 - code: 6c3fe70011648fe076106d23b7fc1fbeefb0725a
2020-08-26 16:34:03.338 | DEBUG    | stravaio:run_server_and_wait_for_token:406 - Authorized athlete: 83758d7707e4c7cf3b057faedb4e2bedb8044f35


In [3]:
client = StravaIO(access_token=access_token['access_token'])

StravaIO object directly exposes [Strava Swagger API interfaces](https://developers.strava.com/docs/reference/)

In [4]:
client.__dict__

{'configuration': <swagger_client.configuration.Configuration at 0x1a2b01bf1c8>,
 '_api_client': <swagger_client.api_client.ApiClient at 0x1a2b01bf188>,
 'athletes_api': <swagger_client.api.athletes_api.AthletesApi at 0x1a2b01db4c8>,
 'activities_api': <swagger_client.api.activities_api.ActivitiesApi at 0x1a2b01db588>,
 'streams_api': <swagger_client.api.streams_api.StreamsApi at 0x1a2b01db548>}

The StravaIO exposes a number of methods that allow direct access to Strava data (essentially these are the wrappers around the api interfaces that simplify life)

In [5]:
for m in dir(client):
    if not m.startswith('_'):
        print(m)

activities_api
athletes_api
configuration
get_activity_by_id
get_activity_streams
get_logged_in_athlete
get_logged_in_athlete_activities
local_activities
local_athletes
local_streams
streams_api


### Get athlete information
Only information about the logged in athlete could be accessed. The access_token unambiguosly encodes the information about the athlete.

In [6]:
import os
#import dotenv
import warnings

athlete = client.get_logged_in_athlete()

In [7]:
print(f"""
Name: {athlete.api_response.firstname}, 
Last Name: {athlete.api_response.lastname}, 
FTP: {athlete.api_response.ftp}, 
""")


Name: Kristoffer Jan, 
Last Name: Zieba, 
FTP: 300, 



### Get the athletes activities

The *after* parameter takes the date in human readable format - you can even tell it "Last year". The function returns a list of SummaryActivity.

In [8]:
activities = client.get_logged_in_athlete_activities(after='last month')

Fetched 18, the latests is on 2020-08-22 12:41:46+00:00


In [9]:
type(activities[0])

swagger_client.models.summary_activity.SummaryActivity

# 3. Make a local copy of training data

In [10]:
data = [] #local copy of downloaded activities
for a in activities:
        data.append(a)

In [11]:
type(data)

list

In [12]:
# Reassigning in order to be able to make dataframe out of
txt = str(data)
lst = eval(txt)

In [13]:
df = pd.DataFrame(lst)
df = df.set_index("start_date_local")

In [14]:
df

Unnamed: 0_level_0,achievement_count,athlete,athlete_count,average_speed,average_watts,comment_count,commute,device_watts,distance,elapsed_time,...,start_date,start_latlng,timezone,total_elevation_gain,total_photo_count,trainer,type,upload_id,weighted_average_watts,workout_type
start_date_local,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-07-27 17:09:54+00:00,9,{'id': 1010749},1,7.552,196.6,0,False,True,58156.0,8565,...,2020-07-27 15:09:54+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,1014.0,0,False,Ride,4087852343,244.0,
2020-07-28 20:38:56+00:00,12,{'id': 1010749},1,7.656,182.7,0,False,False,17670.0,2335,...,2020-07-28 18:38:56+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,228.0,0,False,Ride,4095779077,,
2020-07-29 16:32:42+00:00,14,{'id': 1010749},1,8.055,191.3,0,False,True,81100.0,10227,...,2020-07-29 14:32:42+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,930.0,0,False,Ride,4101312337,214.0,
2020-07-31 15:49:32+00:00,9,{'id': 1010749},1,8.483,186.8,0,False,False,37570.4,7911,...,2020-07-31 13:49:32+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,307.0,0,False,Ride,4112490263,,
2020-08-01 12:18:42+00:00,18,{'id': 1010749},1,7.959,194.3,0,False,True,104618.0,13864,...,2020-08-01 10:18:42+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,1299.0,2,False,Ride,4117730459,212.0,10.0
2020-08-04 17:28:28+00:00,0,{'id': 1010749},1,8.143,167.0,0,False,False,23746.2,4334,...,2020-08-04 15:28:28+00:00,"[63.43, 10.34]",(GMT+01:00) Europe/Oslo,215.0,0,False,Ride,4134832264,,
2020-08-04 18:46:26+00:00,14,{'id': 1010749},11,10.877,324.6,3,False,False,29183.5,2683,...,2020-08-04 16:46:26+00:00,"[63.33, 10.3]",(GMT+01:00) Europe/Oslo,238.0,2,False,Ride,4134830886,,11.0
2020-08-04 19:50:29+00:00,2,{'id': 1010749},3,6.769,147.9,0,False,False,15575.8,2450,...,2020-08-04 17:50:29+00:00,"[63.33, 10.3]",(GMT+01:00) Europe/Oslo,237.0,0,False,Ride,4135139826,,
2020-08-05 08:19:52+00:00,0,{'id': 1010749},1,6.891,158.9,0,False,True,34050.9,4991,...,2020-08-05 06:19:52+00:00,"[63.43, 10.35]",(GMT+01:00) Europe/Oslo,419.0,0,False,Ride,4137786561,175.0,
2020-08-07 11:54:47+00:00,42,{'id': 1010749},1,7.499,187.7,0,False,True,121131.0,18587,...,2020-08-07 09:54:47+00:00,"[63.2, 9.77]",(GMT+01:00) Europe/Oslo,1624.0,3,False,Ride,4150962314,213.0,10.0


In [15]:
df.to_csv('data/strava-activities-raw.csv')