Prototype where I write down all steps in case this becomes an instructional blog later.

Approach:
1. Set up connection to Strava
2. Set up local pgres database
3. Design
4. Retrieve data using Strava API
5. Load data to database

# Export strava data (using Stravalib) to local

This notebook includes the code as dicussed in the following Medium blog post:
https://medium.com/@mandieq/accessing-user-data-via-the-strava-api-using-stravalib-d5bee7fdde17

Documentation: authentication
https://stravalib.readthedocs.io/en/latest/get-started/authenticate-with-strava.html

Documentation: working with activities
https://stravalib.readthedocs.io/en/latest/get-started/activities.html



## Connection to Strava

### Create a new application
Log into strava.com, navigate to your profile, Settings, My API Application. Create an application.
https://www.strava.com/settings/api

Create a new application. Callback domain = localhost. Other parameters are arbitrary. Upload an icon. 

Make note of the client id and client secret. The other 2x tokens aren't useful because the scope isn't enough to retrieve data.


Load ID and secret for your Strava App as set up via Strava's developer area:

In [3]:
CLIENT_ID = '108084'
CLIENT_SECRET = 'bd51d6ce5c6c2e663f195859b3b092631bcd6242'
print ('client_ID=', CLIENT_ID, 'secret=', CLIENT_SECRET)

client_ID= 108084 secret= bd51d6ce5c6c2e663f195859b3b092631bcd6242


### One time authentication with athlete

Only needs to be done once in order to get auth token. Hereafter you use the refresh token to get a new one when the old one runs out. 

The cells below need to be converted back to code cells in order to run them. Note: **once only**!

Copy/paste the resulting URL into a browser. Click the Authorize button. Copy paste the resulting URL.
Copy down the code in the middle of the URL (between code= and scope=)


In [9]:
url = client.authorization_url(client_id=CLIENT_ID,
                               redirect_uri='http://127.0.0.1:5000/authorization')
print(url)

#http://127.0.0.1:5000/authorization?state=&code=0cfa95587f7f320a002dc2eaa89d9b1693aba03a&scope=read,activity:read
#0cfa95587f7f320a002dc2eaa89d9b1693aba03a



https://www.strava.com/oauth/authorize?client_id=108084&redirect_uri=http%3A%2F%2F127.0.0.1%3A5000%2Fauthorization&approval_prompt=auto&scope=read%2Cactivity%3Aread&response_type=code
auth_code= 0cfa95587f7f320a002dc2eaa89d9b1693aba03a


In [9]:
AUTH_CODE = '0cfa95587f7f320a002dc2eaa89d9b1693aba03a'
print('auth_code=', AUTH_CODE)

auth_code= 0cfa95587f7f320a002dc2eaa89d9b1693aba03a


### Exchange auth code for access token + refresh token

In [10]:
import json
from stravalib import Client
client = Client()

In [11]:
client = Client()
token_response = client.exchange_code_for_token(client_id=CLIENT_ID,
                                              client_secret=CLIENT_SECRET,
                                              code=AUTH_CODE)
access_token = token_response['access_token']
refresh_token = token_response['refresh_token']  # You'll need this in 6 hours
print(token_response)

with open('strava.token', 'w') as outfile:
    json.dump(token_response, outfile)


No rates present in response headers


Fault: 400 Client Error: Bad Request [Bad Request: [{'resource': 'AuthorizationCode', 'field': 'code', 'code': 'invalid'}]]

In [12]:
# Retrieve tokens from token file
token_string = ''
with open('strava.token', 'r') as infile:
    token_string = json.load(infile)
print(token_string)

{'access_token': '687d63e48317d2f1f9bebed892b571aaa039ffc8', 'refresh_token': '0967598ae6324d33c3790bfe68e7a1cf84f42e20', 'expires_at': 1714267638}


### Refresh access token


In [13]:
token_response = client.refresh_access_token(client_id=CLIENT_ID,
                                      client_secret=CLIENT_SECRET,
                                      refresh_token=refresh_token)
print(token_response)

with open('strava.token', 'w') as outfile:
    json.dump(token_response, outfile)

NameError: name 'refresh_token' is not defined

### check when token expires

In [17]:
from datetime import datetime

expires_at = '1714267638'
print('expires_at=',expires_at)
print(datetime.fromtimestamp(int(expires_at)))

expires_at= 1714267638
2024-04-28 03:27:18


In [23]:

print(datetime.now())

if datetime.now() < datetime.fromtimestamp(int(expires_at)):
    print('no need to renew the refresh token')
elif datetime.now() > datetime.fromtimestamp(int(expires_at)):
    print('time to renew the refresh token')
else:
    print('dunno')


1714247218.7974396
2024-04-27 21:46:58.797439
no need to renew the refresh token


## Design



Create one schema for historical load of raw data: STG (staging)
Create one schema for transformed model: DM (data mart)

The source tables under STG, format <source name>_<api or entity> Eg.
STRAVA_ATHLETE
STRAVA_GEAR
STRAVA_ACTIVITY
REF_DATE
KNMI_weather_something
GARMIN_SLEEP
GARMIN_ATHLETE
GARMIN_period_something


## Retrieve data using API

### Get Athlete (one-off)

In [24]:
athlete = client.get_athlete()
print(athlete)


bound_client = <stravalib.client.Client object at 0x000001536E12EBD0> 
id=106384724 
city=None 
country=None 
created_at=datetime.datetime(2022, 8, 1, 20, 14, 41, tzinfo=datetime.timezone.utc) 
firstname='Nerrida' 
lastname='Dempster' 
premium=False 
profile='https://dgalywyr863hv.cloudfront.net/pictures/athletes/106384724/25269633/2/large.jpg' 
profile_medium='https://dgalywyr863hv.cloudfront.net/pictures/athletes/106384724/25269633/2/medium.jpg' 
resource_state=2 
sex=None 
state=None 
summit=False 
updated_at=datetime.datetime(2024, 4, 27, 18, 40, 26, tzinfo=datetime.timezone.utc) 
bikes=None clubs=None follower_count=None friend_count=None ftp=None measurement_preference=None shoes=None weight=0.0 
is_authenticated=None athlete_type=None friend=None follower=None approve_followers=None badge_type_id=0 mutual_friend_count=None date_preference=None 
email=None super_user=None email_language=None max_heartrate=None username=None description=None instagram_username=None offer_in_app_payment=None 
global_privacy=None receive_newsletter=None email_kom_lost=None dateofbirth=None facebook_sharing_enabled=None profile_original=None 
premium_expiration_date=None email_send_follower_notices=None plan=None agreed_to_terms=None follower_request_count=None 
email_facebook_twitter_friend_joins=None receive_kudos_emails=None receive_follower_feed_emails=None receive_comment_emails=None 
sample_race_distance=None sample_race_time=None membership=None admin=None owner=None subscription_permissions=None


bound_client=<stravalib.client.Client object at 0x000001536E12EBD0> id=106384724 city=None country=None created_at=datetime.datetime(2022, 8, 1, 20, 14, 41, tzinfo=datetime.timezone.utc) firstname='Nerrida' lastname='Dempster' premium=False profile='https://dgalywyr863hv.cloudfront.net/pictures/athletes/106384724/25269633/2/large.jpg' profile_medium='https://dgalywyr863hv.cloudfront.net/pictures/athletes/106384724/25269633/2/medium.jpg' resource_state=2 sex=None state=None summit=False updated_at=datetime.datetime(2024, 4, 27, 18, 40, 26, tzinfo=datetime.timezone.utc) bikes=None clubs=None follower_count=None friend_count=None ftp=None measurement_preference=None shoes=None weight=0.0 is_authenticated=None athlete_type=None friend=None follower=None approve_followers=None badge_type_id=0 mutual_friend_count=None date_preference=None email=None super_user=None email_language=None max_heartrate=None username=None description=None instagram_username=None offer_in_app_payment=None global_p

Get Athlete Stats

In [25]:
athlete_stats = client.get_athlete_stats()
print(athlete_stats)


all_ride_totals=ActivityTotals(achievement_count=None, count=9, distance=140543.0, elapsed_time=datetime.timedelta(seconds=42100), elevation_gain=317.0, moving_time=datetime.timedelta(seconds=29257)) 
all_run_totals=ActivityTotals(achievement_count=None, count=52, distance=154589.0, elapsed_time=datetime.timedelta(seconds=74898), elevation_gain=15.0, moving_time=datetime.timedelta(seconds=73419)) 
all_swim_totals=ActivityTotals(achievement_count=None, count=0, distance=0.0, elapsed_time=datetime.timedelta(0), elevation_gain=0.0, moving_time=datetime.timedelta(0)) 
biggest_climb_elevation_gain=20.9 biggest_ride_distance=29129.9 recent_ride_totals=ActivityTotals(achievement_count=0, count=1, distance=6571.25, elapsed_time=datetime.timedelta(seconds=2264), elevation_gain=2.825714349746704, moving_time=datetime.timedelta(seconds=1400)) recent_run_totals=ActivityTotals(achievement_count=0, count=8, distance=22480.0, elapsed_time=datetime.timedelta(seconds=9446), elevation_gain=0.0, moving_time=datetime.timedelta(seconds=9446)) recent_swim_totals=ActivityTotals(achievement_count=0, count=0, distance=0.0, elapsed_time=datetime.timedelta(0), elevation_gain=0.0, moving_time=datetime.timedelta(0)) ytd_ride_totals=ActivityTotals(achievement_count=None, count=6, distance=86504.0, elapsed_time=datetime.timedelta(seconds=20022), elevation_gain=65.0, moving_time=datetime.timedelta(seconds=16838)) ytd_run_totals=ActivityTotals(achievement_count=None, count=17, distance=36808.0, elapsed_time=datetime.timedelta(seconds=16810), elevation_gain=0.0, moving_time=datetime.timedelta(seconds=16807)) ytd_swim_totals=ActivityTotals(achievement_count=None, count=0, distance=0.0, elapsed_time=datetime.timedelta(0), elevation_gain=0.0, moving_time=datetime.timedelta(0))


all_ride_totals=ActivityTotals(achievement_count=None, count=9, distance=140543.0, elapsed_time=datetime.timedelta(seconds=42100), elevation_gain=317.0, moving_time=datetime.timedelta(seconds=29257)) all_run_totals=ActivityTotals(achievement_count=None, count=52, distance=154589.0, elapsed_time=datetime.timedelta(seconds=74898), elevation_gain=15.0, moving_time=datetime.timedelta(seconds=73419)) all_swim_totals=ActivityTotals(achievement_count=None, count=0, distance=0.0, elapsed_time=datetime.timedelta(0), elevation_gain=0.0, moving_time=datetime.timedelta(0)) biggest_climb_elevation_gain=20.9 biggest_ride_distance=29129.9 recent_ride_totals=ActivityTotals(achievement_count=0, count=1, distance=6571.25, elapsed_time=datetime.timedelta(seconds=2264), elevation_gain=2.825714349746704, moving_time=datetime.timedelta(seconds=1400)) recent_run_totals=ActivityTotals(achievement_count=0, count=8, distance=22480.0, elapsed_time=datetime.timedelta(seconds=9446), elevation_gain=0.0, moving_time

### Get Activities (Repeatable)

In [1]:
activities = client.get_activities(limit=3)

for activity in activities:
    print(activity)

NameError: name 'client' is not defined

In [35]:
for activity in activities:
    print("{0.id} {0.name} {0.start_date} {0.moving_time}".format(activity))


11269224530 Weighed walk 8kg 2024-04-26 15:13:09+00:00 0:56:06
11261527928 Treadmill jog 2024-04-25 14:12:54+00:00 0:20:22
11247062574 Treadmill jog 2024-04-23 15:51:52+00:00 0:20:34
11239384435 Treadmill jog 2024-04-22 15:31:55+00:00 0:21:21
11232759663 Afternoon Walk 2024-04-21 12:00:34+00:00 0:31:14
11223836583 Yoga 2024-04-20 12:47:55+00:00 0:33:36
11223490375 Treadmill jog 2024-04-20 12:09:16+00:00 0:26:23
11211687883 Evening Walk 2024-04-18 16:44:13+00:00 1:16:01
11196806607 Weighted walk 8kg 2024-04-16 17:17:07+00:00 0:42:33
11183483728 Treadmill jog 2024-04-14 19:02:14+00:00 0:20:19


### Get Gear
After retrieving all activities, make a list of distinct gear and retrieve it

In [32]:
gear_id = 'g11424741'
gear = client.get_gear(gear_id)
print(gear)

#distance=261640.0 id='g11260745' name='Brooks Defyance 12' primary=False resource_state=3 brand_name='Brooks' description=None frame_type=None model_name='Defyance 12'
#distance=805092.0 id='g11424741' name='Meindl UTAH GTX' primary=False resource_state=3 brand_name='Meindl' description=None frame_type=None model_name='UTAH GTX'


distance=805092.0 id='g11424741' name='Meindl UTAH GTX' primary=False resource_state=3 brand_name='Meindl' description=None frame_type=None model_name='UTAH GTX'


## Ingest to database

Run program PGAdmin4

In the left-hand pane, navigate to Databases. Right-click and create a new database: Sport_Activity_Tracker

Utility to convert json to pgres ddl
https://konbert.com/convert/json/to/postgres

## To do, improvements

Consider additional data sources:
* Garmin for sleep and period data
* KNMI for weather
* Date dimension
* Transform to custom activity types
* Add measures as defined in ideas.txt