# Exporting Fitbit Data

<b> Katriona Goldmann </b>

This script accesses the fitbit API and outputs the data to csv files. Fitbit API Python Client Implemented using scripts by [Brad Pitcher's python-fitbit repo](https://github.com/orcasgit/python-fitbit). 

## Outline:
* [Step 1: Access the fitbit API](#first-bullet)
* [Step 2: Extract the Data](#second-bullet)
* [Step 3: Export Exercise Data](#exercise-bullet)
* [Step 4: Export Daily Summaries](#summary-bullet)
* [Step 5: Export the sleep summary data](#sleep-bullet)

I have automated the script to replot the data every other day using [launchd](https://medium.com/@chetcorcos/a-simple-launchd-tutorial-9fecfcf2dbb3) on my laptop. There is also a copy of the launchd script, as well as the bash script it exectutes (fitbit-update.sh), in this repo. 

## Step 1: Access the fitbit API <a class="anchor" id="first-bullet"></a>

Import the necessary packages and their use cases for this project:
   
>fitbit: access fitbit data <br>
>gather_keys_oauth2: authorize fitbit access <br>
>pandas: data frames and data manipulation <br>
>numpy: summary statistics <br>
>datetime: turn the dates into datetime objects / get day of week <br>
>cherrypy: for web applications<br>
>sys: This module provides access to some variables used or maintained by the interpreter<br>
>gather_keys_oauth: to authorize the API, script [here](https://github.com/orcasgit/python-fitbit/blob/master/gather_keys_oauth2.py)<br>

To download the fitbit module simply run 
> $ pip install git+git://github.com/orcasgit/python-fitbit

In [29]:
import fitbit
import pandas as pd
import numpy as np
import os
import datetime
from datetime import date, timedelta
import cherrypy
import sys
import gather_keys_oauth2 as Oauth2

I have saved my api keys in a csv file located within this directory. Your will need to enter your unique client ID and secret as described in step 1 [here](https://towardsdatascience.com/collect-your-own-fitbit-data-with-python-ff145fa10873):

In [30]:
df = pd.read_csv('./Inputs/api_key.csv')

CLIENT_ID = df.iat[0,0]
CLIENT_SECRET = df.iat[0,1]

cherrypy.config.update({'server.socket_host': '127.0.0.1'})

In [31]:
server = Oauth2.OAuth2Server(CLIENT_ID, CLIENT_SECRET)

The next cell will take you to a new tab for authentification, you may need to log in and approve. The tab can be closed once authorized. If this causes errors, this may be due to package versions installed (see [here](https://github.com/orcasgit/python-fitbit/issues/142)). If this is the case roll back to previous versions: pip install requests-oauthlib==1.1.0, and pip install oauthlib==2.1.0

In [32]:
server.browser_authorize()

[05/Mar/2019:10:25:06] ENGINE Listening for SIGTERM.
[05/Mar/2019:10:25:06] ENGINE Listening for SIGHUP.
[05/Mar/2019:10:25:06] ENGINE Listening for SIGUSR1.
[05/Mar/2019:10:25:06] ENGINE Bus STARTING
CherryPy Checker:
The Application mounted at '' has an empty config.

[05/Mar/2019:10:25:06] ENGINE Started monitor thread 'Autoreloader'.
[05/Mar/2019:10:25:06] ENGINE Serving on http://127.0.0.1:8080
[05/Mar/2019:10:25:06] ENGINE Bus STARTED


127.0.0.1 - - [05/Mar/2019:10:25:09] "GET /?code=f4d6b60a7be7d7f7e2a725460ada0a42ae642680&state=ebt7pJoE136fpCWl7xN76BMtja7YXP HTTP/1.1" 200 122 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.3 Safari/605.1.15"


[05/Mar/2019:10:25:10] ENGINE Bus STOPPING
[05/Mar/2019:10:25:15] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8080)) shut down
[05/Mar/2019:10:25:15] ENGINE Stopped thread 'Autoreloader'.
[05/Mar/2019:10:25:15] ENGINE Bus STOPPED
[05/Mar/2019:10:25:15] ENGINE Bus EXITING
[05/Mar/2019:10:25:15] ENGINE Bus EXITED
[05/Mar/2019:10:25:15] ENGINE Waiting for child threads to terminate...


In [33]:
ACCESS_TOKEN = str(server.fitbit.client.session.token['access_token'])
REFRESH_TOKEN = str(server.fitbit.client.session.token['refresh_token'])

auth2_client = fitbit.Fitbit(
    CLIENT_ID,
    CLIENT_SECRET,
    oauth2=True,
    access_token=ACCESS_TOKEN,
    refresh_token=REFRESH_TOKEN)

# Step 2: Extract the User Data <a class="anchor" id="second-bullet"></a>

The details and documentation for the API functions can be found [here](https://python-fitbit.readthedocs.io/en/latest/).

In [35]:
user_info = auth2_client.user_profile_get()["user"]

In [36]:
print("\nAverage daily steps: ", user_info["averageDailySteps"], 
      "\nstride length: ", user_info["strideLengthWalking"]*0.0254, 'm',
      "\nrunning stride length: ", user_info["strideLengthRunning"]*0.0254, 'm')


Average daily steps:  10601 
stride length:  0.648 m 
running stride length:  0.961 m


# Step 3: Export Exercise Data <a class="anchor" id="exercise-bullet"></a>


Extract the log of exercise so that it can be analysed further down the line. This way we can monitor exercise frequncy and check for improvements or changes. But first lets check the dates we still need to run:

In [37]:
# If the analysis has been run before, we only need to run from the last date
if os.path.exists('./Outputs/exercise.csv'):
    ex_comp = pd.read_csv('./Outputs/exercise.csv')
    lastdate = ex_comp.iloc[-1]['lastModified'][0:10]
else:
    lastdate = "2015/01/01" # define the start date

d1 = date(int(lastdate[0:4]), int(lastdate[5:7]), int(lastdate[8:10]))    # start date
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ex_dates_list = []
for i in range(delta.days + 1): ex_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

Range of dates selected: 12 days, 0:00:00 from 2019-02-20 to 2019-03-04


In [38]:
ex_df = pd.DataFrame()

for d in ex_dates_list:
    summary = auth2_client.activities(date=d)
    exercise = pd.DataFrame.from_records(summary['activities'])
    ex_df = ex_df.append(exercise, sort=True)

print(ex_df['distance'])
ex_df['distance (miles)'] = ex_df['distance']
ex_df['distance'] = ex_df['distance (miles)'] * 1.60934  # Convert to km

ex_df['startDate'] = [
    datetime.datetime.strptime(x, '%Y-%m-%d').strftime('%d/%m/%Y')
    for x in ex_df['startDate']
]

0    2.236936
0    2.796170
Name: distance, dtype: float64


Inspect the data frame and export it to a csv: 

In [40]:
with open('./Outputs/exercise.csv', 'a') as f:
    f.write('\n') 
    ex_df.to_csv(f, header=False)

In [41]:
ex_df

Unnamed: 0,activityId,activityParentId,activityParentName,calories,description,distance,duration,hasStartTime,isFavorite,lastModified,logId,name,startDate,startTime,steps,distance (miles)
0,90009,90009,Run,188,Running - 5 mph (12 min/mile),3.599991,1320000,True,False,2019-03-05T10:04:41.000Z,20214666562,Run,24/02/2019,12:33,3746,2.236936
0,90009,90009,Run,256,Running - 5 mph (12 min/mile),4.499988,1807000,True,False,2019-03-05T10:03:23.000Z,20216333806,Run,04/03/2019,18:10,4547,2.79617


# Step 4: Export Daily Summaries <a class="anchor" id="summary-bullet"></a>

Here we will export the summary of each day. This inlcudes minutes of activity, steps, calories and heart rate. 

In [73]:
dict_vars = ('Cardio (mins at HR)', 'Fat Burn (mins at HR)', 'Out of Range/low (mins at HR)', \
             'Peak (mins at HR)', 'caloriesOut', 'fairlyActiveMinutes', 'lightlyActiveMinutes', \
             'restingHeartRate', 'sedentaryMinutes', 'steps', 'veryActiveMinutes')

In [77]:
# If the analysis has been run before, we only need to run from the last date
if os.path.exists('./Outputs/daily_summary.csv'):
    ds_comp = pd.read_csv('./Outputs/daily_summary.csv')
    lastdate = ds_comp.iloc[-1]['date'][0:10]
else:
    lastdate = "2015/01/01" # define the start date

d1 = date(int(lastdate[6:10]), int(lastdate[3:5]), int(lastdate[0:2])) 
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ds_dates_list = []
for i in range(delta.days + 1): ds_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

Range of dates selected: 8 days, 0:00:00 from 2019-02-24 to 2019-03-04


In [78]:
daily_df = pd.DataFrame()

for d in ds_dates_list:
    summary = auth2_client.activities(date=d)
    daily_sum = {k: summary['summary'][k] for k in dict_vars if k in summary['summary'].keys()}
    daily_sum2 = {k: 0 for k in dict_vars if k not in summary['summary'].keys()}
    daily_sum.update(daily_sum2)

    if 'heartRateZones' in summary['summary'].keys():
        daily_sum['Out of Range/low (mins at HR)'] = summary['summary']['heartRateZones'][0]['minutes']
        daily_sum['Fat Burn (mins at HR)'] = summary['summary']['heartRateZones'][1]['minutes']
        daily_sum['Cardio (mins at HR)'] = summary['summary']['heartRateZones'][2]['minutes']
        daily_sum['Peak (mins at HR)'] = summary['summary']['heartRateZones'][3]['minutes']
    
    daily_sum_df = pd.DataFrame.from_records(daily_sum, index=[0])
    daily_df = daily_df.append(daily_sum_df, sort=True)
    
daily_df = daily_df.fillna(0)

In [81]:
daily_df = daily_df.reindex(columns=dict_vars)
daily_df['date'] = [x.strftime('%m/%d/%Y') for x in ds_dates_list]

Inspect and export the daily summaries:

In [83]:
with open('./Outputs/daily_summary.csv', 'a') as f:
    f.write('\n') 
    daily_df.to_csv(f, header=False, index=False)

In [82]:
daily_df.head()

Unnamed: 0,Cardio (mins at HR),Fat Burn (mins at HR),Out of Range/low (mins at HR),Peak (mins at HR),caloriesOut,fairlyActiveMinutes,lightlyActiveMinutes,restingHeartRate,sedentaryMinutes,steps,veryActiveMinutes,date
0,16,52,509,0,1892,12,75,60,1275,13451,78,02/24/2019
0,0,28,1410,0,1875,21,99,59,1242,13231,78,02/25/2019
0,0,54,1376,0,1928,14,99,59,787,14875,86,02/26/2019
0,0,51,1171,0,1737,9,136,59,734,10209,16,02/27/2019
0,10,136,1154,0,2265,43,156,59,692,21923,94,02/28/2019


# Step 5: Export Sleep Summary Data <a class="anchor" id="sleep-bullet"></a>

Similarly we can export the sleep data which shows the hours in bed and hours asleep each day. 

In [67]:
# If the analysis has been run before, we only need to run from the last date
if os.path.exists('./Outputs/sleep_summary.csv'):
    ss_comp = pd.read_csv('./Outputs/sleep_summary.csv')
    lastdate= ss_comp.iloc[-1]['Date'][0:10]
else:
    lastdate = "2015/01/01" # define the start date

print(lastdate)    

d1 = date(int(lastdate[6:10]), int(lastdate[3:5]), int(lastdate[0:2]))     # start date
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ss_dates_list = []
for i in range(delta.days + 1): ss_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

26/02/2019
Range of dates selected: 6 days, 0:00:00 from 2019-02-26 to 2019-03-04


In [68]:
sleep_df = pd.DataFrame()

for d in ss_dates_list:
    fit_statsSl = auth2_client.sleep(date=d)
    stime_list = []
    sval_list = []

    if len(fit_statsSl['sleep']) != 0:
    
        for i in fit_statsSl['sleep'][0]['minuteData']:
            stime_list.append(i['dateTime'])
            sval_list.append(i['value'])
            
        #Calculate the sleep summary
        dict_sum = {
            'Date' : d, 
            'Time in bed (mins)' : len(stime_list), 
            'Time asleep (mins)' : (sval_list.count('1')),
            'Time awake (mins)' : (sval_list.count('2')),
            'Time very awake (mins)' : (sval_list.count('3')),
            'Bedtime (mins)' : (stime_list[0]),
            'Wake up (mins)' : (stime_list[-1]),
            'Total time' : "{:.2f}".format(((datetime.datetime.strptime(stime_list[-1],'%H:%M:%S') - \
                            datetime.datetime.strptime(stime_list[0],'%H:%M:%S')).total_seconds())/(60*60))
        }

        sleep_sum_df = pd.DataFrame.from_records(dict_sum, index=[0])
        sleep_df = sleep_df.append(sleep_sum_df)
    
    else:
        print('\tNo sleep data for ' + d.strftime('%Y-%m-%d'))

	No sleep data for 2019-03-04


In [69]:
sleep_df['Date'] = [x.strftime('%m/%d/%Y') for x in sleep_df['Date']]

In [71]:
with open('./Outputs/sleep_summary.csv', 'a') as f:
    f.write('\n') 
    sleep_df.to_csv(f, header=False)