# Exporting Fitbit Data

### Katriona Goldmann

This script accesses the fitbit API and outputs the data to csv files. Fitbit API Python Client Implemented using scripts by [Brad Pitcher's python-fitbit repo](https://github.com/orcasgit/python-fitbit). 

-----

## Outline:
* [Step 1: Access the fitbit API](#first-bullet)
* [Step 2: Extract the Data](#second-bullet)
* [Step 3: Export Exercise Data](#exercise-bullet)
* [Step 4: Export Daily Summaries](#summary-bullet)
* [Step 5: Export the sleep summary data](#sleep-bullet)
* [Step 6: Export the sleep minute-by-minute data](#sleep-min-bullet) [Unneccessary for this analysis]
* [Step 7: Export data minute by minute](#minutes-bullet) [Unneccessary for this analysis]

I have automated the script to replot the data every other day using [launchd](https://medium.com/@chetcorcos/a-simple-launchd-tutorial-9fecfcf2dbb3) on my laptop. There is also a copy of the launchd script, as well as the bash script it exectutes (fitbit-update.sh), in this repo. 

-----
## Step 1: Access the fitbit API <a class="anchor" id="first-bullet"></a> 

Import the necessary packages and their use cases for this project:
   
>fitbit: access fitbit data <br>
>gather_keys_oauth2: authorize fitbit access <br>
>pandas: data frames and data manipulation <br>
>numpy: summary statistics <br>
>datetime: turn the dates into datetime objects / get day of week 

To download the fitbit module simply run 
> $ pip install git+git://github.com/orcasgit/python-fitbit

In [15]:
import fitbit
import pandas as pd
import numpy as np
import os
import datetime
from datetime import date, timedelta
import cherrypy
import sys
import gather_keys_oauth2 as Oauth2

I have saved my api keys in a csv file located within this directory. Your will need to enter your unique client ID and secret as described in step 1 [here](https://towardsdatascience.com/collect-your-own-fitbit-data-with-python-ff145fa10873):

In [16]:
df = pd.read_csv('./Inputs/api_key.csv')

CLIENT_ID = df.iat[0,0]
CLIENT_SECRET = df.iat[0,1]

#cherrypy.config.update({'server.socket_host': '127.0.0.1'})

#### Access the API

In [17]:
server = Oauth2.OAuth2Server(CLIENT_ID, CLIENT_SECRET)

**The next cell will take you to a new tab for authentification**, you may need to log in and approve. The tab can be closed once authorized. 

In [18]:
server.browser_authorize()

[18/Feb/2019:17:41:30] ENGINE Listening for SIGTERM.
[18/Feb/2019:17:41:30] ENGINE Listening for SIGHUP.
[18/Feb/2019:17:41:30] ENGINE Listening for SIGUSR1.
[18/Feb/2019:17:41:30] ENGINE Bus STARTING
CherryPy Checker:
The Application mounted at '' has an empty config.

[18/Feb/2019:17:41:30] ENGINE Started monitor thread 'Autoreloader'.
[18/Feb/2019:17:41:30] ENGINE Serving on http://127.0.0.1:8080
[18/Feb/2019:17:41:30] ENGINE Bus STARTED
[18/Feb/2019:17:41:31] ENGINE Error in background task thread function <bound method Autoreloader.run of <cherrypy.process.plugins.Autoreloader object at 0x11b02f748>>.
Traceback (most recent call last):
  File "/Applications/anaconda3/lib/python3.7/site-packages/cherrypy/process/plugins.py", line 517, in run
    self.function(*self.args, **self.kwargs)
  File "/Applications/anaconda3/lib/python3.7/site-packages/cherrypy/process/plugins.py", line 669, in run
    for filename in self.sysfiles() | self.files:
  File "/Applications/anaconda3/lib/python

127.0.0.1 - - [18/Feb/2019:17:41:32] "GET /?code=ccbf5264e75a374216580fc627360fe4ed8401a3&state=K9KxkRANCcON7AaYLu7j3QRs5HBHKc HTTP/1.1" 200 122 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.3 Safari/605.1.15"


[18/Feb/2019:17:41:33] ENGINE Bus STOPPING
[18/Feb/2019:17:41:38] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8080)) shut down
[18/Feb/2019:17:41:38] ENGINE Stopped thread 'Autoreloader'.
[18/Feb/2019:17:41:38] ENGINE Bus STOPPED
[18/Feb/2019:17:41:38] ENGINE Bus EXITING
[18/Feb/2019:17:41:38] ENGINE Bus EXITED
[18/Feb/2019:17:41:38] ENGINE Waiting for child threads to terminate...


In [19]:
ACCESS_TOKEN = str(server.fitbit.client.session.token['access_token'])
REFRESH_TOKEN = str(server.fitbit.client.session.token['refresh_token'])

In [20]:
auth2_client = fitbit.Fitbit(CLIENT_ID, CLIENT_SECRET, oauth2=True, access_token=ACCESS_TOKEN, refresh_token=REFRESH_TOKEN)


In [21]:
auth2_client = fitbit.Fitbit(CLIENT_ID, CLIENT_SECRET, oauth2=True, access_token=ACCESS_TOKEN, refresh_token=REFRESH_TOKEN)

-----

# Step 2: Extract the Data <a class="anchor" id="second-bullet"></a>

The details and documentation for the API functions can be found [here](https://python-fitbit.readthedocs.io/en/latest/).

### i) Create range of dates to extract and analyse:

Select the date range by setting d1 and d2. The current setting sets d1 as the day I got the fitbit (a Christmas present 🎄🎅🏻🎁) and d2 as yesterday. We put in yesterday since today's data is incomplete. 

Then create a list of dates to be analysed. 

In [22]:
time_list, date_list = [], []

d1 = date(2018, 12, 25)  # start date
d2 = datetime.datetime.today().date() - timedelta(1) # take off one day since not complete
delta = (d2) - (d1) 
print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

dates_list = []
for i in range(delta.days + 1): dates_list.append(d1 + timedelta(i))

Range of dates selected: 54 days, 0:00:00 from 2018-12-25 to 2019-02-17


### ii) Export the user data

In [23]:
user_info = auth2_client.user_profile_get()["user"]

In [24]:
print("\nAverage daily steps: ", user_info["averageDailySteps"], 
      "\nstride length: ", user_info["strideLengthWalking"]*0.0254, 'm',
      "\nrunning stride length: ", user_info["strideLengthRunning"]*0.0254, 'm')


Average daily steps:  16843 
stride length:  0.648 m 
running stride length:  0.9530000000000001 m


-----

## Step 3: Export Exercise Data <a class="anchor" id="exercise-bullet"></a>


Extract the log of exercise so that it can be analysed further down the line. This way we can monitor exercise frequncy and check for improvements or changes. But first lets cehck the dates we still need to run:

In [25]:
ex_comp = pd.read_csv('./Outputs/exercise.csv')
temps = ex_comp.iloc[-1]['lastModified'][0:10]

d1 = date(int(temps[0:4]), int(temps[5:7]), int(temps[8:10]))    # start date
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ex_dates_list = []
for i in range(delta.days + 1): ex_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

Range of dates selected: 2 days, 0:00:00 from 2019-02-15 to 2019-02-17


In [26]:
ex_df = pd.DataFrame()

for d in ex_dates_list :
    summary = auth2_client.activities(date=d)
    exercise = pd.DataFrame.from_records(summary['activities'])
    ex_df = ex_df.append(exercise, sort=True)

Inspect the data frame and export it to a csv: 

In [27]:
with open('./Outputs/exercise.csv', 'a') as f:
    f.write('\n') 
    ex_df.to_csv(f, header=False)

In [28]:
ex_df

Unnamed: 0,activityId,activityParentId,activityParentName,calories,description,distance,duration,hasStartTime,isFavorite,lastModified,logId,name,startDate,startTime,steps
0,90009,90009,Run,133,Running - 5 mph (12 min/mile),1.491291,931000,True,False,2019-02-15T19:08:39.000Z,19895844014,Run,2019-02-15,18:51,2439


-----

## Step 4: Export Daily Summaries <a class="anchor" id="summary-bullet"></a>

Here we will export the summary of each day. This inlcudes minutes of activity, steps, calories and heart rate. 

In [31]:
dict_vars = ('sedentaryMinutes', 'lightlyActiveMinutes', 'fairlyActiveMinutes', 'veryActiveMinutes', \
             'steps', 'caloriesOut', 'restingHeartRate')

In [32]:
ds_comp = pd.read_csv('./Outputs/daily_summary.csv')
temps = ds_comp.iloc[-1]['date'][0:10]

d1 = date(int(temps[6:10]), int(temps[3:5]), int(temps[0:2]))     # start date
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ds_dates_list = []
for i in range(delta.days + 1): ds_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

Range of dates selected: 1 day, 0:00:00 from 2019-02-17 to 2019-02-18


In [58]:
daily_df = pd.DataFrame()

for d in ds_dates_list:
    summary = auth2_client.activities(date=d)
    daily_sum = {k: summary['summary'][k] for k in dict_vars if k in summary['summary'].keys()}
    daily_sum2 = {k: 0 for k in dict_vars if k not in summary['summary'].keys()}
    daily_sum.update(daily_sum2)

    if 'heartRateZones' in summary['summary'].keys():
        daily_sum['Out of Range/low (mins at HR)'] = summary['summary']['heartRateZones'][0]['minutes']
        daily_sum['Fat Burn (mins at HR)'] = summary['summary']['heartRateZones'][1]['minutes']
        daily_sum['Cardio (mins at HR)'] = summary['summary']['heartRateZones'][2]['minutes']
        daily_sum['Peak (mins at HR)'] = summary['summary']['heartRateZones'][3]['minutes']
    
    daily_sum_df = pd.DataFrame.from_records(daily_sum, index=[0])
    daily_df = daily_df.append(daily_sum_df, sort=True)
    
daily_df['date'] = ds_dates_list
daily_df = daily_df.fillna(0)

In [63]:
daily_df = daily_df.reindex(columns=dict_vars)
daily_df

Unnamed: 0,sedentaryMinutes,lightlyActiveMinutes,fairlyActiveMinutes,veryActiveMinutes,steps,caloriesOut,restingHeartRate
0,1440,0,0,0,0,1227,0
0,1081,0,0,0,0,921,0


Inspect and export the daily summaries:

In [None]:
with open('./Outputs/daily_summary.csv', 'a') as f:
    f.write('\n') 
    daily_df.to_csv(f, header=False)

In [33]:
daily_df.head()

Unnamed: 0,Cardio (mins at HR),Fat Burn (mins at HR),Out of Range/low (mins at HR),Peak (mins at HR),caloriesOut,fairlyActiveMinutes,lightlyActiveMinutes,restingHeartRate,sedentaryMinutes,steps,veryActiveMinutes,date
0,0.0,0.0,244.0,0.0,1279,0,26,63.0,1414,366,0,2019-02-11
0,13.0,125.0,762.0,14.0,2280,16,160,63.0,1170,20719,94,2019-02-12
0,18.0,87.0,874.0,0.0,1995,18,90,62.0,1241,16162,91,2019-02-13
0,0.0,109.0,1263.0,0.0,2000,28,128,62.0,628,16191,81,2019-02-14
0,9.0,113.0,864.0,0.0,2095,24,111,62.0,1200,18302,105,2019-02-15
0,14.0,88.0,917.0,10.0,1863,14,126,63.0,803,12844,51,2019-02-16
0,0.0,0.0,0.0,0.0,1227,0,0,0.0,1440,0,0,2019-02-17
0,0.0,0.0,0.0,0.0,832,0,0,0.0,977,0,0,2019-02-18


-----

## Step 5: Export Sleep Summary Data <a class="anchor" id="sleep-bullet"></a>

Similarly we can export the sleep data which shows the hours in bed and hours asleep each day. 

In [20]:
ss_comp = pd.read_csv('./Outputs/sleep_summary.csv')
ss_comp.head()

temps = ss_comp.iloc[-1]['Date'][0:10]

d1 = date(int(temps[0:4]), int(temps[5:7]), int(temps[8:10]))    # start date
d2 = datetime.datetime.today().date() - timedelta(1) 
delta = (d2) - (d1) 

ss_dates_list = []
for i in range(delta.days + 1): ss_dates_list.append(d1 + timedelta(i))

print('Range of dates selected:', delta, 'from',  d1, 'to', d2)

Range of dates selected: 10 days, 0:00:00 from 2019-02-08 to 2019-02-18


In [21]:
sleep_df = pd.DataFrame()

for d in ss_dates_list:
    fit_statsSl = auth2_client.sleep(date=d)
    stime_list = []
    sval_list = []

    if len(fit_statsSl['sleep']) != 0:
    
        for i in fit_statsSl['sleep'][0]['minuteData']:
            stime_list.append(i['dateTime'])
            sval_list.append(i['value'])
            
        #Calculate the sleep summary
        dict_sum = {
            'Date' : d, 
            'Time in bed (mins)' : len(stime_list), 
            'Time asleep (mins)' : (sval_list.count('1')),
            'Time awake (mins)' : (sval_list.count('2')),
            'Time very awake (mins)' : (sval_list.count('3')),
            'Bedtime (mins)' : (stime_list[0]),
            'Wake up (mins)' : (stime_list[-1]),
            'Total time' : "{:.2f}".format(((datetime.datetime.strptime(stime_list[-1],'%H:%M:%S') - \
                            datetime.datetime.strptime(stime_list[0],'%H:%M:%S')).total_seconds())/(60*60))
        }

        sleep_sum_df = pd.DataFrame.from_records(dict_sum, index=[0])
        sleep_df = sleep_df.append(sleep_sum_df)
    
    else:
        print('\tNo sleep data for ' + d.strftime('%Y-%m-%d'))

	No sleep data for 2019-02-09
	No sleep data for 2019-02-11
	No sleep data for 2019-02-12
	No sleep data for 2019-02-13
	No sleep data for 2019-02-15
	No sleep data for 2019-02-17
	No sleep data for 2019-02-18


In [22]:
with open('./Outputs/sleep_summary.csv', 'a') as f:
    sleep_df.to_csv(f, header=False)

In [23]:
sleep_df.head()

Unnamed: 0,Bedtime (mins),Date,Time asleep (mins),Time awake (mins),Time in bed (mins),Time very awake (mins),Total time,Wake up (mins)
0,23:01:30,2019-02-08,502,9,511,0,-15.5,07:31:30
0,01:15:00,2019-02-10,442,17,460,1,7.65,08:54:00
0,22:05:00,2019-02-14,555,17,575,3,-14.43,07:39:00
0,00:45:30,2019-02-16,437,9,446,0,7.42,08:10:30


-----

## Step 6: Export Sleep minute-by-minute Data <a class="anchor" id="sleep-min-bullet"></a>

The minute-by-minute sleep data can be extracted using the auth2_client.intraday_time_series() functions which lets us to look at the fitbit data between two timepoints. 

To save pinging the API over and over again, we only export for dates which have not previously been extracted. Otherwise this can lead to errors - if you have a big gap it might be worth setting a date range of a week or so. 

In [None]:
run_min_sleep = False


if run_min_sleep == True: 
    # Gather the dates we have so far
    file_list=os.listdir(r"./Outputs/sleep/")
    file_list=[ file_list for file_list in file_list if file_list.endswith( '.csv' ) ]
    file_list=[w.replace('sleep_', '').replace('.csv', '') for w in file_list]
    file_list = [datetime.datetime.strptime(d, '%Y-%m-%d').date() for d in file_list]

    # Find the dates we still need to run
    dl = [d for d in dates_list if d not in file_list]
    print("sleep: running " + str(len(dl)) + " day(s)")

        for d in dl:
            fit_statsSl = auth2_client.sleep(date=d)
            stime_list = []
            sval_list = []
    
        if len(fit_statsSl['sleep']) != 0:
        
            for i in fit_statsSl['sleep'][0]['minuteData']:
                stime_list.append(i['dateTime'])
            sval_list.append(i['value'])
        
            sleepdf = pd.DataFrame({'State':sval_list, 'Time':stime_list})
    
            sleepdf['Interpreted'] = sleepdf['State'].map({'2':'Awake','3':'Very Awake','1':'Asleep'})

            sleepdf.to_csv('./Outputs/sleep/' + 'sleep_' + d.strftime('%Y-%m-%d') +'.csv', \
                           columns = ['Time','State','Interpreted'],header=True, 
                           index = False)
        else:
            print('\tNo sleep data for ' + d.strftime('%Y-%m-%d'))

-----

## Step 7: Extract Daily Data Minute by Minute: <a class="anchor" id="minutes-bullet"></a>

Like before but for all the other variables. 

In [None]:
activities = ['heart', 'calories', 'steps', 'distance', 'floors', 'elevation', \
              'minutesSedentary', 'minutesLightlyActive', 'minutesFairlyActive', 'minutesVeryActive']

In [None]:
for act in activities:
    if not os.path.exists('./Outputs/'+ act):
        os.makedirs('./Outputs/'+ act)

In [None]:
run_daily_min = False


if run_daily_min == True: 
    for var in activities:
        
        # But we don't want to repeat analysis for the same days so exclude these
        file_list=os.listdir(r"./Outputs/"+ var +"/")
        file_list=[ file_list for file_list in file_list if file_list.endswith( '.csv' ) ]
        file_list=[w.replace(var+'_', '').replace('.csv', '') for w in file_list]
        file_list = [datetime.datetime.strptime(d, '%Y-%m-%d').date() for d in file_list]
    
        # Find the dates we still need to run
        dl = [d for d in dates_list if d not in file_list]
        print(var + ": running " + str(len(dl)) + " day(s)")
        
        for date in dl:
            fitbit_stats2 = auth2_client.intraday_time_series('activities/' + var, base_date=date, detail_level='1min')
            stats2 = fitbit_stats2
        
            for i in stats2['activities-' + str(var) + '-intraday']['dataset']:
                val_list.append(i['value'])
                date_list.append(date)
                time_list.append(i['time'])
        
            # Export the data to a csv file
            df = pd.DataFrame({var:val_list,'Date':date_list, 'Time':time_list})
            df.to_csv('./Outputs/'+ var +'/' + var + '_' + str(date) + '.csv', \
                    columns=['Date', 'Time', var], header=True, \
                    index = False)