In [2]:
from datetime import datetime
import json
import keyring
import requests

import fitbit
import gather_keys_oauth2 as Oauth2
import pandas as pd 
from pprint import pprint

First we have to set up authorization for the FitBit API. These instructions show you how to set up Fitbit so that you can connect to their api. 

https://towardsdatascience.com/collect-your-own-fitbit-data-with-python-ff145fa10873

At the moment, the first chunk of the code in this notebook is copied directly from this post.


When the directions mention secrets and keys, you'll notice that the code has in this notebook has stored the key and secret using the keyring library. This library helps you manage your keys and IDs (so that if you share your code, you don't share your credentials!).

Here's a great link on how/why to use the keyring library.

https://alexwlchan.net/2016/11/you-should-use-keyring/

One last thing to note is that while we import the fitbit library, we're really only using it for authentication. In other words, we stop following the instruction after step two. Why is this? The fitbit python library calls the fitbit api in units of one day. And the fitbit api limits a single user's calls to 150 per hour, which means that if we used this library, we'd be limited to grabbing only 5 months of data at a time. 

Instead, we're going to create some functions that interact directly with the fitbit API so that we can grab a range of days' worth of data at a time.

To be clear, sometimes you might want to get a single day's worth of data, but for this analysis, I'm more interested in trends across days than within days.


In [3]:
CLIENT_ID = keyring.get_password("fitbit", "key")
CLIENT_SECRET = keyring.get_password("fitbit", "secret")

server = Oauth2.OAuth2Server(CLIENT_ID, CLIENT_SECRET)
server.browser_authorize()
ACCESS_TOKEN = str(server.fitbit.client.session.token['access_token'])
REFRESH_TOKEN = str(server.fitbit.client.session.token['refresh_token'])
auth2_client = fitbit.Fitbit(CLIENT_ID, CLIENT_SECRET, oauth2=True, access_token=ACCESS_TOKEN, refresh_token=REFRESH_TOKEN)

[19/May/2018:10:00:40] ENGINE Listening for SIGTERM.
[19/May/2018:10:00:40] ENGINE Bus STARTING
[19/May/2018:10:00:40] ENGINE Set handler for console events.
CherryPy Checker:
The Application mounted at '' has an empty config.

[19/May/2018:10:00:40] ENGINE Started monitor thread 'Autoreloader'.
[19/May/2018:10:00:40] ENGINE Serving on http://127.0.0.1:8080
[19/May/2018:10:00:40] ENGINE Bus STARTED


127.0.0.1 - - [19/May/2018:10:00:43] "GET /?code=9701acc93bf4e73eaf7baf0c0e4435d436abc71f&state=xnX4iDzafsaY727iEIEgfu0jIIw3vJ HTTP/1.1" 200 122 "" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"


[19/May/2018:10:00:44] ENGINE Bus STOPPING
[19/May/2018:10:00:44] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8080)) shut down
[19/May/2018:10:00:44] ENGINE Stopped thread 'Autoreloader'.
[19/May/2018:10:00:44] ENGINE Removed handler for console events.
[19/May/2018:10:00:44] ENGINE Bus STOPPED
[19/May/2018:10:00:44] ENGINE Bus EXITING
[19/May/2018:10:00:44] ENGINE Waiting for child threads to terminate...
[19/May/2018:10:00:44] ENGINE Bus EXITED
[19/May/2018:10:00:44] ENGINE Waiting for thread Thread-19.


Now we're authorized to pull data from the Fitbit API. If you've never interacted with an api before, that won't keep you from moving forward with this analysis. Here's the only pieces of information you'll need to understand for this analysis:

1. APIs are tools that entities to provide to allow your program to connect directly to their data. It's how we request data instead of using a UI interface. In Fitbit's case, the API provides more complete and detailed access to your data than is available in the UI download interface.

2. APIs let you 'get', 'post', 'delete', and 'patch'(edit) data. We'll only 'get' data, using the python 'requests' library.

3. Well-designed APIs use consistent URL formats to structure API calls. This takes the form of a URL. Getting, posting (etc.) data involves:
    1. using the correct verb from the requests library (get, post, etc.)
    2. structuring the text of the URL to meet the pattern that the API in question uses.


Here's an example URL from the Fitbit api
 "https://api.fitbit.com/1.2/user/-/sleep/date/2018-04-02/2018-04-08.json"
 
This breaks down into the following pattern:

"https://api.fitbit.com/1.2/user/-/" + endpoint + "/date/" + start_date + "/" + end_date + ".json"

We'll use this to build a generic function that takes the endpoint name, start_date, and end_date.

In [4]:
# getEndpointData is a generic function that lets us retrieve data from any fitbit api endpoint we want
def getEndpointData(endpoint, start_date, end_date):
    # At some point, we should insert some defensive coding here to make sure that the start_date and 
    # end_date are provided in the proper format (YYYY-MM-DD e.g. '2018-04-28'). For now, we'll 
    # leave it to the user to know the correct format
    
    url = "https://api.fitbit.com/1.2/user/-/" + endpoint + "/date/" + start_date + "/" + end_date + ".json"
    results = requests.get(url = url, headers={'Authorization':'Bearer ' + ACCESS_TOKEN})
    if results.status_code == 200:
        activity_data = json.loads(results.text)
        return activity_data
    else:
        print(results.text)
        return "ERROR"
    

Let's also build a function that makes our initial dates go backwards in time. This gives us an an easy way to loop back through all of the data we have stored in fitbit for a given endpoint.

In [5]:
# right now this is set to assume we're pulling one month at a time. That's something 
# that it will probably make sense to change in the future.

def changeDates(start_date, end_date):
    end_date = (datetime.datetime.strptime(end_date, "%Y-%m-%d") - datetime.timedelta(days=30)).strftime("%Y-%m-%d")
    start_date = (datetime.datetime.strptime(start_date, "%Y-%m-%d") - datetime.timedelta(days=30)).strftime("%Y-%m-%d")
    return start_date, end_date

Now let's try this out with some sleep data!

In [6]:
endpoint = "sleep"

end_date = "2018-04-28"
start_date = "2018-03-29"

activity_data = getEndpointData(endpoint, start_date, end_date)

In [7]:
pprint(activity_data)

{'sleep': [{'dateOfSleep': '2018-04-28',
            'duration': 33960000,
            'efficiency': 97,
            'endTime': '2018-04-28T08:29:30.000',
            'infoCode': 0,
            'levels': {'data': [{'dateTime': '2018-04-27T23:03:30.000',
                                 'level': 'wake',
                                 'seconds': 330},
                                {'dateTime': '2018-04-27T23:09:00.000',
                                 'level': 'light',
                                 'seconds': 840},
                                {'dateTime': '2018-04-27T23:23:00.000',
                                 'level': 'deep',
                                 'seconds': 2820},
                                {'dateTime': '2018-04-28T00:10:00.000',
                                 'level': 'light',
                                 'seconds': 330},
                                {'dateTime': '2018-04-28T00:15:30.000',
                                 'level': 'rem',
      

                                 'level': 'light',
                                 'seconds': 180},
                                {'dateTime': '2018-04-27T01:11:00.000',
                                 'level': 'deep',
                                 'seconds': 330},
                                {'dateTime': '2018-04-27T01:16:30.000',
                                 'level': 'light',
                                 'seconds': 1950},
                                {'dateTime': '2018-04-27T01:49:00.000',
                                 'level': 'deep',
                                 'seconds': 330},
                                {'dateTime': '2018-04-27T01:54:30.000',
                                 'level': 'light',
                                 'seconds': 1710},
                                {'dateTime': '2018-04-27T02:23:00.000',
                                 'level': 'rem',
                                 'seconds': 2010},
                                {'d

                       'summary': {'deep': {'count': 1,
                                            'minutes': 57,
                                            'thirtyDayAvgMinutes': 60},
                                   'light': {'count': 14,
                                             'minutes': 211,
                                             'thirtyDayAvgMinutes': 250},
                                   'rem': {'count': 7,
                                           'minutes': 113,
                                           'thirtyDayAvgMinutes': 111},
                                   'wake': {'count': 13,
                                            'minutes': 36,
                                            'thirtyDayAvgMinutes': 44}}},
            'logId': 17983632764,
            'minutesAfterWakeup': 0,
            'minutesAsleep': 381,
            'minutesAwake': 36,
            'minutesToFallAsleep': 0,
            'startTime': '2018-04-23T01:11:30.000',
            'time

                                     {'dateTime': '2018-04-18T07:59:30.000',
                                      'level': 'wake',
                                      'seconds': 60}],
                       'summary': {'deep': {'count': 4,
                                            'minutes': 41,
                                            'thirtyDayAvgMinutes': 60},
                                   'light': {'count': 20,
                                             'minutes': 263,
                                             'thirtyDayAvgMinutes': 249},
                                   'rem': {'count': 4,
                                           'minutes': 103,
                                           'thirtyDayAvgMinutes': 109},
                                   'wake': {'count': 18,
                                            'minutes': 40,
                                            'thirtyDayAvgMinutes': 44}}},
            'logId': 17930427293,
            'minutesAft

            'endTime': '2018-04-13T08:05:30.000',
            'infoCode': 0,
            'levels': {'data': [{'dateTime': '2018-04-13T00:18:00.000',
                                 'level': 'wake',
                                 'seconds': 1200},
                                {'dateTime': '2018-04-13T00:38:00.000',
                                 'level': 'deep',
                                 'seconds': 2040},
                                {'dateTime': '2018-04-13T01:12:00.000',
                                 'level': 'light',
                                 'seconds': 1380},
                                {'dateTime': '2018-04-13T01:35:00.000',
                                 'level': 'rem',
                                 'seconds': 1350},
                                {'dateTime': '2018-04-13T01:57:30.000',
                                 'level': 'light',
                                 'seconds': 720},
                                {'dateTime': '2018-04-13T0

                                 'seconds': 840},
                                {'dateTime': '2018-04-09T03:02:00.000',
                                 'level': 'light',
                                 'seconds': 750},
                                {'dateTime': '2018-04-09T03:14:30.000',
                                 'level': 'deep',
                                 'seconds': 720},
                                {'dateTime': '2018-04-09T03:26:30.000',
                                 'level': 'light',
                                 'seconds': 5010},
                                {'dateTime': '2018-04-09T04:50:00.000',
                                 'level': 'rem',
                                 'seconds': 270},
                                {'dateTime': '2018-04-09T04:54:30.000',
                                 'level': 'light',
                                 'seconds': 690},
                                {'dateTime': '2018-04-09T05:06:00.000',
               

                                 'level': 'rem',
                                 'seconds': 1950},
                                {'dateTime': '2018-04-07T08:27:00.000',
                                 'level': 'light',
                                 'seconds': 2490}],
                       'shortData': [{'dateTime': '2018-04-07T01:42:00.000',
                                      'level': 'wake',
                                      'seconds': 60},
                                     {'dateTime': '2018-04-07T02:45:00.000',
                                      'level': 'wake',
                                      'seconds': 60},
                                     {'dateTime': '2018-04-07T03:01:00.000',
                                      'level': 'wake',
                                      'seconds': 30},
                                     {'dateTime': '2018-04-07T03:04:30.000',
                                      'level': 'wake',
                                   

                                           'thirtyDayAvgMinutes': 120},
                                   'wake': {'count': 17,
                                            'minutes': 39,
                                            'thirtyDayAvgMinutes': 40}}},
            'logId': 17768478507,
            'minutesAfterWakeup': 0,
            'minutesAsleep': 404,
            'minutesAwake': 39,
            'minutesToFallAsleep': 0,
            'startTime': '2018-04-03T00:25:30.000',
            'timeInBed': 443,
            'type': 'stages'},
           {'dateOfSleep': '2018-04-02',
            'duration': 25020000,
            'efficiency': 97,
            'endTime': '2018-04-02T07:59:30.000',
            'infoCode': 0,
            'levels': {'data': [{'dateTime': '2018-04-02T01:02:00.000',
                                 'level': 'wake',
                                 'seconds': 540},
                                {'dateTime': '2018-04-02T01:11:00.000',
                        

In [8]:
def processSleepResults(activity_data, sleep_summaries, sleep_time_events_detail):

    if not activity_data['sleep']:
        # sleep endpoint no longer returns results
        return sleep_summaries, sleep_time_events_detail, "stop"
    else:
        for sleep_event in activity_data['sleep']:
            sleep_time_events_detail.append(sleep_event['levels']['data'])
            del sleep_event['levels']['data']
            try: 
                del sleep_event['levels']['shortData']
            except:
                pass
                #this was a nap, so no shortData was available
            sleep_summaries.append(sleep_event)
    return sleep_summaries, sleep_time_events_detail, "continue"

In [None]:
sleep_time_events_detail = []
sleep_summaries = []
endpoint = "sleep"

end_date = "2018-04-28"
start_date = "2018-03-29"
    
status = "continue"
    
while status == "continue":
    activity_data = getEndpointData('sleep', start_date, end_date)
    if activity_data != "ERROR":
        sleep_summaries, sleep_time_events_detail, status = processSleepResults(activity_data)
        start_date, end_date = makeDatesEarlier(start_date, end_date)
    else:
        break

In [None]:
pprint((sleep_summaries))

In [None]:
def buildActivityData(start_date, end_date):
    activity_stats = auth2_client.activities(date=current_date)['summary']
    heartRateZones = activity_stats['heartRateZones']
    heartRate_df = accumulateHeartData(current_date, heartRateZones)
    del activity_stats['distances']
    del activity_stats['heartRateZones']
    activity_df = pd.DataFrame(activity_stats, index=[current_date])
    return activity_df, heartRate_df

In [None]:
pprint(sleep_data['sleep'][0])
#body_stats = auth2_client.body()   (/sleep/)

#sleep_stats = auth2_client.sleep()

In [None]:
with open('sleep_time_events_detail.txt', 'w') as outfile:
    json.dump(sleep_time_events_detail, outfile)

In [None]:
start_date = "2017-04-28"
end_date = "2018-04-28"
endpoint = "activities/minutesVeryActive"
endpoints = ["activities/calories", "activities/caloriesBMR", "activities/steps", "activities/distance", 
             "activities/floors", "activities/elevation", "activities/minutesSedentary", 
             "activities/minutesLightlyActive", "activities/minutesFairlyActive", "activities/minutesVeryActive",
            "activities/activityCalories"]


data = getEndpointData(endpoint, start_date, end_date)
pprint(data)

In [None]:
activity_dict_label = endpoint.replace("/","-")
df_label = endpoint.replace("activities/", "")
print(activity_dict_label)
activity_data = test[activity_dict_label]

activity_df = pd.DataFrame(activity_data, columns=['date', df_label ])

Below we have some code Melissa wrote to originally interact with the fitbit library (the one that runs into that 150 calls per hour limit.)  We're keeping the code here in case it's helpful as we're building out the other data sets.

In [15]:
#yesterday2 = str((datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d"))
#today = str(datetime.now().strftime("%Y%m%d"))

#yesterday2 = ((datetime.now() - timedelta(days=1)))
#yesterday3 = (yesterday2 - timedelta(days=1))
#print(yesterday3)

current_day = "2018-05-18"


'''
These functions use the intra-day endpoint. 

CAUTION: Plan your calls wisely, or you will exceed 150 API calls per hour.

'''

#take a starting date and a total number of days as an input
# day needs to be in YYYY-MM-DD format
def pullFitBitData(start_date, days, call_type):
    #insert date error checking laterz
    print("Processing day: {}".format(start_date))
    current_date = start_date
    activity_df, heartRate_df = buildActivityData(current_date)
    day_counter = 0
    while day_counter < days:
        current_date = (datetime.datetime.strptime(current_date, "%Y-%m-%d") - datetime.timedelta(days=1)).strftime("%Y-%m-%d")
        print("Processing day: {}".format(current_date))
        activity_df2, heartRate_df2 = buildActivityData(current_date)
        activity_df = pd.concat([activity_df, activity_df2])
        heartRate_df = pd.concat([heartRate_df, heartRate_df2])
        day_counter += 1
    print("Ended processing on {}.".format(current_date))
    return activity_df, heartRate_df
        

In [22]:
fit_statsHR = auth2_client.intraday_time_series('activities/heart', base_date=current_day, detail_level='15min')
heartRateZones = (fit_statsHR['activities-heart'][0]['value']['heartRateZones'])

'''
column_names = heartRatePivot_df.columns.values
new_column_names = []
for name in column_names:
    new_name = name[1].replace(' ', '_')+'.'+name[0]
    new_column_names.append(new_name)
heartRatePivot_df.columns = new_column_names
print(heartRatePivot_df)
'''


def accumulateHeartData(current_day, heartRateZones):
    date_dict = {'date': [current_day, current_day, current_day, current_day]}
    date_df = pd.DataFrame(date_dict)
    
    heartRateZones_df = pd.DataFrame.from_records(heartRateZones)
    heartRateZones_df['date'] = date_df['date']
    heartRatePivot_df = heartRateZones_df.pivot(index='date', columns='name')
    column_names = heartRatePivot_df.columns.values
    new_column_names = []
    for name in column_names:
        new_name = name[1].replace(' ', '_')+'.'+name[0]
        new_column_names.append(new_name)
    heartRatePivot_df.columns = new_column_names
    return heartRatePivot_df

def buildActivityData(current_date):
    activity_stats = auth2_client.activities(date=current_date)
    pprint(activity_stats)
    activity_stats = activity_stats['summary']
    heartRateZones = activity_stats['heartRateZones']
    heartRate_df = accumulateHeartData(current_date, heartRateZones)
    del activity_stats['distances']
    del activity_stats['heartRateZones']
    activity_df = pd.DataFrame(activity_stats, index=[current_date])
    return activity_df, heartRate_df

In [23]:
activity_df, heartRate_df = buildActivityData("2018-05-18")

{'activities': [],
 'goals': {'activeMinutes': 30,
           'caloriesOut': 1900,
           'distance': 5,
           'floors': 20,
           'steps': 10000},
 'summary': {'activeScore': -1,
             'activityCalories': 1041,
             'caloriesBMR': 1401,
             'caloriesOut': 2241,
             'distances': [{'activity': 'total', 'distance': 4.37},
                           {'activity': 'tracker', 'distance': 4.37},
                           {'activity': 'loggedActivities', 'distance': 0},
                           {'activity': 'veryActive', 'distance': 0.99},
                           {'activity': 'moderatelyActive', 'distance': 0.45},
                           {'activity': 'lightlyActive', 'distance': 2.92},
                           {'activity': 'sedentaryActive', 'distance': 0}],
             'elevation': 100,
             'fairlyActiveMinutes': 17,
             'floors': 10,
             'heartRateZones': [{'caloriesOut': 1276.90612,
                       

In [21]:
print(activity_df)

            activeScore  activityCalories  caloriesBMR  caloriesOut  \
2018-05-18           -1              1041         1401         2241   

            elevation  fairlyActiveMinutes  floors  lightlyActiveMinutes  \
2018-05-18        100                   17      10                   233   

            marginalCalories  restingHeartRate  sedentaryMinutes  steps  \
2018-05-18               591                69               750  10404   

            veryActiveMinutes  
2018-05-18                 22  


In [25]:
print(heartRate_df)

            Cardio.caloriesOut  Fat_Burn.caloriesOut  \
date                                                   
2018-05-18             42.8296              849.2915   

            Out_of_Range.caloriesOut  Peak.caloriesOut  Cardio.max  \
date                                                                 
2018-05-18                1276.90612               0.0         151   

            Fat_Burn.max  Out_of_Range.max  Peak.max  Cardio.min  \
date                                                               
2018-05-18           124                89       220         124   

            Fat_Burn.min  Out_of_Range.min  Peak.min  Cardio.minutes  \
date                                                                   
2018-05-18            89                30       151               5   

            Fat_Burn.minutes  Out_of_Range.minutes  Peak.minutes  
date                                                              
2018-05-18               251                  1114             0

In [26]:
fit_statsHR = auth2_client.intraday_time_series('activities/heart', base_date=current_day, detail_level='15min')
pprint(fit_statsHR)

{'activities-heart': [{'dateTime': '2018-05-18',
                       'value': {'customHeartRateZones': [],
                                 'heartRateZones': [{'caloriesOut': 1276.90612,
                                                     'max': 89,
                                                     'min': 30,
                                                     'minutes': 1114,
                                                     'name': 'Out of Range'},
                                                    {'caloriesOut': 849.2915,
                                                     'max': 124,
                                                     'min': 89,
                                                     'minutes': 251,
                                                     'name': 'Fat Burn'},
                                                    {'caloriesOut': 42.8296,
                                                     'max': 151,
                                          