# 1 - Interacting with the Fitbit API

In this section we will be using the python-fitbit and the requests modules to get data from the Fitbit API. This is not the only way to do it, for example, a simple alternative would be to use the Fitbit Web API Explorer (https://dev.fitbit.com/build/reference/web-api/explore/). The steps taken here are largely outlined in this (https://towardsdatascience.com/using-the-fitbit-web-api-with-python-f29f119621ea) Towards Data Science article.

## 1.1 Setting up 

In this section we load all necessary modules, proceed with the authorization from the Fitbit API and also define the Fitbit object (from the python-fitbit module) which is used to make some GET requests to the Fitbit API.

In [1]:
# Import necessary modules
import gather_keys_oauth2 as Oauth2 # This is a python file you need to have in the same directory as your code so you can import it
import fitbit
import pandas as pd 
import datetime as dt
import requests as req
import warnings

# Enter CLIENT_ID and CLIENT_SECRET
CLIENT_ID = '23QRRC'
CLIENT_SECRET = '51922a48a2df4434cc20afaac4ee97b8'

# Date after which we have data
START_DATE = "2023-03-29"
END_DATE = "2023-04-20"



Upon execution of the cell below, you will be redirected to another tab and will be asked to login into your Fitbit account. Upon doing that you will see a page that should say something like "Authentication Complete, you may close this tab".

In [2]:
# Authorize user
server = Oauth2.OAuth2Server(CLIENT_ID, CLIENT_SECRET)
server.browser_authorize()
# Save access and refresh tokens
ACCESS_TOKEN = str(server.fitbit.client.session.token['access_token'])
REFRESH_TOKEN = str(server.fitbit.client.session.token['refresh_token'])
EXPIRES_AT = str(server.fitbit.client.session.token['expires_at'])

[04/May/2023:13:53:04] ENGINE Listening for SIGTERM.
[04/May/2023:13:53:04] ENGINE Bus STARTING
CherryPy Checker:
The Application mounted at '' has an empty config.

[04/May/2023:13:53:04] ENGINE Set handler for console events.
[04/May/2023:13:53:04] ENGINE Started monitor thread 'Autoreloader'.
[04/May/2023:13:53:04] ENGINE Serving on http://127.0.0.1:8080
[04/May/2023:13:53:04] ENGINE Bus STARTED


127.0.0.1 - - [04/May/2023:13:53:07] "GET /?code=ebd399e4fa36e8524efe89cfa57220208adcfb4b&state=gRICXeKEeCXSaFQC9GJjGZ7iBqepcb HTTP/1.1" 200 122 "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"


[04/May/2023:13:53:08] ENGINE Bus STOPPING
[04/May/2023:13:53:08] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8080)) shut down
[04/May/2023:13:53:08] ENGINE Removed handler for console events.
[04/May/2023:13:53:08] ENGINE Stopped thread 'Autoreloader'.
[04/May/2023:13:53:08] ENGINE Bus STOPPED
[04/May/2023:13:53:08] ENGINE Bus EXITING
[04/May/2023:13:53:08] ENGINE Waiting for child threads to terminate...
[04/May/2023:13:53:08] ENGINE Bus EXITED
[04/May/2023:13:53:08] ENGINE Waiting for thread Thread-16.


In the previous cells we completed the authorization process for the Id and Secret that is specified in the first code block. Next, we create an instance of the Fitbit object of the python-fitbit module, which will be the base object we will use to get the data we want.

In [3]:
# Create Fitbit object which will be used to get the data
auth2_client = fitbit.Fitbit(client_id = CLIENT_ID,
                             client_secret = CLIENT_SECRET,
                             expires_at = EXPIRES_AT,
                             oauth2 = True,
                             access_token = ACCESS_TOKEN,
                             refresh_token = REFRESH_TOKEN)

## 1.2 Get data

To get Sleep related data, we will be using the requests module. The reason for that is that for some reason in the python-fitbit module, there is an older API version variable that is hardcoded and we cannot change it (it's 1 while it should be 1.2 for Sleep data). As far as we know there is an issue about that in the module's Github page, but no implemented solution.

Therefore what we did is that we used the Fitbit Web API Explorer to get the CURL of the endpoint we want to draw data from, converted it to python using the requests module, and get the data we want.

For Activity data we used the python-fitbit module. We have used several different resources to quantify activity (steps, minutes active/sedentary).

Please keep in mind that there is a rate limit for each user who has consented to share their data, and this limit is **150 API requests per hour**. This resets at the top of each hour.


In [4]:
def to_date(x):
    """
    Returns x as a datetime.date() object.
    """
    if isinstance(x, str):
        # If input is a string, parse it as a date
        return dt.datetime.strptime(x, "%Y-%m-%d").date()
    elif isinstance(x, dt.date):
        # If input is already a date object, return it
        return x
    else:
        # Otherwise, raise an error
        raise ValueError("Input must be a string or datetime.date object")


def get_sleep_data(date):
    """
    Inputs:
        - date <str> (yyyy-mm-dd): Date for which we want to pull data.
        
    Returns the data related to sleep for <date> from the Fitbit API. 
    """

    global START_DATE
    global ACCESS_TOKEN
        
    date = to_date(date)
    start_date = to_date(START_DATE)

    # Check date value
    if date < start_date:
        raise ValueError("date cannot be before {}".format(start_date))
    
    # Make API get request
    headers = {
        'accept': 'application/json',
        'authorization': 'Bearer {}'.format(ACCESS_TOKEN),
    }
    try:
        response = req.get('https://api.fitbit.com/1.2/user/-/sleep/date/{}.json'.format(date), 
                           headers = headers)
    except fitbit.exceptions.HTTPTooManyRequests as e:
        tryAfterMin = e.retry_after_secs/60
        errorMessage = str(e) + ", please try again after {:.1f} min.".format(tryAfterMin)
        raise Exception(errorMessage)

    return response.json()


def get_activity_data(date):
    """
    Inputs:
        - date <str> (yyyy-mm-dd): Date we want the data of.
    
    Returns activity data for the specified date. Activity is quantified in terms of the elements of the resources list
    that is defined inside the function.
    """
    
    global START_DATE

    date = to_date(date)
    start_date = to_date(START_DATE)

    # Check date value
    if date < start_date:
        raise ValueError("date cannot be before {}".format(start_date))
   
    # Dictionary where data returned by the API will be stored
    data = {}

    # Different kinds of resources that quantify activity
    resources = [
        "minutesSedentary",
        "minutesLightlyActive",
        "minutesFairlyActive",
        "minutesVeryActive",
        "steps"
    ]
    
    try:
        # A separate API call is made for each resource
        for resource in resources:
            resourceString = "activities/" + resource
            # detailString can be one of 1min, 5min, 15min
            if resource == "steps":
                # Thought this might make more sense, feel free to change it if you think otherwise
                detailString = "1min"
            else:
                detailString = "15min"
            
            # Use fitbit module to make the API get request
            data[resource] = auth2_client.intraday_time_series(resourceString, 
                                                               date, 
                                                               detail_level = detailString)
    except fitbit.exceptions.HTTPTooManyRequests as e:
        tryAfterMin = e.retry_after_secs/60
        errorMessage = str(e) + ", please try again after {:.1f} min.".format(tryAfterMin)
        raise Exception(errorMessage)
        
    return data


def daterange(start_date, end_date):
    """
    Inputs:
        start_date, end_date: Can be either string or datetime.date

    Returns all dates in [start_date, end_date] as datetime.date objects.
    """
    start_date = to_date(start_date)
    end_date = to_date(end_date)
    for n in range(int((end_date - start_date).days)):
        yield start_date + dt.timedelta(n)


### Example code on how to get and transform sleep data: https://towardsdatascience.com/using-the-fitbit-web-api-with-python-f29f119621ea

# 2 - Import data into MongoDB

In [None]:
#!pip install pymongo

In [5]:
import pymongo as mongo
client = mongo.MongoClient('localhost', 27017)

# Check if the connection to the db was successful
try:
    db = client.admin
    server_info = db.command('serverStatus')
    print('Connection to MongoDB server successful.')
    
except mongo.errors.ConnectionFailure as e:
    print('Connection to MongoDB server failed: %s' % e)


USER_UUID = "3cc4e2ee-8c2f-4c25-955b-fe7f6ffcbe44"
DB_NAME = "fitbit"
DATA_COLLECTION_NAME = "fitbitCollection"

Connection to MongoDB server successful.


In [6]:
# client.drop_database("fitbit")

In [7]:
def check_create_collection(mongoDb, collection):
    """ 
    Checks if collection exists in mongoDb, and if it doesn't it creates it.
    """
    if collection in mongoDb.list_collection_names():
        print(f"Collection {collection} already exists, proceeding.")
    else:
        mongoDb.create_collection(collection)
        print(f"Collection {collection} created.")

    return mongoDb[collection]


def check_create_index(collection, index, indexName):
    """ 
    Checks if index with indexName exists in collection, and if it doesn't it creates it.
    """
    # Check if the index exists
    if indexName not in [index['name'] for index in collection.list_indexes()]:
        # Create the index if it does not exist
        collection.create_index(index, name = indexName, unique=True)
        print(f"Index {indexName} created.")
    else:
        print(f"Index {indexName} already exists, proceeding.")


# Connect to the fitbitDb and the collection where the data are stored or create them if they don't exist
fitbitDb = client[DB_NAME]
fitbitCollection = check_create_collection(fitbitDb, DATA_COLLECTION_NAME)
fitbitDb.list_collection_names()

# Define index
fitbitIndex = [('type', mongo.ASCENDING), ('data.dateTime', mongo.ASCENDING)]
fitbitIndexName = "type_1_data.dateTime_1"
check_create_index(fitbitCollection, fitbitIndex, fitbitIndexName)


Collection fitbitCollection created.
Index type_1_data.dateTime_1 created.


## Import Sleep data to mongo

In this section we want to draw data from the Fitbit API and more specifically the Sleep endpoint, and then save that data in our local MongoDB instance.

Reference for what the keys mean:
https://dev.fitbit.com/build/reference/web-api/sleep/get-sleep-log-by-date/ 

In [8]:
def create_document(documentType, dataDict):
    """ 
    Inputs:
        > documentType <str>: The type entry of the document to be created.
        > dataDict <dict>: The 'data' entry of the document to be created.

    Creates a document to be inserted into MongoDB.
    """
    global USER_UUID

    myDocument = {}
    myDocument["id"] = USER_UUID
    myDocument["type"] = documentType
    myDocument["data"] = dataDict

    return myDocument

     
def save_document(myCollection, myDocument):
    """
    Inputs:
        > myCollection: MongoDB collection in which we want to add myDocument.
        > myDocument <dict>: Document to be inserted.
    Adds myDocument in myCollection and checks if it was inserted successfully. If
    If myDocument already exists in myCollection, if it cannot find it, a ValueError
    is raised.
    """

    try:
        # Insert myDocument in Mongo
        result = myCollection.insert_one(myDocument)
        # Check if the document was inserted successfully
        if not result.inserted_id:
            raise Exception(f"Document {myDocument} not inserted.")
    # If record already exists
    except mongo.errors.DuplicateKeyError:
        # Try to find the document in the DB
        query = {
            'type': myDocument['type'],
            'data.dateTime': myDocument['data']['dateTime']
        }

        existing_doc = myCollection.find_one(query)
        if existing_doc is None:
            # Something went wrong, raise an error
            raise ValueError("Cannot find existing document in the collection.")
        else:
            # Document already exists, ignore it
            pass
    except Exception as e:
        print('Error: %s' % e)
    

def create_and_save_document(fitbitCollection, documentType, dataDict):
    """ 
    Creates and saves document into fitbitCollection
    """
    dataDocument = create_document(documentType, dataDict)
    save_document(fitbitCollection, dataDocument)


def to_datetime(date, time = ""):
    """
    Converts date (str or datetime.date) into datetime.datetime. If a time argument is given, it includes it
    in the datetime.datetime object it returns.
    """

    if isinstance(date, str):
        datetimeObj = dt.datetime.fromisoformat(date)
    elif isinstance(date, dt.date):
        datetimeObj = dt.datetime.combine(date, dt.datetime.min.time())
    else:
        raise ValueError("Unsupported type for date. It should be either a string or a datetime.date object.")
    
    if time != "":
        datetimeObj = dt.datetime.combine(datetimeObj, dt.datetime.strptime(time, '%H:%M:%S').time())
    
    return datetimeObj


def get_summary_key_data(sleepSummaryData, sleepSummaryDataKey, single_date):
    """
    Input:
        > sleepSummaryData <dict>: Contains the sleep data from the Fitbit API reponse under the 'summary' key.
        > single_date <datetime.date>: The date the data of which we parse.
    Output:
        > documentType <str>: The 'type' key of the document to be inserted in MongoDB.
        > dataDict <dict>: The 'data' key of the document to be insterted in MongoDB.

    This function parses through the data the Fitbit API gives us for a single day and returns
    only the data under the 'summary' key that we are interested in keeping in our MongoDB. 
    More specifically, it returns the type of the entry, and the relevant data for each key we decide to keep.
    """

    # Dictionary that will hold the 'data' entry of the document
    dataDict = {}
    documentType = "sleepSummary-{}".format(sleepSummaryDataKey)

    dataDict["dateTime"] = to_datetime(single_date)
    if sleepSummaryDataKey == "stages":
        for stage in sleepSummaryData[sleepSummaryDataKey].keys():
            dataDict[stage] = sleepSummaryData[sleepSummaryDataKey][stage]
    else:
        dataDict[sleepSummaryDataKey] = sleepSummaryData[sleepSummaryDataKey]

    return documentType, dataDict


In [9]:
global START_DATE
global END_DATE

try:
    # Keys in the returned
    skipKeys = ["levels", "infoCode", "logId", "logType", "type", "dateOfSleep"]

    for single_date in daterange(START_DATE, END_DATE):
        # Get data from the fitbit API
        oneDaySleepData = get_sleep_data(single_date)

        # Check if sleep data exist for the date we are looking at
        if (len(oneDaySleepData["sleep"]) > 0):
            # Get data related to general sleep info as well as the sleep time series
            sleepData = oneDaySleepData["sleep"][0]
            # Define which keys we want to keep
            sleepDataKeys = [key for key in sleepData.keys() if key not in skipKeys or key == "levels"]
            # For each key containing general information on sleep
            for sleepDataKey in sleepDataKeys:  
                # Dictionary that will hold the 'data' entry of the document
                dataDict = {}    
                # 'levels' contains the time series data
                if sleepDataKey == "levels":
                    sleepLevelsData = sleepData[sleepDataKey]
                    for key in sleepLevelsData.keys():
                        if key == "data" or key == "shortData":
                            documentType = f"sleepLevelsData-{key}"
                            for dataPoint in sleepLevelsData[key]:
                                dataDict = {}
                                # Convert string date to datetime.datetime so that's saved correctly in mongo
                                dataDict["dateTime"] = to_datetime(dataPoint["dateTime"])
                                dataDict["level"] = dataPoint["level"]
                                dataDict["value"] = dataPoint["seconds"]
                                create_and_save_document(fitbitCollection, documentType, dataDict)
                else:
                    documentType = "sleep-{}".format(sleepDataKey)
                    # Convert datetime.date to datetime.datetime so that's saved correctly in mongo
                    dataDict["dateTime"] = to_datetime(single_date)
                    if isinstance(sleepData[sleepDataKey], str):
                        dataDict["value"] = to_datetime(sleepData[sleepDataKey])
                    else:
                        dataDict["value"] = sleepData[sleepDataKey]
                    create_and_save_document(fitbitCollection, documentType, dataDict)
            # Get data related to summary sleep info
            sleepSummaryData = oneDaySleepData["summary"]
            # For each key containing summary information on sleep
            for sleepSummaryDataKey in sleepSummaryData.keys():
                documentType, dataDict = get_summary_key_data(sleepSummaryData, sleepSummaryDataKey, single_date)
                create_and_save_document(fitbitCollection, documentType, dataDict)
        else:
            warnings.warn(f"Could not find sleep data for {single_date}.")
            continue
        print(f'Loaded sleep data for {single_date}.')
except Exception as e:
    print('Error: %s' % e)



Loaded data for 2023-03-29.
Loaded data for 2023-03-30.
Loaded data for 2023-03-31.
Loaded data for 2023-04-01.
Loaded data for 2023-04-02.




Loaded data for 2023-04-04.
Loaded data for 2023-04-05.
Loaded data for 2023-04-06.
Loaded data for 2023-04-07.
Loaded data for 2023-04-08.
Loaded data for 2023-04-09.
Loaded data for 2023-04-10.
Loaded data for 2023-04-11.
Loaded data for 2023-04-12.
Loaded data for 2023-04-13.
Loaded data for 2023-04-14.
Loaded data for 2023-04-15.
Loaded data for 2023-04-16.
Loaded data for 2023-04-17.
Loaded data for 2023-04-18.
Loaded data for 2023-04-19.


## Import Activity data to mongo


In [10]:
global START_DATE
global END_DATE

try:
    for single_date in daterange(START_DATE, END_DATE):
        # Get activity data from the fitbit API
        oneDayActivityData = get_activity_data(single_date)

        # For each kind of activity
        for activityTypeKey in oneDayActivityData.keys():
            for key in oneDayActivityData[activityTypeKey].keys():
                # Check if activity data exist for the date and type of activity we are looking at
                if len(oneDayActivityData[activityTypeKey][key]) > 0:
                    documentType = key.replace('activities-',"")
                    if "intraday" not in key:
                        dataDict = {}
                        dataDict["dateTime"] = to_datetime(oneDayActivityData[activityTypeKey][key][0]["dateTime"])
                        dataDict["value"] = int(oneDayActivityData[activityTypeKey][key][0]["value"])
                        create_and_save_document(fitbitCollection, documentType, dataDict)
                    else:
                        for dataPoint in oneDayActivityData[activityTypeKey][key]["dataset"]:
                            dataDict = {}
                            dataDict["dateTime"] = to_datetime(single_date, time = dataPoint["time"])
                            dataDict["value"] = int(dataPoint["value"])
                            create_and_save_document(fitbitCollection, documentType, dataDict)
                else:
                    warnings.warn(f"Could not find {activityTypeKey}-{key} data for {single_date}.")
                    continue 
        print(f'Loaded activity data for {single_date}.')                            
except Exception as e:
    print('Error: %s' % e)

Loaded data for 2023-03-29.
Loaded data for 2023-03-30.
Loaded data for 2023-03-31.
Loaded data for 2023-04-01.
Loaded data for 2023-04-02.
Loaded data for 2023-04-03.
Loaded data for 2023-04-04.
Loaded data for 2023-04-05.
Loaded data for 2023-04-06.
Loaded data for 2023-04-07.
Loaded data for 2023-04-08.
Loaded data for 2023-04-09.
Loaded data for 2023-04-10.
Loaded data for 2023-04-11.
Loaded data for 2023-04-12.
Loaded data for 2023-04-13.
Loaded data for 2023-04-14.
Loaded data for 2023-04-15.
Loaded data for 2023-04-16.
Loaded data for 2023-04-17.
Loaded data for 2023-04-18.
Loaded data for 2023-04-19.
