# Gather Data

**Purpose**: In this notebook we'll establish the connection to Fitbit's API and scrape the data used for analysis downstream. By the end of the notebook we'll have a Pandas dataframe saved to disk as a csv ready for consumption. 

## Imports

*Note*: There's a third party dependency to the open source Python library "fitbit". If this is not already installed on the host machine run the following in a terminal: 

```sh
pip install fitbit
```

In [3]:
import fitbit
import requests
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import threading
import webbrowser

from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session
from requests.auth import HTTPBasicAuth
from IPython.display import display, HTML

## Establish Connection to Fitbit API

Now we'll leverage the fitbit library we imported by passing in a client id and secret key. We'll also specify a redirect URI of home since we want to intercept the URL parameters that the fitbit API returns to us. 

At a high level, the fitbit API uses the Oauth2.0 workflow to allow appropriate access for requestors. In this project, the requestor is the Jupyter notebook. By providing credentials from my account, I'm telling Fitbit that this notebook is a trusted application so that Fitbit knows to pass a time sensitive token that will be included with every subsequent request for actual data. We store that token in that notebook and pass it along to the Fitbit library imported. 

In [12]:
CLIENT_ID = 'REDACTED'
CLIENT_SECRET = 'REDACTED'
redirect_uri = 'http://127.0.0.1:8080/'
fitbit_date_format = '%Y-%m-%d'

In [5]:
scope = ['activity','heartrate', 'location', 'nutrition', 'profile', 'settings', 'sleep', 'social', 'weight']
auth  = HTTPBasicAuth(CLIENT_ID, CLIENT_SECRET)
oauth = OAuth2Session(client_id=CLIENT_ID,redirect_uri=redirect_uri,scope=scope)
auth_url, _ = oauth.authorization_url('https://www.fitbit.com/oauth2/authorize')

In [6]:
html = HTML(f"<a href='{auth_url}' target='_blank'>Click here and then copy the 'code' parameter in the URL</a>")
display(html)

In [7]:
code = input()

9f44aa5f8b445144cf80dafba12e93263bf23926


In [8]:
token = oauth.fetch_token(token_url='https://api.fitbit.com/oauth2/token',username=CLIENT_ID,password=CLIENT_SECRET, code=code)

Here we store the token fetched above and initialize the auth2_client object with that token so that we can make authenticated calls.

In [9]:
auth2_client = fitbit.Fitbit(CLIENT_ID, CLIENT_SECRET, oauth2=True, access_token=token['access_token'], refresh_token=token['refresh_token'])

In [10]:
auth2_client.API_VERSION = 1.2 #this has to be manually done because the python library defaults to ver 1.

Now we'll build the dataframe by making separate requests for each resource we're interested in and keying into the appropriate responses.

## Build Sleep Data

In [9]:
date_list = []
startTime_list = []
minutesAsleep_list = []
first_of_year = datetime.datetime.strptime('2018-01-01', fitbit_date_format)
start_date = first_of_year
for i in range(3):
    # FitBit only allows pulls of 100 days of data at a time. 
    hunnid_days_later = start_date + datetime.timedelta(days=100)
    sleep_records = auth2_client.time_series(resource='sleep'
             , base_date=start_date.strftime(fitbit_date_format)
             , end_date=hunnid_days_later.strftime(fitbit_date_format))
    for record in sleep_records['sleep']:
        date_list.append(record['dateOfSleep'])
        startTime_list.append(record['startTime'])
        minutesAsleep_list.append(record['minutesAsleep'])
        
    start_date = hunnid_days_later + datetime.timedelta(days=1)

In [10]:
df_sleep = pd.DataFrame({'StartTime':startTime_list, 'MinutesAsleep':minutesAsleep_list}, index=date_list)
df_sleep = df_sleep[['StartTime', 'MinutesAsleep']]

In [11]:
df_sleep = df_sleep.groupby(df_sleep.index).agg({'MinutesAsleep':sum, 'StartTime':'first'}).sort_index()

## Build Step Data

In [13]:
steps_list = []
date_list = []
first_of_year = datetime.datetime.strptime('2018-01-01', fitbit_date_format)
start_date = first_of_year
for i in range(3):
    # FitBit only allows pulls of 100 days of data at a time. 
    hunnid_days_later = start_date + datetime.timedelta(days=100)
    step_records = auth2_client.time_series(resource='activities/steps'
             , base_date=start_date.strftime(fitbit_date_format)
             , end_date=hunnid_days_later.strftime(fitbit_date_format))
    for record in step_records['activities-steps']:
        date_list.append(record['dateTime'])
        steps_list.append(record['value'])
        
    start_date = hunnid_days_later + datetime.timedelta(days=1)

In [14]:
df_steps = pd.DataFrame({'Steps': list(map(int,steps_list))}, index=date_list)

## Build Resting Heart Rate Data

In [31]:
hr_list = []
date_list = []
first_of_year = datetime.datetime.strptime('2018-01-01', fitbit_date_format)
start_date = first_of_year
for i in range(3):
    # FitBit only allows pulls of 100 days of data at a time. 
    hunnid_days_later = start_date + datetime.timedelta(days=100)
    hr_records = auth2_client.time_series(resource='activities/heart'
             , base_date=start_date.strftime(fitbit_date_format)
             , end_date=hunnid_days_later.strftime(fitbit_date_format))
    for record in hr_records['activities-heart']:
        if('restingHeartRate' in record['value']):
            date_list.append(record['dateTime'])
            hr_list.append(record['value']['restingHeartRate'])
        
    start_date = hunnid_days_later + datetime.timedelta(days=1)

In [17]:
df_heartrate = pd.DataFrame({'Heartrate': list(map(int,hr_list))}, index=date_list)

## Consolidate Data

In [19]:
df = pd.merge(df_sleep, df_steps, left_index=True, right_index=True, how="outer")

In [20]:
df = pd.merge(df, df_heartrate, left_index=True, right_index=True, how="outer")

In [22]:
df.to_csv("data")