# Fitbit tracker data review 

#### Description  
In this notebook, we review the data from a Fitbit tracker data download to see if it can help us investigate our research goal.   

I use my own Fitbit tracker data download and my research goal as an example here. If you use a Fitbit tracker, you can download your Fitbit data using [these instructions](https://help.fitbit.com/articles/en_US/Help_article/1133.htm) and follow along.  

**My research goal:** to understand better how my wellness habits affect how well I feel.

#### Guiding questions for this data review:  
- What kind of data is available?  
- What is the format of the data?  
- For time series data, what are the frequency and date ranges of the data?  
- Does the data appear to have lots of missing values?  
- How can this data help investigate the research goal?
- Is this data enough for the research goal? If not, what information is missing?  
#### Steps:
- Download the Fitbit tracker data and unzip it    
- Browse the folders  
- Browse any documentation/README files included  
- Take a look at the data files contents  

 
Depending on your Fitbit device and subscriptions, some of these folders do not have any data. And some folders contain irrelevant data, like profile information. In addition, any folders for Fitbit features you did not use much probably will not have any useful data. To simplify things, you can delete any empty or irrelevant folders.  

I have a very basic Fitbit tracker, the Inspire HR, and a free Fitbit account, and I have relevant data in the following folders:  
- Physical Activity  
- Sleep  
- Menstrual Health  

Your available tracker data may be very different, but the steps to review it at a high level should be similar.  

Let's review the data.

In [None]:
import pandas as pd

import os
from dotenv import load_dotenv

In [None]:
load_dotenv()  # take environment variables from .env.

In [None]:
data_path = os.environ.get("DATAPATH")

In [None]:
!ls $data_path

## Physical Activity data

In [None]:
folder_name = 'Physical\ Activity'

In [None]:
!ls $data_path/$folder_name | grep ".txt"

In [None]:
!ls $data_path/$folder_name | grep "README"

In [None]:
!cat $data_path/$folder_name/Daily\ Readiness\ README.txt

The folder "Physical Activity" has a ton of files. The json datafiles are organized by measurement type and either daily or monthly date.  

This list of files shows the available measurement types:

In [None]:
%%capture capt

!ls $data_path/$folder_name 

In [None]:
file_names_list = capt.stdout.split('\r\n')

In [None]:
x_last_prefix = ''
j = 0
for x in list(reversed(file_names_list)):
    x_prefix = x.split('-')[0]
    if x_prefix != x_last_prefix:
        print('\n-------')
        print(f'{x_prefix}\n')
        j = 0
        
    if j < 5:
        print(x)  
        
    j += 1 
    x_last_prefix = x_prefix

Let's look at some sample raw data in those files.

In [None]:
!head -n 20 $data_path/$folder_name/calories-2023-01-06.json

In [None]:
!head -n 20 $data_path/$folder_name/distance-2023-01-06.json

In [None]:
!head -n 20 $data_path/$folder_name/heart_rate-2023-01-06.json

In [None]:
!head -n 20 $data_path/$folder_name/lightly_active_minutes-2023-01-06.json

In [None]:
!head -n 20 $data_path/$folder_name/steps-2023-01-06.json

In [None]:
!tail -n 20 $data_path/$folder_name/steps-2023-01-06.json

In [None]:
!head -n 20 $data_path/$folder_name/time_in_heart_rate_zones-2023-01-06.json

In [None]:
!ls $data_path/$folder_name | grep "demographic_vo2_max-.*.json"

In [None]:
!head -n 20 $data_path/$folder_name/demographic_vo2_max-2022-10-18.json

In [None]:
!head -n 100 $data_path/$folder_name/exercise-900.json

In [None]:
!head -n 20 $data_path/$folder_name/resting_heart_rate-2022-10-18.json

## Sleep data

In [None]:
folder_name = 'Sleep'

In [None]:
!ls $data_path/$folder_name | grep ".txt"

In [None]:
!ls $data_path/$folder_name | grep "README"

This list of files shows the available measurement types:

In [None]:
%%capture capt

!ls $data_path/$folder_name 

In [None]:
file_names_list = capt.stdout.split('\r\n')

In [None]:
x_last_prefix = ''
j = 0
for x in list(reversed(file_names_list)):
    x_prefix = x.split('-')[0]
    if x_prefix != x_last_prefix:
        print('\n-------')
        print(f'{x_prefix}\n')
        j = 0
        
    if j < 5:
        print(x)  
        
    j += 1 
    x_last_prefix = x_prefix

### Sleep score

In [None]:
!head -n 10 $data_path/$folder_name/"sleep_score.csv"

### Sleep data

In [None]:
!head -n 100 $data_path/$folder_name/"sleep-2023-04-06.json"

### Respiratory Rate Summary

In [None]:
!cat $data_path/$folder_name/"Respiratory Rate README.txt"

In [None]:
!head -n 10 $data_path/$folder_name/"Respiratory Rate Summary - 2023-04-01.csv"

### Daily Respiratory Rate Summary

In [None]:
!cat $data_path/$folder_name/"Daily Respiratory Rate Summary README.txt"

In [None]:
!head -n 20 $data_path/$folder_name/"Daily Respiratory Rate Summary - 2023-04-23.csv"

In [None]:
!head -n 20 $data_path/$folder_name/"Daily Respiratory Rate Summary - 2023-04-22.csv"

### Daily Heart Rate Variability Summary

In [None]:
!cat $data_path/$folder_name/"Daily Heart Rate Variability Summary README.txt"

Let's look at some sample raw data in those files.

In [None]:
!head -n 20 $data_path/$folder_name/"Daily Heart Rate Variability Summary - 2023-04-23.csv"

In [None]:
!head -n 20 $data_path/$folder_name/"Daily Heart Rate Variability Summary - 2023-04-22.csv"

### Heart Rate Variability Histogram

In [None]:
!head -n 20 $data_path/$folder_name/"Heart Rate Variability Histogram - 2023-04-01.csv"

### Heart Rate Variability Details

In [None]:
!head -n 20 $data_path/$folder_name/"Heart Rate Variability Details - 2023-04-23.csv"

## Menstrual Health data

In [None]:
folder_name = 'Menstrual\ Health'

In [None]:
!ls $data_path/$folder_name | grep ".txt"

In [None]:
!ls $data_path/$folder_name | grep "README"

This list of files shows the available measurement types:

In [None]:
%%capture capt

!ls $data_path/$folder_name 

In [None]:
file_names_list = capt.stdout.split('\r\n')

In [None]:
x_last_prefix = ''
j = 0
for x in list(reversed(file_names_list)):
    x_prefix = x.split('-')[0]
    if x_prefix != x_last_prefix:
        print('\n-------')
        print(f'{x_prefix}\n')
        j = 0
        
    if j < 5:
        print(x)  
        
    j += 1 
    x_last_prefix = x_prefix

In [None]:
!cat $data_path/$folder_name/"menstrual_health_README.txt"

In [None]:
!head -n 10 $data_path/$folder_name/"menstrual_health_cycles.csv"