# Fitbit Exploration
For an explanation on the variables, take a look at the [data dictionary created by Fitabase](https://www.fitabase.com/media/1546/fitabasedatadictionary.pdf).

## Sleep Sensitivity - 1 Variable
In this notebook we take a look at the individual variables that might be affecting sleep

In [1]:
import os
import sys
sys.path.append('../')

import pandas as pd
import numpy as np

from datetime import datetime, timedelta

import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.dates as mdates
from matplotlib import cm
from matplotlib.colors import ListedColormap, LinearSegmentedColormap

from joypy import joyplot

# Data Import
Sleep data are divided into two primary datasets:

1. Sleep Summaries by Day (daily sleep)
2. Sleep Data by Minute (sleep stages)

In [8]:
daily_sleep = pd.read_csv("../data/processed/fitbit-sleep_daily-ux_s20.csv",parse_dates=["date","start_time","end_time"],infer_datetime_format=True)
# converting duration to something that makes more sense...
daily_sleep['tst'] = daily_sleep['duration_ms'] / 3600000
daily_sleep = daily_sleep[daily_sleep["main_sleep"] == True]
daily_sleep.drop(["minutes_to_sleep","main_sleep"],axis=1,inplace=True)
daily_sleep = daily_sleep[['beiwe', 'start_time', 'end_time', 'date','tst','duration_ms','minutes_after_wakeup', 'minutes_asleep', 'minutes_awake', 'time_in_bed', 'efficiency']]
daily_sleep.head()

Unnamed: 0,beiwe,start_time,end_time,date,tst,duration_ms,minutes_after_wakeup,minutes_asleep,minutes_awake,time_in_bed,efficiency
0,hfttkth7,2020-05-14 00:27:00,2020-05-14 07:13:00,2020-05-14,6.766667,24360000,0,379,27,406,97
1,hfttkth7,2020-05-14 23:53:30,2020-05-15 08:06:30,2020-05-15,8.216667,29580000,8,392,101,493,87
2,hfttkth7,2020-05-15 23:28:00,2020-05-16 04:57:00,2020-05-16,5.483333,19740000,7,287,42,329,95
3,hfttkth7,2020-05-17 02:01:30,2020-05-17 09:28:30,2020-05-17,7.45,26820000,8,403,44,447,96
4,hfttkth7,2020-05-18 00:24:00,2020-05-18 07:20:00,2020-05-18,6.933333,24960000,0,351,65,416,92


In [9]:
sleep_stages = pd.read_csv("../data/processed/fitbit-sleep_stages-ux_s20.csv",parse_dates=["start_date","end_date","time"],infer_datetime_format=True)
sleep_stages.head()

Unnamed: 0,start_date,end_date,time,stage,time_at_stage,beiwe,value
0,2020-05-14,2020-05-14,2020-05-14 00:27:00,wake,510,hfttkth7,0
1,2020-05-14,2020-05-14,2020-05-14 00:35:30,light,420,hfttkth7,1
2,2020-05-14,2020-05-14,2020-05-14 00:42:30,deep,1590,hfttkth7,2
3,2020-05-14,2020-05-14,2020-05-14 01:09:00,light,1290,hfttkth7,1
4,2020-05-14,2020-05-14,2020-05-14 01:30:30,rem,840,hfttkth7,3


The [data dictionary](https://www.fitabase.com/media/1546/fitabasedatadictionary.pdf) for these variables can be quite enlightening as many of these variables are useless.

# Getting Features
Here we combine datasets across the Fitbit, EMAs, and Beacon

In [21]:
beacon = pd.read_csv("../data/processed/beacon-fb_ema_and_gps_filtered-ux_s20.csv",index_col=0,parse_dates=True,infer_datetime_format=True)
beacon.columns

Index(['lat', 'long', 'altitude', 'accuracy', 'tvoc', 'lux', 'no2', 'co',
       'co2', 'pm1_number', 'pm2p5_number', 'pm10_number', 'pm1_mass',
       'pm2p5_mass', 'pm10_mass', 'temperature_c', 'rh', 'beacon', 'beiwe',
       'fitbit', 'redcap', 'start_time', 'end_time'],
      dtype='object')

In [53]:
beacon_mean = pd.DataFrame()
for pt in beacon["beiwe"].unique():
    beacon_by_pt = beacon[beacon["beiwe"] == pt]
    ids = beacon_by_pt[["end_time","beacon","beiwe","fitbit","redcap"]]
    beacon_by_pt.drop(["beiwe","fitbit","redcap","end_time"],axis=1,inplace=True)
    little = beacon_by_pt.groupby("start_time").min()
    big = beacon_by_pt.groupby("start_time").max()
    beacon_mean_by_pt = big - little
    beacon_mean_by_pt["end_time"] = ids["end_time"].unique()
    for col in ids.columns[1:]:
        beacon_mean_by_pt[col] = ids[col][0]
    beacon_mean = beacon_mean.append(beacon_mean_by_pt)
    
beacon_mean

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0_level_0,lat,long,altitude,accuracy,tvoc,lux,no2,co,co2,pm1_number,...,pm1_mass,pm2p5_mass,pm10_mass,temperature_c,rh,beacon,end_time,beiwe,fitbit,redcap
start_time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-08-10 04:42:30,0.00018,0.00011,37.31855,40.48316,91.860000,0.102000,,0.663800,267.800265,4.103723,...,0.620745,2.025447,3.560924,1.00,0.94,21,2020-08-10 12:35:30,lkkjddam,25,12
2020-08-12 02:59:30,0.00010,0.00005,12.98383,0.53336,110.880000,0.163200,,0.496840,121.969624,9.710013,...,1.231882,2.793039,4.622537,0.00,1.50,21,2020-08-12 10:52:30,lkkjddam,25,12
2020-08-14 03:05:00,0.00034,0.00014,2.83618,18.97467,79.200000,2.040000,,5.044353,204.770229,5.193445,...,0.881304,2.926095,5.131563,1.24,1.20,21,2020-08-14 11:23:30,lkkjddam,25,12
2020-08-16 04:21:30,0.00007,0.00008,32.37683,11.82401,92.000000,2.040000,,0.393240,80.422490,3.954363,...,0.619553,1.984098,3.324671,0.42,0.56,21,2020-08-16 11:53:00,lkkjddam,25,12
2020-08-17 03:00:00,0.00027,0.00032,41.06684,47.69561,220.250000,2.040000,,0.879920,288.195615,8.872496,...,1.180683,3.044860,5.413904,1.80,1.50,21,2020-08-17 11:30:30,lkkjddam,25,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-08-24 00:25:00,0.00019,0.00027,4.35970,1402.95731,144.360000,4.896000,,0.918640,55.752006,15.724111,...,2.296188,5.513977,9.081423,1.00,5.60,36,2020-08-24 07:15:30,tlmlq19s,9,47
2020-08-25 23:46:30,0.00897,0.01069,7.93757,9551.28418,140.686667,8.160000,,2.288360,107.117194,11.585351,...,1.762849,4.827313,7.705100,2.00,5.80,36,2020-08-26 08:03:00,tlmlq19s,9,47
2020-08-30 01:30:00,0.00131,0.00253,11.62827,1409.35997,97.520000,11.734080,,1.228280,157.346263,7.829587,...,1.150610,3.834445,6.592023,1.00,10.80,36,2020-08-30 08:36:30,tlmlq19s,9,47
2020-08-30 23:42:30,0.00023,0.00028,11.86171,1409.31074,109.880000,4.080000,,2.734040,76.227309,65.495725,...,8.609431,14.160868,19.589573,1.68,7.00,36,2020-08-31 07:26:00,tlmlq19s,9,47


In [11]:
beacon = pd.read_csv("../data/processed/fitbit_beiwe_beacon-sleep_summary-ux_s20.csv")
beacon.columns

Index(['date', 'duration_ms', 'efficiency', 'end_time', 'main_sleep',
       'minutes_after_wakeup', 'minutes_asleep', 'minutes_awake',
       'minutes_to_sleep', 'start_time', 'time_in_bed', 'content', 'stress',
       'lonely', 'sad', 'energy', 'tst', 'sol', 'naw', 'restful', 'beiwe',
       'beacon', 'fitbit', 'redcap'],
      dtype='object')

In [12]:
beacon = pd.read_csv("../data/processed/beacon-fb_ema_and_gps_filtered-ux_s20.csv")

In [15]:
beacon.columns

Index(['timestamp', 'lat', 'long', 'altitude', 'accuracy', 'tvoc', 'lux',
       'no2', 'co', 'co2', 'pm1_number', 'pm2p5_number', 'pm10_number',
       'pm1_mass', 'pm2p5_mass', 'pm10_mass', 'temperature_c', 'rh', 'beacon',
       'beiwe', 'fitbit', 'redcap', 'start_time', 'end_time'],
      dtype='object')

# Reviewing Datasets

In [109]:
emas = pd.DataFrame()
for pt in morning["beiwe"].unique():
    morning_by_pt = morning[morning["beiwe"] == pt]
    evening_by_pt = evening[evening["beiwe"] == pt]
    ema_by_pt = morning_by_pt.merge(evening_by_pt,left_on=["date","beiwe"],right_on=["date","beiwe"],suffixes=('_morning', '_evening'))
    emas = emas.append(ema_by_pt)

In [110]:
emas

Unnamed: 0,beiwe,content_morning,stress_morning,lonely_morning,sad_morning,energy_morning,tst,sol,naw,restful,date,content_evening,stress_evening,lonely_evening,sad_evening,energy_evening
0,qh34m4r9,3.0,0.0,0.0,0.0,1.0,8.0,20.0,2.0,3.0,2020-05-13,3.0,1,0.0,0,4
1,qh34m4r9,3.0,0.0,0.0,0.0,3.0,7.0,30.0,3.0,3.0,2020-05-15,3.0,0,0.0,0,3
2,qh34m4r9,3.0,0.0,0.0,1.0,2.0,8.0,5.0,0.0,3.0,2020-05-17,3.0,0,0.0,1,3
3,qh34m4r9,3.0,0.0,0.0,0.0,3.0,9.0,3.0,2.0,3.0,2020-05-18,2.0,0,0.0,0,3
4,qh34m4r9,3.0,0.0,0.0,0.0,2.0,9.0,20.0,1.0,3.0,2020-05-20,3.0,1,0.0,0,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
20,hfttkth7,1.0,3.0,0.0,1.0,2.0,6.5,20.0,1.0,1.0,2020-08-31,1.0,2,1.0,1,2
0,r11k6uxz,1.0,1.0,1.0,0.0,2.0,7.0,5.0,0.0,2.0,2020-08-07,1.0,0,1.0,0,2
1,r11k6uxz,0.0,0.0,0.0,0.0,2.0,7.0,0.0,0.0,1.0,2020-08-09,1.0,2,1.0,0,2
2,r11k6uxz,0.0,0.0,0.0,0.0,2.0,7.0,0.0,0.0,1.0,2020-08-09,1.0,2,1.0,1,2


In [129]:
emas = morning.merge(evening,left_on=["date","beiwe"],right_on=["date","beiwe"],suffixes=('_morning', '_evening'))
emas

Unnamed: 0,timestamp_morning,beiwe,content_morning,stress_morning,lonely_morning,sad_morning,energy_morning,tst,sol,naw,restful,date,timestamp_evening,content_evening,stress_evening,lonely_evening,sad_evening,energy_evening
0,2020-05-13 09:10:27,qh34m4r9,3.0,0.0,0.0,0.0,1.0,8.0,20.0,2.0,3.0,2020-05-13,2020-05-13 21:00:18,3.0,1,0.0,0,4
1,2020-05-13 09:15:49,awa8uces,0.0,2.0,1.0,1.0,1.0,2.0,10.0,3.0,1.0,2020-05-13,2020-05-13 19:00:23,1.0,1,1.0,3,2
2,2020-05-13 09:42:19,xxvnhauv,1.0,1.0,1.0,3.0,0.0,6.0,30.0,3.0,1.0,2020-05-13,2020-05-13 20:07:04,1.0,3,1.0,2,0
3,2020-05-13 09:43:27,rvhdl2la,1.0,1.0,2.0,3.0,0.0,5.3,5.0,2.0,2.0,2020-05-13,2020-05-13 19:30:38,2.0,1,0.0,0,1
4,2020-05-13 12:30:38,lkkjddam,1.0,1.0,3.0,3.0,2.0,7.0,45.0,2.0,1.0,2020-05-13,2020-05-13 19:21:32,0.0,2,3.0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1735,2020-09-01 13:01:00,2xtqkfz1,2.0,1.0,2.0,1.0,1.0,7.0,0.0,0.0,2.0,2020-09-01,2020-09-01 13:00:40,3.0,1,1.0,1,2
1736,2020-09-01 13:10:16,7dhu3pn7,2.0,2.0,0.0,0.0,2.0,8.0,5.0,0.0,3.0,2020-09-01,2020-09-01 12:20:46,3.0,2,0.0,0,2
1737,2020-09-01 14:14:17,745vq78e,3.0,0.0,0.0,0.0,2.0,7.6,0.0,1.0,2.0,2020-09-01,2020-09-01 14:13:49,3.0,0,0.0,0,2
1738,2020-09-01 17:28:26,axk49ssu,2.0,2.0,0.0,1.0,1.0,7.0,40.0,3.0,1.0,2020-09-01,2020-09-01 17:28:35,2.0,2,0.0,1,1


In [151]:
df1 = pd.read_csv("../data/processed/fitbit-sleep_data_summary-ux_s20.csv",parse_dates=["end_date"],infer_datetime_format=True)
df1

Unnamed: 0,start_date,end_date,deep_count,deep_minutes,light_count,light_minutes,rem_count,rem_minutes,wake_count,wake_minutes,...,duration_ms,efficiency,end_time,main_sleep,minutes_after_wakeup,minutes_asleep,minutes_awake,minutes_to_sleep,start_time,time_in_bed
0,2020-05-14,2020-05-14,5,84,20,213,10,82,21,27,...,24360000,97,2020-05-14T07:13:00.000,True,0,379,27,0,2020-05-14T00:27:00.000,406
1,2020-05-14,2020-05-15,4,95,31,250,6,47,33,101,...,29580000,87,2020-05-15T08:06:30.000,True,8,392,101,0,2020-05-14T23:53:30.000,493
2,2020-05-15,2020-05-16,2,47,17,190,8,50,20,42,...,19740000,95,2020-05-16T04:57:00.000,True,7,287,42,0,2020-05-15T23:28:00.000,329
3,2020-05-17,2020-05-17,5,78,21,242,11,83,25,44,...,26820000,96,2020-05-17T09:28:30.000,True,8,403,44,0,2020-05-17T02:01:30.000,447
4,2020-05-18,2020-05-18,5,96,20,167,14,88,28,65,...,24960000,92,2020-05-18T07:20:00.000,True,0,351,65,0,2020-05-18T00:24:00.000,416
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3041,2020-09-03,2020-09-03,4,78,29,207,7,49,28,69,...,24180000,95,2020-09-03T07:46:30.000,True,0,334,69,0,2020-09-03T01:03:30.000,403
3042,2020-09-05,2020-09-05,2,36,26,214,5,77,28,64,...,23460000,93,2020-09-05T08:30:00.000,True,0,327,64,0,2020-09-05T01:59:00.000,391
3043,2020-09-05,2020-09-06,2,80,29,228,15,119,37,77,...,30240000,92,2020-09-06T07:49:00.000,True,0,427,77,0,2020-09-05T23:25:00.000,504
3044,2020-09-07,2020-09-07,3,64,23,209,13,120,30,57,...,27000000,90,2020-09-07T08:13:00.000,True,0,393,57,0,2020-09-07T00:43:00.000,450


In [155]:
df2 = pd.read_csv("../data/processed/fitbit-daily-ux_s20.csv",parse_dates=["timestamp"],infer_datetime_format=True)
df2

Unnamed: 0,timestamp,calories,bmr,steps,distance,sedentary_minutes,lightly_active_minutes,fairly_active_minutes,very_active_minutes,calories_from_activities,bmi,fat,weight,food_calories_logged,water_logged,beiwe
0,2020-05-13,2781.0,1876.0,9207,4.396294,1241,70,118,11,1097.0,23.754000,0.0,180.0,0.0,0.0,hfttkth7
1,2020-05-14,3727.0,1876.0,15207,7.261114,614,263,134,23,2234.0,23.754000,0.0,180.0,0.0,0.0,hfttkth7
2,2020-05-15,3909.0,1876.0,14556,8.028501,577,205,57,108,2381.0,23.754000,0.0,180.0,0.0,0.0,hfttkth7
3,2020-05-16,3927.0,1876.0,18453,8.748670,760,176,24,151,2364.0,23.754000,0.0,180.0,0.0,0.0,hfttkth7
4,2020-05-17,4180.0,1876.0,15425,7.973149,605,207,50,131,2652.0,23.754000,0.0,180.0,0.0,0.0,hfttkth7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4481,2020-09-05,2048.0,1541.0,3103,1.357559,882,160,6,1,527.0,19.596317,0.0,125.0,0.0,0.0,e8js2jdf
4482,2020-09-06,1992.0,1541.0,2551,1.116337,782,154,0,0,483.0,19.596317,0.0,125.0,0.0,0.0,e8js2jdf
4483,2020-09-07,2180.0,1541.0,5014,2.193546,768,222,0,0,740.0,19.596317,0.0,125.0,0.0,0.0,e8js2jdf
4484,2020-09-08,1886.0,1541.0,1765,0.772315,872,120,0,0,366.0,19.596317,0.0,125.0,0.0,0.0,e8js2jdf


In [157]:
df1.merge(df2,left_on=["end_date","beiwe"],right_on=["timestamp","beiwe"])

Unnamed: 0,start_date,end_date,deep_count,deep_minutes,light_count,light_minutes,rem_count,rem_minutes,wake_count,wake_minutes,...,sedentary_minutes,lightly_active_minutes,fairly_active_minutes,very_active_minutes,calories_from_activities,bmi,fat,weight,food_calories_logged,water_logged
0,2020-05-14,2020-05-14,5,84,20,213,10,82,21,27,...,614,263,134,23,2234.0,23.754000,0.0,180.0,0.0,0.0
1,2020-05-14,2020-05-15,4,95,31,250,6,47,33,101,...,577,205,57,108,2381.0,23.754000,0.0,180.0,0.0,0.0
2,2020-05-15,2020-05-16,2,47,17,190,8,50,20,42,...,760,176,24,151,2364.0,23.754000,0.0,180.0,0.0,0.0
3,2020-05-17,2020-05-17,5,78,21,242,11,83,25,44,...,605,207,50,131,2652.0,23.754000,0.0,180.0,0.0,0.0
4,2020-05-18,2020-05-18,5,96,20,167,14,88,28,65,...,637,224,46,117,2561.0,23.754000,0.0,180.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3041,2020-09-03,2020-09-03,4,78,29,207,7,49,28,69,...,872,165,0,0,531.0,19.596317,0.0,125.0,0.0,0.0
3042,2020-09-05,2020-09-05,2,36,26,214,5,77,28,64,...,882,160,6,1,527.0,19.596317,0.0,125.0,0.0,0.0
3043,2020-09-05,2020-09-06,2,80,29,228,15,119,37,77,...,782,154,0,0,483.0,19.596317,0.0,125.0,0.0,0.0
3044,2020-09-07,2020-09-07,3,64,23,209,13,120,30,57,...,768,222,0,0,740.0,19.596317,0.0,125.0,0.0,0.0


In [80]:
df3 = pd.read_csv("../data/processed/fitbit_beiwe_beacon-sleep_summary-ux_s20.csv")
df3

Unnamed: 0,date,start_date,end_date,deep_count,deep_minutes,light_count,light_minutes,rem_count,rem_minutes,wake_count,...,sad,energy,tst,sol,naw,restful,beiwe,beacon,fitbit,redcap
0,2020-08-10,2020-08-10,2020-08-10,2,41,26,285,2,71,25,...,0.0,2.0,6.0,15.0,5.0,0.0,lkkjddam,21,25,12
1,2020-08-12,2020-08-12,2020-08-12,3,52,34,291,7,65,35,...,0.0,2.0,8.0,10.0,3.0,2.0,lkkjddam,21,25,12
2,2020-08-14,2020-08-14,2020-08-14,4,49,38,299,6,76,38,...,2.0,3.0,8.0,10.0,3.0,3.0,lkkjddam,21,25,12
3,2020-08-16,2020-08-16,2020-08-16,4,79,27,200,5,97,25,...,2.0,1.0,6.0,20.0,4.0,1.0,lkkjddam,21,25,12
4,2020-08-17,2020-08-17,2020-08-17,4,85,42,263,8,94,36,...,1.0,1.0,6.0,25.0,2.0,1.0,lkkjddam,21,25,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
155,2020-08-24,2020-08-24,2020-08-24,2,84,18,194,12,99,26,...,0.0,3.0,7.0,10.0,2.0,3.0,tlmlq19s,36,9,47
156,2020-08-26,2020-08-25,2020-08-26,3,71,33,256,14,91,39,...,0.0,1.0,7.0,20.0,3.0,2.0,tlmlq19s,36,9,47
157,2020-08-30,2020-08-30,2020-08-30,2,72,23,214,12,80,30,...,0.0,3.0,7.0,15.0,3.0,2.0,tlmlq19s,36,9,47
158,2020-08-31,2020-08-30,2020-08-31,3,67,28,286,9,40,35,...,0.0,2.0,7.0,10.0,2.0,2.0,tlmlq19s,36,9,47
