# Fitbit Exploration
For an explanation on the variables, take a look at the [data dictionary created by Fitabase](https://www.fitabase.com/media/1546/fitabasedatadictionary.pdf).

## Sleep Sensitivity - 1 Variable
In this notebook we take a look at the individual variables that might be affecting sleep

In [1]:
import os
import sys
sys.path.append('../')

import pandas as pd
import numpy as np

from datetime import datetime, timedelta

import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.dates as mdates
from matplotlib import cm
from matplotlib.colors import ListedColormap, LinearSegmentedColormap

from joypy import joyplot

# Data Import
Sleep data are divided into two primary datasets:

1. Sleep Summaries by Day (daily sleep)
2. Sleep Data by Minute (sleep stages)

In [2]:
daily_sleep = pd.read_csv("../data/processed/fitbit-sleep_daily-ux_s20.csv",parse_dates=["date","start_time","end_time"],infer_datetime_format=True)
# converting duration to something that makes more sense...
daily_sleep['tst'] = daily_sleep['duration_ms'] / 3600000
daily_sleep = daily_sleep[daily_sleep["main_sleep"] == True]
daily_sleep.drop(["minutes_to_sleep","main_sleep"],axis=1,inplace=True)
daily_sleep = daily_sleep[['beiwe', 'start_time', 'end_time', 'date','tst','duration_ms','minutes_after_wakeup', 'minutes_asleep', 'minutes_awake', 'time_in_bed', 'efficiency']]
daily_sleep.head()

Unnamed: 0,beiwe,start_time,end_time,date,tst,duration_ms,minutes_after_wakeup,minutes_asleep,minutes_awake,time_in_bed,efficiency
0,hfttkth7,2020-05-14 00:27:00,2020-05-14 07:13:00,2020-05-14,6.766667,24360000,0,379,27,406,97
1,hfttkth7,2020-05-14 23:53:30,2020-05-15 08:06:30,2020-05-15,8.216667,29580000,8,392,101,493,87
2,hfttkth7,2020-05-15 23:28:00,2020-05-16 04:57:00,2020-05-16,5.483333,19740000,7,287,42,329,95
3,hfttkth7,2020-05-17 02:01:30,2020-05-17 09:28:30,2020-05-17,7.45,26820000,8,403,44,447,96
4,hfttkth7,2020-05-18 00:24:00,2020-05-18 07:20:00,2020-05-18,6.933333,24960000,0,351,65,416,92


In [3]:
sleep_stages = pd.read_csv("../data/processed/fitbit-sleep_stages-ux_s20.csv",parse_dates=["start_date","end_date","time"],infer_datetime_format=True)
sleep_stages.head()

Unnamed: 0,start_date,end_date,time,stage,time_at_stage,beiwe,value
0,2020-05-14,2020-05-14,2020-05-14 00:27:00,wake,510,hfttkth7,0
1,2020-05-14,2020-05-14,2020-05-14 00:35:30,light,420,hfttkth7,1
2,2020-05-14,2020-05-14,2020-05-14 00:42:30,deep,1590,hfttkth7,2
3,2020-05-14,2020-05-14,2020-05-14 01:09:00,light,1290,hfttkth7,1
4,2020-05-14,2020-05-14,2020-05-14 01:30:30,rem,840,hfttkth7,3


The [data dictionary](https://www.fitabase.com/media/1546/fitabasedatadictionary.pdf) for these variables can be quite enlightening as many of these variables are useless.

# Getting Features
Here we combine datasets across the Fitbit, EMAs, and Beacon

In [4]:
beacon = pd.read_csv("../data/processed/beacon-fb_ema_and_gps_filtered-ux_s20.csv",index_col=0,parse_dates=True,infer_datetime_format=True)
beacon.columns

Index(['lat', 'long', 'altitude', 'accuracy', 'tvoc', 'lux', 'no2', 'co',
       'co2', 'pm1_number', 'pm2p5_number', 'pm10_number', 'pm1_mass',
       'pm2p5_mass', 'pm10_mass', 'temperature_c', 'rh', 'beacon', 'beiwe',
       'fitbit', 'redcap', 'start_time', 'end_time'],
      dtype='object')

In [5]:
beacon_mean = pd.DataFrame()
for pt in beacon["beiwe"].unique():
    beacon_by_pt = beacon[beacon["beiwe"] == pt]
    ids = beacon_by_pt[["end_time","beacon","beiwe","fitbit","redcap"]]
    beacon_by_pt.drop(["beiwe","fitbit","redcap","end_time"],axis=1,inplace=True)
    little = beacon_by_pt.groupby("start_time").min()
    big = beacon_by_pt.groupby("start_time").max()
    beacon_mean_by_pt = big - little
    beacon_mean_by_pt["end_time"] = ids["end_time"].unique()
    for col in ids.columns[1:]:
        beacon_mean_by_pt[col] = ids[col][0]
    beacon_mean = beacon_mean.append(beacon_mean_by_pt)
    
beacon_mean

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0_level_0,lat,long,altitude,accuracy,tvoc,lux,no2,co,co2,pm1_number,...,pm1_mass,pm2p5_mass,pm10_mass,temperature_c,rh,beacon,end_time,beiwe,fitbit,redcap
start_time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-08-10 04:42:30,0.00018,0.00011,37.31855,40.48316,91.900000,0.1360,,0.731650,304.035761,2.009295,...,0.202818,0.620298,0.627820,1.000,0.950000,21,2020-08-10 12:35:30,lkkjddam,25,12
2020-08-12 02:59:30,0.00010,0.00005,12.98383,0.53336,108.600000,0.2040,,0.512533,146.009840,5.033866,...,0.429567,0.702017,0.657346,0.000,1.500000,21,2020-08-12 10:52:30,lkkjddam,25,12
2020-08-14 03:05:00,0.00034,0.00014,2.83618,18.97467,81.800000,2.0400,,5.108383,233.372161,3.178799,...,0.254569,0.736714,0.783074,1.175,1.250000,21,2020-08-14 11:23:30,lkkjddam,25,12
2020-08-16 04:21:30,0.00007,0.00008,32.37683,11.82401,88.216667,2.0400,,0.312050,96.153213,3.125266,...,0.216821,0.471061,0.445183,0.500,0.583333,21,2020-08-16 11:53:00,lkkjddam,25,12
2020-08-17 03:00:00,0.00027,0.00032,41.06684,47.69561,249.066667,2.0400,,0.819100,331.097623,6.091469,...,0.488183,1.358703,1.371396,1.750,1.500000,21,2020-08-17 11:30:30,lkkjddam,25,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-08-24 00:25:00,0.00019,0.00027,4.35970,1402.95731,95.916667,4.5424,,0.805100,58.582825,14.461965,...,0.956106,1.626959,1.629446,1.000,5.500000,36,2020-08-24 07:15:30,tlmlq19s,9,47
2020-08-25 23:46:30,0.00897,0.01069,7.93757,9551.28418,140.400000,8.0784,,2.340050,115.775437,8.941261,...,0.702543,1.098767,1.033726,2.000,6.000000,36,2020-08-26 08:03:00,tlmlq19s,9,47
2020-08-30 01:30:00,0.00131,0.00253,11.62827,1409.35997,98.950000,11.5600,,1.313850,166.573677,5.201494,...,0.370893,0.974239,1.013575,1.000,11.000000,36,2020-08-30 08:36:30,tlmlq19s,9,47
2020-08-30 23:42:30,0.00023,0.00028,11.86171,1409.31074,88.700000,4.0800,,2.661700,80.088870,56.705639,...,3.394869,3.543439,2.882672,1.250,7.000000,36,2020-08-31 07:26:00,tlmlq19s,9,47


In [6]:
beacon = pd.read_csv("../data/processed/fitbit_beiwe_beacon-sleep_summary-ux_s20.csv")
beacon.columns

Index(['date', 'start_date', 'end_date', 'deep_count', 'deep_minutes',
       'light_count', 'light_minutes', 'rem_count', 'rem_minutes',
       'wake_count', 'wake_minutes', 'beiwe', 'tst_fb', 'efficiency',
       'end_time', 'minutes_after_wakeup', 'minutes_asleep', 'minutes_awake',
       'minutes_to_sleep', 'start_time', 'time_in_bed', 'redcap_x', 'beacon_x',
       'tst_ema', 'sol_ema', 'naw_ema', 'restful_ema', 'beacon_y', 'fitbit',
       'redcap_y'],
      dtype='object')

In [7]:
beacon = pd.read_csv("../data/processed/beacon-fb_ema_and_gps_filtered-ux_s20.csv")

In [8]:
beacon.columns

Index(['timestamp', 'lat', 'long', 'altitude', 'accuracy', 'tvoc', 'lux',
       'no2', 'co', 'co2', 'pm1_number', 'pm2p5_number', 'pm10_number',
       'pm1_mass', 'pm2p5_mass', 'pm10_mass', 'temperature_c', 'rh', 'beacon',
       'beiwe', 'fitbit', 'redcap', 'start_time', 'end_time'],
      dtype='object')

# Reviewing Datasets

In [9]:
emas = pd.DataFrame()
for pt in morning["beiwe"].unique():
    morning_by_pt = morning[morning["beiwe"] == pt]
    evening_by_pt = evening[evening["beiwe"] == pt]
    ema_by_pt = morning_by_pt.merge(evening_by_pt,left_on=["date","beiwe"],right_on=["date","beiwe"],suffixes=('_morning', '_evening'))
    emas = emas.append(ema_by_pt)

NameError: name 'morning' is not defined

In [None]:
emas

In [None]:
emas = morning.merge(evening,left_on=["date","beiwe"],right_on=["date","beiwe"],suffixes=('_morning', '_evening'))
emas

In [10]:
df1 = pd.read_csv("../data/processed/fitbit-sleep_data_summary-ux_s20.csv",parse_dates=["end_date"],infer_datetime_format=True)
df1.columns

Index(['start_date', 'end_date', 'deep_count', 'deep_minutes', 'light_count',
       'light_minutes', 'rem_count', 'rem_minutes', 'wake_count',
       'wake_minutes', 'beiwe', 'duration_ms', 'efficiency', 'end_time',
       'main_sleep', 'minutes_after_wakeup', 'minutes_asleep', 'minutes_awake',
       'minutes_to_sleep', 'start_time', 'time_in_bed', 'redcap', 'beacon'],
      dtype='object')

In [None]:
df2 = pd.read_csv("../data/processed/fitbit-daily-ux_s20.csv",parse_dates=["timestamp"],infer_datetime_format=True)
df2

In [None]:
df1.merge(df2,left_on=["end_date","beiwe"],right_on=["timestamp","beiwe"])

In [None]:
df3 = pd.read_csv("../data/processed/fitbit_beiwe_beacon-sleep_summary-ux_s20.csv")
df3

In [13]:
df4 = pd.read_csv("../data/processed/beacon-fb_and_gps_filtered_summary-ux_s20.csv")
df4.columns

Index(['start_time', 'lat_mean', 'long_mean', 'altitude_mean', 'accuracy_mean',
       'tvoc_mean', 'lux_mean', 'no2_mean', 'co_mean', 'co2_mean',
       'pm1_number_mean', 'pm2p5_number_mean', 'pm10_number_mean',
       'pm1_mass_mean', 'pm2p5_mass_mean', 'pm10_mass_mean',
       'temperature_c_mean', 'rh_mean', 'end_time', 'beacon', 'beiwe',
       'fitbit', 'redcap', 'lat_median', 'long_median', 'altitude_median',
       'accuracy_median', 'tvoc_median', 'lux_median', 'no2_median',
       'co_median', 'co2_median', 'pm1_number_median', 'pm2p5_number_median',
       'pm10_number_median', 'pm1_mass_median', 'pm2p5_mass_median',
       'pm10_mass_median', 'temperature_c_median', 'rh_median', 'lat_delta',
       'long_delta', 'altitude_delta', 'accuracy_delta', 'tvoc_delta',
       'lux_delta', 'no2_delta', 'co_delta', 'co2_delta', 'pm1_number_delta',
       'pm2p5_number_delta', 'pm10_number_delta', 'pm1_mass_delta',
       'pm2p5_mass_delta', 'pm10_mass_delta', 'temperature_c_delta',
