## Analyzing Sleep Data: a [Quantified Self](https://medium.com/quantified-self-dublin/quantified-self-health-wellness-now-into-f853e68c5d72) Study


### Executive Summary
---

There are many motivations behind self-tracking, but with the rise of newer technologies coupled with recent trends in physical/mental improvement, it is natural to want to study yourself.  For me, this obsession started with trying to figure out if I had some gastrointestinal issues in 2017.  At first, I told myself perhaps it was my age since I am no longer in my early 20s.  As I researched further, I found the [Nourish Balance Thrive Podcast](http://www.nourishbalancethrive.com/podcasts/nourish-balance-thrive/).  ***Voila!*** My passion of self improvement through nutrition and sleep was ignited.  This podcast spoke to me because the host, Christopher Kelly, was a data scientist turned health pioneer.  I also worked in the field of data science and needed answers about my health.  In the past year, I have played around with my sleep but not intensely tracking it.  Since the beginning of my data science immersive with [General Assembly](https://generalassemb.ly/), I started tracking my sleep with a Fitbit.  With that said, the purpose of this analysis, is two fold.  I want to be able to see if a consistent week of good sleep will lead to more good sleep, and I would like to look at my sleep data not only on a daily view but track my sleep over a long period.  

---

### The Problem Statement: 
---
We are trying to make several inferences with this sleep data.  Perhaps we can predict if I will have more or less sleep than the average, but it doesn't necessarily give me much information about what my sleep patterns are.  For simplicity, I have several questions I want to answer with this analysis.
 - Does consistently good sleep lead to more good sleep?  
 - What other factors can impact sleep?
 - What should we explore next?

---

### The Data:
---
The data is a downloadable zip file provided by Fitbit.  Fitbit also offers an API option, but due to time constraints, I chose to just download the zip file of all of my data.  The data came in a JSON format, and I used a python library to parse the data into two main datasets.  The first dataset that I also use for my model consists of each nights sleeping data.  The second dataset I created from the JSON files consisted of the times that I had different sleeping activity which can be linked back to the first dataset (the header dataset) via the logID.  Sleeping activity varies between minutes asleep, restless, and awake.  Below are two tables representing the data and its layouts.

#### Header Table

|Column Name|Data Type|Description|
|---|---|---|
|asleep_cnt|int64|counts of continuous sleep periods for 1 night|
|asleep_min|int64|minutes of sleep for 1 night|
|awake_cnt|int64|counts of continuous awake periods for 1 night|
|awake_min|int64|minutes being awake for 1 night|
|dateOfSleep|datetime64|date of sleep|
|duration|int64|duration of sleep|
|efficiency|int64|efficiency of sleep|
|endTime|object|end of time sleep (when you wake up and get out of bed)|
|infoCode|int64|n/a|
|logID|int64|unique id for each night's sleep|
|minutesAfterWakeup|int64|minutes in bed after you wake up|
|minutesAsleep|int64|minutes spent asleep|
|minutesAwake|int64|minutes awake during sleep|
|minutesToFallAsleep|int64|minutes to fall asleep|
|restless_cnt|int64|counts of continuous restless periods for 1 night|
|restless_min|int64|minutes of restlessness for 1 night|
|startTime|object|when Fitbit infers when you feel asleep|
|timeInBed|int64|total time spent in bed for 1 night of sleep|
|type|object|n/a|

#### Sleep Table

|Column Name|Data Type|Description|
|---|---|---|
|dateTime|datetime64| timestamp YYYY-MM-DDTHH:mm:ss.nnn|
|level|object|description of sleep activity|
|logID|int64|unique id for each night's sleep|
|seconds|int64|seconds in specific level|

---
### The Methodologies
---

#### 1. Gather Data
Gathering data was much more difficult than I had expected.  What made the gathering data the most difficult, was actually keeping the Fitbit on my wrist.  Another challenge was when the Fitbit needed to be charged.  There are days where sleep data is missing because I accidentally took off my Fitbit to shower/wash my hands and the band never made it back onto my wrist.  The opposite of this would be I would remember to wear my Fitbit but not realize that it was dead and needed to be charged.  This made gathering data much more difficult that originally expected.  Additionally, this created gaps in the sleep study.  Where I should have had 120 days of sleep data, I had closer to 85 days of sleep data.  

#### 2. Clean Data
The process of parsing the data from the downloadable zip file was fairly simple.  Using the Python dictionary parsing method, I created simple datasets as mentioned above from the sleep information.

#### 3. Initial EDA

#### 4. Modeling

#### 5. Metrics for modeling

---

### Conclusions & Next Steps
---


### The Files
---
The files are organized in 3 folders.  Each folder has a subset of folders that describe a different portion of the analysis.  Because this is a two part project each subfolder is related to a different aspect such as the dashboard creation/interactive tool, the data cleaning, or the modeling.  
1. Code
2. Data
3. jupyterlab-dash