# Sleep Health & Lifestyle

The following dataset we are going to explore is a collection on sleep health indicated by its duration and quality as well as lifestyle factors such as stress level and daily steps as well as cardiocascular aspects such as BMI category or Blood Pressure. The data set includes the following variables: 

1. **Demographics** 🧑‍🦰  

    | |  |
    |--------------|-------------|
    | **Person ID**  | Unique identifier for each individual |
    | **Gender**     | 🚻 (Male/Female) |
    | **Age**        | 📅 (Years) |
    | **Occupation** | 💼 (e.g., Software Engineer, Doctor, Teacher) |

<br>

2. **Lifestyle** 🏃‍♀️  

    | |  |
    |----------------------------|-------------|
    | **Physical Activity Level** | ⏱️ (Minutes per day) |
    | **Daily Steps**             | 🚶 (Number of steps) |
    | **Stress Level**            | 😟 (Scale of 1 to 10) |

<br>

3. **Sleep Health** 💤  

    | |  |
    |-------------------|-------------|
    | **Sleep Duration**  | ⏰ (Hours per day) |
    | **Quality of Sleep** | ⭐ (Scale of 1 to 10) |
    | **Sleep Disorder**   | 🤕 (None, Insomnia, Sleep Apnea) - **Target Variable** |

<br>

4. **Cardiovascular Health** ❤️  

    | |  |
    |--------------|-------------|
    | **BMI Category** | 📏 (Underweight, Normal, Overweight, Obese) |
    | **Blood Pressure** | 🌡️ (Systolic/Diastolic) |
    | **Heart Rate** | 💓 (Beats per minute) |

<br>

Using this dataset we are going to be answering the following questions:



1. Test
2. Test
3. Test
4. Test


### Test

## Step 1 - Modules
As a first step we will import all necessary modules. 

In [27]:
# Import modules

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
import plotly.express as px


## Step 2 - Data Import + First Look 
Next step we will look at the dataset and get a first overview as to what it looks like. 


In [28]:
df = pd.read_csv("Sleep_health_and_lifestyle_dataset_FINAL.csv")

In [29]:
df.shape

print ("We have 15 columns and 374 rows.")

We have 15 columns and 374 rows.


In [30]:
df.head(5)

Unnamed: 0,Person ID,Gender,Gender Count,Employee Count,Age,Occupation,Sleep Duration in Hours,Quality of Sleep,Physical Activity in Min,Stress Level,BMI Category,Blood Pressure,Heart Rate,Daily Steps,Sleep Disorder
0,1,Male,1,1,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,No Disorder
1,2,Male,1,1,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,No Disorder
2,3,Male,1,1,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,No Disorder
3,4,Male,1,1,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea
4,5,Male,1,1,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea


In [31]:
df.info()

print ("\n\nFINDINGS")
print ("\nUsing 'df.info()' provides a quick overview and grants us insight into the dtypes as well as the null values.")
print ("")
print ("Our dataset looks rather clean as we have no null values to take care of.")
print ("In addition we have 9 int columns, 5 int columns and 1 float column.")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 374 entries, 0 to 373
Data columns (total 15 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Person ID                 374 non-null    int64  
 1   Gender                    374 non-null    object 
 2   Gender Count              374 non-null    int64  
 3   Employee Count            374 non-null    int64  
 4   Age                       374 non-null    int64  
 5   Occupation                374 non-null    object 
 6   Sleep Duration in Hours   374 non-null    float64
 7   Quality of Sleep          374 non-null    int64  
 8   Physical Activity in Min  374 non-null    int64  
 9   Stress Level              374 non-null    int64  
 10  BMI Category              374 non-null    object 
 11  Blood Pressure            374 non-null    object 
 12  Heart Rate                374 non-null    int64  
 13  Daily Steps               374 non-null    int64  
 14  Sleep Diso

In [32]:
df.describe().round(2)

Unnamed: 0,Person ID,Gender Count,Employee Count,Age,Sleep Duration in Hours,Quality of Sleep,Physical Activity in Min,Stress Level,Heart Rate,Daily Steps
count,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0
mean,187.5,1.0,1.0,42.18,7.13,7.31,59.17,5.39,70.17,6816.84
std,108.11,0.0,0.0,8.67,0.8,1.2,20.83,1.77,4.14,1617.92
min,1.0,1.0,1.0,27.0,5.8,4.0,30.0,3.0,65.0,3000.0
25%,94.25,1.0,1.0,35.25,6.4,6.0,45.0,4.0,68.0,5600.0
50%,187.5,1.0,1.0,43.0,7.2,7.0,60.0,5.0,70.0,7000.0
75%,280.75,1.0,1.0,50.0,7.8,8.0,75.0,7.0,72.0,8000.0
max,374.0,1.0,1.0,59.0,8.5,9.0,90.0,8.0,86.0,10000.0


In [33]:
round (df.describe(exclude = 'object'), 2).style.background_gradient(cmap='BuPu')


Unnamed: 0,Person ID,Gender Count,Employee Count,Age,Sleep Duration in Hours,Quality of Sleep,Physical Activity in Min,Stress Level,Heart Rate,Daily Steps
count,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0,374.0
mean,187.5,1.0,1.0,42.18,7.13,7.31,59.17,5.39,70.17,6816.84
std,108.11,0.0,0.0,8.67,0.8,1.2,20.83,1.77,4.14,1617.92
min,1.0,1.0,1.0,27.0,5.8,4.0,30.0,3.0,65.0,3000.0
25%,94.25,1.0,1.0,35.25,6.4,6.0,45.0,4.0,68.0,5600.0
50%,187.5,1.0,1.0,43.0,7.2,7.0,60.0,5.0,70.0,7000.0
75%,280.75,1.0,1.0,50.0,7.8,8.0,75.0,7.0,72.0,8000.0
max,374.0,1.0,1.0,59.0,8.5,9.0,90.0,8.0,86.0,10000.0
