# Introduction
### Importance of Sleep
In today's fast-paced world, sleep often becomes a neglected aspect of our daily routines, yet its importance is profound in maintaining overall well-being and health. Sleep is not merely a time to rest but a critical biological function that restores and rejuvenates the mind and body. Studies have linked adequate sleep to improved memory, mood stability, and physical health, including the regulation of metabolism and reduction of chronic diseases such as diabetes and heart disease. On the other hand, sleep deprivation can lead to a multitude of health problems, impacting mental clarity, emotional stability, and even increasing the risk of chronic conditions. Understanding the role of sleep becomes a necessity, not a luxury, in our pursuit of a healthy and productive life.
### Why do we sleep?
Sleep is a complex biological process characterized by altered consciousness, reduced interaction with surroundings, and periodic cycles involving different stages. It plays a vital role in many bodily functions, including the healing and repair of the heart and blood vessels, balancing hormones, and supporting cognitive functions. Different people require varying amounts of sleep, influenced by age, lifestyle, and health conditions. Evaluating the quality and quantity of sleep is essential in assessing an individual's overall health profile.
### Purpose
The intention of this project is to explore the multifaceted factors that affect sleep and identify what elements are most influential in determining sleep quality and quantity. Various aspects such as lifestyle choices, environmental factors, medical conditions, and psychological stressors will be examined to unravel their impact on sleep patterns. Data science, with its ability to process complex data and highlight underlying trends, becomes an indispensable tool in this exploration. By harnessing data science methodologies, this study aims to shed light on the crucial factors affecting sleep and provide insights that can lead to better sleep practices and, consequently, a healthier population. By understanding these key factors, we can tailor interventions, policies, and personal routines to foster a society that recognizes and prioritizes sleep as a fundamental component of health and well-being. Whether it's adopting better sleep hygiene or addressing specific medical conditions that interfere with sleep, the insights from this study could have far-reaching implications for individuals and communities alike.

In conclusion, as sleep is intertwined with virtually every aspect of our health, this project embarks on a crucial journey to understand the dynamics of sleep and its determinants. The insights drawn from this exploration will not only enrich our understanding but will guide interventions and policies aimed at improving the quality of life through enhanced sleep practices.

In [20]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# First Dataset

In [21]:
file_path = 'Sleep_health_and_lifestyle_dataset.csv'
sleep_health_and_lifestyle_df = pd.read_csv(file_path)

# Displaying the first few rows of the dataset
sleep_health_and_lifestyle_df.head()

Unnamed: 0,Person ID,Gender,Age,Occupation,Sleep Duration,Quality of Sleep,Physical Activity Level,Stress Level,BMI Category,Blood Pressure,Heart Rate,Daily Steps,Sleep Disorder
0,1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,
1,2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
2,3,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
3,4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea
4,5,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea


# Tidy Data

In [22]:
# Replace all of the labels that have 'None' in the disorder column to 'No Disorder' indicating that the dataset does not have missing data.
# sleep_health_and_lifestyle_df['Sleep Disorder'] = sleep_health_and_lifestyle_df['Sleep Disorder'].fillna('No Disorder')

# One-hot encoding for Occupation and BMI Category columns
sleep_health_and_lifestyle_df = pd.get_dummies(sleep_health_and_lifestyle_df, columns=['Occupation', 'BMI Category'], drop_first=True)

# Manually encoding Gender and Sleep Disorder columns
sleep_health_and_lifestyle_df['Gender'] = sleep_health_and_lifestyle_df['Gender'].map({'Male': 1, 'Female': 0})
sleep_health_and_lifestyle_df['Sleep Disorder'] = sleep_health_and_lifestyle_df['Sleep Disorder'].map(lambda x: 0 if x == 'None' or x == np.NAN else 1)

# Splitting Blood Pressure into Systolic and Diastolic
sleep_health_and_lifestyle_df['Systolic BP'] = sleep_health_and_lifestyle_df['Blood Pressure'].str.split('/').str
sleep_health_and_lifestyle_df['Diastolic BP'] = sleep_health_and_lifestyle_df['Blood Pressure'].str.split('/').str
sleep_health_and_lifestyle_df['Systolic BP'] = sleep_health_and_lifestyle_df['Systolic BP'].astype(int)
sleep_health_and_lifestyle_df['Diastolic BP'] = sleep_health_and_lifestyle_df['Diastolic BP'].astype(int)

# Dropping the original Blood Pressure column
sleep_health_and_lifestyle_df.drop(columns=['Blood Pressure'], inplace=True)

# Displaying the first few rows of the processed dataset
sleep_health_and_lifestyle_df.head()


ValueError: setting an array element with a sequence.

Kaggle Dataset is more of a hollistic view of the sleeping dataset. Has more general info and we can drag conclusions from it to prove our point on how stress impacts sleep.

In the second dataset, Sayo pillow, the dataset approaches subjects that have recorded data as they are sleeping which makes it easier to prove our null hypothesis.

Null Hypothesis: No correlation between stress and sleep

1. Kaggle dataset exploratory analysis.
2. Kaggle dataset modelling;
3. Draw conclusion that stress is the most important factor.
4. Formulate hypothesis: stress is the most important factor.
5. Pillow dataset exploratory analysis.
6. Modeling: Prove hypothesis.
