Jessica: Analysis of trends/correlations between health factors and sleep disorders. 

Question: Does lack of sleep cause certain health problems in patients with diagnosed sleep disorders versus patients without sleep disorders?

Null Hypothesis: There is no significant relationship between health factors (Blood pressure, Heart rate, BMI) and patients with sleep disorders. No difference between these factors between patients with a disorder and patients without. 

Alternative Hypothesis: There is significant cause and effect between health factors (Blood pressure, Heart rate, BMI) and patients with sleep disorders. 

In [139]:
import pandas as pd
from pathlib import Path
import scipy.stats as stats
import numpy as np
from scipy.stats import sem
from scipy.stats import linregress
from sklearn import datasets

In [140]:
sleep_data = Path("Sleep_Study/Sleep_health_and_lifestyle_dataset.csv")

sleep_data_pd = pd.read_csv(sleep_data)

sleep_data_pd.head()

Unnamed: 0,Person ID,Gender,Age,Occupation,Sleep Duration,Quality of Sleep,Physical Activity Level,Stress Level,BMI Category,Blood Pressure,Heart Rate,Daily Steps,Sleep Disorder
0,1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,
1,2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
2,3,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
3,4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea
4,5,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea


In [141]:
sleep_data_pd[['systolic blood pressure','diastolic blood pressure']] = sleep_data_pd['Blood Pressure'].str.split("/", expand=True).astype(float)
sleep_data_pd.head()

Unnamed: 0,Person ID,Gender,Age,Occupation,Sleep Duration,Quality of Sleep,Physical Activity Level,Stress Level,BMI Category,Blood Pressure,Heart Rate,Daily Steps,Sleep Disorder,systolic blood pressure,diastolic blood pressure
0,1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,,126.0,83.0
1,2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,,125.0,80.0
2,3,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,,125.0,80.0
3,4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea,140.0,90.0
4,5,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea,140.0,90.0


In [142]:
sleep_data_pd['Sleep Disorder'].unique()

array(['None', 'Sleep Apnea', 'Insomnia'], dtype=object)

In [143]:
sleep_cleaned = sleep_data_pd[['Person ID','Sleep Duration', 'Quality of Sleep','BMI Category','systolic blood pressure','diastolic blood pressure','Heart Rate','Sleep Disorder']]

In [144]:
health_stat_table1 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None'].agg({'Sleep Duration':["mean", "median","var","std","sem"]})
health_stat_table2 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None'].agg({'Quality of Sleep':["mean", "median","var","std","sem"]})
health_stat_table3 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None'].agg({'Heart Rate':["mean", "median","var","std","sem"]})
health_stat_table4 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None'].agg({'systolic blood pressure':["mean", "median","var","std","sem"]})
health_stat_table5 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None'].agg({'diastolic blood pressure':["mean", "median","var","std","sem"]})

health_stat_table6 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea'].agg({'Sleep Duration':["mean", "median","var","std","sem"]})
health_stat_table7 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea'].agg({'Quality of Sleep':["mean", "median","var","std","sem"]})
health_stat_table8 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea'].agg({'Heart Rate':["mean", "median","var","std","sem"]})
health_stat_table9 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea'].agg({'systolic blood pressure':["mean", "median","var","std","sem"]})
health_stat_table10 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea'].agg({'diastolic blood pressure':["mean", "median","var","std","sem"]})

health_stat_table11 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia'].agg({'Sleep Duration':["mean", "median","var","std","sem"]})
health_stat_table12 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia'].agg({'Quality of Sleep':["mean", "median","var","std","sem"]})
health_stat_table13 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia'].agg({'Heart Rate':["mean", "median","var","std","sem"]})
health_stat_table14 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia'].agg({'systolic blood pressure':["mean", "median","var","std","sem"]})
health_stat_table15 = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia'].agg({'diastolic blood pressure':["mean", "median","var","std","sem"]})


In [145]:
health_stats_none_merged = health_stat_table1.copy()

health_stats_none_merged['Quality of Sleep']= health_stat_table2['Quality of Sleep']
health_stats_none_merged['Heart Rate']= health_stat_table3['Heart Rate']
health_stats_none_merged['systolic blood pressure']= health_stat_table4['systolic blood pressure']
health_stats_none_merged['diastolic blood pressure']= health_stat_table5['diastolic blood pressure']


In [146]:
health_stats_none_merged

Unnamed: 0,Sleep Duration,Quality of Sleep,Heart Rate,systolic blood pressure,diastolic blood pressure
mean,7.358447,7.625571,69.018265,124.045662,81.0
median,7.4,8.0,70.0,125.0,80.0
var,0.536293,0.950903,7.063885,32.887814,15.926606
std,0.73232,0.975142,2.657797,5.73479,3.990815
sem,0.049486,0.065894,0.179597,0.387521,0.269674


In [147]:
health_stats_apnea_merged = health_stat_table6.copy()

health_stats_apnea_merged['Quality of Sleep']= health_stat_table7['Quality of Sleep']
health_stats_apnea_merged['Heart Rate']= health_stat_table8['Heart Rate']
health_stats_apnea_merged['systolic blood pressure']= health_stat_table9['systolic blood pressure']
health_stats_apnea_merged['diastolic blood pressure']= health_stat_table10['diastolic blood pressure']

health_stats_apnea_merged

Unnamed: 0,Sleep Duration,Quality of Sleep,Heart Rate,systolic blood pressure,diastolic blood pressure
mean,7.032051,7.205128,73.089744,137.769231,92.717949
median,6.8,6.0,75.0,140.0,95.0
var,0.950258,2.710623,26.186647,26.43956,20.15318
std,0.974812,1.646397,5.117289,5.141941,4.489229
sem,0.110376,0.186418,0.579419,0.58221,0.508305


In [150]:
health_stats_insomnia_merged = health_stat_table11.copy()

health_stats_insomnia_merged['Quality of Sleep']= health_stat_table12['Quality of Sleep']
health_stats_insomnia_merged['Heart Rate']= health_stat_table13['Heart Rate']
health_stats_insomnia_merged['systolic blood pressure']= health_stat_table14['systolic blood pressure']
health_stats_insomnia_merged['diastolic blood pressure']= health_stat_table15['diastolic blood pressure']

health_stats_insomnia_merged

Unnamed: 0,Sleep Duration,Quality of Sleep,Heart Rate,systolic blood pressure,diastolic blood pressure
mean,6.58961,6.532468,70.467532,132.038961,86.857143
median,6.5,7.0,72.0,130.0,85.0
var,0.149891,0.646958,24.489064,15.485304,10.097744
std,0.387157,0.804337,4.948643,3.935137,3.177695
sem,0.044121,0.091663,0.56395,0.44845,0.362132


In [174]:
control = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='None']
apnea = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Sleep Apnea']
insomnia = sleep_cleaned[sleep_cleaned['Sleep Disorder']=='Insomnia']

In [175]:
stats.ttest_ind(control['Sleep Duration'],apnea['Sleep Duration'])

Ttest_indResult(statistic=3.0837349347047125, pvalue=0.0022377044307714178)

In [176]:
stats.ttest_ind(control['Quality of Sleep'],apnea['Quality of Sleep'])

Ttest_indResult(statistic=2.6850628735269377, pvalue=0.007661936450979)

In [177]:
stats.ttest_ind(control['systolic blood pressure'],apnea['systolic blood pressure'])

Ttest_indResult(statistic=-18.631512683042992, pvalue=9.397806654358879e-52)

In [178]:
stats.ttest_ind(control['diastolic blood pressure'],apnea['diastolic blood pressure'])

Ttest_indResult(statistic=-21.53464843219751, pvalue=1.8096144119624439e-62)

In [179]:
stats.ttest_ind(control['Heart Rate'],apnea['Heart Rate'])

Ttest_indResult(statistic=-8.893141383973877, pvalue=6.080232027932548e-17)

In [182]:
stats.ttest_ind(control['Sleep Duration'],insomnia['Sleep Duration'])

Ttest_indResult(statistic=8.784361641549475, pvalue=1.3296979544241514e-16)

In [183]:
stats.ttest_ind(control['Quality of Sleep'],insomnia['Quality of Sleep'])

Ttest_indResult(statistic=8.833683252255163, pvalue=9.388601134698313e-17)

In [184]:
stats.ttest_ind(control['Heart Rate'],insomnia['Heart Rate'])

Ttest_indResult(statistic=-3.216135888583218, pvalue=0.001444178722913637)

In [185]:
stats.ttest_ind(control['systolic blood pressure'],insomnia['systolic blood pressure'])

Ttest_indResult(statistic=-11.323247457299466, pvalue=6.551792290581626e-25)

In [186]:
stats.ttest_ind(control['diastolic blood pressure'],apnea['diastolic blood pressure'])

Ttest_indResult(statistic=-21.53464843219751, pvalue=1.8096144119624439e-62)

T-testing results between the following groupings on disorders:

Control group = patients without sleep disorders
Apnea group = patients with apnea sleep disorder
Insomnia group = patients with insomnia sleep disorder

The following results show statistical significance:

systolic blood pressure, diastolic blood pressure, and heart rate in patients with Apnea sleep disorder

sleep duration, quality of sleep, systolic and diastolic blood pressure in patients with Insomnia sleep disorder

