In [5]:
%%capture
# Export this Notebook to PDF
!jupyter nbconvert --to pdf "Hypothesis Testing.ipynb" \
    --TagRemovePreprocessor.enabled=True  \
    --TagRemovePreprocessor.remove_cell_tags remove_cell \
    --TagRemovePreprocessor.remove_all_outputs_tags remove_output \
    --TagRemovePreprocessor.remove_input_tags remove_input;

In [1]:
# Make Jupyter reload library before every execution

%load_ext autoreload
%autoreload 2

import warnings
warnings.filterwarnings('ignore')

In [2]:
import pandas as pd
import numpy as np

df= pd.read_csv('data/all.csv', parse_dates=True, )

# Hypothesis Testing

The observed correlations from the data analysis and visualizations suggest several hypotheses that could be explored through further analysis:

1. **Impact of Sleep Disturbances on Quality:** Given the strong negative correlation between sleep disturbances and quality, we can hypothesize that increased sleep disturbances are likely to negatively impact the quality of sleep.

1. **Age in Relation to Sleep Duration:** The negative correlation between age and calculated night sleep duration leads to the hypothesis that sleep duration may decrease with age.

1. **Relationship Between Sleep Onset Time and Quality:** The moderate negative correlation observed between sleep onset time and quality suggests that a longer time to fall asleep might be associated with poorer sleep quality.

1. **Influence of Exercise on Sleep Duration and Quality:** The slight positive correlation between exercise days per week and sleep duration hints at a potential hypothesis that increased physical activity could contribute to longer and possibly better quality sleep.

1. **Nap Duration's Effect on Nighttime Sleep Duration:** Although the correlation is weak, we could investigate whether the duration of naps has any effect on the duration of nighttime sleep.

**Null Hypothesis ($H_0$)**: The level of Sleep Disturbances has no impact on Sleep Quality.

**Alternative Hypothesis ($H_1$)**: The level of Sleep Disturbances has a negative impact on Sleep Quality.

In [3]:
sleep_disturbances_mapping = {
    "Never": 0,
    "Rarely": 1,
    "Sometimes": 2,
    "Frequently": 3,
    "Often": 4,
}

df["Sleep Disturbances Ordinal"] = df["Sleep Disturbances"].map(
    sleep_disturbances_mapping
)


In [4]:
import scipy.stats as stats

# Perform Spearman correlation test
correlation, p_value = stats.spearmanr(df['Sleep Quality'].to_numpy(), df['Sleep Disturbances Ordinal'].to_numpy().astype(float))

print(f'Correlation: {correlation}')
print(f'P-value: {p_value}')

Correlation: -0.45326163606763725
P-value: 8.396028187377574e-07


With the given results of a Spearman correlation coefficient (ρ) of approximately -0.453 and a p-value of approximately 8.4e-07, we can draw the following conclusions about the relationship between sleep disturbances and sleep quality:

**Strength and Direction of Correlation:** The Spearman correlation coefficient of -0.453 indicates a moderate negative correlation between sleep disturbances and sleep quality. This means that as sleep disturbances increase (become more frequent), sleep quality tends to decrease (gets worse).

**Statistical Significance:** The p-value is a measure of the probability that the observed correlation occurred by chance if there were no actual relationship in the population. A p-value of 8.4e-07 is extremely small, far below the common alpha level of 0.05 used to determine statistical significance. This means that the negative correlation observed is highly unlikely to be due to random variation in the sample; it's statistically significant.

**Conclusion:**
Based on the Spearman correlation test, we can confidently reject the null hypothesis that there is no correlation between sleep disturbances and sleep quality. The data supports the alternative hypothesis that sleep disturbances do affect sleep quality, with more disturbances associated with worse sleep quality. This result aligns with what might be expected intuitively: that individuals who experience more disturbances during sleep tend to report lower overall sleep quality.