# Sleep Health and Lifestyle - EDA Summary

This notebook provides a lightweight summary of the exploratory data analysis for the Sleep Health and Lifestyle dataset.

## Dataset Overview

- **Task**: Classification
- **Target**: `sleep_disorder` (categorical)
- **Features**: Demographics, sleep metrics, lifestyle factors, and blood pressure
- **Dataset location**: `data/raw/Sleep_health_and_lifestyle_dataset.csv`

## Key Steps Performed

1. **Data Loading and Preprocessing**
   - Standardized column names
   - Parsed `blood_pressure` (S/D format) into `bp_systolic` and `bp_diastolic`
   - Removed duplicates
   - Cast categorical variables

2. **Univariate Analysis**
   - Histograms and boxplots for numeric features
   - Count plots for categorical features
   - See figures: `reports/figures/sleep_univariate_*.png`

3. **Outlier Detection**
   - Applied IQR method to all numeric features
   - Removed rows with outliers beyond 1.5 * IQR

4. **Bivariate Analysis**
   - Boxen plots: numeric features vs. sleep_disorder
   - Stacked bar charts: categorical features vs. sleep_disorder
   - See figures: `reports/figures/sleep_bivariate_*.png`

5. **Correlation Analysis**
   - Spearman correlation matrix
   - **Decision**: Dropped `bp_diastolic` if |correlation| > 0.8 with `bp_systolic` to reduce multicollinearity
   - See figure: `reports/figures/sleep_correlation_matrix.png`

6. **Train/Test Split**
   - 80/20 split with stratification by `sleep_disorder` classes
   - Ensures balanced class distributions in both sets
   - Outputs: `data/processed/sleep_train.csv`, `data/processed/sleep_test.csv`

## Key Insights

- Blood pressure features are highly correlated; keeping only systolic pressure
- Sleep disorders show distinct patterns across occupation, BMI category, and sleep duration
- Gender distribution varies across sleep disorder categories

## How to Run

```bash
cd sleep-insurance-eda
python src/eda_sleep.py
```

## Figures

All generated figures are saved in `reports/figures/` with the prefix `sleep_`.
