# Explore Sleep Health Data

Your client, SleepInc, has shared anonymized sleep data from their hot new 
sleep tracking app SleepScope. As their data science consultant, your mission is 
to analyze the lifestyle survey data with Python to discover relationships 
between exercise, gender, occupation, and sleep quality. See if you can identify 
patterns leading to insights on sleep quality

In [5]:
import pandas as pd

df = pd.read_csv('sleep_health_data.csv')
df.columns

Index(['Person ID', 'Gender', 'Age', 'Occupation', 'Sleep Duration',
       'Quality of Sleep', 'Physical Activity Level', 'Stress Level',
       'BMI Category', 'Blood Pressure', 'Heart Rate', 'Daily Steps',
       'Sleep Disorder'],
      dtype='object')

In [6]:
df.shape

(374, 13)

The dataset includes 13 columns covering sleep duration, quality, disorders,  exercise, stress, diet, demographics, and other factors related to sleep health.

| Column                        | Description                                                                 | Data Type       | Type Description |
|-------------------------------|-----------------------------------------------------------------------------|-----------------|------------------|
| Person ID                     | An identifier for each individual.                                           | Nominal         | Categorical data with no inherent order. |
| Gender                        | The gender of the person (Male/Female).                                      | Nominal         | Categorical data with no inherent order. |
| Age                           | The age of the person in years.                                              | Ratio           | Numeric data with a true zero point. |
| Occupation                    | The occupation or profession of the person.                                  | Nominal         | Categorical data with no inherent order. |
| Sleep Duration (hours)        | The average number of hours the person sleeps per day.                       | Ratio           | Numeric data with a true zero point. |
| Quality of Sleep (scale: 1-10)| A subjective rating of the quality of sleep, ranging from 1 to 10.           | Ordinal         | Categorical data with a meaningful order. |
| Physical Activity Level (minutes/day) | The average number of minutes the person engages in physical activity daily. | Ratio           | Numeric data with a true zero point. |
| Stress Level (scale: 1-10)    | A subjective rating of the stress level experienced by the person, ranging from 1 to 10. | Ordinal         | Categorical data with a meaningful order. |
| BMI Category                  | The BMI category of the person (e.g., Underweight, Normal, Overweight).      | Nominal         | Categorical data with no inherent order. |
| Blood Pressure (systolic/diastolic) | The average blood pressure measurement of the person, indicated as systolic pressure over diastolic pressure. | Ratio           | Numeric data with a true zero point. |
| Heart Rate (bpm)              | The average resting heart rate of the person in beats per minute.            | Ratio           | Numeric data with a true zero point. |
| Daily Steps                   | The average number of steps the person takes per day.                        | Ratio           | Numeric data with a true zero point. |
| Sleep Disorder                | The presence or absence of a sleep disorder in the person (None, Insomnia, Sleep Apnea). | Nominal         | Categorical data with no inherent order. |

#### 1. Which occupation has the lowest average sleep duration? Save this in a string variable called lowest_sleep_occ.  

In [18]:
lowest_sleep_occ_df = df.groupby('Occupation')['Sleep Duration'].mean()
lowest_sleep_occ = lowest_sleep_occ_df.idxmin()
print("The occupation with the lowest average sleep duration is", lowest_sleep_occ, "with an average sleep duration of", lowest_sleep_occ_df[lowest_sleep_occ], "hours.")

The occupation with the lowest average sleep duration is Sales Representative with an average sleep duration of 5.9 hours.


#### 2. Which occupation has the lowest average sleep quality? Save this in a string variable called lowest_sleep_quality_occ. Did the occupation with the lowest sleep duration also have the lowest sleep quality? If so assign a boolean value to variable same_occ variable, True if it is the same occupation, and False if it isn't.  

In [25]:
lowest_sleep_quality_occ_df = df.groupby('Occupation')['Quality of Sleep'].mean()
lowest_sleep_quality_occ = lowest_sleep_quality_occ_df.idxmin()

lowest_sleep_quality_occ
print("The occupation with the lowest sleep quality is", lowest_sleep_quality_occ)

same_occ = lowest_sleep_occ == lowest_sleep_quality_occ
print(same_occ)

The occupation with the lowest sleep quality is Sales Representative
True


#### 3. Let's explore how BMI Category can affect sleep disorder rates. Start by finding what ratio of app users in each BMI Category have been diagnosed with Insomnia. Create a dictionary named: bmi_insomnia_ratios. The key should be the BMI Category as a string, while the value should be the ratio of people in this category with insomnia as a float rounded to two decimal places. 

In [47]:
bmi_insomnia_ratios = {
    "Normal": float,
    "Overweight": float,
    "Obese": float
}

for bmi in bmi_insomnia_ratios:
    ppl_with_bmi = df[df['BMI Category']==bmi].shape[0]
    ppl_with_bmi_insomnia = df[(df['BMI Category']==bmi) & (df['Sleep Disorder']=="Insomnia")].shape[0]
    bmi_insomnia_ratios[bmi] = round(ppl_with_bmi_insomnia/ppl_with_bmi,2)

bmi_insomnia_ratios

{'Normal': 0.04, 'Overweight': 0.43, 'Obese': 0.4}