![insomnia](insomnia.jpg)


We have anonymized sleep data from a sleep tracking app. Our aim is to analyze the lifestyle survey data with Python to discover relationships between exercise, gender, occupation, and sleep quality. See if we can identify patterns leading to insights on sleep quality.

# Reading the data

## Importing Libraries

In [49]:
import pandas as pd

In [50]:
sleep_data = pd.read_csv('sleep_health_data.csv')

## Viewing the data information

In [51]:
sleep_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 374 entries, 0 to 373
Data columns (total 13 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Person ID                374 non-null    int64  
 1   Gender                   374 non-null    object 
 2   Age                      374 non-null    int64  
 3   Occupation               374 non-null    object 
 4   Sleep Duration           374 non-null    float64
 5   Quality of Sleep         374 non-null    int64  
 6   Physical Activity Level  374 non-null    int64  
 7   Stress Level             374 non-null    int64  
 8   BMI Category             374 non-null    object 
 9   Blood Pressure           374 non-null    object 
 10  Heart Rate               374 non-null    int64  
 11  Daily Steps              374 non-null    int64  
 12  Sleep Disorder           374 non-null    object 
dtypes: float64(1), int64(7), object(5)
memory usage: 38.1+ KB


We can see that the data does not contain any null(NaN) values

In [52]:
# Viewing first 10 rows of data
sleep_data.head(10)

Unnamed: 0,Person ID,Gender,Age,Occupation,Sleep Duration,Quality of Sleep,Physical Activity Level,Stress Level,BMI Category,Blood Pressure,Heart Rate,Daily Steps,Sleep Disorder
0,1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,
1,2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
2,3,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,
3,4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea
4,5,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea
5,6,Male,28,Software Engineer,5.9,4,30,8,Obese,140/90,85,3000,Insomnia
6,7,Male,29,Teacher,6.3,6,40,7,Obese,140/90,82,3500,Insomnia
7,8,Male,29,Doctor,7.8,7,75,6,Normal,120/80,70,8000,
8,9,Male,29,Doctor,7.8,7,75,6,Normal,120/80,70,8000,
9,10,Male,29,Doctor,7.8,7,75,6,Normal,120/80,70,8000,


# Exploring the Data

## Occupation with the lowest average sleep duration

In [53]:
lowest_sleep_occ_df = sleep_data.groupby(['Occupation']).aggregate({'Sleep Duration':'mean'}).sort_values(by=['Sleep Duration']).reset_index()

In [54]:
lowest_sleep_occ = lowest_sleep_occ_df.iloc[0][0]
print(lowest_sleep_occ)

Sales Representative


**Sales Representative** is the occupation with lowest avarage sleep duration.

## Occupation with the lowest average quality of sleep

In [55]:
lowest_sleep_quality_df = sleep_data.groupby(['Occupation']).aggregate({'Quality of Sleep':'mean'}).sort_values(by=['Quality of Sleep']).reset_index()

In [56]:
lowest_sleep_quality_occ = lowest_sleep_quality_df.iloc[0][0]
print(lowest_sleep_quality_occ)

Sales Representative


Sales Representatice is the occupation with lowest average sleep quality

In [57]:
same_occ = True

## BMI Categories Analysis

In [58]:
users_per_bmi_cat = sleep_data.groupby('BMI Category').aggregate({'Sleep Disorder':'count'}).reset_index()
print(users_per_bmi_cat)

  BMI Category  Sleep Disorder
0       Normal             216
1        Obese              10
2   Overweight             148


### Users with Insomnia in different BMI Categories

In [59]:
insomnia_df = sleep_data.loc[sleep_data['Sleep Disorder'] == 'Insomnia']
users_per_bmi_cat_insomnia = insomnia_df.groupby('BMI Category').aggregate({'Sleep Disorder':'count'}).reset_index()
print(users_per_bmi_cat_insomnia)

  BMI Category  Sleep Disorder
0       Normal               9
1        Obese               4
2   Overweight              64


In [60]:
bmi_insomnia_ratios = {'Normal': float,'Overweight':float,'Obese':float}

In [61]:
for key,value in bmi_insomnia_ratios.items():
    if key == 'Normal':
        value = (users_per_bmi_cat_insomnia.loc[users_per_bmi_cat_insomnia['BMI Category'] == 'Normal',['Sleep Disorder']].iloc[0][0])/(users_per_bmi_cat.loc[users_per_bmi_cat['BMI Category'] == 'Normal',['Sleep Disorder']].iloc[0][0])
        value = round(value,2)
        bmi_insomnia_ratios.update({'Normal':value})
    elif key == 'Overweight':
        value = (users_per_bmi_cat_insomnia.loc[users_per_bmi_cat_insomnia['BMI Category'] == 'Overweight',['Sleep Disorder']].iloc[0][0])/(users_per_bmi_cat.loc[users_per_bmi_cat['BMI Category'] == 'Overweight',['Sleep Disorder']].iloc[0][0])
        value = round(value,2)
        bmi_insomnia_ratios.update({'Overweight':value})
    elif key == 'Obese':
        value = (users_per_bmi_cat_insomnia.loc[users_per_bmi_cat_insomnia['BMI Category'] == 'Obese',['Sleep Disorder']].iloc[0][0])/(users_per_bmi_cat.loc[users_per_bmi_cat['BMI Category'] == 'Obese',['Sleep Disorder']].iloc[0][0])
        value = round(value,2)
        bmi_insomnia_ratios.update({'Obese':value})

## Ratio of app users diagnosed with **Insomnia** in each BMI Category

In [62]:
print(bmi_insomnia_ratios)

{'Normal': 0.04, 'Overweight': 0.43, 'Obese': 0.4}
