# Sleep Durations

> Author: [Yalim Demirkesen](https://github.com/demirkeseny)

In this notebook, there are the analysis about 8 fitbit users. First preprocessing was done to get the columns in the desired format and filter the specific users, then we are left with data we analyzed to get an overall understanding and interpret the trends.

In [29]:
# necessary libraries:
import pandas as pd
import plotly.plotly as py
import plotly.graph_objs as go

In [2]:
csv_loc = './mturkfitbit_export_4.12.16-5.12.16/Fitabase Data 4.12.16-5.12.16/sleepDay_merged.csv'

In [9]:
sleep = pd.read_csv(csv_loc)

In [18]:
sleep = sleep.drop(columns=['TotalSleepRecords'])

In [19]:
sleep.shape

(168, 4)

In [20]:
mask = [8792009665,
 5553957443,
 5577150313,
 4020332650,
 6962181067,
 6117666160,
 2347167796,
 4388161847]

In [21]:
# Let's filter the data so that we only have the users we need:
sleep = sleep.loc[sleep['Id'].isin(mask)]

In [22]:
sleep.reset_index(drop=True, inplace=True)

In [23]:
sleep.head()

Unnamed: 0,Id,SleepDay,TotalMinutesAsleep,TotalTimeInBed
0,2347167796,4/13/2016 12:00:00 AM,467,531
1,2347167796,4/14/2016 12:00:00 AM,445,489
2,2347167796,4/15/2016 12:00:00 AM,452,504
3,2347167796,4/17/2016 12:00:00 AM,556,602
4,2347167796,4/18/2016 12:00:00 AM,500,557


In [24]:
sleep.shape

(168, 4)

In [25]:
sleep.dtypes

Id                     int64
SleepDay              object
TotalMinutesAsleep     int64
TotalTimeInBed         int64
dtype: object

In [26]:
sleep.SleepDay = pd.to_datetime(sleep.SleepDay)

In [27]:
sleep.head()

Unnamed: 0,Id,SleepDay,TotalMinutesAsleep,TotalTimeInBed
0,2347167796,2016-04-13,467,531
1,2347167796,2016-04-14,445,489
2,2347167796,2016-04-15,452,504
3,2347167796,2016-04-17,556,602
4,2347167796,2016-04-18,500,557


In [28]:
sleep.dtypes

Id                             int64
SleepDay              datetime64[ns]
TotalMinutesAsleep             int64
TotalTimeInBed                 int64
dtype: object

In [30]:
participants_avg = pd.DataFrame(None, 
            columns=['Id','TotalMinutesAsleep','TotalTimeInBed'])

In [31]:
participants_avg['Id'] = pd.Series(mask)

In [32]:
for col in participants_avg.columns.tolist()[1:]:
    for i in range(len(participants_avg)):
        participants_avg[col][i] = round(sleep[sleep['Id'] == participants_avg['Id'][i]][col].mean(),4)



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy



In [34]:
participants_avg['Id'] = participants_avg['Id'].astype(str)

In [38]:
participants_avg.columns = ['Id','AvgSleep','AvgBed']

In [39]:
participants_avg

Unnamed: 0,Id,AvgSleep,AvgBed
0,8792009665,435.667,453.8
1,5553957443,463.484,505.871
2,5577150313,432.0,460.615
3,4020332650,349.375,379.75
4,6962181067,448.0,466.129
5,6117666160,478.778,510.167
6,2347167796,446.8,491.333
7,4388161847,403.125,426.208


In [42]:
trace1 = go.Bar(
    x=["Id:" + identity for identity in participants_avg['Id'].tolist()],
    y=participants_avg['AvgSleep'].tolist(),
    name='Average Sleeping Duration (min)'
)
trace2 = go.Bar(
    x=["Id:" + identity for identity in participants_avg['Id'].tolist()],
    y=participants_avg['AvgBed'].tolist(),
    name='Average in Bed Duration (min)'
)

data = [trace1, trace2]
layout = go.Layout(
    barmode='group'
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='grouped-bar')