# Sleep analysis

It has been 6 months since I have been tracking my sleeping data. I bought a Samsung Galaxy Watch Active 2 which allows me to track every stages of my sleep, every night.

The goal of this project is to analyse my sleep and see if I can find any correlations between my sleep quality and other external factors.

In [69]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import datetime

In [23]:
sleep_data = pd.read_csv('sleep_data_2021_02_09.csv', sep=';', skiprows=[0])

In [24]:
sleep_data.head(5)

Unnamed: 0,start_time,sleep_id,custom,update_time,create_time,stage,time_offset,deviceuuid,pkg_name,end_time,datauuid
0,2020-09-01 23:22:00.000,b953ce09-66f0-5b6f-5b56-83c7dcd10e3b,,02.09.20 07:28,02.09.20 07:28,40001,UTC+0200,0CI+8SNNP+,com.sec.android.app.shealth,2020-09-01 23:26:00.000,3713d761-0e35-bae5-ae2e-bdb42c2e1536
1,2020-09-01 23:49:00.000,b953ce09-66f0-5b6f-5b56-83c7dcd10e3b,,02.09.20 07:28,02.09.20 07:28,40002,UTC+0200,0CI+8SNNP+,com.sec.android.app.shealth,2020-09-02 00:06:00.000,72ec0096-b482-f885-2580-a6f95f91af15
2,2020-09-02 00:42:00.000,b953ce09-66f0-5b6f-5b56-83c7dcd10e3b,,02.09.20 07:28,02.09.20 07:28,40004,UTC+0200,0CI+8SNNP+,com.sec.android.app.shealth,2020-09-02 00:49:00.000,a25927cb-e04c-04d4-6231-032a60a218dc
3,2020-09-02 01:09:00.000,b953ce09-66f0-5b6f-5b56-83c7dcd10e3b,,02.09.20 07:28,02.09.20 07:28,40004,UTC+0200,0CI+8SNNP+,com.sec.android.app.shealth,2020-09-02 01:23:00.000,3a97ac3d-a55b-d34e-db1c-4b594e055823
4,2020-09-02 02:25:00.000,b953ce09-66f0-5b6f-5b56-83c7dcd10e3b,,02.09.20 07:28,02.09.20 07:28,40002,UTC+0200,0CI+8SNNP+,com.sec.android.app.shealth,2020-09-02 02:58:00.000,f12df4f9-97aa-6a4b-dd17-27f2d70b8c06


In [25]:
sleep_data.drop(['custom','sleep_id','deviceuuid','pkg_name','datauuid'],axis=1, inplace=True)

In [26]:
sleep_data

Unnamed: 0,start_time,update_time,create_time,stage,time_offset,end_time
0,2020-09-01 23:22:00.000,02.09.20 07:28,02.09.20 07:28,40001,UTC+0200,2020-09-01 23:26:00.000
1,2020-09-01 23:49:00.000,02.09.20 07:28,02.09.20 07:28,40002,UTC+0200,2020-09-02 00:06:00.000
2,2020-09-02 00:42:00.000,02.09.20 07:28,02.09.20 07:28,40004,UTC+0200,2020-09-02 00:49:00.000
3,2020-09-02 01:09:00.000,02.09.20 07:28,02.09.20 07:28,40004,UTC+0200,2020-09-02 01:23:00.000
4,2020-09-02 02:25:00.000,02.09.20 07:28,02.09.20 07:28,40002,UTC+0200,2020-09-02 02:58:00.000
...,...,...,...,...,...,...
12499,2021-02-09 15:14:00.000,09.02.21 17:27,09.02.21 17:27,40004,UTC+0100,2021-02-09 15:20:00.000
12500,2021-02-09 15:20:00.000,09.02.21 17:27,09.02.21 17:27,40002,UTC+0100,2021-02-09 15:26:00.000
12501,2021-02-09 15:26:00.000,09.02.21 17:27,09.02.21 17:27,40003,UTC+0100,2021-02-09 15:55:00.000
12502,2021-02-09 15:55:00.000,09.02.21 17:27,09.02.21 17:27,40002,UTC+0100,2021-02-09 16:00:00.000


In [27]:
#I decide to remove update_time and create_time because they do not look actionable compared to start_time
#and end_time
sleep_data.drop(['update_time','create_time'],axis=1, inplace=True)

In [28]:
sleep_data

Unnamed: 0,start_time,stage,time_offset,end_time
0,2020-09-01 23:22:00.000,40001,UTC+0200,2020-09-01 23:26:00.000
1,2020-09-01 23:49:00.000,40002,UTC+0200,2020-09-02 00:06:00.000
2,2020-09-02 00:42:00.000,40004,UTC+0200,2020-09-02 00:49:00.000
3,2020-09-02 01:09:00.000,40004,UTC+0200,2020-09-02 01:23:00.000
4,2020-09-02 02:25:00.000,40002,UTC+0200,2020-09-02 02:58:00.000
...,...,...,...,...
12499,2021-02-09 15:14:00.000,40004,UTC+0100,2021-02-09 15:20:00.000
12500,2021-02-09 15:20:00.000,40002,UTC+0100,2021-02-09 15:26:00.000
12501,2021-02-09 15:26:00.000,40003,UTC+0100,2021-02-09 15:55:00.000
12502,2021-02-09 15:55:00.000,40002,UTC+0100,2021-02-09 16:00:00.000


| Stage        | Signification           | Description |
| ------------- | :-------------| :------------- |
| 40001      | **Awaken stage of sleep.** | Eyes open. Responsive to external stimuli. |
| 40002      | **Light stage of sleep.** | Breathing slows down and heartbeat becomes regular. Typically lasts between 1 and 20 minutes after falling asleep. |
| 40003 | **Deep stage of sleep.** | Brain waves slow down and become larger. Typically starts 35 - 45 minutes after falling asleep. |
| 40004 | **REM (Rapid Eye Movement) stage of sleep.** | Brain waves similar to waking. Most vivid dreams happen in this stage. Body does not move. |

*https://developer.samsung.com/health/server/partner-only/api-reference/data-types/sleep-stage.html*

In [37]:
#All the data was mixed up. In here I sort the values by start time.
sleep_data = sleep_data.sort_values('start_time').reset_index().drop('index',axis=1)

In [38]:
sleep_data

Unnamed: 0,start_time,stage,time_offset,end_time
0,2020-09-01 23:22:00.000,40001,UTC+0200,2020-09-01 23:26:00.000
1,2020-09-01 23:26:00.000,40003,UTC+0200,2020-09-01 23:29:00.000
2,2020-09-01 23:29:00.000,40002,UTC+0200,2020-09-01 23:30:00.000
3,2020-09-01 23:30:00.000,40003,UTC+0200,2020-09-01 23:33:00.000
4,2020-09-01 23:33:00.000,40002,UTC+0200,2020-09-01 23:43:00.000
...,...,...,...,...
12499,2021-02-09 17:18:00.000,40001,UTC+0100,2021-02-09 17:19:00.000
12500,2021-02-09 17:19:00.000,40002,UTC+0100,2021-02-09 17:20:00.000
12501,2021-02-09 17:20:00.000,40001,UTC+0100,2021-02-09 17:21:00.000
12502,2021-02-09 17:21:00.000,40002,UTC+0100,2021-02-09 17:23:00.000


In [50]:
sleep_data['start_time_date'] = sleep_data['start_time'].apply(lambda x: x.split(' ')[0])
sleep_data['start_time_hour'] = sleep_data['start_time'].apply(lambda x: x.split(' ')[1].split('.')[0])
sleep_data['end_time_date'] = sleep_data['end_time'].apply(lambda x: x.split(' ')[0])
sleep_data['end_time_hour'] = sleep_data['end_time'].apply(lambda x: x.split(' ')[1].split('.')[0])

In [51]:
sleep_data

Unnamed: 0,start_time,stage,time_offset,end_time,start_time_date,start_time_hour,end_time_date,end_time_hour
0,2020-09-01 23:22:00.000,40001,UTC+0200,2020-09-01 23:26:00.000,2020-09-01,23:22:00,2020-09-01,23:26:00
1,2020-09-01 23:26:00.000,40003,UTC+0200,2020-09-01 23:29:00.000,2020-09-01,23:26:00,2020-09-01,23:29:00
2,2020-09-01 23:29:00.000,40002,UTC+0200,2020-09-01 23:30:00.000,2020-09-01,23:29:00,2020-09-01,23:30:00
3,2020-09-01 23:30:00.000,40003,UTC+0200,2020-09-01 23:33:00.000,2020-09-01,23:30:00,2020-09-01,23:33:00
4,2020-09-01 23:33:00.000,40002,UTC+0200,2020-09-01 23:43:00.000,2020-09-01,23:33:00,2020-09-01,23:43:00
...,...,...,...,...,...,...,...,...
12499,2021-02-09 17:18:00.000,40001,UTC+0100,2021-02-09 17:19:00.000,2021-02-09,17:18:00,2021-02-09,17:19:00
12500,2021-02-09 17:19:00.000,40002,UTC+0100,2021-02-09 17:20:00.000,2021-02-09,17:19:00,2021-02-09,17:20:00
12501,2021-02-09 17:20:00.000,40001,UTC+0100,2021-02-09 17:21:00.000,2021-02-09,17:20:00,2021-02-09,17:21:00
12502,2021-02-09 17:21:00.000,40002,UTC+0100,2021-02-09 17:23:00.000,2021-02-09,17:21:00,2021-02-09,17:23:00


In [61]:
def stage_pairing(x):
    if x == 40001:
        x = 'Awaken'
    elif x == 40002:
        x = 'Light'
    elif x == 40003:
        x = 'Deep'
    else:
        x = 'REM'
    return x

sleep_data['stage_of_sleep'] = sleep_data['stage'].apply(lambda x: stage_pairing(x))

In [62]:
sleep_data

Unnamed: 0,start_time,stage,time_offset,end_time,start_time_date,start_time_hour,end_time_date,end_time_hour,stage_of_sleep
0,2020-09-01 23:22:00.000,40001,UTC+0200,2020-09-01 23:26:00.000,2020-09-01,23:22:00,2020-09-01,23:26:00,Awaken
1,2020-09-01 23:26:00.000,40003,UTC+0200,2020-09-01 23:29:00.000,2020-09-01,23:26:00,2020-09-01,23:29:00,Deep
2,2020-09-01 23:29:00.000,40002,UTC+0200,2020-09-01 23:30:00.000,2020-09-01,23:29:00,2020-09-01,23:30:00,Light
3,2020-09-01 23:30:00.000,40003,UTC+0200,2020-09-01 23:33:00.000,2020-09-01,23:30:00,2020-09-01,23:33:00,Deep
4,2020-09-01 23:33:00.000,40002,UTC+0200,2020-09-01 23:43:00.000,2020-09-01,23:33:00,2020-09-01,23:43:00,Light
...,...,...,...,...,...,...,...,...,...
12499,2021-02-09 17:18:00.000,40001,UTC+0100,2021-02-09 17:19:00.000,2021-02-09,17:18:00,2021-02-09,17:19:00,Awaken
12500,2021-02-09 17:19:00.000,40002,UTC+0100,2021-02-09 17:20:00.000,2021-02-09,17:19:00,2021-02-09,17:20:00,Light
12501,2021-02-09 17:20:00.000,40001,UTC+0100,2021-02-09 17:21:00.000,2021-02-09,17:20:00,2021-02-09,17:21:00,Awaken
12502,2021-02-09 17:21:00.000,40002,UTC+0100,2021-02-09 17:23:00.000,2021-02-09,17:21:00,2021-02-09,17:23:00,Light


In [64]:
sleep_data.drop(['start_time','stage','time_offset','end_time'], axis=1, inplace=True)

In [66]:
sleep_data.head(20)

Unnamed: 0,start_time_date,start_time_hour,end_time_date,end_time_hour,stage_of_sleep
0,2020-09-01,23:22:00,2020-09-01,23:26:00,Awaken
1,2020-09-01,23:26:00,2020-09-01,23:29:00,Deep
2,2020-09-01,23:29:00,2020-09-01,23:30:00,Light
3,2020-09-01,23:30:00,2020-09-01,23:33:00,Deep
4,2020-09-01,23:33:00,2020-09-01,23:43:00,Light
5,2020-09-01,23:43:00,2020-09-01,23:49:00,Awaken
6,2020-09-01,23:49:00,2020-09-02,00:06:00,Light
7,2020-09-02,00:06:00,2020-09-02,00:07:00,Awaken
8,2020-09-02,00:07:00,2020-09-02,00:21:00,Deep
9,2020-09-02,00:21:00,2020-09-02,00:33:00,Light


In [71]:
#Dates and hours are strings. I need to find a way to translate them into time values. I want to create a new
#column "Duration" to calculate the duration of each sleeping phase.