# Sleep classification using wrist-worn accelerometer 

## Data 
We will be working with the [Newcastle dataset](https://zenodo.org/record/1160410#.ZAfRby-l1qs), which contains single night of polysomnography (PSG) data in 28 sleep clinic patients.

The tri-axial accelerometer data has a sampling frequency of 30hz, and we have grouped together each 30 seconds of readings as an "epoch", thus the data comprises a sequence of epochs of $30\times 30=900$ readings from each of the three axes.

input format: `n x 3 x 900`

sleep_label: `n x 1`

## Learning objectives 
1. Sleep prediction using heuristic base method 
2. Sleep parameters derivation 
3. Sleep visualisation 

## 0. Data loading

In [None]:
import numpy as np
import os
%matplotlib inline

In [None]:
prac_root = './'
y_path = os.path.join(prac_root, 'y.npy')
x_path = os.path.join(prac_root, 'X.npy')
times_path = os.path.join(prac_root, 'times.npy')
times = np.load(times_path)
X = np.load(x_path)
y_five_class = np.load(y_path) # every 30-second of PSG was scored into one of the five sleep stages 

In [None]:
X.shape # 1069 epoches 

In [None]:
y_five_class

In [None]:
WAKE_LABEL = 0
SLEEP_LABEL = 1

# for simplicity, we will only deal with wake and sleep two classes 
EPOCH_LENGTH = 30
label_dict = {'N1': SLEEP_LABEL,
              'N2': SLEEP_LABEL,
              'N3': SLEEP_LABEL,
              'R': SLEEP_LABEL,
              'W': WAKE_LABEL
            }
sleep_labels = [label_dict[my_class] for my_class in y_five_class]
sleep_labels = np.array(sleep_labels)

In [None]:
np.unique(sleep_labels) # now our sleep labels only contain 0s and 1s 

## 1. Sleep prediction 

### 1.1 Simple sleep classifier

In this section, we will try to implement a rule-based method to discriminate between wake and sleep. A very naive approach would look at the mean of the standard deviations of each of the axes over an epoch. If the mean standard deviation is below a threshold, then we will consider that epoch to be in the sleep stage. Formally, given a tri-axial signal $\vec{a} \in R^3$, and a threshold value $\lambda$ the threshold method wil be as follows:

$$
\text{sleep}(\vec{a}) = 
\begin{cases}
\text{FALSE} & \text{if } \frac{1}{3} \sum_{i \in \{x,y,z\}} \sigma_i(\vec{a}) \geq \lambda \\
\text{TRUE} & \text{otherwise}
\end{cases}
$$

In [None]:
from numpy import linalg as LA

In [None]:
np.std(X[0],axis=1).shape

In [None]:
def simple_class_classifier(x, threshold=0.01):
    # x of size 3 by 900
    
    std_axis = np.std(x,axis=1)
    mean_std = np.mean(std_axis)
    
    if mean_std >= threshold:
        return WAKE_LABEL
    else:
        return SLEEP_LABEL

In [None]:
# make classifications using the sample data 
sleep_pred = [simple_class_classifier(my_window) for my_window in X]
sleep_pred = np.array(sleep_pred)

### 1.2 Classification evaluation 
Let's compute the sensitivity and specificity against the ground truth.

In [None]:
#Confusion matrix, Accuracy, sensitivity and specificity
from sklearn.metrics import confusion_matrix

cm1 = confusion_matrix(sleep_labels,
                       sleep_pred)
print('Confusion Matrix : \n', cm1)

total1=sum(sum(cm1))
#####from confusion matrix calculate accuracy
accuracy1=(cm1[0,0]+cm1[1,1])/total1
print ('Accuracy : ', accuracy1)

sensitivity1 = cm1[0,0]/(cm1[0,0]+cm1[0,1])
print('Sensitivity : ', sensitivity1 )

specificity1 = cm1[1,1]/(cm1[1,0]+cm1[1,1])
print('Specificity : ', specificity1)

### Assignment Custom sleep classifier 
Our current classifier has a high specificity but a low sensitivity. There much be ways that we can improve its performance. Can you try build your own sleep classifier by extracting your own features? 

You might want to explore:
* Relationship between different windows 
* Commonly used spatiotemporal features like mean, frequency and power 
* If you are ambitious enough, there are well-validated rule-based methods that you can try to implement. Refer to [Towards Benchmarked Sleep Detection with Wrist-Worn Sensing Units](https://ieeexplore.ieee.org/document/7052479)

## 2. Sleep parameter estimation 
We now try to estimate the following sleep parameters:
- *Total sleep time*,
- *Sleep efficiency*, which is the you total sleep time divided by your time in bed,
- *Sleep onset*, which is when you transition from being awake to asleep, 
- *Sleep onset latency (SOL)*, which is how long it takes you to fall asleep from attempting to sleep, and 
- *Wake after sleep onset (WASO)*, which is the amount of time spent awake after initially falling asleep and before the final awakening.

### 2.1 Total sleep time estimate 
Getting *total sleep time* (TST) is easy because you just need to count the number of sleep labels in the input array.

In [None]:
def get_tst(my_sleep_labels):
    # output in minutes 
    return np.sum(my_sleep_labels==SLEEP_LABEL) * EPOCH_LENGTH / 60

In [None]:
get_tst(sleep_labels)

### 2.2 Sleep efficiency

Sleep efficiency is just TST / Time in bed.

In [None]:
def get_se(my_sleep_labels):
    # assuming all the input labels are time in bed 
    timeinbed = len(my_sleep_labels)  * EPOCH_LENGTH / 60
    return get_tst(my_sleep_labels) / timeinbed

In [None]:
get_se(sleep_labels)

### Assignment: implement sleep onset latency (SOL) estimation 
Having seen how to compute total sleep time and sleep efficiency, could you implement the following two functons `get_sleep_onset` and `get_sleep_onset_latency`, which estimage *sleep onset* and *sleep onset latency* respectively.

In [None]:
def get_sleep_onset(my_sleep_labels, times):
    # # output the time
    pass # TODO

In [None]:
def get_sleep_onset_latency(my_sleep_labels):
    # output in minutes
    pass # TODO

Q: Do you know what is the SOL for this particular subject? 

### Assignment: Wake after sleep onset estimation 
Can you implement a function `get_waso` to estimate the wake after sleep onset? 

In [None]:
def get_waso(my_sleep_labels):
    # output in minutes
    pass # TODO

Q: Do you know what is the WASO for this particular subject? 

## 3. Sleep visulisation 
Finally, let's come up with informative ways of visualising the sleep data.

In [None]:
import seaborn as sns 
import pandas as pd
from matplotlib.pyplot import figure
import matplotlib.dates as mdates

sns.set_theme(style="darkgrid")

In [None]:
data2visu = {'times': times,
             'y': sleep_labels}

my_df = pd.DataFrame.from_dict(data2visu)
my_df['times'] = pd.to_datetime(my_df['times'])
my_df.head()

In [None]:
figure(figsize=(16, 6), dpi=80)

ax = sns.lineplot(x="times", y="y",
             data=my_df)
ax.set_yticks([0, 1])
ax.set_yticklabels(("Wake", "Sleep"))


myFmt = mdates.DateFormatter('%D %H:%M') # change timestamp format 
ax.xaxis.set_major_formatter(myFmt)

### Assignment: Better sleep visulisation 

Coming up with a high quality scientific figure is hard. Can you try to improve the figure above, for instance, by representing other sleep parameters such as SOL and WASO on the figure to enrich the data density? 