
# Sleep Spindle Study

## Building Model

In this notebook, we build a model to detect the presence of sleep spindles in the entire EEG recording. 
        


## Imports

We will import the necessary libraries that are needed for processing the data, building the model, and evaluating its performance.
        

In [1]:
import mne
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.models import Sequential
from keras.callbacks import EarlyStopping
from sklearn.model_selection import KFold
import json
import utils
import feature_extraction
import data_preparation
from memory_profiler import profile
import preprocess
import keras
import tensorflow as tf
from tensorflow.keras import backend as K
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
from tensorflow.keras.callbacks import Callback
import json
from tensorflow.keras.metrics import Metric
from tensorflow.keras.layers import Conv1D, MaxPooling1D, Dropout, LSTM, Dense, BatchNormalization, Flatten
import tensorflow.keras.layers
from tensorflow.keras.models import Sequential

2023-12-30 15:14:37.047760: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-30 15:14:37.092911: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-30 15:14:37.092959: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-30 15:14:37.093863: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-30 15:14:37.100125: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-30 15:14:37.100745: I tensorflow/core/platform/cpu_feature_guard.cc:1

### Download data

Using the `processed_data` function from the previous step to download our concatenated raw with its correspondent preprocessing and features.

In [2]:
X, labels = data_preparation.processed_data(["../dataset/train_S002_night1_hackathon_raw.mat",
                                            "../dataset/train_S003_night5_hackathon_raw.mat"
                                            ],
                                            ["../dataset/train_S002_labeled.csv",
                                            "../dataset/train_S003_labeled.csv"
                                            ],
                                            labels=["K0", "K1"],
                                            fmin=8,
                                            fmax=16,
                                            include_entire_recording=True)


Creating RawArray with float64 data, n_channels=1, n_times=4965399
    Range : 0 ... 4965398 =      0.000 ... 19861.592 secs
Ready.
Filtering raw data in 1 contiguous segment
Setting up band-pass filter from 8 - 16 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 8.00
- Lower transition bandwidth: 2.00 Hz (-6 dB cutoff frequency: 7.00 Hz)
- Upper passband edge: 16.00 Hz
- Upper transition bandwidth: 4.00 Hz (-6 dB cutoff frequency: 18.00 Hz)
- Filter length: 413 samples (1.652 s)

Used Annotations descriptions: ['0_0', '0_1', '1_0']
Not setting metadata
1191 matching events found
Applying baseline correction (mode: mean)
0 projection items activated
Using data from preloaded Raw for 1191 events and 626 original time points ...
0 bad epochs dropped
epcohs.get_data().shape: (1191, 1, 6


#### Model

The chosen model is an LSTM, since we are dealing with timeframes, LSTM are known to deal well with time depending samples. A k-cross validation is implemented, partitioning the data into 5 parts and alterning between the 4 parts for training and the 1 for testing.
        

In [3]:
class F1Score(Metric):
    def __init__(self, name='f1_score', **kwargs):
        super(F1Score, self).__init__(name=name, **kwargs)
        self.precision = tf.keras.metrics.Precision()
        self.recall = tf.keras.metrics.Recall()
        self.f1_score = self.add_weight(name='f1', initializer='zeros')

    def update_state(self, y_true, y_pred, sample_weight=None):
        self.precision.update_state(y_true, y_pred, sample_weight)
        self.recall.update_state(y_true, y_pred, sample_weight)
        p = self.precision.result()
        r = self.recall.result()
        self.f1_score.assign(2 * ((p * r) / (p + r + tf.keras.backend.epsilon())))

    def result(self):
        return self.f1_score

    def reset_states(self):
        self.precision.reset_states()
        self.recall.reset_states()
        self.f1_score.assign(0)

print("X.shape:", X.shape)
print("labels.shape:", labels.shape)
print("shape before reshaping:", X.shape)
X = X.transpose((0, 2, 1))
print("shape after reshaping:", X.shape)

X.shape: (2241, 1, 626)
labels.shape: (2241, 2)
shape before reshaping: (2241, 1, 626)
shape after reshaping: (2241, 626, 1)


In [None]:
def create_model():
    input_layer = keras.Input(shape=(X.shape[1], X.shape[2]))

    x = Conv1D(
        filters=32, kernel_size=3, strides=1, activation="relu", padding="same"
    )(input_layer)
    x = BatchNormalization()(x)

    x = Conv1D(
        filters=64, kernel_size=3, strides=1, activation="relu", padding="same"
    )(x)
    x = BatchNormalization()(x)

    x = Conv1D(
        filters=128, kernel_size=5, strides=1, activation="relu", padding="same"
    )(x)
    x = BatchNormalization()(x)

    # Now you can flatten the output if you haven't applied global pooling before
    x = Flatten()(x)

    x = Dense(
        2048, activation="relu"
    )(x)
    x = Dropout(0.2)(x)

    x = Dense(
        1024, activation="relu"
    )(x)
    x = Dropout(0.2)(x)
    x = Dense(
        128, activation="relu"
    )(x)
    output_layer = Dense(labels.shape[1], activation="sigmoid")(x)

    return keras.Model(inputs=input_layer, outputs=output_layer)


In [None]:

kfold = KFold(n_splits=3, shuffle=True)
for fold_no, (train, test) in enumerate(kfold.split(X, labels)):
    print("train indices:", train.shape)
    print("test indices:", test.shape)
    # Define the model architecture
    model = create_model()
    
    # Compile the model
    model.compile(
        optimizer=keras.optimizers.Adam(),
        loss="binary_crossentropy",
        metrics=[
            'accuracy',
            tf.keras.metrics.Precision(),
            tf.keras.metrics.Recall(),
            F1Score(),
        ]
    )
    
    # Train the model
    history = model.fit(
        X[train],
        labels[train],
        epochs=30,
        validation_data=(X[test], labels[test]),
    )


    training_f1_scores = history.history['f1_score']
    validation_f1_scores = history.history['val_f1_score']

    plt.plot(training_f1_scores, label='Training F1 Score')
    plt.plot(validation_f1_scores, label='Validation F1 Score')
    plt.xlabel('Epochs')
    plt.ylabel('F1 Score')
    plt.legend()
    plt.show()

train indices: (1494,)
test indices: (747,)
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
 3/47 [>.............................] - ETA: 1:17 - loss: 0.6905 - accuracy: 0.3542 - precision_6: 0.0625 - recall_6: 0.4286 - f1_score: 0.1091