### Order of Execution of the codes

**Step 1: Aligning the edf with the sleep stage annotation and converting .edf files to .mat files.**<br/>
Takes as input the path to .edf signal files and the path to .xml sleep annotation files<br/>
Outputs the .mat files aligning the signals to sleep stage annotation<br/>

In [1]:
python data_extraction.py

**Step 2: Creating 2 separate pkl files containing filename (.mat filename) with its corresponding length (in terms of 30 secs epochs) and containing filenames with its corresponding cumulative length (e.g. length of 1st file, length of 1st+2nd file, length of 1st+2nd+3rd file etc.)**

In [5]:
import utils
import random
from random import shuffle
import os

path_to_mat_folder='D:/DEEPSLEEP/test_for_github/datasets/shhs/mat files/'
path_to_file_length_train='D:/DEEPSLEEP/test_for_github/datasets/shhs/train/trainFilesNum30secEpochs_all_shhs1.pkl'
path_to_file_length_train_cumul='D:/DEEPSLEEP/test_for_github/datasets/shhs/train/trainFilesNum30secEpochsCumulative_all_shhs1.pkl'
path_to_file_length_val='D:/DEEPSLEEP/test_for_github/datasets/shhs/val/valFilesNum30secEpochs_all_shhs1.pkl'

random.seed(30)
mat_files=[]

for i in os.listdir(path_to_mat_folder):
    mat_files.append(i)
shuffle(mat_files)
training_set=5
validation_set=10

utils.calculate_num_samples(mat_files[:training_set],path_to_file_length_train)
utils.calculate_num_samples_cumulative(mat_files[:training_set],path_to_file_length_train_cumul)
utils.calculate_num_samples(mat_files[training_set:validation_set],path_to_file_length_val)

0 eeg_annotation_shhs1-200004
1 eeg_annotation_shhs1-200007
2 eeg_annotation_shhs1-200008
3 eeg_annotation_shhs1-200003
4 eeg_annotation_shhs1-200002
0 eeg_annotation_shhs1-200004
1 eeg_annotation_shhs1-200007
2 eeg_annotation_shhs1-200008
3 eeg_annotation_shhs1-200003
4 eeg_annotation_shhs1-200002
0 eeg_annotation_shhs1-200006
1 eeg_annotation_shhs1-200010
2 eeg_annotation_shhs1-200001
3 eeg_annotation_shhs1-200005
4 eeg_annotation_shhs1-200009


**Step 3: Creating 3 separate hdf5 files from all the .mat files in the training set, in validation set and in test ste**

In [6]:
path_to_hdf5_file_train='D:/DEEPSLEEP/test_for_github/datasets/shhs/train/hdf5_file_train_all_chunking_shhs1.hdf5'
utils.hdf5_creation1(mat_files[:training_set],path_to_hdf5_file_train)
path_to_hdf5_file_val='D:/DEEPSLEEP/test_for_github/datasets/shhs/val/hdf5_file_val_all_chunking_shhs1.hdf5'
utils.hdf5_creation1(mat_files[training_set:validation_set],path_to_hdf5_file_val)

0 eeg_annotation_shhs1-200004
(875, 1, 5, 3750)
(875, 1)
1 eeg_annotation_shhs1-200007
(919, 1, 5, 3750)
(919, 1)
2 eeg_annotation_shhs1-200008
(959, 1, 5, 3750)
(959, 1)
3 eeg_annotation_shhs1-200003
(1049, 1, 5, 3750)
(1049, 1)
4 eeg_annotation_shhs1-200002
(1079, 1, 5, 3750)
(1079, 1)
dset data shape outside: (4881, 1, 5, 3750)
dset label shape outside: (4881, 1)
0 eeg_annotation_shhs1-200006
(1084, 1, 5, 3750)
(1084, 1)
1 eeg_annotation_shhs1-200010
(1084, 1, 5, 3750)
(1084, 1)
2 eeg_annotation_shhs1-200001
(1084, 1, 5, 3750)
(1084, 1)
3 eeg_annotation_shhs1-200005
(1084, 1, 5, 3750)
(1084, 1)
4 eeg_annotation_shhs1-200009
(1086, 1, 5, 3750)
(1086, 1)
dset data shape outside: (5422, 1, 5, 3750)
dset label shape outside: (5422, 1)


**Step 4: Training and validation of the STQS model**

In [None]:
python train_val.py
or
python train_val.py --batch_size=192 --n_workers=16 --learning_rate=0.0001 --max_epochs=200 --time_steps=3750 --n_channels=5 --modality_pipelines-3 --seq_length=8 --class_imbalance='óversampling' --lstm_option=False --rc_option=False --patience_epoch=7

Possible arguments:
1. batch_size = size of the mini-batch passed for training the model; default=192
2. n_workers = number of cpu cores to be used by data loader function of pytorch; default=16
3. time_steps = 30 secs x sampling rate = 30x125=3750 (for SHHS dataset)
4. max_epochs = Maximum number of epochs that a model will be trained if the training does not stop by early stopping; default=200
5. n_channels = total number of channels in all modalities (EEG, EOG, EMG); for our model trained on SHHS dataset default=5 
6. seq_length = sequence length for lstm; default=8 
7. lstm_option = if False, the architecture only contains the CNN part (ST part); if True, the architecture contains the CNN+LSTM part, but not the residual connection; default=False
8. rc_option = if False, the model does not contain the Residual connection block; if True, the model contains the residual connection block; default=False
9. class_imbalance = Any value among ['None','oversampling','weightedcostfunc1','weightedcostfunc2']. 'None' corresponds to no class imbalance handling and the other values correspond to the way the class imbalance is handled.
10. lr = learning rate; default=0.0001
11. modality_pipeline = number of modality pipelines in the model; default=3 (EEG, EOG, EMG pipelines)
12. patience_epoch = number of epochs to check before early stopping, i.e. stop training if validation loss does not decrease for n consecutive epochs; default=7

**Step 5: Testing of STQS model**

In [8]:
mat_files=[]
for i in os.listdir(path_to_mat_folder):
    mat_files.append(i)

sample_test_files=mat_files[10:12]

path_to_file_length_test='D:/DEEPSLEEP/test_for_github/datasets/shhs/test/testFilesNum30secEpochs_all_shhs1.pkl'
utils.calculate_num_samples(sample_test_files,path_to_file_length_test)

path_to_hdf5_file_test='D:/DEEPSLEEP/test_for_github/datasets/shhs/test/hdf5_file_test_all_chunking_shhs1.hdf5'
utils.hdf5_creation1(sample_test_files,path_to_hdf5_file_test)

0 eeg_annotation_shhs1-200011
1 eeg_annotation_shhs1-200012
0 eeg_annotation_shhs1-200011
(989, 1, 5, 3750)
(989, 1)
1 eeg_annotation_shhs1-200012
(964, 1, 5, 3750)
(964, 1)
dset data shape outside: (1953, 1, 5, 3750)
dset label shape outside: (1953, 1)


In [None]:
python test.py