This notebook tests the CNN model from bnci_utils.py -> cnn_model() on the entire dataset. It was also used to test the
model from p300 dataset (simply by changing the commented line in the cross-validation loop)

Note that the results presented in the thesis might not be reproducible since the network computation was done on GPU
which does not rely on the specified seed (i.e. the network might get initialized with random weights and only the splitting
of the dataset is consistent).

To run this notebook, bnci_utils.py is necessary

In [19]:
import os
import numpy as np
import tensorflow as tf
import keras
import nengo_dl
from tensorflow.python.keras import Input, Model
import nengo
from tensorflow.python.keras.callbacks import EarlyStopping
from tensorflow.python.keras.layers import Conv2D, Dropout, AveragePooling2D, Flatten, Dense, BatchNormalization, \
    Conv3D, MaxPooling2D
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split, KFold
from keras import backend as K
import pandas as pd
from sklearn import metrics

import bnci_utils as utils

In [20]:

# Dataset path is by default saved in dataset_result/bci_dataset.npz
dataset_path = os.path.join('dataset_result', 'entire_dataset.npz')

# Get features and labels
features, labels = utils.load_dataset(dataset_path)

f'Features shape: {features.shape}, labels shape: {labels.shape}'

'Features shape: (2976, 14, 36, 10), labels shape: (2976,)'

In [21]:
# Check if the dataset is balanced
yes = labels[labels == 'yes']
no = labels[labels == 'no']

f'yes: {yes.shape} ({(yes.shape[0]/labels.shape[0]) * 100}%), no: {no.shape} ({(no.shape[0]/labels.shape[0]) * 100}%)'

'yes: (1521,) (51.108870967741936%), no: (1455,) (48.891129032258064%)'

In [22]:
# Reshape the dataset
features, labels = utils.reshape_dataset(features, labels)
f'Features shape: {features.shape}, labels shape: {labels.shape}'

'Features shape: (2976, 1, 5040), labels shape: (2976, 1, 2)'

In [23]:
# Set seed for consistency
seed = 1
np.random.seed(seed)
tf.random.set_seed(seed)

In [24]:
x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size=0.25, random_state=seed, shuffle=True)
print(y_train)
print(y_test)

f'x_train shape: {x_train.shape}, y_train shape: {y_train.shape}, ' \
f'x_test shape: {x_test.shape}, y_test shape: {y_test.shape}'

[[[0. 1.]]

 [[1. 0.]]

 [[0. 1.]]

 ...

 [[0. 1.]]

 [[1. 0.]]

 [[0. 1.]]]
[[[0. 1.]]

 [[0. 1.]]

 [[1. 0.]]

 ...

 [[0. 1.]]

 [[1. 0.]]

 [[0. 1.]]]


'x_train shape: (2232, 1, 5040), y_train shape: (2232, 1, 2), x_test shape: (744, 1, 5040), y_test shape: (744, 1, 2)'

In [25]:
params_output_path = 'cnn_all_samples_nengo_params'
os.makedirs(params_output_path, exist_ok=True)

utils.cnn_model(seed=seed).summary() # Print the model

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_layer (InputLayer)     [(None, 14, 360, 1)]      0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 14, 360, 32)       832       
_________________________________________________________________
dropout_3 (Dropout)          (None, 14, 360, 32)       0         
_________________________________________________________________
average_pooling2d_2 (Average (None, 7, 180, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 5, 178, 64)        18496     
_________________________________________________________________
dropout_4 (Dropout)          (None, 5, 178, 64)        0         
_________________________________________________________________
average_pooling2d_3 (Average (None, 2, 89, 64)         0   

In [26]:
ann, snn = [], [] # arrays that will contain data from each iteration for the analog and spiking network

num_iterations = 10 # number of iterations in the cross-validation (10)
iteration = 1 # number of current iteration

for train, valid in KFold(n_splits=num_iterations).split(x_train): # perform K-Fold CV
    print('Current iteration: ', iteration)
    x_train_curr, y_train_curr = x_train[train], y_train[train] # get current training data
    x_val_curr, y_val_curr = x_train[valid], y_train[valid] # get current validation data

    params_path = os.path.join(params_output_path, f'params_{iteration}') # configure path for parameters

    model = utils.cnn_model(seed=seed) # create the model

    # Uncomment this line to perform the simulation on the p300 model instead
    # model = utils.original_p300_model(seed)

    # run ann
    ann_result = utils.run_ann(model=model,
                               train=(x_train_curr, y_train_curr),
                               valid=(x_val_curr, y_val_curr),
                               test=(x_test, y_test),
                               optimizer=keras.optimizers.Adam(),
                               loss=keras.losses.BinaryCrossentropy(),
                               params_save_path=params_path,
                               iteration=iteration,
                               callbacks=[EarlyStopping(patience=8, restore_best_weights=True, verbose=1)],
                               num_epochs=30
                         )

    # run snn
    snn_result = utils.run_snn(model,
                               x_test, y_test,
                               params_load_path=params_path,
                               iteration=iteration
                               )

    ann.append(ann_result)
    snn.append(snn_result)
    iteration += 1

    K.clear_session() # clear session and delete model since it sometimes causes memory leaks
    del model

Current iteration:  1
Build finished in 0:00:01                                                      
Optimization finished in 0:00:00                                               
Construction finished in 0:00:00                                               
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Restoring model weights from the end of the best epoch.
Epoch 00024: early stopping
1. CNN: accuracy = 52.85326086956522%, precision = 0.5316239316239316, recall = 0.8098958333333334, f1 = 0.6418988648090815
Confusion matrix:
[[ 78 274]
 [ 73 311]]
Build finished in 0:00:01                                                      
Optimization finished in 0:00:00                                               
Construction finished in 0:00:00                  

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [27]:
# Create data dictionary for pandas dataframe
df = utils.create_data_df(ann, snn, num_iterations)

df

Unnamed: 0,iterations,ann_accuracy,ann_precision,ann_recall,ann_f1,snn_accuracy,snn_precision,snn_recall,snn_f1
0,1,0.528533,0.531624,0.809896,0.641899,0.495924,0.550388,0.184896,0.276803
1,2,0.525815,0.533849,0.71875,0.612653,0.490489,0.517928,0.338542,0.409449
2,3,0.474185,0.2,0.002604,0.005141,0.525815,0.528363,0.848958,0.651349
3,4,0.513587,0.522569,0.783854,0.627083,0.485054,0.506562,0.502604,0.504575
4,5,0.521739,0.521739,1.0,0.685714,0.519022,0.521866,0.932292,0.669159
5,6,0.478261,0.0,0.0,0.0,0.512228,0.521515,0.789062,0.627979
6,7,0.514946,0.521951,0.835938,0.642643,0.512228,0.520325,0.833333,0.640641
7,8,0.506793,0.515195,0.927083,0.662326,0.494565,0.509434,0.84375,0.635294
8,9,0.521739,0.521739,1.0,0.685714,0.470109,0.493119,0.559896,0.52439
9,10,0.478261,0.0,0.0,0.0,0.516304,0.533981,0.572917,0.552764


In [28]:
data_output_folder = 'entire_dataset_output' # output path for data from each iteration
os.makedirs(data_output_folder, exist_ok=True)

# Save the dataframe to excel
df.to_excel(os.path.join(data_output_folder, 'cnn_10_fold_entire_dataset.xlsx'))

# Save path for the P300 model
# df.to_excel(os.path.join(data_output_folder, 'cnn_p300_model_10_fold_entire_dataset.xlsx'))

'Statistics for iterations successfully saved.'

'Statistics for iterations successfully saved.'

In [29]:
# Create statistics such as maximums and averages for each metric
df_stats = utils.create_stats_df(df)

df_stats

Unnamed: 0,models,average_accuracy,max_accuracy,accuracy_std,average_precision,max_precision,average_recall,max_recall,average_f1,max_f1
0,ann,0.506386,0.528533,0.021301,0.386867,0.533849,0.607812,1.0,0.456317,0.685714
1,snn,0.502174,0.525815,0.017636,0.520348,0.550388,0.640625,0.932292,0.54924,0.669159


In [30]:
# Create dataframe for statistics and save it as excel file
df_stats.to_excel(os.path.join(data_output_folder, 'cnn_10_fold_entire_dataset_stats.xlsx'))

# Save path for the P300 model
# df_stats.to_excel(os.path.join(data_output_folder, 'cnn_p300_model_10_fold_entire_dataset_stats.xlsx'))

'File with statistics successfully saved.'

'File with statistics successfully saved.'

In [31]:
# Print confusion matrices for ANN and SNN in each iteration
utils.print_confusion_matrices(ann, snn)


Confusion matrices for the ANN:
[[ 78 274]
 [ 73 311]] 

[[111 241]
 [108 276]] 

[[348   4]
 [383   1]] 

[[ 77 275]
 [ 83 301]] 

[[  0 352]
 [  0 384]] 

[[352   0]
 [384   0]] 

[[ 58 294]
 [ 63 321]] 

[[ 17 335]
 [ 28 356]] 

[[  0 352]
 [  0 384]] 

[[352   0]
 [384   0]] 

Confusion matrices for the SNN
[[294  58]
 [313  71]] 

[[231 121]
 [254 130]] 

[[ 61 291]
 [ 58 326]] 

[[164 188]
 [191 193]] 

[[ 24 328]
 [ 26 358]] 

[[ 74 278]
 [ 81 303]] 

[[ 57 295]
 [ 64 320]] 

[[ 40 312]
 [ 60 324]] 

[[131 221]
 [169 215]] 

[[160 192]
 [164 220]] 

