# Detection of interictal periods in EEG signals using machine learning

## Train and test machine learning models to detect interictal periods in EEG signals with pycaret

In this notebook, we will use the pycaret library to train and test machine learning models to detect interictal periods in EEG signals. The models will be trained on features extracted from the EEG signals using pycaret. This notebook show a high-level overview of the process and the result about the best model to detect interictal periods in EEG signals. The models will be evaluated using various metrics such as accuracy, precision, recall, and F1-score. The best model will be selected based on these metrics and will be used for further analysis.

## Prepare the environment

### Install requirements

In [None]:
!pip install -r ../requirements.txt

### Global variables

In [2]:
PATH_DATASET = "./datasets"
PATH_SCRIPTS = "./scripts"
PATH_RESULTS = "./results"

### Import libraries

In [12]:
import pandas as pd
from pycaret.classification import setup, compare_models,plot_model,get_config, ensemble_model, blend_models, stack_models, pull, predict_model

## Loading final dataset

In [4]:
Path_final_dataset = PATH_RESULTS+"/features/EEG_features_AllFeatures.csv"
df_final = pd.read_csv(Path_final_dataset,sep=';')


## Training Model

We use setup of pycaret to prepare the data for training. This function will automatically preprocess the data, handle missing values, and encode categorical variables. We will also specify the target variable, which is the label indicating whether the signal is interictal or not.

In [5]:

Target_data='Label'
clf1 = setup(data=df_final, target=Target_data, session_id=123, verbose=True)

Unnamed: 0,Description,Value
0,Session id,123
1,Target,Label
2,Target type,Binary
3,Original data shape,"(650, 173)"
4,Transformed data shape,"(650, 173)"
5,Transformed train set shape,"(454, 173)"
6,Transformed test set shape,"(196, 173)"
7,Numeric features,172
8,Preprocess,True
9,Imputation type,simple


## Compare models

We use the compare_models function to train and evaluate multiple machine learning models on the dataset. This function will automatically select the best model based on the specified evaluation metric, which is accuracy in this case. The function will also return a dataframe with the results of all the models, including their accuracy, precision, recall, and F1-score.

In [6]:
best_model = compare_models()

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC,TT (Sec)
et,Extra Trees Classifier,0.9734,0.9935,0.965,0.9662,0.965,0.9435,0.9443,0.279
rf,Random Forest Classifier,0.9713,0.9938,0.9595,0.9666,0.9623,0.9391,0.94,0.38
lightgbm,Light Gradient Boosting Machine,0.9625,0.99,0.9425,0.9607,0.9501,0.9201,0.9219,1.954
gbc,Gradient Boosting Classifier,0.9602,0.9889,0.9425,0.9548,0.9471,0.9153,0.9172,2.082
xgboost,Extreme Gradient Boosting,0.9582,0.9895,0.9373,0.954,0.9443,0.9109,0.9125,0.473
ada,Ada Boost Classifier,0.9339,0.9794,0.9141,0.9166,0.914,0.8604,0.8621,0.786
lr,Logistic Regression,0.9272,0.9731,0.9127,0.9002,0.9046,0.8457,0.8481,0.881
dt,Decision Tree Classifier,0.9271,0.9275,0.9301,0.891,0.9072,0.8474,0.852,0.091
ridge,Ridge Classifier,0.8919,0.912,0.9431,0.8176,0.8725,0.7804,0.7909,0.052
lda,Linear Discriminant Analysis,0.8898,0.9211,0.949,0.8065,0.8703,0.7762,0.7862,0.102


Processing:   0%|          | 0/65 [00:00<?, ?it/s]

## Review shapes and keys of the training data


In [None]:
X_train = get_config('X_train')
y_train = get_config('y_train')

print(X_train.keys())


## Visualization of important graphs about the best model

In [None]:
plot_model(best_model, plot='confusion_matrix', save=True)

plot_model(best_model, plot='auc', save=True)

plot_model(best_model, plot='learning', save=True)

plot_model(best_model, plot='feature', save=True)

plot_model(best_model, plot='class_report', save=True)

In [7]:
from pycaret.classification import evaluate_model
evaluate_model(best_model)


interactive(children=(ToggleButtons(description='Plot Type:', icons=('',), options=(('Pipeline Plot', 'pipelin…

In [13]:

results_df = pull()


predictions = predict_model(best_model)


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Extra Trees Classifier,0.9694,0.9958,0.9733,0.9481,0.9605,0.9355,0.9357


In [10]:


ensemble = ensemble_model(best_model)
blended = blend_models([best_model, ensemble])
stacked = stack_models([best_model, ensemble, blended])


Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,0.9565,0.998,0.8889,1.0,0.9412,0.9069,0.9108
1,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,0.9783,1.0,1.0,0.9474,0.973,0.9548,0.9558
3,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,1.0,1.0,1.0,1.0,1.0,1.0,1.0
5,0.9778,1.0,0.9412,1.0,0.9697,0.9522,0.9533
6,0.9333,0.9937,0.9412,0.8889,0.9143,0.8598,0.8608
7,0.9333,0.9601,0.9412,0.8889,0.9143,0.8598,0.8608
8,0.9333,0.9853,0.8824,0.9375,0.9091,0.8565,0.8575
9,0.9556,0.9968,1.0,0.8947,0.9444,0.9076,0.9115


Processing:   0%|          | 0/6 [00:00<?, ?it/s]

Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,0.9783,0.998,0.9444,1.0,0.9714,0.9539,0.9549
1,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,0.9783,1.0,1.0,0.9474,0.973,0.9548,0.9558
3,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,1.0,1.0,1.0,1.0,1.0,1.0,1.0
5,0.9778,1.0,0.9412,1.0,0.9697,0.9522,0.9533
6,0.9556,0.9937,0.9412,0.9412,0.9412,0.9055,0.9055
7,0.9333,0.958,0.9412,0.8889,0.9143,0.8598,0.8608
8,0.9333,0.9895,0.8824,0.9375,0.9091,0.8565,0.8575
9,0.9556,1.0,1.0,0.8947,0.9444,0.9076,0.9115


Processing:   0%|          | 0/6 [00:00<?, ?it/s]

Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,0.9565,0.998,0.8889,1.0,0.9412,0.9069,0.9108
1,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,0.9783,1.0,1.0,0.9474,0.973,0.9548,0.9558
3,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,1.0,1.0,1.0,1.0,1.0,1.0,1.0
5,1.0,1.0,1.0,1.0,1.0,1.0,1.0
6,0.9556,0.9937,0.9412,0.9412,0.9412,0.9055,0.9055
7,0.9333,0.958,0.9412,0.8889,0.9143,0.8598,0.8608
8,0.9333,0.9895,0.8824,0.9375,0.9091,0.8565,0.8575
9,0.9556,1.0,1.0,0.8947,0.9444,0.9076,0.9115


Processing:   0%|          | 0/6 [00:00<?, ?it/s]

## Save the best model

In [11]:

import joblib

joblib.dump(best_model, "Extra-Trees-Classifier.joblib")

['Extra-Trees-Classifier.joblib']