# __BioPyC__

## An  open-source  python  platform  for  offline EEG and Bio signals decoding and analysis !

This application enables users to easily go through the following steps : 
- __Reading__ EEG and physiological signals (.gdf, .mat, .fiff)
- __Filtering__ the signals (CSP, FBCSP, etc)
- __Classifying__ the signals (LDA, NN, Riemannian Geometry methods)
- __Evaluating__ the algorithms by obtaining classification performances, for a given data set (ROC, CV, etc)
- __Visualizing__ classification performances
- __Statistical testing__ classification performances

BioPyC offers a unique interface to quickly create ML models and compare both EEG and physiological signals-based classifiers without any coding. 

## <font color = orange> 1 - Load the application</font>
Loading BioPyC application...

In [1]:
%reload_ext autoreload
%autoreload 2

from application import application
from src.graphic_user_interface.jupyter_widgets.study_parameters import study_parameters

# Call objects containing methods for 
study_parameters_object = study_parameters()

new_application = application()

# Initializing widgets that will be display at each step of BioPyC

# Signals type (EEG, physio or EEG+physio)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.signals_type import signals_type
signals_type_object = signals_type()

# Data type (raw data or preprocessed data)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.data_type import data_type
data_type_object = data_type()

# Dataset (Ex: BCI_competition_4_dataset_a)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.dataset import dataset
dataset_object = dataset()

# Preprocessing parameters (run(s), session(s), list of channels, tmin, tmax, stimulations, etc)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.preprocessing_parameters import preprocessing_parameters
preprocessing_parameters_object = preprocessing_parameters()

# Preprocessed data parameters (band-pass delimiters)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.preprocessed_parameters import preprocessed_parameters
preprocessed_parameters_object = preprocessed_parameters()

# Filters (CSP, FBCSP)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.filters import filters
filters_object = filters()

# Classifiers (Riemmanian methods, CNN, LDA)
from src.graphic_user_interface.jupyter_widgets.specific_parameters.classifiers import classifiers
classifiers_object = classifiers()

# Calibration type ('subject-independent', 'subject_specific')
from src.graphic_user_interface.jupyter_widgets.specific_parameters.calibration_type import calibration_type
calibration_type_object = calibration_type()

# Evaluation type ('classic', 'cross-validation')
from src.graphic_user_interface.jupyter_widgets.specific_parameters.evaluation_type import evaluation_type
evaluation_type_object = evaluation_type()

# Evaluation parameters
from src.graphic_user_interface.jupyter_widgets.specific_parameters.evaluation_parameters import evaluation_parameters
evaluation_parameters_object = evaluation_parameters()
evaluation_parameters_object.parameter_type = 'cross-validation_type' # kfold or loo

evaluation_parameters_object_2 = evaluation_parameters()
evaluation_parameters_object_2.parameter_type = 'type_split' # chronological or shuffle

evaluation_parameters_object_3 = evaluation_parameters()
#evaluation_parameters_object_3.parameter_type = 'training_split_ratio'
#evaluation_parameters_object_3.parameter_type_2 = 'nb_kfold'

# Plotting type
from src.graphic_user_interface.jupyter_widgets.specific_parameters.plot_type import plot_type
plotting_parameters_object = plot_type()

# Statistical tests 
from src.graphic_user_interface.jupyter_widgets.specific_parameters.stats_type import stats_type
stats_parameters_object = stats_type()


... BioPyC is loaded ! 

BioPyC initialized all parameters that will then be updated by following the different steps of the BCI process bellow, i.e., reading data, pre-processing, signals processing & classification, performing data visualization and statistical tests.

## <font color = orange> 2 - Type of data/signals</font>

__A - Signals types__

Choose the type of data you would like to work on:
- __EEG signals__ this option will lead to study signals with algorithms for EEG signals
- __physiological signals__ this option will lead to study signals with algorithms for physiological signals (Heart Rate, breathing or Electrodermal Activity)
- __EEG and physiological signals__ this option will lead to study a combination of both EEG and physiological signals with algorithms made for it

In [2]:
# Load the widgets for choosing the signals type, updates data type widgets when button is clicked
signals_type_object.display_widgets(study_parameters_object=study_parameters_object, 
                                    data_type_object=data_type_object)

VBox(children=(HBox(children=(Label(value='Signals types', layout=Layout(height='50px', width='25%')), SelectM…

Button(button_style='info', description='Select this signals type', layout=Layout(height='40px', width='50%'),…

Label(value='')

__B - Data types__

Choose the type of data you would like to work on:
- __raw data__ data that need to be pre-processed (=bandpassing, artifacts cleaning, epoching, etc) before to apply any algorithms on it.   
Your data set should be stored in "BioPyC/data_store/rawdata_datasets/"
- __preprocessed data__  data that have been preprocessed and saved as is. 
Your data set should be stored in "BioPyC/data_store/preprocessed_datasets/"

In [3]:
# Load the widgets for choosing the data type
data_type_object.display_widgets(study_parameters_object = study_parameters_object,
                                dataset_object=dataset_object)

VBox(children=(HBox(children=(Label(value='Data types', layout=Layout(height='60px', width='25%')), SelectMult…



Label(value='')

## <font color = orange> 3 - Reading the dataset</font>

#### Choose the dataset
This first step allows you to read data from a folder you previously slipped into the data store. You will find your folder by running the code below. 

The architecture of this folder has to be the following :
- __folder name__: name of your study. For example, if you want to analyze the "BCI competition IV dataset a", call it "bci_competition_4_dataset_a"
<br /><br />
- __subfolders names__: names of the subjects. Ex : "subject_1", "subject_2", etc
<br /><br />
- __files__: each subfolder, corresponding to each subject data, contains all files for this subject, each file being the data band passed in a particular frequency band. Ex : data band passed in a [8,12] Hz frequency band. 

Finally we have this tree structure : __example - data_store/EEG_data/raw_data/bci_competition_IV_dataset_a/subject_1/[8,12].mat__  

*NB If .mat format: we consider that the matlab structure header starts by "EEG". Feel free to change it /src/data/data_readers/mat.py*

In [4]:
# Load the widgets for choosing the dataset
dataset_object.display_widgets(study_parameters_object=study_parameters_object,
                              list_available_datasets=data_type_object.list_available_datasets,
                              preprocessing_parameters_object=preprocessing_parameters_object,
                              preprocessed_parameters_object=preprocessed_parameters_object)

VBox(children=(HBox(children=(Label(value='Datasets', layout=Layout(height='60px', width='25%')), SelectMultip…



Label(value='')

## <font color = orange> 4 - Define Parameters</font>

#### A - Raw data: pre-processing parameters 

If your dataset contains raw signals, you have to specify some parameters for the pre-processing. 
<br /><br />

* __list of subjects to keep__: list containing integers of the subjects you wan to keep (Ex: [2, 3, 7])
<br /><br />

* __list of sessions__: list containing integers of the sessions (Ex: [1, 2]).  <font color = red>Have to be specified in each file name.</font>
<br /><br />

* __list of runs__: list containing integers of the runs (Ex: [1,2,3,4]).  <font color = red>Have to be specified in each file name.</font>
<br /><br />

* __specify labeling__: boolean, if True, this means you have to implement a script to labelize your data, and place this python script in the folder "src/utils/specific_studies_labeling/" and name it after your data set. Note that you have to keep the same structure as for the two exemple you gave (curiosity.py and bci_competition_4_dataset_2a.py).<br /> 

    * _Exemple: for a data set called "bci_competition_4_dataset_2a" then call your script "bci_competition_4_dataset_2a.py"._
<br /><br />

* __dictionary of "labels: stimulations"__: where the keys are the labels and the values the associated stimulations
<br /><br />

* __t min__: dictionary with windows' starting times for each type of signals, can be before/after the stimulation (Ex: -2.5 for 2,5 sec before stimulation) <br /> 
    * _Exemple 1: tmin = {'breathing': -10.0, 'eda': -10.0, 'heart_rate': -10.0, 'eeg': -4.0} (if working with all 4 types of signals)_
    * _Exemple 2: tmin = {'eeg': 0.5} (if working with EEG singals only)_
<br /><br />

* __t max__: dictionary with windows' stopping times for each type of signals, can be before/after the stimulation (Ex: 0.5 for 0,5 sec after stimulation).
    * _Exemple 1: tmax = {'breathing': 4.0, 'eda': 4.0, 'heart_rate': 4.0, 'EEG': -0.1} (if working with all 4 types of signals)_
    * _Exemple 2: tmax = {'EEG': 2.5} (if working with EEG singals only)_
<br /><br />

* __dictionary of signal types / list of channels__: where th key is the type of signals, and the value the list of channels associated with. Note that only one channel must be indicated in the list for each type of physiological signals you want to use.
    * _Exemple 1: {'EEG': ['Fp1', 'Fz', 'F3', 'F7', 'FT9']}_
    * _Exemple 2: {'breathing': ['Channel 65'], 'eda': ['Channel 67'], 'heart_rate": ['Channel 68'], 'EEG': ['Fp1', 'Fz', 'F3', 'F7', 'FT9']}_
<br /><br />

* __list of EOG channels__: list of EOG channels ['eog_1', 'eog_2', etc]
<br /><br />

* __list of channels to drop__: list of channels to drop ['unnamed_1', 'unnamed_2', etc]


In [5]:
# Load the widgets for choosing the preprocessing parameters
preprocessing_parameters_object.display_widgets(study_parameters_object=study_parameters_object,
                                                classifiers_object=classifiers_object,
                                                filters_object=filters_object)

VBox(children=(HBox(children=(Label(value='List of subjects to keep', layout=Layout(height='40px', width='25%'…



Label(value='')

#### B - Preprocessed data: identifying parameters in files names

If your dataset contains pre-processed signals, you have to specify in which passband(s) the signals have been filtered. 
*NB there are no lists of run(s)/session(s), files from different run/session should have been concatenated during the preprocessing step.*
- __passband delimiter__: list of string characters that delimit limit figures of the bandpass. (Ex: ['\_','to','Hz'] to indicate the 8-12 bandpass for the file like 'subj1_8to8Hz.mat') 

In [6]:
# Load the widgets for choosing the preprocessed data parameters
preprocessed_parameters_object.display_widgets(study_parameters_object=study_parameters_object,
                                              classifiers_object=classifiers_object,
                                               filters_object=filters_object)

VBox(children=(HBox(children=(Label(value='Band-pass delimiters', layout=Layout(height='40px', width='25%')), …



Label(value='')

## <font color = orange> 5 - Filtering signals</font>

This step is <font color = red>__optional__</font>.

First, the application looks to the "filters" folder of the BioPyC toolbox to list the implemented filters. 


*NB: Using your own filter is possible by placing your script into the BioPyC's "filters" folder, on condition your script follows BioPyC's formalism* 
<br /><br />

- __Available filters:__ so far, the Common Spatial Pattern (CSP)  and the Filter Bank CSP (FBCSP) are implemented and available. 
<br /><br />

- __CSP number of filter pairs:__ For the CSP, a single passband is used (see the next section, i.e. the classifier section, to choose this passband). Among the pairs of filters, you need to choose the number of filter pairs you wan to keep (usually this number is 3, meaning extracting 3 pairs of filter, i.e., 6 features). 
<br /><br />

- __FBCSP number of filter pairs:__ For the FBCSP, multiple passbands are used (see the classifier section as well). It requires to choose the number of pairs of filters to select per band pass.
<br /><br />

- __FBCSP number of features to keep:__ then to choose the number of features to keep among the different pass bands. Note that these features are selected using the mRMR feature selection algorithm.


In [7]:
# Load the widgets for choosing the filters
filters_object.display_widgets(study_parameters_object=study_parameters_object)

VBox(children=(HBox(children=(Label(value='Available filters', layout=Layout(height='60px', width='25%')), Sel…

VBox(children=(HBox(children=(Label(value='CSP number of filter pairs', layout=Layout(height='40px', width='25…

VBox(children=(HBox(children=(Label(value='FBCSP number of filter pairs per band-pass', layout=Layout(height='…

VBox(children=(HBox(children=(Label(value='FBCSP number of features to keep', layout=Layout(height='40px', wid…



Label(value='')

## <font color = orange> 6 - Classifying signals</font>
The application looks to the "classifiers" repository of BioPyC, in order to list the implemented classifiers. These classifiers are then displayed on the interface.  

*NB: Using your own classifier is possible by placing your script into the BioPyC's "classifiers" folder

In [8]:
# Load the widgets for choosing the classifiers
classifiers_object.display_widgets(study_parameters_object=study_parameters_object,
                                  calibration_type_object=calibration_type_object)

VBox(children=(HBox(children=(Label(value='Available classifiers', layout=Layout(height='60px', width='25%')),…

VBox(children=(HBox(children=(Label(value='Filter Bank Riemmanian Methods number features to keep', layout=Lay…

VBox(children=(HBox(children=(Label(value='Single pass-band', layout=Layout(height='40px', width='25%')), Text…

VBox(children=(HBox(children=(Label(value='Filter Bank pass-bands', layout=Layout(height='40px', width='25%'))…



Label(value='')

## <font color = orange> 7 - Calibration & Evaluation</font>

__A - Calibration type__

BioPyC offers to run different types of calibration for studying data:

- __subject-specific study__: first, data specific to each subject are split into two parts: the training and testing sets. Then, machine learning algorithms are trained on the first set, then evaluated on the second one. 
<br /><br />

- __subject-independent study__: in BioPyC, the method for this type of calibration is a leave-one-subject-out cross validation, i.e. it requires to build a machine learning model for each subject, where the training phase uses all other subjects data, and the testing phase uses the target subject data only. 

You can either choose one or both of these calibrations. 
<br /><br />

__B - Evaluation type__

BioPyC offers to run different types of evaluation for studying data:

- __classic__: the data set is split into two subdatasets - one will be used as training set, the other one as testing set. This evaluation method requires to enter a "split ratio" in the "Evaluation parameter" section coming next. This Split ratio must be a float between 0.0 and 1.0 (Ex: 0.8 to keep 80% of the data as training set, and 20% of that data as testing set.)
<br /><br />

- __cross-validation__: the data from each participant is divided into n parts: n-1 parts are used for training the classifier and the last one for testing classifier for that participant. This process is repeated n times, with each part used exactly once as the testing set. 


In [9]:
# Load the widgets for choosing the the calibration type
calibration_type_object.display_widgets(study_parameters_object=study_parameters_object,
                                       evaluation_type_object=evaluation_type_object)

VBox(children=(HBox(children=(Label(value='Calibration types', layout=Layout(height='60px', width='25%')), Sel…



Label(value='')

In [10]:
# Load the widgets for choosing the the evaluation type
evaluation_type_object.display_widgets(study_parameters_object=study_parameters_object,
                                       evaluation_parameters_object=evaluation_parameters_object,
                                       evaluation_parameters_object_2=evaluation_parameters_object_2,
                                       evaluation_parameters_object_3=evaluation_parameters_object_3)

VBox(children=(HBox(children=(Label(value='Evaluation types', layout=Layout(height='60px', width='25%')), Sele…



Label(value='')

In [11]:


evaluation_parameters_object.display_widgets(widget_type='select_multiple',
                                             study_parameters_object=study_parameters_object,
                                             evaluation_parameters_object_2=evaluation_parameters_object_2,
                                             evaluation_parameters_object_3=evaluation_parameters_object_3)

evaluation_parameters_object_2.display_widgets(widget_type='select_multiple',
                                               study_parameters_object=study_parameters_object,
                                               evaluation_parameters_object_2=evaluation_parameters_object_2,
                                               evaluation_parameters_object_3=evaluation_parameters_object_3)

evaluation_parameters_object_3.display_widgets(widget_type='textbox',
                                               study_parameters_object=study_parameters_object,
                                               evaluation_parameters_object_2=evaluation_parameters_object_2,
                                               evaluation_parameters_object_3=evaluation_parameters_object_3)



VBox(children=(HBox(children=(Label(value='Evaluation Parameters', layout=Layout(height='60px', width='25%')),…



Label(value='')

VBox(children=(HBox(children=(Label(value='Evaluation Parameters', layout=Layout(height='60px', width='25%')),…



Label(value='')

VBox(children=(HBox(children=(Label(value='Evaluation Parameters', layout=Layout(height='40px', width='25%')),…



Label(value='')

In [12]:
plotting_parameters_object.display_widgets(study_parameters_object=study_parameters_object)

VBox(children=(HBox(children=(Label(value='Available plottings', layout=Layout(height='60px', width='25%')), S…



Label(value='')

In [13]:
stats_parameters_object.display_widgets(study_parameters_object=study_parameters_object)

VBox(children=(HBox(children=(Label(value='Available statistical tests', layout=Layout(height='60px', width='2…



Label(value='')

## Run the application

In [14]:

def print_summary_parameters():
    
    print('--------  SUMMARY PARAMETERS   -------- ')
    print('application directory: ', study_parameters_object.application_directory)

    print('Calibration type: ', study_parameters_object.calibration_type)

    # Study sub types
    print('Cross validation type: ', study_parameters_object.cross_val_type )
    print('Type split: ', study_parameters_object.type_split)
    print('training split ratio: ', study_parameters_object.training_split_ratio ) # split ratio for the training set (Ex: 0.8 for keeping 80% of the data for the training set)
    print('nb kfold: ', study_parameters_object.kfold)

    # Dataset
    # === global parameters ===
    print('dataset format : ', study_parameters_object.dataset_format ) # the format of the dataset, i.e. .mat, .gdf etc
    print('signals type: ', study_parameters_object.signals_type)# EGG physio or EEG+physio
    print('data type: ', study_parameters_object.data_type) # 'raw' or 'preprocessed'
    print('dataset: ', study_parameters_object.dataset) # (Ex: 'bci_competition_4_dataset_2a')
    # self.passband = [] # set of passbands data should be filtered into
    print('passband repository: ', study_parameters_object.passband_repository) # set of passbands data should be filtered into

    # === preprocessed data === (no list of runs/session, should have been concatenated during the preprocessing)
    print('passband delimiters: ', study_parameters_object.passband_delimiters) # string characters that delimit limit figures of the bandpass

    # === raw data === (meaning those are parameters for the preprocessing)
    print('specify labeling: ', study_parameters_object.specify_labeling )  # Meaning you have to implement a script to labelize your data
    print('list subjects to keep : ', study_parameters_object.list_subjects_to_keep ) # list containing strings of the subjects (Ex: ['subject_1', 'subject_2'])
    print('list sessions: ', study_parameters_object.list_sessions ) # list containing integers of the sessions (Ex: [1,2])
    print('list runs: ', study_parameters_object.list_runs ) # list containing integers of the runs (Ex: [1,2,3,4])
    print('dictionary stimulations: ', study_parameters_object.dictionary_stimulations ) # (Ex: {'right':1, 'left':2})
    print('tmin: ', study_parameters_object.tmin ) # start time window signal, can before/after the stimulation (Ex: 2.5 for 2,5 sec after stimulation)
    print('tmax: ', study_parameters_object.tmax)# stop time window signal, can before/after the stimulation (Ex: -1.0 for 1 sec before the stimulation)
    print('list all channels: ', study_parameters_object.list_all_channels) # list of channels ['channel_1', 'channels_2', etc]
    print('list eog: ', study_parameters_object.list_eog )# list EOGs ['eog_1', 'eog_2', etc]
    print('list channels to drop: ', study_parameters_object.list_channels_to_drop) # list of channels ['unnamed_1', 'unnamed_2', etc]

    # Algorithm
    # ==== type ====
    print('list filters : ', study_parameters_object.list_filters)
    print('list classifiers: ', study_parameters_object.list_classifiers)
    # ==== hyper-parameters ====
    print('filter parameters: ', study_parameters_object.filter_parameter )
    print('classifier parameters: ', study_parameters_object.classifier_parameter)

    # Results
    # ==== Evaluation ====
    print('Evaluation type: ', study_parameters_object.evaluation_type)  # 'classic' or 'cross_validation'
    # ==== Analysis to make ====
    print('list statistical tests: ', study_parameters_object.list_statistical_tests)
    print('list plots: ', study_parameters_object.list_plots)
    # ==== Saving path for results dataframe ====
    print('results filename : ', study_parameters_object.results_filename)


    print('----------------------------------------')
    
    print('')
    print('')
    
    
import ipywidgets as widgets
button_ = widgets.Button(description='Run the study !',
                                 button_style='success',
                                 disabled=False,
                                 layout=widgets.Layout(width='50%',
                                                       height='40px'))

display(button_)

out = widgets.Output(layout={'border': '2px solid black'})
display(out)

def click_run_study(b):
    
    with out:
        print_summary_parameters()
        new_application.run_study(studies_parameters=study_parameters_object, b=b)

button_.on_click(click_run_study)



Button(button_style='success', description='Run the study !', layout=Layout(height='40px', width='50%'), style…

Output(layout=Layout(border='2px solid black'))