# Tutorial no. 1 of SpeechBrain-MOABB: Setting up EEG decoding

## **Prerequisites**


### Download SpeechBrain-MOABB

SpeechBrain-MOABB can be downloaded from the GitHub repository listed below.

In [None]:
%%capture
!git clone https://github.com/speechbrain/benchmarks
%cd /content/benchmarks
!git checkout eeg

%cd /content/benchmarks/benchmarks/MOABB
!pip install -r extra-requirements.txt # Install additional dependencies

### Install SpeechBrain and SpeechBrain-MOABB requirements, and install SpeechBrain

In [None]:
%%capture
# Clone SpeechBrain repository (development branch)
%cd /content/
!git clone https://github.com/speechbrain/speechbrain/
%cd /content/speechbrain/

# Install required dependencies
!pip install -r requirements.txt

# Install SpeechBrain in editable mode
!pip install -e .

%cd /content/



## **Define a yaml file containing the hyper-parameters defining a decoding pipeline**

Let us address 4-class motor imagery decoding using BNCI2014-001 dataset (also known as "BCI IV2a dataset"), by adopting a leave-one-session-out strategy using the first participants' signals, leaving out the session named '0test'. EEGNet is used as decoder.

You can set all hyper-parameters to specific values if you already know them; otherwise, you can set them to placeholders (i.e., as `!PLACEHOLDER`). For example, folders (e.g., the data folder, the folder for compressed dataset, and the output folder), and dataset information (e.g., data iterator, index of subject and session to use) are usually kept as placeholders.

Before start writing the yaml file, please follow the SpeechBrain tutorial dedicated to HyperPyYAML at https://speechbrain.github.io/tutorial_basics.html.

In [None]:
from hyperpyyaml import load_hyperpyyaml, dump_hyperpyyaml

example_hyperparams = """
seed: 1234
__set_torchseed: !apply:torch.manual_seed [!ref <seed>]

# DIRECTORIES
data_folder: !PLACEHOLDER  #'/path/to/dataset'. The dataset will be automatically downloaded in this folder
cached_data_folder: !PLACEHOLDER #'path/to/pickled/dataset'
output_folder: !PLACEHOLDER #'path/to/results'

# DATASET HPARS
# Defining the MOABB dataset.
dataset: !new:moabb.datasets.BNCI2014001
save_prepared_dataset: True # set to True if you want to save the prepared dataset as a pkl file to load and use afterwards
data_iterator_name: 'leave-one-session-out'
target_subject_idx: 0
target_session_idx: 1
events_to_load: null # all events will be loaded
original_sample_rate: 250 # Original sampling rate provided by dataset authors
sample_rate: 125 # Target sampling rate (Hz)
# band-pass filtering cut-off frequencies
fmin: 1
fmax: 40
n_classes: 4
tmin: 0.
tmax: 4.0
# number of steps used when selecting adjacent channels from a seed channel (default at Cz)
n_steps_channel_selection: 3
T: !apply:math.ceil
    - !ref <sample_rate> * (<tmax> - <tmin>)
C: 22
test_with: 'best' # 'last' or 'best'
test_key: "acc" # Possible opts: "loss", "f1", "auc", "acc"

# METRICS
f1: !name:sklearn.metrics.f1_score
    average: 'macro'
acc: !name:sklearn.metrics.balanced_accuracy_score
cm: !name:sklearn.metrics.confusion_matrix
metrics:
    f1: !ref <f1>
    acc: !ref <acc>
    cm: !ref <cm>

# TRAINING HPARS
n_train_examples: 100  # it will be replaced in the train script
# checkpoints to average
avg_models: 10
number_of_epochs: 1000
lr: 0.0001
# Learning rate scheduling (cyclic learning rate is used here)
max_lr: !ref <lr> # Upper bound of the cycle (max value of the lr)
base_lr: 0.00000001 # Lower bound in the cycle (min value of the lr)
step_size_multiplier: 5 #from 2 to 8
step_size: !apply:round
    - !ref <step_size_multiplier> * <n_train_examples> / <batch_size>
lr_annealing: !new:speechbrain.nnet.schedulers.CyclicLRScheduler
    base_lr: !ref <base_lr>
    max_lr: !ref <max_lr>
    step_size: !ref <step_size>
label_smoothing: 0.0
loss: !name:speechbrain.nnet.losses.nll_loss
    label_smoothing: !ref <label_smoothing>
optimizer: !name:torch.optim.Adam
    lr: !ref <lr>
epoch_counter: !new:speechbrain.utils.epoch_loop.EpochCounter  # epoch counter
    limit: !ref <number_of_epochs>
batch_size: 32
valid_ratio: 0.2

# DATA NORMALIZATION
dims_to_normalize: 1 # 1 (time) or 2 (EEG channels)
normalize: !name:speechbrain.processing.signal_processing.mean_std_norm
    dims: !ref <dims_to_normalize>

# MODEL
input_shape: [null, !ref <T>, !ref <C>, null]
cnn_temporal_kernels: 8
cnn_temporal_kernelsize: 62
cnn_spatial_depth_multiplier: 2
cnn_spatial_max_norm: 1.
cnn_spatial_pool: 4
cnn_septemporal_depth_multiplier: 1
cnn_septemporal_point_kernels: !ref <cnn_temporal_kernels> * <cnn_spatial_depth_multiplier> * <cnn_septemporal_depth_multiplier>
cnn_septemporal_kernelsize: 16
cnn_septemporal_pool: 8
cnn_pool_type: 'avg'
dense_max_norm: 0.25
dropout: 0.5
activation_type: 'elu'

model: !new:models.EEGNet.EEGNet
    input_shape: !ref <input_shape>
    cnn_temporal_kernels: !ref <cnn_temporal_kernels>
    cnn_temporal_kernelsize: [!ref <cnn_temporal_kernelsize>, 1]
    cnn_spatial_depth_multiplier: !ref <cnn_spatial_depth_multiplier>
    cnn_spatial_max_norm: !ref <cnn_spatial_max_norm>
    cnn_spatial_pool: [!ref <cnn_spatial_pool>, 1]
    cnn_septemporal_depth_multiplier: !ref <cnn_septemporal_depth_multiplier>
    cnn_septemporal_point_kernels: !ref <cnn_septemporal_point_kernels>
    cnn_septemporal_kernelsize: [!ref <cnn_septemporal_kernelsize>, 1]
    cnn_septemporal_pool: [!ref <cnn_septemporal_pool>, 1]
    cnn_pool_type: !ref <cnn_pool_type>
    activation_type: !ref <activation_type>
    dense_max_norm: !ref <dense_max_norm>
    dropout: !ref <dropout>
    dense_n_neurons: !ref <n_classes>

"""


In [None]:
# Save the yaml file on disk
f = open('/content/example_hyperparams.yaml', "w")
f.write(example_hyperparams)
f.close()

### **Note about data augmentation**

It is worth highlighting that in the previous yaml file, no data augmentation was included. However, you can easily add data augmentation by defining each augmenter (e.g., applying CutCat and random time shift).

The so-defined augmenters are provided as input to the `Augmenter` class, that will combine and apply the augmenters. For instance, you can perform the augmenters in sequence or in parallel (`parallel_augment` input parameter), use one or more augmenters for augmenting each mini-batch of data (`min_augmentations` and `max_augmentations` input parameters), and repeat data augmentation multiple times for each mini-batch (`repeat_augment` input parameter). See `Augmenter` documentation for further details.

In [None]:
data_augmentation_hyperparams = """
# DATA AUGMENTATION
# cutcat (disabled when min_num_segments=max_num_segments=1)
max_num_segments: 3 # @orion_step2: --max_num_segments~"uniform(2, 6, discrete=True)"
cutcat: !new:speechbrain.augment.time_domain.CutCat
    min_num_segments: 2
    max_num_segments: !ref <max_num_segments>
# random amplitude gain between 0.5-1.5 uV (disabled when amp_delta=0.)
amp_delta: 0.01742 # @orion_step2: --amp_delta~"uniform(0.0, 0.5)"
rand_amp: !new:speechbrain.augment.time_domain.RandAmp
    amp_low: !ref 1 - <amp_delta>
    amp_high: !ref 1 + <amp_delta>
# random shifts between -300 ms to 300 ms (disabled when shift_delta=0.)
shift_delta_: 1 # orion_step2: --shift_delta_~"uniform(0, 25, discrete=True)"
shift_delta: !ref 1e-2 * <shift_delta_> # 0.250 # 0.-0.25 with steps of 0.01
min_shift: !apply:math.floor
    - !ref 0 - <sample_rate> * <shift_delta>
max_shift: !apply:math.floor
    - !ref 0 + <sample_rate> * <shift_delta>
time_shift: !new:speechbrain.augment.freq_domain.RandomShift
    min_shift: !ref <min_shift>
    max_shift: !ref <max_shift>
    dim: 1
# injection of gaussian white noise
snr_white_low: 15.0 # @orion_step2: --snr_white_low~"uniform(0.0, 15, precision=2)"
snr_white_delta: 19.1 # @orion_step2: --snr_white_delta~"uniform(5.0, 20.0, precision=3)"
snr_white_high: !ref <snr_white_low> + <snr_white_delta>
add_noise_white: !new:speechbrain.augment.time_domain.AddNoise
    snr_low: !ref <snr_white_low>
    snr_high: !ref <snr_white_high>

repeat_augment: 1 # @orion_step1: --repeat_augment 0
augment: !new:speechbrain.augment.augmenter.Augmenter
    parallel_augment: True
    concat_original: True
    parallel_augment_fixed_bs: True
    repeat_augment: !ref <repeat_augment>
    shuffle_augmentations: True
    min_augmentations: 4
    max_augmentations: 4
    augmentations: [
        !ref <cutcat>,
        !ref <rand_amp>,
        !ref <time_shift>,
        !ref <add_noise_white>]
"""

example_hyperparams += data_augmentation_hyperparams

## **Train the neural network on a single cross-validation fold**

Start network training by running the `train.py` script providing the filepath to the yaml file and by overriding the variables set as placeholders in the yaml file. Furthermore, here for brevity we override also the number of training epochs to 50 epochs (instead of 1000 epochs) with `--number_of_epochs 50`. It is worth highlighting that you can also override any other hyper-parameters in the same way.

In [None]:
%cd /content/benchmarks/benchmarks/MOABB/

!python train.py /content/example_hyperparams.yaml \
--data_folder '/content/data/BNCI2014001' \
--cached_data_folder '/content/data' \
--output_folder '/content/results/single-fold-example/BNCI2014001' \
--data_iterator_name 'leave-one-session-out' \
--target_subject_idx 0 \
--target_session_idx 1 \
--number_of_epochs 50 \
--device 'cpu' # Switch to cuda for a speed up.


## **Run a complete experiment by looping over the entire dataset**

In the previous cell, `train.py` was called for a single cross-validation fold (e.g., one participant and one held-out session in leave-one-session-out cross-validation). Thus, we provide a command line interface for easily running training on all participants and cross-validation folds (using `./run_experiments.sh`).

Here, besides the relevant folders, you should specify the hyper-parameter file, the number of participants and sessions to use, the data iteration scheme (leave-one-session-out or leave-one-subject-out). In addition, you can also run the code multiple times, each time with a different random seed used for initializing weights (by setting the `nruns` parameters). Finally, you can define the `device` to use (set to `cpu` if you do not have a GPU).



In [None]:
!./run_experiments.sh --hparams /content/example_hyperparams.yaml \
--data_folder '/content/data/BNCI2014001'\
--cached_data_folder '/content/data' \
--output_folder '/content/results/full-experiment/BNCI2014001' \
--nsbj 9 --nsess 2 --nruns 1 --train_mode 'leave-one-session-out' \
--number_of_epochs 50 \
--device 'cpu'