# Deep Learning for EEG data analysis
We have already seen from a more theoretical point of view what Deep Learning models are and what they are meant for. <br>
Now it is time to understand how to use these models with Python!! <br>

In the following, we will see a complete pipeline for the task of <span style="color:orange">Emotion Recognition from EEG data</span> using the SEED dataset (Zheng et al., 2015, https://bcmi.sjtu.edu.cn/home/seed). <br>

The workflow is composed of the following steps:

<span style="font-size:24px">1. Data Loading & Preprocessing</span><br>
<span style="font-size:24px">2. Model Definition</span><br>
<span style="font-size:24px">3. Model Training</span><br>
<span style="font-size:24px">4. Model Evaluation</span><br>

We will go over every step, one at the time, discussing the problem and trying to find a solution. 

## 1. Data Loading & Preprocessing


I'll report here a brief description of the SEED dataset collection (from https://bcmi.sjtu.edu.cn/home/seed/seed.html):

*"Fifteen Chinese film clips (positive, neutral and negative emotions) were chosen from the pool of materials as stimuli used in the experiments. [...] The duration of each film clip is approximately 4 minutes. Each film clip is well edited to create coherent emotion eliciting and maximize emotional meanings. [...] There is a total of 15 trials (ed. movies) for each experiment. There was a 5 s hint before each clip, 45 s for self-assessment and 15 s to rest after each clip in one session. The order of presentation is arranged in such a way that two film clips that target the same emotion are not shown consecutively. For feedback, the participants were told to report their emotional reactions to each film clip by completing the questionnaire immediately after watching each clip. [...] [EEG signals] were collected with the 62-channel ESI NeuroScan System."*

<div style="text-align:center"><img src="./images/Data_collection_setup.png" alt="exp_setup"></div>

The resulting dataset can be summarized as follows:

<div style="text-align:center"><img src="./images/dataset_schema.png" alt="exp_setup" width="1000" height="500"></div>

Finally, it is worth mentioning that the provided data were **downsampled** at **200Hz** and a **bandpass frequency filter** from **0 - 75 Hz** was applied. 

<span style="color:red;font-size:24px"><em>A bit of Math</em></span>

As we discussed before, Deep Learning is all about Linear Algebra and tensors. Let's write down a *Legend* to fix symbols for the dimensions of our dataset. <br>
<div style="text-align:left">
    <p style="font-size:24px;">- N: <small><em>number of EEG recordings</em></small><br> - C: <small><em>number of EEG channels</em></small><br> - L: <small><em>EEG signal length</em></small><br></p>
</div>

Now, if we call the dataset containing the EEG signals as $X$ and the associated labels as $y$, we have tha $X$ and $y$ are tensors of size:
    $$X \in [N \times C \times L],$$    $$y \in [N]$$


### 1.1. Load, Extract and Transform
The first step is to **load** data from file, **extract** the meaningful information, and **transform** them into a proper format. 

<span style="color:orange;font-size:18px">This step requires some proficiency with file and data handling, plus some experience with Pytorch tensors manipulation, and, hence, it goes  bit out of the scope of this tutorial. <br>
Anyway, for who's interested, the code used to manipulate the input data is available at: https://github.com/federico-carrara/DL_for_Neuro_workshop/blob/main/dataset.py</span>

### 1.2. Data preprocessing
<span style="color:orange;font-size:20px"><em>Deep Learning models are powerful feature extractors that can deal with unstructured data. <br></em></span>

Well, in principle, that's true. However, in practice, the signal-to-noise ratio for most of data source is very low. <br>
So, if we input our model with noisy data, most of the times we end up in a *garbage-in-garbage-out* situation:
<div style="text-align:center"><img src="./images/gigo.png" alt="gigo"></div>

Therefore, **data preprocessing** is an essential step in all data analysis pipelines. <br>
Specifically, in this case, we do the following:

<span style="font-size:24px">I) Division of the EEG signals in overlapping windows</span><br>
Each of the input signals covers 4 minutes of EEG recording downsampled at 200Hz. Hence each signal is made of ~48k timepoints!!!<br>
Splitting each signal into windows spanning 1, 5, or 10 seconds allows us to have more handy and still informative data. 
<div style="text-align:left"><img src="./images/window_overlap.png" alt="overlap" height=300, width=400></div>

Why overlapping windows?? It is an example of *Data Augmentation*!

<span style="color:red">A bit of Math:</span> after this step $X \in [\hat{N} \times C \times W]$ and $y \in [\hat{N}]$, where $W$ = window length, and $\hat{N} \approx \frac{L}{W}$

<span style="font-size:24px">II) Extraction of EEG sub-bands</span><br>
We apply a *Butterworth filter* to extract 4 sub-signals with frequencies in the *Theta*, *Alpha*, *Beta* and *Gamma* ranges.

<span style="color:red">A bit of Math:</span> after this step $X \in [\hat{N} \times F \times C \times W]$ and $y \in [\hat{N}]$, where $F$ = 4, is the number of sub-bands.

<span style="font-size:24px">III) Computation of <em>channel-wise</em> Differential Entropy </span><br>
Intuitively, the **Differential Entropy** is a measure of the *dispersion* of the distribution of a continuous variable. <br>
Assuming that the EEG signal is the set of *realizations* of a *Gaussian* random variable, we get that the **Differential Entropy** for a given signal $S$ (with standard deviation $\sigma$) can be computed as:
$$DE(S) = \frac{1}{2} \log_2{2\pi e \sigma}$$
Therefore, given a signal as input, we compute a single number as output. As a result from 62 windows, each one corresponding to a different channel, we get a single vector of 62 values.<br>
We compute the $DE$ separately on every channel of each window. 

<span style="color:red">A bit of Math:</span> after this step $X \in [\hat{N} \times F \times C]$ and $y \in [\hat{N}]$.

<span style="font-size:24px">IV) Standardization of data <em>by-trial</em></span><br>
A challenging problem when dealing with EEG data is that the signals are extremely *subject* and *trial-dependent*. <br>
Standardizing the data *by-trial* enables us to lower the *within-trial* variability of the data, so that the model is able to focus more on the *between-trials* variability

<span style="color:red">A bit of Math:</span> standardization does not alter the size of the data tensors.

<span style="font-size:24px"><span style="color:magenta">V) Option 1:</span> Concatenation of sub-bands vector</span><br>
To make the transformed data suitable for training a *Neural Network* we **concatenate** sub-band vectors in one single vector.

<span style="color:red">A bit of Math:</span> after this step $X \in [\hat{N} \times (F*C)]$

<span style="font-size:24px"><span style="color:cyan">V) Option 2:</span> Mapping to Electrode Grid</span><br>
Instead of concatenating vectors, we can think to map the value computed from each electrode to a 2D grid, whose cells represent the position of a given electrode. In this way we can additionally exploit the information about the relative position of electrodes.

<div style="display: flex; justify-content: left;">
  <img src="./images/Electrode_pos.png" style="width: 25%;">
  <img src="./images/map_to_grid.png" style="width: 70%;">
</div>

<span style="color:red">A bit of Math:</span> after this step $X \in [\hat{N} \times F \times 9 \times 9]$, as the grid size to accomodate all the 62 electrodes has size $9 \times 9$.

<span style="color:orange;font-size:18px">The preprocessing of a large dataset like SEED may take several minutes of computation. Therefore, for the workshop you are provided with the already preprocess data (for different window size and both the concatenation and grid options).

Once again, the code used to manipulate the input data is available at: https://github.com/federico-carrara/DL_for_Neuro_workshop/blob/main/dataset.py</span>

### It's finally time to play a bit with the data!!!
As we mentioned we have two types of inputs for the Deep learning models:
- <span style="color:magenta">Concatenated data</span>
- <span style="color:cyan">Data mapped to 2D grid</span>

Clearly, depending on the particular input, we would need to use a different Neural Network to process them. <br>
Namely we will use:
- <span style="color:magenta">A Feed Forward Neural Network (FFNN)</span>
- <span style="color:cyan">A Convolutional Neural Network (CNN)</span>


### <span style="color:magenta">Option 1: Concatenated Data and Feed Forward Neural Network</span>

#### Import Libraries

In [15]:
from torch.utils.data import DataLoader
from dataset_v2 import SEEDDataset
from FFNN_model import EEGFeedForwardNet, EEGFeedForwardNetModel
import pytorch_lightning as pl
from torchsummary import summary 

#### Load preprocessed data

Define window length, batch size, and the path to the preprocessed data files

In [26]:
window_length = 1000 # possible values: [200, 1000, 2000]
batch_sz = 16 # possible values: [8,16,32,64,128,256,512]

dataset_type = "concatenated" # possible values: ["concatenated", "grid"]

path_to_training_data = f"../data/processed_data_{dataset_type}/train_data_processed_win{window_length}_{dataset_type}.h5"
path_to_validation_data = f"../data/processed_data_{dataset_type}/valid_data_processed_win{window_length}_{dataset_type}.h5"
path_to_test_data = f"../data/processed_data_{dataset_type}/test_data_processed_win{window_length}_{dataset_type}.h5"

Load the data

In [27]:
train_dataset = SEEDDataset(
    path_to_preprocessed=path_to_training_data,
    split="train"
)

val_dataset = SEEDDataset(
    path_to_preprocessed=path_to_validation_data,
    split="validation"
)

test_dataset = SEEDDataset(
    path_to_preprocessed=path_to_test_data,
    split="test"
)

Loading preprocessed train data at ../data/processed_data_concatenated/train_data_processed_win2000_concatenated.h5...
Loading preprocessed validation data at ../data/processed_data_concatenated/valid_data_processed_win2000_concatenated.h5...
Loading preprocessed test data at ../data/processed_data_concatenated/test_data_processed_win2000_concatenated.h5...


Create Data Loaders <small>(objects that automatically prepare data to be fed in the network)</small>

In [28]:
train_dataloader = DataLoader(train_dataset, batch_size=batch_sz, shuffle=True, num_workers=4)
val_dataloader = DataLoader(val_dataset, batch_size=batch_sz, shuffle=False, num_workers=4)
test_dataloader = DataLoader(test_dataset, batch_size=batch_sz, shuffle=False, num_workers=4)

#### Define the model

Specify model parameters

In [29]:
model_params = {
    "input_size": 248, # BETTER LEAVE LIKE THIS
    "num_classes": 3, # BETTER LEAVE LIKE THIS
    "dropout_prob": 0.25, # possible values: any in the interval [0, 0.8] 
    "hidden_size": 64,
    "norm_method": "batch" # BETTER LEAVE LIKE THIS
}

Build the Neural Network to inspect its structure *(not needed for training)*

In [30]:
ffnn_model = EEGFeedForwardNetModel(**model_params)

In [31]:
summary(
    model=ffnn_model, 
    input_size=(248,),
    batch_size=batch_sz, 
    device="cpu"
)

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
       BatchNorm1d-1                  [16, 248]             496
            Linear-2                   [16, 64]          15,936
       BatchNorm1d-3                   [16, 64]             128
              ReLU-4                   [16, 64]               0
           Dropout-5                   [16, 64]               0
            Linear-6                   [16, 64]           4,160
       BatchNorm1d-7                   [16, 64]             128
              ReLU-8                   [16, 64]               0
           Dropout-9                   [16, 64]               0
           Linear-10                   [16, 64]           4,160
      BatchNorm1d-11                   [16, 64]             128
             ReLU-12                   [16, 64]               0
          Dropout-13                   [16, 64]               0
           Linear-14                   

Specify Training Parameters

In [32]:
training_params = {    
    "lr": 5e-4, # possible values: any in the interval [1e-6, 1e-2]
    "betas": [0.9, 0.99], # BETTER LEAVE LIKE THIS
    "weight_decay": 1e-6, # BETTER LEAVE LIKE THIS
    "epochs": 20, # possible values: any (but be careful to check for overfitting!!)
    "lr_patience": 3 # possible values: no more than 1/3 of `epochs` is a reasonable value
}

Build the actual trainable model

In [33]:
ffnn_net = EEGFeedForwardNet(
    model_parameters=model_params,
    **training_params
)

#### Train the model

Some useful stuff for training (BETTER LEAVE LIKE THIS)

In [34]:
best_checkpoint_callback = pl.callbacks.ModelCheckpoint(
    save_top_k=1,
    monitor="val_loss",
    mode="min",
    filename="best_checkpoint",
)

last_checkpoint_callback = pl.callbacks.ModelCheckpoint(
    save_last=True,
    filename="last_checkpoint_at_{epoch:02d}",
)

lr_monitor = pl.callbacks.LearningRateMonitor(logging_interval='epoch')

It is time to train

In [35]:
trainer = pl.Trainer(
    accelerator="gpu",
    max_epochs=training_params["epochs"], 
    callbacks=[best_checkpoint_callback, last_checkpoint_callback, lr_monitor])
trainer.fit(ffnn_net, train_dataloader, val_dataloader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name     | Type                   | Params
----------------------------------------------------
0 | loss_fun | CrossEntropyLoss       | 0     
1 | model    | EEGFeedForwardNetModel | 25.3 K
----------------------------------------------------
25.3 K    Trainable params
0         Non-trainable params
25.3 K    Total params
0.101     Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=20` reached.


#### Evaluate the model

### <span style="color:cyan">Option 2: Data mapped to 2D grid and Convolutional Neural Network</span>

#### Import Libraries

In [41]:
from torch.utils.data import DataLoader
from dataset_v2 import SEEDDataset
from CNN_model import CCNN, CCNNModel
import pytorch_lightning as pl
from torchsummary import summary 

#### Load preprocessed data

Define window length, batch size, and the path to the preprocessed data files

In [42]:
window_length = 1000 # possible values: [200, 1000, 2000]
batch_sz = 16 # possible values: [8,16,32,64,128,256,512]

dataset_type = "grid" # possible values: ["concatenated", "grid"]

path_to_training_data = f"../data/processed_data_{dataset_type}/train_data_processed_win{window_length}_{dataset_type}.h5"
path_to_validation_data = f"../data/processed_data_{dataset_type}/valid_data_processed_win{window_length}_{dataset_type}.h5"
path_to_test_data = f"../data/processed_data_{dataset_type}/test_data_processed_win{window_length}_{dataset_type}.h5"

Load the data

In [43]:
train_dataset = SEEDDataset(
    path_to_preprocessed=path_to_training_data,
    split="train"
)

val_dataset = SEEDDataset(
    path_to_preprocessed=path_to_validation_data,
    split="validation"
)

test_dataset = SEEDDataset(
    path_to_preprocessed=path_to_test_data,
    split="test"
)

Loading preprocessed train data at ../data/processed_data_grid/train_data_processed_win1000_grid.h5...
Loading preprocessed validation data at ../data/processed_data_grid/valid_data_processed_win1000_grid.h5...
Loading preprocessed test data at ../data/processed_data_grid/test_data_processed_win1000_grid.h5...


Create Data Loaders <small>(objects that automatically prepare data to be fed in the network)</small>

In [44]:
train_dataloader = DataLoader(train_dataset, batch_size=batch_sz, shuffle=True, num_workers=4)
val_dataloader = DataLoader(val_dataset, batch_size=batch_sz, shuffle=False, num_workers=4)
test_dataloader = DataLoader(test_dataset, batch_size=batch_sz, shuffle=False, num_workers=4)

#### Define the model

Specify model parameters

In [54]:
model_params = {
    "input_ch": 4, # BETTER LEAVE LIKE THIS
    "grid_size": (9, 9), # BETTER LEAVE LIKE THIS
    "num_classes": 3, # BETTER LEAVE LIKE THIS
    "output_ch":  128, # possible values: [8, 16, 32, 64, 128]
    "kernel_size":  4, # possible values: [3, 4, 5]
    "hidden_size":  1024, # possible values: [128, 256, 512, 1024]
    "dropout_prob": 0.25, # possible values: any in the interval [0, 0.8] 
}

Build the Neural Network to inspect its structure *(not needed for training)*

In [55]:
cnn_model = CCNNModel(**model_params)

In [56]:
summary(
    model=cnn_model, 
    input_size=(4, 9, 9),
    batch_size=batch_sz, 
    device="cpu"
)

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [16, 128, 9, 9]           8,320
              ReLU-2            [16, 128, 9, 9]               0
            Conv2d-3            [16, 256, 9, 9]         524,544
              ReLU-4            [16, 256, 9, 9]               0
            Conv2d-5            [16, 512, 9, 9]       2,097,664
              ReLU-6            [16, 512, 9, 9]               0
            Conv2d-7            [16, 128, 9, 9]       1,048,704
              ReLU-8            [16, 128, 9, 9]               0
            Linear-9                 [16, 1024]      10,617,856
             SELU-10                 [16, 1024]               0
        Dropout2d-11                 [16, 1024]               0
           Linear-12                    [16, 3]           3,075
Total params: 14,300,163
Trainable params: 14,300,163
Non-trainable params: 0
-------------------------



Specify Training Parameters

In [57]:
training_params = {    
    "lr": 5e-4, # possible values: any in the interval [1e-6, 1e-2]
    "betas": [0.9, 0.99], # BETTER LEAVE LIKE THIS
    "weight_decay": 1e-6, # BETTER LEAVE LIKE THIS
    "epochs": 20, # possible values: any (but be careful to check for overfitting!!)
    "lr_patience": 3 # possible values: no more than 1/3 of `epochs` is a reasonable value
}

Build the actual trainable model

In [58]:
cnn_net = CCNN(
    model_parameters=model_params,
    **training_params
)

#### Train the model

Some useful stuff for training (BETTER LEAVE LIKE THIS)

In [59]:
best_checkpoint_callback = pl.callbacks.ModelCheckpoint(
    save_top_k=1,
    monitor="val_loss",
    mode="min",
    filename="best_checkpoint",
)

last_checkpoint_callback = pl.callbacks.ModelCheckpoint(
    save_last=True,
    filename="last_checkpoint_at_{epoch:02d}",
)

lr_monitor = pl.callbacks.LearningRateMonitor(logging_interval='epoch')

It is time to train

In [60]:
trainer = pl.Trainer(
    accelerator="gpu",
    max_epochs=training_params["epochs"], 
    callbacks=[best_checkpoint_callback, last_checkpoint_callback, lr_monitor])
trainer.fit(cnn_net, train_dataloader, val_dataloader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name     | Type             | Params
----------------------------------------------
0 | loss_fun | CrossEntropyLoss | 0     
1 | model    | CCNNModel        | 14.3 M
----------------------------------------------
14.3 M    Trainable params
0         Non-trainable params
14.3 M    Total params
57.201    Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

  rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown...")


### It is your turn to experiment!
Try to train and evaluate your own model. <br>

You can customize:
<p style="font-size:24px">
- Window Size <br>
- <span style="color:magenta"><em>Concatenated</em></span> vs <span style="color:cyan"><em>Mapped to Grid</em></span> data <br>
- <span style="color:magenta"><em>FFNN</em></span> vs <span style="color:cyan"><em>CNN</em></span> model <br>
- Model parameters <br>
- Training parameters <br>
</p>

Notice that next to every parameter there is a comment with the suggested range to get reasonable resuls. <br>
There is a few parameters that is better not to touch (they are marked with *BETTER LEAVE LIKE THIS*). <br>
Finally, it is important to remark that *Concatenated data* works only with the *FFNN* model, whereas the *Mapped to grid data* works only with the CNN.

<p style="font-size:24px; color:orange">GOOD LUCK WITH YOUR TRAININGS!!</p>

#### Import Libraries

In [None]:
from torch.utils.data import DataLoader
from dataset_v2 import SEEDDataset
from CNN_model import CCNN
from FFNN_model import EEGFeedForwardNet
import pytorch_lightning as pl
from torchsummary import summary 

#### Load preprocessed data

Define window length, batch size, and the path to the preprocessed data files

Load the data

Create Data Loaders <small>(objects that automatically prepare data to be fed in the network)</small>

#### Define the model

Specify model parameters

Specify Training Parameters

Build the actual trainable model

#### Train the model

Some useful stuff for training (BETTER LEAVE LIKE THIS)

It is time to train