## P&O ISSP: Brain-computer interface voor sturing van een directionele akoestische zoom

In this notebook, we will start building a basic deep learning implementation for classifying which two of the Stimuli was attended to, when given EEG and both Stimuli as input. 


One of the ways to process the EEG data is to find specific patterns in the signal. Based on the presence or absence of these patterns we will decide where is the auditory attention. But handcrafting these pattern might be difficult, so we will use convolutional neural network to learn filters which can detect those patterns.

The implementation will be in mulitple phases. First, we will get more familiar with keras and the deep learning framework by mimicking the linear regression-based network, but then in a non-linear context. Once we have implemented this, we can start playing with the deep learning architectures and add some blocks, see what different training schemes do, etc...

Once we have a working model, we can start to play with the data and see if we can improve the performance. The basic model will transform the EEG to a space where it has to resemble the envelope, and then we will compare performance by calcualtion the correlation between this represenation and both envelopes. instead of only transforming the EEG, we can try to transform the envelopes as well. In this way, both EEG and envelopes get transformed to a common space, and we can compare how similar they are in this latent space. This gives the model more degrees of freedom, to find a representation that is good for both the EEG and the envelopes.



**Note**: If keras is  not already installed, execute: !pip install keras

In [None]:
# Load required libraries
import keras
from keras.models import Sequential
from keras.layers import Dense, Conv1D, Flatten, Activation
from keras import regularizers



* The EEG data preprocessing has been explained in another tutorial.
* we have already implemeted method a) linear decoder baseline

**Convolutional baseline network**
* The first step in the model is a convolutional layer, A (64 x 16) spatio-temporal filter is shifted over the input matrix, containing the EEG.
* A rectifying linear unit (ReLu) activation function is used after the convolution step. the kernel size of 16 is chosen because, as is the case in the linear model, we want to look to future EEG to predict the current envelope. the EEG is sampled at fs=64Hz, giving us a temporal resolution of 16/64 = 250ms.
* The output of the convolutional block is a (time-window, 1) signal. 
* In the next step, we calculate the cosine similarity between this signal and both of the envelopes. We will calculate this cosine similarity by applying a *dot product* between the signal and both envelopes. 
* As a last step, we then have to choose which one of the two attended envelopes is the one we want to choose. We do this by applying a single neuron ( **dense layer** in keras, with a sigmoid activation function. 

**deep learning model**
* the idea here is the same. We still give EEG and envelopes to the model, there are just more processing steps in between before we have to make a decision. 
* we first apply a one-dimensional convolution to the EEG, with 8 output filters. We can interpret this as kind of a non-linear dimensionality reduction, as the resulting EEG has shape (time-window, 8) instead of the original (time-window, 64) 
* next, there are some convolutional blocks. These convolutions are applied to both EEG and envelopes. We will have separate track for the EEG ( eg 1 convolutional block) and one for the envelope (eg. also 1 convolutional block). To keep the model stable and simple, we will have one 'track' for the envelopes. Both attended and unattended envelope will be transformed by the same convolutional block, ensuring that the model has to learn to distinguish between the attended and unattended envelope.
* after that, we once again compute the dot product and subsequently put the result of this in a sigmoid neuron to reach an end decision.
* the possibilities are endless, and we can try to add more convolutional blocks, or even add a recurrent layer ( LSTM blocks) to the model. What is important is that you start from a simple model, and then gradually expand it. this way, if something does not work, it is easier to find the problem, or to revert back to a simpler model.





In [None]:

eeg = tf.keras.layers.Input(shape=[time_window, 64])
env1 = tf.keras.layers.Input(shape=[time_window, 1])
env2 = tf.keras.layers.Input(shape=[time_window, 1])

#add model layers
## ---- add your code ----here

# Classification
out1 = tf.keras.layers.Dense(1, activation="sigmoid")(
    tf.keras.layers.Flatten()(tf.keras.layers.Concatenate()([cos1, cos2])))

# 1 output per batch
out = tf.keras.layers.Reshape([1], name=output_name)(out1)
model = tf.keras.Model(inputs=[eeg, env1, env2], outputs=[out])



In [None]:
# To check the model summary:
model.summary()

Before we start training the model, we need to make sure that the data is equally balanced. We have attended and unattended envelopes that we give to the model. If we always put the attended envelope at stream 1 and the unattended at stream 2, the model will quickly figure out that it should just always output stream 1 and hence not learn anything. 

The solution to this is to present each segment of EEG twice, where we swap the envelopes, ( and thus, the labels), from place 

In [None]:

def batch_equalizer(eeg, env_1, env_2, labels):
    # present each of the eeg segments twice, where the envelopes, and thus the labels 
    # are swapped around. EEG presented in small segments [bs, window_length, 64]
    return (np.concatenate([eeg,eeg], axis=0), np.concatenate([env_1, env_2], axis=0),np.concatenate([ env_2, env_1], axis=0)), np.concatenate([labels, (labels+1)%2], axis=0)


# Data loading

* as you can see, the total amount of data is quite a few GB. this will most probably not fit in your RAM, so we will have to load the data in batches.
* Python generators are a great way to do this.
* Now we prepare our data to train the model.
*

In [None]:
data_generator = DataGenerator(files)

# create tf dataset from generator
dataset = tf.data.Dataset.from_generator(
        data_generator)

# now you have a dataset, you can perform operations on the fly (  using built-in functions such as 'map', 'window', 'batch', etc)
# eg. window the dataset into slices of (EEG, envelope1, envelope2) of a certain length, with a hop size between consecutive slices
# batch the data into batches of a certain size
# shuffle the data
# create a corect label for each sample ( is envelope 1 attended or envelope2 , eg, label 1 or 0 )

# using keras, we can easily create a model that can be trained on this dataset, giving this dataset to the model.fit() function


In [None]:

class DataGenerator:
    """Generate data for the Match/Mismatch task."""

    def __init__(
        self,
        files
    ):
        """Initialize the DataGenerator.

        Parameters
        ----------
        files: Sequence[Union[str, pathlib.Path]]
            Files to load.
        """
        self.files = files


    def __len__(self):
        return len(self.files)

    def __getitem__(self, recording_index):
        """Get data for a certain recording.

        Parameters
        ----------
        recording_index: int
            Index of the recording in this dataset to load data for.

        Returns
        -------
        Union[Tuple[tf.Tensor,...], Tuple[np.ndarray,...]]
            The features corresponding to the recording_index recording
        """

        # Load the data
        # prepare the data for the model   ( eeg, env1, env2)
        # return the data


    def __call__(self):
        """Load data for the next recording.

        Yields
        -------
        Union[Tuple[tf.Tensor,...], Tuple[np.ndarray,...]]
            The features corresponding to the recording_index recording
        """
        for idx in range(self.__len__()):
            yield self.__getitem__(idx)

            if idx == self.__len__() - 1:
                self.on_epoch_end()

    def on_epoch_end(self):
        """Change state at the end of an epoch."""

        # choose if you want to do something with the data at the end of an epoch

