# Preprocessing EEG data
In the last tutorial, we have seen that the raw data is hardly readable as contaminated by various artifacts. In this tutorial, we implement some methods to denoise EEG data, using the same data files as last time.

In [None]:
import mne
import matplotlib.pyplot as plt
import numpy as np
%matplotlib notebook

## Import data
Follow the tutorial from last week to import the raw data files in a `eegdata` variable. No need to epoch yet: for the reasons mentionned in the associated lecture, it is better to apply the first preprocessing steps on the full recording sequence. Visualize once again the PSD.

# Filter data
Apply a 3-35Hz bandpass filter to the raw data (use the method `eegdata.filter($f_{low}$,$f_{high}$)`) and look how the PSD changed. What do you notice?

Try changing some of the parameters of the filter (refer to the documentation), and in particular the method and the phase of the filter. Compare the performances: advantage of each method, drawbacks. Take into account the computation time too.

## Epoching
Extract epochs between -0.5 to 2 seconds from the stimulus (markers `10` and `20`).

## Channel rejection and interpolation
Look into the data if there are channels that disfunctionned.

In [None]:
#ep.plot(block=True,scalings='auto')

Reject and interpolate channels Fp1, Fp2 and T8:
- List in `ep.info['bads']` the names of these two channels
- Use the `mne.pick_types()` function to reject the bad channels. Set the parameter `exclude` to __[ ]__
- Interpolate these channels with the method `.interpolate_bads(reset_bads=False)`

Before the interpolation, you will need to add the montage, which gives the electrode locations. Since we did not measure them during our recordings, we simply use a template that is provided: the standard 10-20 montage.

In [None]:
montage1020 = mne.channels.make_standard_montage('standard_1020')
ep= ep.set_montage(montage1020)



Visualize again the epochs and look at what has changed.

In [None]:
#ep.plot()

## Epoch rejection
[The tutorial provided by MNE](https://mne.tools/stable/auto_tutorials/preprocessing/plot_20_rejecting_bad_data.html#sphx-glr-auto-tutorials-preprocessing-plot-20-rejecting-bad-data-py)  is very well written for visual inspection-based epoch rejection. You can explore also the `autoreject` package.

For epoch rejection based on thresholding: use the `.drop_bad()`method (beware to perform this on a copy of the ep object, and not on ep directly as this action is irreversible). Tune the threshold so that up to 10% of epochs are rejected.

## ICA
We have learned that ICA splits the signal into independent sources. It is helpful to identify remaining EOG artifacts and the heartbeat.
Since we have rejected 3 channels, we can only extract $32-3=29$ independent components.
Create the ICA object using `mne.preprocessing.ICA()`, then fit it to the cleaned epoch struct (`ica.fit(cleaned_epochs)`).

Visualize the time series of the components using the `ica.plot_sources(epochs, show_scrollbars=False)` method

By now you may be able to identify components that are clearly related to heart beat.

Now we introduce a new form of plots: the __topoplots__. They represent activity on a projection of the head shape. These will be particularly useful to look at activities at precise times or in frequency bands, as we will see in the next tutorials. For now, they will help us understand how the ICs have been built, that is, which combination of channels makes the given components. This can be done using `ica.plot_components()`

Now you have identifies components that you want to exclude (namely: those showing ECG and EOG activity). To exclude these components: 
- List the indices to reject, and run `ica.exclude = [list of indices]`
- Apply the new transform to the epochs data (don't forget to make a copy as this is irreversible), using `ica.apply()`

Now the data should be cleaner. Have a look at what it looks like! For easier representation, average epochs and look at what we obtain.
`evoked = epochs.average()`