# **EEG Preprocessing**
Steps to preprocess EEG data include the following:

  - Importing the raw data
  - Downsample the data
  - Bandpass filter 
  - Re-reference data
  - Run PREP pipeline to mark bad channels
  - Run independent component analysis and reject noisy components
  - Save preprocessed data
  - These steps are sometimes done in a different order, or some of the steps omitted depending on a researcher's preference. 

The Python package used to run through each of these steps is MNE: https://mne.tools/stable/index.html

### **Data Description**

![](Pictures\Data.png)

### **What is our "Clean" data?**

From the raw data, we identify and remove intervals of special procedures performed on patients during recordings, such as hyperventilation (deep breathing) and photic stimulation (flashing light). 

Also the recordings contain intervals with no signal. It is the results of turned off equipment or disconnected electrode. So we also have to avoid these flat intervals with zero signal. 

Thus target slices acquired only from clean intervals from each EEG, without flat intervals, hyperventilation and photic stimulation. 

#### Time domain plot of the clean data segment (5 seconds long plot is shown)
![](Pictures/Clean_t.png)

#### Power spectral density of the clean data segment (1 to 100 Hz)
![](Pictures/Clean_psd.png)

### **Resampling**

The sampling frequency of the data was changed from 512 Hz to 500 Hz. 

#### Time domain plot (5 seconds)
![](Pictures/t_resample.png)

#### Power Spectral Density plot
![](Pictures/p_resample.png)

### **Bandpass Filtering the Data**

The data needs to be filtered for low-frequency and high-frequency signal, which is often resultant from environmental/muscle noise in scalp EEG and otherwise is not generally the focus of analyses. Low-pass and high-pass filtering allows for noise below and above a certain frequency to remain in the data. 

This can done using the MNE command raw.filter( ), and you must specify what you want your band cut-offs to be. Typically, and depending on your planned analyses, filtering will be set around 1 and 100 Hertz for EEG signals and arounf 1 and 5 Hertz for EOG signals. 

Bandpass filtering will also have the effect of smoothing out the raw data, and typically looks different than raw data to the naked eye.

`Note the change in EOG signals in the PSD plot`

#### Time domain plot (5 seconds)
![](Pictures/t_filter.png)

#### Power Spectral Density plot
![](Pictures/p_filter.png)


### **Applying the PREP Pipeline**

What the PREP Pipeline does:
- Remove line-noise without committing to a filtering strategy (60 Hz)
- Robustly reference the signal relative to an estimate of the “true” average reference.
- Detect and interpolate bad channels relative to this reference
- Mark the interpolated channels as bad channels in the raw EEG data

Further details about the PREP pipeline can be found at: https://www.frontiersin.org/articles/10.3389/fninf.2015.00016/full

#### Time domain plot (5 seconds)
![](Pictures/t_prep.png)

#### Power Spectral Density plot 
![](Pictures/p_prep.png)

### **Running ICA**

**Why use ICA?**

There are many sensor electrodes simultaneously recording many neural activities like blinks, heartbeats, activity in different areas of the brain, muscular activity from jaw clenching or swallowing, etc. ICA aims to separate the sources, and then re-construct the sensor signals after excluding the sources that are unwanted.

Hence, the next step after cleaning is to run it through Independent Component Analysis. This allows you to reject components of the data that seem to be heavily influenced by motor-related artifacts from blinking, jaw, neck, arm, or upper back movement.

The biggest thing to look out for are hotspots of activity around the edges of the topoplot. Blink artifacts are typically easiest to identify using this method, as they appear near the very front of the topomap (around the frontal channels). Jaw artifacts appear at the very sides of the topomap. Otherwise, highly concentrated spots of activity are typically an artifact of one noisy channel.


**Steps to remove artifacts**

- Filtering to remove slow drifts
- Fitting and plotting the solution
- Using EOG channel to select which ICA components to remove
- Visualising plots before and after applying ICA

Notice that some of the labels are automatically greyed out. This is because MNE has the ability to auto-detect which IC's are a result of muscle noise when given information about EOG and ECG electrodes. It is imperfect so make sure to not blindly trust it. 

![](Pictures/ica1.png)

### ICA Components (EOG)
![](Pictures/ica_eog.png)

#### Time domain plot before and after applying ICA (5 seconds)
Before Applying ICA             |  After Applying ICA
:-------------------------:|:-------------------------:
![](Pictures/t_ica_eog_b.png)  |  ![](Pictures/t_ica_eog_a.png)

#### Power Spectral Density plot before and after applying ICA
Before Applying ICA             |  After Applying ICA
:-------------------------:|:-------------------------:
![](Pictures/p_ica_eog_b.png)  |  ![](Pictures/p_ica_eog_a.png)


### **Applying Current Source Density or Laplacian Transform**

**Why CSD?**

A single neural source contributes to EEG at many electrodes. Laplacian highlights the local features and acts as a spatial filter. It minimizes the contribution of deep and distant sources so that each electrode highlights a smaller region of the brain.

![](Pictures/csd.png)


#### Time domain plot (5 seconds)
![](Pictures/t_csd.png)
