# EEG preprocessing 

In this notebook: 
- Necessary imports
- Data loader for events, eeg and meta data
- Filtering algorithm
- EEG raw to epochs
- Epochs to evoked responses (ERPs)
- Averaging code for ERPS

Not working:
- Filtering algorithm
- ERPs based on channel

Missing:
- Grand average 
- Mismatch response
- Turning data into pandas dataframe

## Imports

The data will be processed using the mne library. Also there are libraries made in eegyolk in order to load the metadata, eeg data and the event markers. Those libraries need to be imported

In [1]:
import mne      # toolbox for analyzing and visualizing EEG data
import os       # using operating system dependent functionality (folders)
import pandas as pd # data analysis and manipulation
import numpy as np    # numerical computing (manipulating and performing operations on arrays of data)
import copy     # Can Copy and Deepcopy files so original file is untouched.
from ipywidgets import IntSlider, Output
import ipywidgets as widgets
from IPython.display import display
import matplotlib.pyplot as plt
from math import nan

import sys
sys.path.insert(0, '../eegyolk') # path to helper functions
from eegyolk import helper_functions as hf # library useful for eeg and erp data cleaning
from eegyolk import initialization_functions #library to import data
from eegyolk import epod_helper

## Load metadata and eeg files

First the different pathways for the different datasets need to be defined. There are three pathways: eeg, metadata and events. The files can be loaded using the initialization_functions library. All event markers needs to be saved in a seperate folder. If not saved already, the event markers will be saved using the initialization_function library. The data must be saved in a separate folder called "epod_data_not_pushed" in the ePodium repository. 

In [2]:
path_metadata = os.path.join('../epod_data_not_pushed','metadata')
path_eeg = os.path.join('../epod_data_not_pushed','not_zip')
path_eventmarkers = os.path.join('../epod_data_not_pushed','not_zip', 'event_markers')

In [3]:
# load metadata
files_metadata = ["children.txt", "cdi.txt", "parents.txt", "CODES_overview.txt"]  
children, cdi, parents, codes = initialization_functions.load_metadata(path_metadata, files_metadata)

In [4]:
# load eeg
! echo 1 > /proc/sys/vm/overcommit_memory
eeg, eeg_filename =  initialization_functions.load_dataset(path_eeg, preload=False) # preload must be set to True once on the cloud

119 EEG files loaded


In [6]:
print(type(eeg[1]))

<class 'mne.io.edf.edf.RawEDF'>


Unless you run the above cell in a virtual machine on Linux or on a Linux, the above command :
    ! echo 1 > /proc/sys/vm/overcommit_memory
I'm not sure what this will do. Please explain what you are attempting to do with this?

Also this is not really so safe on Linux, as it might kill processes if the memory limit is reached.
It could also go into swapping, just don't go there...

In [None]:
# load events 
events_files = os.listdir(path_eventmarkers)
if len(events_files) == 0 or path_eventmarkers == False: # check if event markers are saved in a seperate folder
    initialization_functions.save_event_markers(path_eventmarkers, eeg, eeg_filename) # save event markers

event_markers = initialization_functions.load_events(path_eventmarkers, eeg_filename) # load event markers
event_markers_simplified = epod_helper.group_events_12(event_markers) # simplify events

This (above) breaks for me because I do not have events for every filepath. Function probably should be re-written to account for this possiblity.

## Data info

Choose which participant you want to view in the box below. 

In [None]:
index = widgets.IntText(
    step=0.1,
    description='Participant',
    disabled=False
)
widgets.VBox([index])

In [None]:
index = int(index.value)

In [None]:
# plot of the used sensors
for i in range(len(eeg)): 
    montage = mne.channels.make_standard_montage('standard_1020')
    #montage.plot(kind='topomap', show_names=True)
    eeg[index].info.set_montage(montage, on_missing='ignore')


In [None]:
eeg[index].plot_sensors(ch_type='eeg', show_names=True)

In [None]:
drop_ch = ['EXG2', 'EXG3', 'EXG4', 'EXG5', 'EXG6', 'EXG7', 'EXG8']
eeg[index].drop_channels(drop_ch)

In [None]:
eeg[index].plot(duration=0.1)

In [None]:
%matplotlib inline 
fig = mne.viz.plot_events(event_markers_simplified[index], event_id = epod_helper.event_dictionary)

## Filtering

In [None]:
# bad channel remover
def removebadchannel(eeg):
    for i in range(len(eeg)):
        if len(eeg[i].info['bads']) != 0:
            eeg[i] = mne.pick_types(eeg[i].info, meg=False, eeg=True, exclude='bads')
    return eeg
removebadchannel(eeg)

In [None]:
#eeg[index] = mne.preprocessing.annotate_nan(eeg[index])

In [None]:
lowpass = widgets.IntText(
    step=0.1,
    description='lowpass:',
    disabled=False
)

highpass = widgets.IntText(
    step=0.1,
    description='highpass:',
    disabled=False
)

widgets.VBox([lowpass,highpass])


In [None]:
# change type to integer
lowpass = int(lowpass.value)
highpass = int(highpass.value)

I would put the values in the filter preset to something that makes sense, and limit the range so values that don't make sense can not be chosen. Right now I can filter to -3 and -900. 

Bandpass filter doesn't work due to memory issues. Can be fixed with preload = True in the load_data() in helper_functions.py. However, jupyter crashes due to lack of memory when done. Needs to be fixed. The data contains power line noise which disrupts the rest of the signal. The notch filter filters out the fequency of 50 Hz. Does not work because of the preload memory issue. 

In [None]:
# bandpass, notch and bad channel filter
freqs = (60, 120, 180, 240)

def filter_eeg(eeg, lowpass, highpass, freqs):
    for i in range(len(eeg)): eeg[i] = hf.band_pass_filter(eeg[i].get_data(), lowpass, highpass) # bandpass filter
        break
    for i in range(len(eeg)): # remove bad channels
        if len(eeg[i].info['bads']) != 0:
            eeg[i] = mne.pick_types(eeg[i].info, meg=False, eeg=True, exclude='bads')
        break
    for i in range(len(eeg)): eeg[i] = notch_filter(eeg[i].get_data(), freqs=freqs) # notch filter
        break
    return eeg

eeg = filter_eeg(eeg, lowpass, highpass, freqs)

Put the body of the loop in a different line than the head of the loop (lines 4 and line 11)

In [None]:
freqs = (60, 120, 180, 240)

def filter_eeg(eeg, lowpass, highpass, freqs):
    for i in range(len(eeg)): 
        eeg[i] = hf.band_pass_filter(eeg[i].get_data(), lowpass, highpass) # bandpass filter
        break
    for i in range(len(eeg)): # remove bad channels
        if len(eeg[i].info['bads']) != 0:
            eeg[i] = mne.pick_types(eeg[i].info, meg=False, eeg=True, exclude='bads')
        break
    for i in range(len(eeg)): 
        eeg[i] = notch_filter(eeg[i].get_data(), freqs=freqs) # notch filter
        break
    return eeg

eeg = filter_eeg(eeg, lowpass, highpass, freqs)

After reformating this to make sense it didn't work. because

the function you call takes the result of  mne.io.read_raw_bdf(something...
                                                               
                                                               You manipulated eeg many times afterwards. 
PLease always clear your kernel and then run your entire notebook from top to bottom or you will make something impossible for me to even evlaute. I'm going to stop evaluating here. Please let me know when you have this organized as a ready to run notebook.

Below a working filter, but not imported from helper_functions.py. This is a band pass filter with defined frequency. The filter is useful for limitting the bandwidth of the output signal to avoid noise. 

In [None]:
# plotting filter
filter_params = mne.filter.create_filter(eeg[index].get_data(), eeg[index].info['sfreq'],
                                         l_freq=lowpass, h_freq=highpass)
mne.viz.plot_filter(filter_params, eeg[index].info['sfreq'], flim=(0.01, 5))

## Creating epoched data

Epochs are created with joining the eeg data with a specific event. tmin and tmax are the start and stop time relative to each event. mne.Epochs automaticaly create a baseline correction.

In [None]:
event_dictionary = epod_helper.event_dictionary
event_dictionary

In [None]:
epochs = hf.create_epochs(eeg, event_markers_simplified, -0.3, 0.7)

In [None]:
evoked = hf.evoked_responses(epochs, event_dictionary)

In [None]:
for i in range(len(event_dictionary)):
    evoked[index][i].plot(spatial_colors=True, exclude='bads')
    print(([key for key in epod_helper.event_dictionary.keys()][i], [value for value in epod_helper.event_dictionary.values()][i]))

In [None]:
channelnames = epochs[1].ch_names

In [None]:
evoked[index][1].plot_joint()

In [None]:
c1 = mne.grand_average(evoked[index])
c1.plot(spatial_colors=True)

In [None]:
std_evoked = epochs[index][2,5,8,11].average()
dev_evoked = epochs[index][3,6,9,12].average()

In [None]:
difference=std_evoked-dev_evoked
evokeds = dict(standard=std_evoked, deviant=dev_evoked, difference=difference)
mne.viz.plot_compare_evokeds(evokeds, combine='mean')

## Create pandas dataset out of epoched data

In [None]:
epochs = epochs[0:4] # subset to test function
def create_pd_df(epochs):
    df_epochs = pd.DataFrame()

    for i in range(len(epochs)):
        df = epochs[i].to_data_frame()
        df['index'] = i
        df_epochs = df_epochs.append(df)
    return df_epochs

create_pd_df(epochs)

In [None]:
df_epochs