# Lab 1

Introduction to Labstream and EEG Analysis. This notebook is for you to get exposure to EEG signal analysis and artifact removal.

## EEG Theory

### Eletroencephalography

The electroencephalogram or EEG is a recording of the biopotentials in the cerebrum of thebrain. These potentials are typical recorded at the surface of the scalp and can vary withrespect to the emotional, mental, and physiological state of a person. The action potentialsand synaptic potentials of an individual neurons are too small to be measured by electrodes.Therefore, an EEG is a measurement of the summation of the electrical signals produce byneurons in a defined area and over a specific amount of time. It is important to note that theseneurons need not synchronized but may be producing signals in an asynchronous manner.EEG signals can be categorizedby the four major frequency ranges or brainwaves in whichthey occur: alpha, beta, delta and theta. The corresponding frequencies, amplitudes, andtypical human functionality of the waves are seen in Table.


| Brainwave | Frequency | Amplitude (uV) | Human Function |
| --------- | --------- | -------------- | -------------- |
| alpha | 8 - 13 | 2 - 100 | Awake, Quiet, Resting, Eyes Open|
| beta | 13 - 22 | 5 - 100 | Mental Activity or External Stimulus|
| delta | 0.4 - 4 | 20 - 100| Sleep |
| theta | 4 - 8 | 10 | Emotional Stress | 


A special system of electrode placement called the 10-20 system is used during EEG recordings.The 10 and 20 refer to percent distances of the electrodes from each other with respect to the sizeof the patient’s head. The letters F, T, C, P, and O in the 10 20 system refer to frontal, temporal,central, parietal, and occipital or essential lobes of the brain excluding central. Also, even numbers are located on the right hemisphere and odd numbers on the left hemisphere. The letterz is an indicator of the central line of the head.



## Import the CSV
We first need to read in the csv file name from your data. Please insert the name of your file in the code below:

In [None]:
import pandas as pd
import mne
# Opens up another window with your plot
%matplotlib qt 
# Read in Data
EEGCsv = pd.read_csv('Saved_Data/Your_File_name.csv')
EEGCsv = EEGCsv.drop(columns=['Timestamp','Device_Time'])

# Set channel sampling frequency
fs = 512 
display(EEGCsv)

## EEG Signals
The above code prints out a table containing your collected data. One thing to note is the column names and what they correspond to. We have time, electrode position values, and Event/ IsTarget. For now, lets focus on the electrode positions.

The data was collected using the standard 10-20 placement. The code below shows you these positions.


In [None]:
mon = mne.channels.make_standard_montage('standard_1020')
mon.plot(kind='topomap', show_names=True)# 2d Plot

In [None]:
# The window that this opens is interactable, so you can rotate around to get a better idea where each electrode is.
mon.plot(kind='3d', show_names=True)# 3d Plot

We now need to plot signals from our dataset. Unfortunately, some of the electrode positions are incorrectly named and need to be renamed. Our analysis uses the MNE toolbox. Feel free to look at the MNE documentation online.

In [None]:
# Drop Unused Columns
df_mne = EEGCsv.drop(columns=['Time', 'Event', 'IsTarget', 'IsNonTarget', 'IsStartOfNewBlock', 'EndOfRepetitionNumber'])
# Rename Channels
ch_names = list(df_mne.columns)
ch_names[2] = 'FC5'
ch_names[4] = 'FC6'
display(df_mne)

In [None]:
# Create MNE Info to apply to csv data
info = mne.create_info(ch_names, fs, ch_types='eeg')

# Create the Raw Object for MNE
raw = mne.io.RawArray(df_mne.values.transpose(), info)

# Plot the data
raw.plot(n_channels=16)

In [None]:
# You should of gotten a very messy plot, but can scale the image using +/- buttons. Lets fix this

# We can also apply a function to an entire pandas dataframe using the .apply method, which returns a dataframe
# The lambda keyword creates a onetime function definition with input x. X in this case is each cell value.

df_mne_scaled = df_mne.apply(lambda x: x/1000000)
display(df_mne_scaled)

# Create the Raw Object for MNE
raw = mne.io.RawArray(df_mne_scaled.values.transpose(), info)

# Plot the data
raw.plot(n_channels=16)


In [None]:
raw.plot_psd(area_mode='range', tmax=10.0, show=False, average=True)

In [None]:
import scipy
import scipy.fftpack
import matplotlib.pyplot as plt
import numpy as np
# Run the following to create a notch filter at 50 hz and view the resulting PSD
raw.notch_filter(np.arange(50, 251, 50))
raw.plot_psd(area_mode='range', tmax=10.0, show=False, average=True)

In [None]:
from mne.viz import plot_topomap
from mne.time_frequency import psd_welch

# FOOOF imports
from fooof import FOOOFGroup
from fooof.bands import Bands
from fooof.analysis import get_band_peak_fg
from fooof.plts.spectra import plot_spectrum

from matplotlib import cm, colors, colorbar

raw.set_montage(mon)

spectra, freqs = psd_welch(raw, fmin=1, fmax=40, tmin=0, tmax=250,
                           n_overlap=150, n_fft=512)

# Initialize a FOOOFGroup object, with desired settings
fg = FOOOFGroup(peak_width_limits=[1, 6], min_peak_height=0.15,
                peak_threshold=2., max_n_peaks=6, verbose=False)

# Define the frequency range to fit
freq_range = [1, 30]

# Fit the power spectrum model across all channels
fg.fit(freqs, spectra, freq_range)

# Define frequency bands of interest
bands = Bands({'theta': [3, 7],
               'alpha': [7, 14],
               'beta': [15, 30]})

# Extract alpha peaks
alphas = get_band_peak_fg(fg, bands.alpha)

# Extract the power values from the detected peaks
alpha_pw = alphas[:, 1]

# Plot the topography of alpha power
plot_topomap(alpha_pw, raw.info, cmap=cm.viridis, contours=0);

In [None]:
#Next, lets check the power spectra for the largest detected peaks within each band.

def check_nans(data, nan_policy='zero'):
    """Check an array for nan values, and replace, based on policy."""

    # Find where there are nan values in the data
    nan_inds = np.where(np.isnan(data))

    # Apply desired nan policy to data
    if nan_policy == 'zero':
        data[nan_inds] = 0
    elif nan_policy == 'mean':
        data[nan_inds] = np.nanmean(data)
    else:
        raise ValueError('Nan policy not understood.')

    return data

fig, axes = plt.subplots(1, 3, figsize=(15, 6))
for ind, (label, band_def) in enumerate(bands):

    # Get the power values across channels for the current band
    band_power = check_nans(get_band_peak_fg(fg, band_def)[:, 1])

    # Extracted and plot the power spectrum model with the most band power
    fg.get_fooof(np.argmax(band_power)).plot(ax=axes[ind], add_legend=False)

    # Set some plot aesthetics & plot title
    axes[ind].yaxis.set_ticklabels([])
    axes[ind].set_title('Largest ' + label + ' peak', {'fontsize' : 16})

## Event Related Potentials

Now that we have an idea of how to extract some meaningful information from EEG signals, we need to do some analysis specifically with our dataset. This dataset is part of this [paper](https://hal.archives-ouvertes.fr/hal-02078533v3/document)

Going through this paper, you see that there are symbols that flash on screen. The participant is looking for a specific target symbol, hence the 'IsTarget' and 'IsNonTarget'. These stimulus produce what is called a P300 Event Related Potential. We are going to analyze these potentials between the target and non target stimulus.

Each Event has an event number indicating what occured, such as a target symbol displaying at a specific location on screen.

In [None]:
# We are going to go back to the original dataset and use the Pandas Groupby method to figure out when the target events occur
EEGCsv.groupby('IsTarget').get_group(1)

In [None]:
# No for the non-target events
EEGCsv.groupby('IsNonTarget').get_group(1)

In [None]:
# We can now compare an ERP between a target and non-target event.
# We need to slice our dataframe to get the P300 timeframe from these events.
# P300 events occur when a positive deflection occurs approx 300 msec after a triggering event (stimulus)
# Lets see if we can find this peak visually

# We first find a target time by using the .iloc method
TargetTime = EEGCsv.groupby('IsTarget').get_group(1).iloc[0].Time

# Then we can index the TimeFrame for the P300
StartTime = TargetTime - 200*0.001 # Starting at -200 msec
EndTime   = TargetTime + 500*0.001 # Ending at 500 msec

ERPdf = EEGCsv[(EEGCsv['Time'] > StartTime) & (EEGCsv['Time'] < EndTime)]

# Lets look at what the raw data looks like for 1 ERP
display(ERPdf)

# This code will be useful for the lab questions when computing the PSD of the ERPs. The next code sections are better for plotting

### MNE Library and ERPs
We first need to drop columns from our raw data that are not EEG based. Then we can scale and create a raw MNE array for easy plotting.

The next step creates Event Info from our Event column in the original array. We can then use MNE functionality to plot the ERPs between target and non-target events.

In [None]:
# Drop Non-EEG Columns
df_mne = EEGCsv.drop(columns=['Time','Event', 'IsTarget','IsNonTarget', 'IsStartOfNewBlock', 'EndOfRepetitionNumber'])
non_eeg_cols = ['Time','IsTarget', 'IsNonTarget', 'IsStartOfNewBlock', 'EndOfRepetitionNumber']
# Rename Channels
ch_names = list(df_mne.columns)
ch_names[2] = 'FC5'
ch_names[4] = 'FC6'

# Create MNE Info to apply to csv data
info = mne.create_info(ch_names, fs, ch_types='eeg')
info['chs'][-1]['kind']

# Rescale data
df_mne = df_mne.apply(lambda x: x/10000000)

# Create Raw Array
raw = mne.io.RawArray(df_mne.transpose(), info)


# Create Event info and add to raw array
info = mne.create_info(['STI 0'], raw.info['sfreq'], ['stim'])
temp = EEGCsv['Event'].values.transpose()
temp = np.reshape(temp, (1, -1))
stim_raw = mne.io.RawArray(temp, info)
raw.add_channels([stim_raw], force_update_info=True)

# Get events object for easy access
events = mne.find_events(raw, stim_channel='STI 0', verbose=True)


# Plot the data
raw.plot(events = events, n_channels=16)

# Note the vertical lines indicating events

In [None]:
# Add reject threshold for EEG signals
reject = dict(eeg=180e-6)

# Set epoch length to -0.2 seconds before stimulus and 0.5 after stimulus
tmin, tmax = -0.2, 0.5

# Get the Target Event values using the Pandas groupby method
targets = EEGCsv.groupby('IsTarget').get_group(1).Event.values
display(targets)

# Set the epoch parameters based on the events in the raw data, the first event Id (65)
epochs_params = dict(events=events, event_id=int(targets[0]), tmin=tmin, tmax=tmax,
                     reject=reject)

# Now we parse our data based on the target events and take the average of all channels
epochs = mne.Epochs(raw, **epochs_params).average()
# Set montage to 10-20
epochs.set_montage(mon)
# Plot EEG data and a topomap
display(epochs.plot())
epochs.plot_topomap()

In [None]:
# We can do the same for the non target events
nontargets = EEGCsv.groupby('IsNonTarget').get_group(1).Event.values
display(nontargets)
epochs_params = dict(events=events, event_id=int(nontargets[0]), tmin=tmin, tmax=tmax,
                     reject=reject)
epochs = mne.Epochs(raw, **epochs_params).average()
epochs.set_montage(mon)
display(epochs.plot())
epochs.plot_topomap()

We now have two plots for target and non-target events. Note the difference between the two graphs.

## Your Turn

The above code allows you to get a feel for some EEG analysis for ERPs. Please use the code above as reference to complete the following:

### All data analysis
The following should be applied to all of your collected data
1. Apply a notch filter
2. Compute the average Alpha, Beta, and Theta Power
3. Plot the topomaps for each band (Alpha, Beta, and Theta

### Target and Non Target data analysis
Choose a target event id and a non-target event id that is different from the ids above.

1. Use the notch filtered Raw data from above
2. Compute the average Alpha, Beta, and Theta PSDs for the target and non target event
3. Plot the Raw Averaged Epoched data for the target and non-target event
4. Plot a Topomap for the Raw Averaged Epoched Data for the target and non-target event

 