# Assignment 3: EEG and Event-Related Potentials
Please submit this assignment to Canvas as a jupyter notebook (.ipynb).  The assignment will introduce you to EEG brain data, as well as some of the techniques and methods used to analyze it.

In [1]:
# imports
import pandas as pd
import cmlreaders as cml
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

## Working with EEG data
We learned how to load basic information about CML experiments and experimental events in assignment 1. Next, we're going to load EEG/iEEG data that correspond to those events.

## Question 1: Electrodes in the Brain
For this question, we will use the term "electrode" to refer to individual contacts, as opposed to bipolar referenced virtual "electrodes".  In other words, use the contacts dataframe (not the pairs dataframe) to answer the following.

1) How many electrodes does R1111M have placed in the temporal cortex?
2) How many electrodes does R1111M have on the left hemisphere and how many on the right hemisphere?

In [5]:
# Question 1.1
### YOUR CODE HERE

In [6]:
# Question 1.2
### YOUR CODE HERE

## Question 2
Now we will compare z-scored voltage traces for encoded words that were subsequently recalled vs. not recalled.  This sort of "Subsequent Memory Effect" (SME) analysis is of great interest, since it allows us to get at the question of what neural features underlie successful memory.

1) Plot the Z-scored ERP for a 2-second span (-400 ms to 1600 ms) surrounding each word encoding event for a lateral occipital electrode from R1383J's first FR1 session.  We want to two lines on our plot, one averaging over words that were subsequently recalled, and one averaging over words that were not later recalled.
* Add vertical and horizontal lines on your plot to indicate word onset and Z = 0, respectively.
* Add a legend and label each line (Rec v. NRec)


2) Does the occipital electrode show an effect?

In [17]:
# Question 2.1
### YOUR CODE HERE

Question 2.2

**YOUR ANSWER HERE**

## Event Related Potentials (ERPs)

For the remaining exercises in this assignment, we will be analyzing neural responses to events using Event Related Potentials (ERPs).  You will be working with data from a single electrode 75 (Oz) (labeled 'E75') in a single subject LTP093 for LTPFR2 experimental sessions. In this and all subsequent exercises, you will analyze the time series of data surrounding each word presentation, from 200 ms prior to word onset until 1,000 ms after word onset. To remove electrical signals that do not reflect neural activity, we use the Localized Component Filtering method (DelPozo-Banos & Weidemann, 2017). To load the LCF-cleaned
data you will pass the clean=True argument to the CMLReader’s load_eeg() method.

A few points:

* The LTPFR2 data set you'll be working with for this assignment is a *scalp EEG* data set and has slightly different column definitions than intracranial EEG sessions.  For example, the 'trial' column indicates the list number of the event.
* If you use array indexing to access electrode 75 (as opposed to e.g., xarray indexing), make sure to double check the index for this contact.
* Load the scalp EEG data with the `load_eeg()` function by setting the argument `clean=True` and simply don't include the `scheme` argument in the function call to `load_eeg()`. The scheme is only needed for intracranial data and indicates whether monopolar or bipolar-rereferenced EEG should be loaded and for which contacts/bipolar pairs.
* Make sure to subset the events to just the WORD or REC_WORD events before loading the EEG. CMLReaders won't load overlapping EEG intervals (i.e. epochs) surrounding events, which will happen if you try to load EEG surrounding all the behavioral events.  Here is some example code:

In [18]:
exp = 'ltpFR2'                      # LTPFR2 experiment
df = cml.get_data_index('ltp')
df = df.query("experiment == @exp")     # select only LTPFR2 sessions
df

Unnamed: 0,all_events,experiment,import_type,math_events,original_session,session,subject,subject_alias,task_events
487,protocols/ltp/subjects/LTP093/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP093/experiments/ltpF...,0,0,LTP093,LTP093,protocols/ltp/subjects/LTP093/experiments/ltpF...
488,protocols/ltp/subjects/LTP093/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP093/experiments/ltpF...,1,1,LTP093,LTP093,protocols/ltp/subjects/LTP093/experiments/ltpF...
489,protocols/ltp/subjects/LTP093/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP093/experiments/ltpF...,10,10,LTP093,LTP093,protocols/ltp/subjects/LTP093/experiments/ltpF...
490,protocols/ltp/subjects/LTP093/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP093/experiments/ltpF...,11,11,LTP093,LTP093,protocols/ltp/subjects/LTP093/experiments/ltpF...
491,protocols/ltp/subjects/LTP093/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP093/experiments/ltpF...,12,12,LTP093,LTP093,protocols/ltp/subjects/LTP093/experiments/ltpF...
...,...,...,...,...,...,...,...,...,...
6526,protocols/ltp/subjects/LTP393/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP393/experiments/ltpF...,5,5,LTP393,LTP393,protocols/ltp/subjects/LTP393/experiments/ltpF...
6527,protocols/ltp/subjects/LTP393/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP393/experiments/ltpF...,6,6,LTP393,LTP393,protocols/ltp/subjects/LTP393/experiments/ltpF...
6528,protocols/ltp/subjects/LTP393/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP393/experiments/ltpF...,7,7,LTP393,LTP393,protocols/ltp/subjects/LTP393/experiments/ltpF...
6529,protocols/ltp/subjects/LTP393/experiments/ltpF...,ltpFR2,build,protocols/ltp/subjects/LTP393/experiments/ltpF...,8,8,LTP393,LTP393,protocols/ltp/subjects/LTP393/experiments/ltpF...


In [19]:
sess_df = df.iloc[0]  # grab meta-data for first session in LTPFR2 data set

# get a data reader as before, but without the montage and localization arguments
reader = cml.CMLReader(subject=sess_df.subject,
                       session=sess_df.session,
                       experiment=sess_df.experiment)

# load the behavioral events for this session
evs = reader.load('events')
word_evs = evs[evs['type'] == 'WORD']

# load the eeg --> for scalp EEG we want to specify clean = True)
eeg = reader.load_eeg(events=word_evs, rel_start = -100, rel_stop = 1000, clean = True).to_ptsa()
eeg

Opening raw data file /protocols/ltp/subjects/LTP093/experiments/ltpFR2/sessions/0/ephys/current_processed/LTP093 20140902 0959.2_clean_raw.fif...
    Range : 0 ... 2858664 =      0.000 ...  5717.328 secs
Ready.
Reading 0 ... 2858664  =      0.000 ...  5717.328 secs...
Not setting metadata
576 matching events found
No baseline correction applied
0 projection items activated
Using data from preloaded Raw for 576 events and 551 original time points ...
0 bad epochs dropped


## Question 3

1) For each word encoding event in the first list of the first session of the same subject (LTP093) from the previous problem, plot the individual voltage timeseries from 200 ms before the onset of the word until 1000 ms after the onset of the study word. In each figure panel mark study word onset. 

2) Can you identify any consistent patterns in the voltage time series between serial positions just by looking at them?

In [20]:
# Question 3.1
### YOUR CODE HERE

Question 3.2

**YOUR CODE HERE**

## Question 4
1) Generate ERPs for all the events in the first session (subject LTP093) for the first, second, and third serial positions separately (one ERP per serial position) and then an ERP averaged across the events in the remaining serial positions 4 - 24 in Session 1. To carry out an ERP analysis on the voltage data, we must 
    1. Load the desired events.
    2. Filter to only encoding events.
    3. Get the voltage for all encoding events.
    4. Baseline correct the voltage traces separately for each trace by establishing a "baseline" period of the signal from 250 ms before stimulus onset (word presentation) to 50ms before the stimulus onset (i.e., the "baseline" voltage), and then subtracting the average of the baseline from all values in the trace, and dividing all of these resulting values by the standard deviation of the baseline period.
        * Baseline correction accounts for differing electrical characteristics of each channel, and for the fact that the electrical activity (or roughly speaking, state) of the brain will differ from event to event when the stimulus is presented. What we care about is how much the stimulus impacts the electrical activity of the brain relative to what it was just prior to the stimulus.
    5. Filter events by serial position.
    6. Plot the average voltage traces for the different groups of serial positions together on a single figure. 
2) Compare the ERPs to the individual time series you plotted earlier. Is it easier to distinguish the serial positions of items when including data from all events?

In [21]:
# Question 4.1
### YOUR CODE HERE

Question 4.2

**YOUR ANSWER HERE**

## Question 5
1) Again generate ERPs comparing items across the groups of serial positions given in Problem 4, but this time averaging over data from all sessions from the same subject. 

2) Compare the ERPs obtained here with those obtained in the previous exercise.

In [22]:
# Question 5.1
### YOUR CODE HERE

Question 5.2

**YOUR ANSWER HERE**