# EEG Problem Set (Part 2)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mne import set_log_level
set_log_level(verbose=False)
%matplotlib inline

## Instructions

In the second part of this problem set, you will be performing an ERP analysis on the preprocessed data. First let's describe the task in greater detail. 

This experiment is based on the oddball paradigm used in [Luck et al., (2009)](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1469-8986.2009.00817.x). In the oddball paradigm, a letter or digit was presented every 1100-1500 ms. Subjects were instructed to press a button with one hand for digits and with the other hand for letters. For a given trial block, either the letters or the digits were rare (20%) and the other category was frequent (80%). Thus, the stimulus category and the probability were counterbalanced. The probability manipulation was designed to isolate the probability-sensitive P300 component. Different event codes were used for the digits when they were rare, the digits when they were frequent, the letters when they were rare, and the letters when they were frequent. 

The P300 is a neural marker of surprise. As such, we expect a larger P300 during the rare trials than for the frequent trials. In this second notebook, you will analyze the difference (if any) in P300s between conditions. Specifically, you will (1) epoch the raw data, (2) organize the data by condition, (3) perform permutation testing, and (4) visualize the evoked potentials.

We begin by loading in the preprocessed data.

In [None]:
from mne.io import read_raw_fif

## Load data.
raw = read_raw_fif('sub-01_task-digitsymbol_preproc_raw.fif', preload=True, verbose=False)
print(raw)

## Step 1: Gather Events

In this first step, we will read in all of the trial events from the raw data. To do this, use `mne.events_from_annotations`. Import the function and apply it to the raw data.

In [None]:
from mne import events_from_annotations

## Gather events.
events, event_id = events_from_annotations(raw)

Next, use `plot_events` from `mne.viz` to visualize out the events. 

**Q**: How many trial types are there? How are they organized?

> &nbsp;

**Q**: How many response types are there? What are they?

> &nbsp;

## Step 2: Epoching

Now we will perform epoching. First we must define the **event_id**. Remember that event IDs are Python dictionaries, where the keys are the event labels (e.g. 20_Dig_R, 80_Dig_R) and the values are the event integers (see y-axis of plot above).

In the following, include only the events correspdonding to the onset of the stimuli (i.e. do not include the response events). 

Now we must define the time window for our epochs. We will use:
- tmin: 200ms
- tmax: 1000ms
- baseline: = (None, 0)

Now we must define our rejection criterion. Define a reasonable threshold.

Now perform epoching using `mne.Epochs`. 

**Note:** Use `pick_types` to include only the EEG channels (i.e. we no longer need the EOG or trigger channels).

Now we drop bad epochs. 

**Q**: How many trials are left per condition after dropping bad epochs? 
> &nbsp;

Finally, let's save our epochs. Save the data as *sub-01_task-digitsymbol-epo.fif*.

## Step 3: Event Related Potential Analysis

Now we get to the fun part. In the following, you will look to find the P300 in the evoked potentials of each condition.

### Evoked Potentials

First, make two evoked potentials:
- *frequent*: an average of all the frequent (80) trials, collapsing over symbol and hand.
- *rare*: an average of all the rare (20) trials, collapsing over symbol and hand.

### Compare Evoked Potentials
Using `mne.viz.plot_evoked_topo`, plot a comparison of all the evoked potentials across the scalp.

**Q**: Is there an obvious P300?

> &nbsp;

**Q**: If there is a P300, is it prominent everywhere?

> &nbsp;

### Topographic Plots
Make topographic plots for the **difference wave**  Remember that the P300 should start around 300 ms and persist for many hundreds of milliseconds.

**Q**: If present, when is the P300 most prominent?

> &nbsp;

**Q**: If present, where is the P300 most prominent?

> &nbsp;

## Step 4: Replication + Permutation Testing

In this final step, we will formalize our analysis by replicating and extending Figure 2 from [Luck et al., (2009)](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1469-8986.2009.00817.x). To do so, we will perform permutation testing, testing for differences in the amplitude of the P300 between conditions across three sets of channels.

First, make two separate epoch objects:
- *frequent*: all the frequent (80)
- *rare*: all the rare (20) trials

Next, find the corresponding indices for the following sets of channels. Find the indices using `mne.pick_channels`. 
- frontal: F1, Fz, F2
- central: C3, Cz, C4
- parietal: P3, Pz, P4

Following the permutation testing code presented in the `eeg-02` demo, write a *for loop* that performs the following for each channel set:

1. Extracts the trials by channel set and condition (frequent, rare).
2. Average over the channels.
3. Performs permutation testing with 1024 permutations.
4. Plots the evoked potential (i.e. average over trials) per condition and highlights significant clusters.