# EEG Problem Set (Part 1)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

## Instructions

In this problem set, we will be analyzing EEG data from one participant who completed an [oddball paradigm](https://en.wikipedia.org/wiki/Oddball_paradigm). 

In this first notebook, you will preprocess the raw data. Specifically, you will (1) load and inspect the raw data, (2) mark bad channels, (3) filter the data, and (4) remove eyeblinks with independent components analysis. Question prompts are included throughout the notebook to guide you in analysis. Note that these require answers to earn full points.

## Step 1: Load and inspect raw data

In [None]:
from mne.io import read_raw_fif

## Load data.
raw = read_raw_fif('sub-01_task-digitsymbol_raw.fif', preload=True, verbose=False)
print(raw)

### Inspecting metadata
Inspect the recording metadata with `raw.info` and answer the following questions:

**Q**: How many EEG channels does this dataset have? How many peripheral channels?

> &nbsp;

**Q**: What is the sampling frequency of the data?

> &nbsp;

**Q**: Has the data already been filtered?

> &nbsp;

### Channel layout
Plot the channel layout below.

**Q**: Are the channels organized according to the [10-20 international electrode placement system](https://en.wikipedia.org/wiki/10%E2%80%9320_system_(EEG))?

> &nbsp;

## Step 2: Marking Bad Channels

Take a moment to browse the raw data.

---

**Note:** The raw data visualizor requires the *matplotlib qt5* backend. This cannot be called in the same notebook as the *matplotlib inline* backend. For your convenience, code for inspecting raw data has been stored in **eeg-ps-inspector.ipynb**. 

---

**Q**: Do any of the channels look bad? 

> &nbsp;

Mark the bad channel (if any) below.

In [None]:
## Designate bad channels.
raw.info['bads'] = []

## Step 3: Filtering

Apply a bandpass filter of [0.5, 30 Hz] to the data.

**Note**: Use `mne.pick_types` to apply the filter only to the EEG channels.

## Step 4: Independent Components Analysis (ICA)

In this final step, we will perform ICA to remove eyeblinks from the data. Use the code from class to fit an ICA model with 25 components to the data. 

**Note:** The eyeblink artifact is exceedingly large in magnitude in the frontal channels (e.g. FPz, FP1, FP2). You will need to use a large amplitude rejection threshold in order to avoid rejecting many epochs from the ICA model.

Next, plot the scalp topography of each component to identify artifactual-looking components.

**Q**: Based on first impressions, which components look like ocular artifact?

> &nbsp;

To more objectively identify artifactual components, you will construct eyeblink epochs to correlate with each component. Use the `create_eog_epochs` code we covered in class. Plot the resulting eyeblinks. 

Next we detect EOG related components using correlation. Detection is based on Pearson correlation between the filtered data and the filtered EOG channel. 

**Q**: Which components most strongly correlate with the eyeblinks?

> &nbsp;

Inspect the source timecourse within the time window of our EOG average.

Now visualize how we would modify our signals if we removed this component from the data.

Register any bad components using the `ica.exclude` attribute.

Now remove the effects of the rejected components using the `apply` method. Apply the ICA transformation to a copy of the original raw data.

## Step 5: Save the Preprocessed Data

Finally, save the new preprocessed raw data for use in the next notebook. Save it as *sub-01_task-digitsymbol_preproc_raw.fif*.