# The Event-Related Potential from a Scalp Electroencephalogram

<div class="alert alert-block alert-info">
*Synopsis* <br><br>

**Data:** 1 s of scalp EEG data sampled at 500 Hz during 1,000 trials in two conditions.<br>

**Goal:** Characterize the response of the EEG in the two conditions.<br>

**Tools:** Visualization, event-related potential, confidence intervals, bootstrapping.

</div>

## Background

In [None]:
# BACKGROUND VIDEO

Voltage recordings from the scalp surface - the electroencephalogram or EEG - provide a powerful window into brain voltage activity.  Some of the earliest human EEG recording occurred in 1924, when Dr. Hans Berger made a remarkable discovery:  the EEG of a human subject at rest with eyes closed exhibits rhythmic activity, an approximately 10 Hz oscillation he labeled the alpha rhythm.  Although now studied for nearly 100 years, the definitive functional role (if any) of the alpha rhythm remains unknown.  Since then, many other EEG rhythms have been detected and labelled (typically with Greek letters) and the analysis of EEG rhythms remains an active area of research. 

Compared to other modalities for measuring brain activity, the EEG possesses both advantages and disadvantages.  Perhaps the most important advantages are:

1. The EEG is non-invasive, and
2. The EEG permits a high temporal resolution (on the order of milliseconds).
But, the EEG measure also suffers from significant disadvantages, the most devastating being the poor spatial resolution;  a single scalp electrode detects the summed activity from approximately 10 cm<sup>2</sup> of cortex.

In this chapter, we consider EEG data recorded from a single scalp electrode.  We will analyze these data to determine what (if any) activity is evoked following two different types of stimuli presented to a human subject.  In doing so, we will continue to develop our skills in MATLAB, and begin to see how this powerful tool can help us understand these time series data.  We begin with a brief description of the EEG data.

## Case Study: an EEG ERP task
An undergraduate student volunteers to participate in a psychology study at his university. In this study, EEG electrodes (sampling rate 500 Hz, i.e., 500 samples per second) are placed on the student's scalp, and he is seated in a comfortable chair in a dark, electrically isolated room.  The student is instructed to place headphones over his ears and listen to a series of repeated sounds.  The sounds consist of two tones - either a high pitch tone or a low pitch tone.  A single tone is presented once every few seconds, and the student responds with a button press to the low pitch tone.  The tone presentation is repeated to collect the EEG response to numerous presentations of the two tones, as illustrated here:

<img src="imgs/example_recording.png"></img>

In this cartoon illustration of the EEG experiment, the EEG electrodes are placed on the scalp surface of a human subject (left).  The EEG activity (blue) is recorded as a function of time during presentation of high pitch tones (black) and low pitch tones (orange).

Our collaborator leading this research study has agreed to provide us with EEG data recorded at a single electrode for 1000 presentations of the high pitch tone, and 1000 presentations of the low pitch tone.  In each presentation - or "trial" - she provides us with 1 s of EEG data, such that the tone occurs at 0.25 s into the trial.  She asks us to analyze these data to determine whether the EEG signal differs following the two tone presentations.

In [None]:
# Case study video

## Loading the Data

Our first step is to load the data into Python.  To do so, we use the function `loadmat()` from the `scipy.io` module as follows,

In [5]:
import scipy.io as sio
data = sio.loadmat('Ch2-EEG-1.mat')

To understand the outcome of issuing this command, let's examine the variable `data` now present in the workspace.  This variable is a *dictionary* variable (execute `type(data)` and the result should be `dict`). To see the *keys* of a dictionary, use the `keys()` method.

In [8]:
data.keys()

['__globals__', 'EEGb', 'EEGa', '__header__', 't', '__version__']

The keys that start and end with two underscores ( `__` ) are private and contain information about the MATLAB file. The variables that we are interested in here are `EEGa`, `EEGb`, and `t`. These correspond to the EEG data recorded in the two conditions (i.e., EEGa to condition A and EEGb to condition B) as well as a time axis (t). Let's extract these variables from the `data` dictionary.

In [15]:
EEGa = data['EEGa']
EEGb = data['EEGb']
t = data['t']

<div class="alert alert-block alert-info">
In general, a single underscore at the beginning of a variable, function or method indicates that this object should be treated as *private*. Double underscores often indicate that Python will interpret this object with some special instructions. In both cases, for what we are doing, we can usually ignore an object that starts with an underscore.
</div>

Let's use the `whos` command to get some more information about the variables.

In [16]:
whos

Variable   Type       Data/Info
-------------------------------
EEGa       ndarray    1000x500: 500000 elems, type `float64`, 4000000 bytes (3 Mb)
EEGb       ndarray    1000x500: 500000 elems, type `float64`, 4000000 bytes (3 Mb)
data       dict       n=6
sio        module     <module 'scipy.io' from '<...>s/scipy/io/__init__.pyc'>
t          ndarray    1x500: 500 elems, type `float64`, 4000 bytes


<div class="alert alert-block alert-info">
We could also have used `EEGa.shape` to find out the dimensions of the variable `EEGa`.
</div>

In the *Data/Info* column we see *1000x500* for `EEGa` and `EEGb`.  Both variables are matrices with 1000 rows and 500 columns.  Our collaborator tells us that each row corresponds to a separate trial, and each column to a point in time.  So there are 1000 total trials, each consisting of 500 time points.  As a matter of convenience, we define a new variable to record the number of trials,

In [21]:
ntrials = EEGa.shape[0]

The `shape` property of an array is a *tuple* that indicates the size of each dimension. Adding `[0]` at the end tells Python to give us only the first value in the tuple. Recall that Python indexing starts at 0. This variable will be useful later, as we'll see.

<div class="alert alert-block alert-info">
A *tuple* is another data structure in Python that is similar to an array or a list because it usually contains more than one element. Python treats each of these structures slightly differently, however. One of the most challenging things about starting with Python is learning to understand the different data structures. Here, we will mainly work with arrays, which are good for holding multidimensional data. If you are curious and want to know more about data structures, there is a very concise description <a href="http://thomas-cokelaer.info/tutorials/python/data_structures.html">here</a> to get you started.
</div>

In [None]:
# Omit Who's there video since it is MATLAB specific