<a href="https://colab.research.google.com/github/abelowska/mlNeuro/blob/main/2025/MLN_ERP_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ERP analysis in MNE

[`MNE`](https://mne.tools/stable/index.html) is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data: MEG, EEG, sEEG, ECoG, NIRS, and more.

The easiest way is to install MNE via Anaconda, `pip`, or `conda` (see [installation instructions](https://mne.tools/stable/install/manual_install.html)).

In [None]:
!pip install mne

Imports

In [None]:
from pathlib import Path
import matplotlib.pyplot as plt
import mne
import numpy as np
import pandas as pd

## Raw

We are going to use data from the [ERP CORE Dataset](https://doi.org/10.1016/j.neuroimage.2020.117465) via `MNE`. This dataset contains EEG recordings from a single participant performing the Flanker task. Thus, we can extract signal segments around events where we expect the synchronization of large populations of neurons, leading to observable event-related activity.

In [None]:
# download dataset
data_dir = Path(mne.datasets.erp_core.data_path('.'))
file_name = data_dir / "ERP-CORE_Subject-001_Task-Flankers_eeg.fif"

**MNE**-Python data structures are based around the FIF file format from Neuromag, but there are reader functions for a wide [variety of other data formats](https://mne.tools/stable/overview/implementation.html#data-formats). Data is loaded into so-called [Raw](https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw) object.


In [None]:
# read raw from one individual
raw = mne.io.read_raw(file_name, preload=True)

You can get a glimpse of the basic details of a Raw object by printing it; even more is available by printing its `info` attribute (a dictionary-like object that is preserved across Raw, Epochs, and Evoked objects). The `info` data structure keeps track of channel locations, applied filters, projectors, etc. Notice especially the chs entry, showing that MNE-Python detects different sensor types and handles each appropriately. See The Info data structure for more on the [`Info`](https://mne.tools/stable/generated/mne.Info.html) class.

In [None]:
raw.info

In [None]:
print(raw.info)

Let's see our EEG data. Basic MNE classes ([`Raw`](https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw), [`Epochs`](https://mne.tools/stable/generated/mne.Epochs.html), [`Evoked`](https://mne.tools/stable/generated/mne.Evoked.html)) have special method for plotting. Just call method `plot()` on the `Raw` object to see the data.

In [None]:
fig = raw.plot()

It doesn't look pretty, does it? It's because EEG data is contaminated with the high frequencies. Take a look into the documentation of [`plot()`](https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.plot) and plot our data with low- and high-pass filters and from the 60th second.

In [None]:
fig = raw.plot(highpass=0.05, lowpass=40, start=60)

**On the plot, we can see the triggers (events) marked with vertical colored lines.**

**Annotations**

[Annotations](https://mne.tools/stable/generated/mne.Annotations.html) in MNE-Python store short strings of information about temporal spans of a Raw object. Below the surface, Annotations are list-like objects, where each element comprises three pieces of information: an onset time (in seconds), a duration (also in seconds), and a description (a text string). Additionally, the Annotations object itself also keeps track of orig_time, which is a POSIX timestamp denoting a real-world time relative to which the annotation onsets should be interpreted. In general, annotations store information about [Events](https://mne.tools/dev/events.html#events) that occurred and were encoded during EEG recording. Such events could be a keystroke, the appearance of a target, feedback, etc.

In our case, we have information on *responses* and *stimulus*.

In [None]:
# extract events
events, event_ids = mne.events_from_annotations(raw)

In [None]:
# display events names
event_ids

**Event ids are defined as a dictionary of names and their codes, as we can see above.**

The resulting events array is an ordinary 3-column `ndarray`, with **sample number** in the first column and **integer event ID** in the last column; the middle column is usually ignored.

In [None]:
# display events
events

On the zoom plot below we can see stimulus and response events marked with coloured vertical lines and EEG signal on 32 electrodes.

In [None]:
fig = raw.plot(
    start=60.5,
    duration=1,
    highpass=0.05,
    lowpass=40,
    n_channels=32,
    show=False
)

plt.show()

For the further ERP analysis, let's filter our data with the bandpass filter:

In [None]:
# filter raw data
raw_filtered = raw.copy().filter(l_freq=0.1, h_freq=30)

# plot filtered data
fig = raw_filtered.plot()

Keep in mind that MNE methods work in place – MNE objects are mutable, and by default, operations are always performed in place, modifying your object. To keep your original object, always work on copies, e.g.

```
raw_filtered = raw.copy().filter(l_freq=0.1, h_freq=40)
```

**For now, we are not going to do any further signal pre-processing because the data is already basically pre-processed.**

## Epochs

The Raw object and the events array are the bare minimum needed to create an [`Epochs`](https://mne.tools/stable/generated/mne.Epochs.html#mne.Epochs) object, which we create with the `Epochs` class constructor.

Basically, `Epochs` store single-trial event-related potentials (ERPs), thus they have to be constructed **around some defined events**.

To do so, we **have to**:
- pass the event dictionary as the `event_ids` parameter;
- pass the list of events;
- specify tmin and tmax (the time relative to each event at which to start and end each epoch).

By default Raw and Epochs data aren’t loaded into memory (they’re accessed from disk only when needed), but here we’ll force loading into memory using the `preload=True`.

Now, we are going to create an epoch around stimuli events that is in our `Raw`.

In [None]:
# set the time-window of the segments
tmin=-0.2
tmax=0.8

# get the events list from raw
events, _ = mne.events_from_annotations(raw_filtered)

# select only subset of our events - those related to stimuli
event_ids = {
  'stimulus/compatible/target_left': 3,
  'stimulus/compatible/target_right': 4,
  'stimulus/incompatible/target_left': 5,
  'stimulus/incompatible/target_right': 6
 }

# create segments (Epochs)
epochs = mne.Epochs(
  raw=raw_filtered,
  events=events,
  event_id=event_ids,
  tmin=tmin,
  tmax=tmax,
  baseline=(-0.2, 0),
  preload=True,
)

Again, we can print the basic details of `Epochs` object.

In [None]:
epochs

In [None]:
epochs.info

We can now plot the segmented signal:

In [None]:
fig = epochs.plot(
    events=events,
    event_id=event_ids
)

Like `Raw`, `Epochs` also have a number of built-in plotting methods. One is [`plot_image()`](https://mne.tools/stable/generated/mne.Epochs.html#mne.Epochs.plot_image), which shows each epoch as one row of an image map, with color representing signal magnitude; the average evoked response and the sensor location are shown below the image:

In [None]:
fig = epochs.plot_image(picks=['FCz'])

### Get data from epochs

To extract data from `Epochs` as `ndarray`, as with `Raw`, use `get_data()` method. The resulting `ndarray` has shape of `(n_epochs, n_channels, n_samples)`:

In [None]:
epochs_data = epochs.get_data()
print(epochs_data.shape)

## Evoked

MNE-Python has a special object for averaged epochs called [`Evoked`](https://mne.tools/stable/generated/mne.Evoked.html#mne-evoked). `Evoked` objects typically store EEG or MEG signals that have been averaged over multiple epochs, which is a common technique for estimating stimulus-evoked activity (ERPs). Evoked objects can only contain the average of a single set of conditions. Evoked might created by calling `.average()` method on Epoch object.

The data in an Evoked object are stored in an array of shape (`n_channels, n_times)` (in contrast to an Epochs object, which stores data of shape`(n_epochs, n_channels, n_times)`).

The simplest way to create `Evoked` is to call `average()` method on `Epochs` ocject. The `Evoked` is a classic ERP wave.

In [None]:
# create evokeds
congruent_erp = epochs['stimulus/compatible'].average()
incongruent_erp = epochs['stimulus/incompatible'].average()

Let's see the info about the created object:

In [None]:
congruent_erp

The information about the signal in `Epochs` is transferred to derived `Evoked` objects:

In [None]:
congruent_erp.info

We can plot the `Evoked`s using default method for plotting.

In [None]:
fig = congruent_erp.plot(spatial_colors=True)
fig = incongruent_erp.plot(spatial_colors=True)

We can use specific MNE method for visualization [`mne.viz.plot_compare_evokeds()`](https://mne.tools/stable/generated/mne.viz.plot_compare_evokeds.html#mne-viz-plot-compare-evokeds) to directly compare our two types of events: congruent-left and incongruent-left:  

In [None]:
# compare congruent and incongruent ERPs
evokeds = dict(
    congruent=congruent_erp,
    incongruent=incongruent_erp
)
picks = ['Cz']

fig = mne.viz.plot_compare_evokeds(
    evokeds = evokeds,
    picks=picks,
    invert_y=True
)

### Get data from Evoked

To extract data from `Evoked` as `ndarray`, as with `Raw` and `Epochs`, use `get_data()` method. The resulting `ndarray` has shape of `(n_channels, n_samples)`:

In [None]:
evoked_data = congruent_erp.get_data()
print(evoked_data.shape)

As with `Raw` and `Epochs` objects, `Evoked` gives a lot of opportunities to work on the signal. For an examples of manipulating and working with Evoked, see the [tutorial](https://mne.tools/stable/auto_tutorials/evoked/10_evoked_overview.html#sphx-glr-auto-tutorials-evoked-10-evoked-overview-py).

## Exercise: Basic ERP statistical analysis

On the plot above, it is visible that there is a difference between congruent and incongruent ERPs. Let's conduct a statistical analysis of the significance of this difference. We will use a t-test to test this difference.

1. Create congruent epochs and incongruent epochs.
2. (Think!) Select a time window and electrode most suitable for testing the difference.
3. Use `get_data()` to extract data as an ndarray within the desired time window and on the desired channel. Now, your data for both congruent and incongruent epochs will have the shape `(n_epochs, n_channels, n_timepoints)`.
4. Calculate the mean within the selected time window across all timepoints to get the average amplitude within the selected time window on the selected electrode.
5. Use the t-test for related samples to compare congruent and incongruent epochs.

```
scipy.stats.ttest_rel(compatibile_data, incompatibile_data)
```

In [None]:
# your code here