<a href="https://colab.research.google.com/github/abelowska/mlNeuro/blob/main/2025/MLN_feature_extraction_exercises.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Basic feature extraction exercises using MNE

Install [`MNE`](https://mne.tools/stable/index.html):

In [None]:
!pip install mne

Imports

In [None]:
from pathlib import Path
import matplotlib.pyplot as plt
import mne
import numpy as np
import pandas as pd

## Load data

We are going to use data from the [ERP CORE Dataset](https://doi.org/10.1016/j.neuroimage.2020.117465) via `MNE`. This dataset contains EEG recordings from a single participant performing the Flanker task. Thus, we can extract signal segments around events where we expect the synchronization of large populations of neurons, leading to observable event-related activity.

In [None]:
# download dataset
data_dir = Path(mne.datasets.erp_core.data_path('.'))
file_name = data_dir / "ERP-CORE_Subject-001_Task-Flankers_eeg.fif"

# read raw from one individual
raw = mne.io.read_raw(file_name, preload=True)

# filter data
raw_filtered = raw.copy().filter(l_freq=0.1, h_freq=30)

Let's see our EEG data:

In [None]:
fig = raw_filtered.plot(start=60)

**On the plot, we can see the triggers (events) marked with vertical colored lines.**

## Create epochs around stimuli

In [None]:
# set the time-window of the segments in seconds
tmin=-0.2
tmax=0.8

# get the events list from raw
events, _ = mne.events_from_annotations(raw_filtered)

# select only subset of our events - those related to stimuli
event_ids = {
  'stimulus/compatible/target_left': 3,
  'stimulus/compatible/target_right': 4,
  'stimulus/incompatible/target_left': 5,
  'stimulus/incompatible/target_right': 6
 }

# create segments (Epochs)
epochs = mne.Epochs(
  raw=raw_filtered,
  events=events,
  event_id=event_ids,
  tmin=tmin,
  tmax=tmax,
  baseline=(-0.2, 0),
  preload=True,
)

Plot the epochs: each epoch as one row of an image map, with color representing signal magnitude; the average evoked response and the sensor location are also shown on the image:

In [None]:
fig = epochs.plot_image(picks=['FCz'])

## ERP Features

Try to code all the features extractions below:

### ERP-Feature 1: Mean amplitude in time-window

Get the mean aplitude from the time-window 0.2 - 0.4 for each trial for one channel Cz. Your output vector should have shape of `(400, 1)`.

In [None]:
# your code here

Get the mean aplitude from the time-window 0.2 - 0.4 for each trial for all channels. Your output vector should have shape of `(400, 33)`.

In [None]:
# your code here

### ERP-Feature 2: Peak amplitude

Get the positive peak aplitude from the time-window 0.2 - 0.4, i.e., the highest amplitude within this window, for each trial in channel Cz. You can use e.g. `np.max()`. Your output vector should have shape of `(400, 1)`.

In [None]:
# your code here

Get the negative peak aplitude within the time window 0 - 0.2, i.e., the lowest amplitude within this window, for each trial in channel Cz. Your output vector should have shape of `(400, 1)`.

In [None]:
# your code here

### ERP-Feature 3: Peak latency

Get the latency of the positive peak amplitude within the time window 0.2 - 0.4 seconds, i.e., the time point (index) of the highest amplitude within this window, for each trial in channel Cz. Your output vector should have a shape of `(400, 1)`.

In [None]:
# your code here

## Oscillatory Features

Try to code all the features extractions:

### Oscillatory-Feature 1: Log-variance of the signal

Calculate the log-variance of the alpha (8-13 Hz) band for one channel Cz. Mind, that you have to (1) filter your raw data first, (2) create epochs, (3) calculate log-variance for all segments. Your output vector should have shape of `(400, 1)`.

In [None]:
# 1. filter raw
l_freq = # your code here
h_freq = # your code here

raw_alpha = raw.copy().filter(l_freq=l_freq, h_freq=h_freq)


# 2. create epochs/segments
# set the time-window of the segments in seconds
tmin=-0.2
tmax=0.8

# get the events list from raw
events, _ = mne.events_from_annotations(raw_alpha)

# select only subset of our events - those related to stimuli
event_ids = {
  'stimulus/compatible/target_left': 3,
  'stimulus/compatible/target_right': 4,
  'stimulus/incompatible/target_left': 5,
  'stimulus/incompatible/target_right': 6
 }

# create segments (Epochs)
epochs_alpha = mne.Epochs(
  raw=raw_alpha,
  events=events,
  event_id=event_ids,
  tmin=tmin,
  tmax=tmax,
  baseline=(-0.2, 0),
  preload=True,
)

In [None]:
# 3. calculate the log-variance of the epochs_alpha

# your code here

### Oscillatory-Feature 2: Power of brain bands.

Calculate the power of the each of main brain bands (delta, theta, alpha, beta) for one channel Cz. Use [`compute_psd()`](https://mne.tools/stable/generated/mne.Epochs.html#mne.Epochs.compute_psd) method to create **spectral representation of each epoch**. `compute_psd()` returns [`EpochsSpectrum`](https://mne.tools/stable/generated/mne.time_frequency.EpochsSpectrum.html#mne.time_frequency.EpochsSpectrum) object. You can get frequency data from `EpochsSpectrum` in a similar way as from `Epochs` (using `get_data()`), but instead of time-window you have to provide lower and upper limit of the band you would like to extract. Then you have to calculate the mean of the returned frequencies, to get the average power within given band.

Your output vector should have shape of `(400, 1, 4)`.

In [None]:
# 1. Compute epochs PSD
epochs_spectrum = epochs.compute_psd(fmin = 1.0, fmax = 30.0)
fig = epochs_spectrum.plot()

In [None]:
# 2. extract and average the data for 4 brain bands

# your code here