# Preprocessing intracranial EEG using MNE-python


*NeuroHackademy 2023*  
[Liberty Hamilton, PhD](https://slhs.utexas.edu/research/hamilton-lab)  
Assistant Professor, Department of Speech, Language, and Hearing Sciences and  
Department of Neurology  
The University of Texas at Austin 

## Part 1: Loading, Plotting, and Referencing
This notebook will show you how to preprocess intracranial EEG using MNE-python. This uses a freely available iEEG dataset on audiovisual movie watching from [Julia Berezutskaya, available on OpenNeuro.org](https://openneuro.org/datasets/ds003688/versions/1.0.7/metadata). This notebook covers the basics of how to look at iEEG data, interpret effects of referencing, and inspect data quality. In part 2, we will cover high gamma extraction and referencing. The method of high gamma extraction is identical to that used in [Hamilton et al. 2018](https://doi.org/10.1016/j.cub.2018.04.033) and [Hamilton et al. 2021](https://doi.org/10.1016/j.cell.2021.07.019).

## Python libraries used in this tutorial

* matplotlib
* mne_bids
* [MNE-python](https://mne.tools/stable/install/index.html)
* numpy
* pandas
* pybids

## What you will do in this tutorial

* Load an iEEG dataset in MNE-python
* Compare iEEG dataset with BIDs metadata vs. without so you know what to do if you encounter data without this info
* Plot the power spectrum of the data to check for bad channels and compare channel types
* Re-reference the data according to different reference schemes

### and in part 2! ([`02_ieeg_preprocessing_MNE_epochs.ipynb`](02_ieeg_preprocessing_MNE_epochs.ipynb))
* Compute the high gamma analytic amplitude of the signal
* Plot evoked data

In [None]:
%matplotlib notebook

import mne
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import os
from mne_bids import read_raw_bids, print_dir_tree
from mne_bids.path import get_bids_path_from_fname
from bids import BIDSLayout
from ecog_preproc_utils import transformData
import bids 
import re  # regex for comparing channel names

## Download BIDS iEEG dataset

Here we will download an example iEEG dataset from [Berezutskaya et al.  Open multimodal iEEG-fMRI dataset from naturalistic stimulation with a short audiovisual film](https://openneuro.org/datasets/ds003688/versions/1.0.7/metadata). For this tutorial we will use data from `sub-06`, `iemu` data only, which has been downloaded to the jupyter hub. The whole dataset is rather large (15 GB), so if you prefer to download just this session you can do that.

In [None]:
# This is the example participant's data that we will load for the tutorial,
# but there are more options.
subj = '06'
sess = 'iemu'
task = 'film'
acq = 'clinical'
run = 1

In [None]:
# Change the data directory below to where your data are located. 
parent_dir = '/home/jovyan/shared/ds003688/'  # This is on the jupyter hub
ieeg_dir = f'{parent_dir}/sub-{subj}/ses-{sess}/ieeg/'
channel_path = f'{ieeg_dir}/sub-{subj}_ses-{sess}_task-{task}_acq-{acq}_run-{run}_channels.tsv'
raw_path = f'{ieeg_dir}/sub-{subj}_ses-{sess}_task-{task}_acq-{acq}_run-{run}_ieeg.vhdr'

bids_path = get_bids_path_from_fname(raw_path)
base_name = os.path.basename(raw_path).split('.')[0]

## BIDS layout

We can use `pybids` to show a little bit about the files in this BIDS dataset. We won't get as much into this, but if you'd like to try this tutorial on your own you may wish to delve into this more.

In [None]:
layout = BIDSLayout(parent_dir)

In [None]:
layout.get_tasks()

In [None]:
all_files = layout.get()
print("There are {} files in the layout.".format(len(all_files)))
print("\nThe first 10 files are:")
all_files[:10]

In [None]:
print_dir_tree(parent_dir, max_depth=3)

## Let's load some iEEG data!

First, we will choose the relevant subject, session, task, acquisition, and run. Note that if you wish to change these variables, you may need to download the data yourself.

To show the capabilities of BIDS and contrast to when we don't use BIDS, we'll load the data in two ways. The data structure using BIDS will be called `raw`, the data structure without BIDS will be `raw_nobids`.

In [None]:
# Read data and extract parameters from BIDS files
raw = read_raw_bids(bids_path, verbose=True)

In [None]:
# Read the data assuming we didn't have the BIDS structure in place
raw_nobids = mne.io.read_raw_brainvision(raw_path, preload=True)

In [None]:
# Let's load the data into memory and print some information about it. The 
# info structure contains a lot of helpful metadata about number of channels,
# sampling rate, data types, etc. It can also contain information about the
# participant and date of acquisition, however, this dataset has been anonymized.
raw.load_data()
raw.info

In [None]:
raw.info['ch_names']

## Plot the raw data

Let's first inspect the raw data to see how it looks, what type of information we have, and whether we can immediately see any bad channels or bad time segments that should be rejected. Typically this portion should be done interactively so you can scroll through the data using the arrow keys. 

We will compare and contrast the data loaded using MNE BIDS (`raw`) versus the data loaded without this information (`raw_nodbids`).

In [None]:
# Plot the data, reject bad segments. Look for times where there
# are spike wave discharges (epileptiform artifacts) or large
# movement artifacts. Be selective, look out for blocks with a 
# ton of seizure activity
raw.plot(scalings='auto', color=dict(eeg='b', ecog='b'), n_channels=64, block=True)

In [None]:
# Plot the data, reject bad segments. Look for times where there
# are spike wave discharges (epileptiform artifacts) or large
# movement artifacts. Be selective, look out for blocks with a 
# ton of seizure activity. This is when we don't have the nice
# BIDS metadata automatically loaded in.
raw_nobids.plot(scalings='auto', color=dict(eeg='b', ecog='b'), n_channels=64, block=True)

## Plot the power spectrum

Now we will plot the power spectrum of the signal to give us an idea of the signals we're getting. Bad channels (or channels that are not EEG/ECoG) will often have a very different power spectrum than the good channels. These will show up as highly outside the range of the other channels (either flat, or much higher/lower power).

In [None]:
raw.compute_psd().plot(picks='data', exclude=[]);

In [None]:
# If we want to see other options we have for computing the power spectrum, 
# we can consult the help function
raw.compute_psd?

## Do the same without having loaded the metadata from BIDs

Here we will see the data with bad channels included, and with all the channel types marked as EEG.

In [None]:
raw_nobids.info

In [None]:
raw_nobids.compute_psd().plot();

## Referencing

Referencing or re-referencing your data should be done with some knowledge of your recording setup and what you wish to measure. You can read more about referencing [here (for EEG)](https://pressrelease.brainproducts.com/referencing/#:~:text=The%20reference%20influences%20the%20amplitude,affected%20by%20similar%20electrical%20activity.). Typically, experimenters will choose one of the following references:

1. Based on a single electrode in white matter (or relatively "quiet" electrode far away from your signals of interest. 
2. Based on the average of all electrodes or a block of electrodes (CAR or Common Average Reference). Note that the CAR is *not* a good idea if all of your electrodes are within a single functional area, as you will likely subtract out more signal than noise. 
3. Bipolar referencing, in which pairs of adjacent electrodes are subtracted to calculate more local signals. This is a bit more complicated but allows you to work with data in a single region without the drawbacks of the CAR.

In [None]:
# Example of single reference channel - this subtracts the signal from this channel
# from all other channels (including itself), so the reference will appear as a flat
# line after this step
raw_ref = raw.copy()
raw_ref.set_eeg_reference(ref_channels = ['P01'], )
raw_ref.plot(scalings='auto', color=dict(eeg='b', ecog='b'), n_channels=64, block=False)

In [None]:
# Example of common average reference. The common average reference subtracts the average
# signal across all *good* channels from every single channel. This is typically a good
# choice for removing noise across all channels (for example, sometimes EMG from chewing,
# some movement artifacts, electrical noise).
raw_ref_car = raw.copy()
raw_ref_car.set_eeg_reference(ref_channels = 'average')
raw_ref_car.plot(scalings='auto', color=dict(eeg='b', ecog='b'), n_channels=64, block=False)

## Bipolar reference

Bipolar referencing is a bit trickier and is not fully implemented here. You need to use knowledge of the physical locations of the electrodes to properly create the bipolar montage. For example, in the image below, we would need to use the knowledge of how the electrodes are placed in order to create the appropriate pairs for the anode and cathode.

![sub-06 electrode locations](sub-06_ses-iemu_acq-render_photo_ecog_left.jpeg)

In [None]:
# Example of bipolar reference. This would normally be done with some manual
# intervention of the specific channel pairs. 
raw_ieeg = raw.copy()
raw_ieeg.pick_types(ecog=True)
ch_pairs = list(zip(raw_ieeg.info['ch_names'][:-1],
                    raw_ieeg.info['ch_names'][1:]))

# Make a list of channels for the anode and the cathode
anode = []
cathode = []
# Only subtract channels with the same device name (these will be
# close in space). This is still not ideal as we are probably
# subtracting electrodes that are far away from one another in 
# space (for example, electrodes 8 and 9 on the grid picture
# above should not be used for a bipolar reference)
for pair in ch_pairs:
    # get rid of the numbers in the ch_name
    ch1_dev = re.sub(r'\d+', '', pair[0]) 
    ch2_dev = re.sub(r'\d+', '', pair[1]) 
    # if these are part of the same device, consider them for 
    # anode and cathode selection
    if ch1_dev == ch2_dev:
        anode.append(pair[0])
        cathode.append(pair[1])

# Apply the bipolar reference
raw_ref_bip = mne.set_bipolar_reference(raw, anode=anode, cathode=cathode)
raw_ref_bip.plot(scalings='auto', color=dict(eeg='b', ecog='b'), n_channels=64, block=True)

## Want to plot some evoked data? 

Go to Part 2! [02_ieeg_preprocessing_MNE_epochs.ipynb](02_ieeg_preprocessing_MNE_epochs.ipynb)