# Lexical Decision Task
Today we will try to apply the general EEG preprocessing pipeline from the MNE tutorial to the EEG data from the lexical decision task.


#### Setting up Python
Before starting to analyse our own EEG data, we need to make sure we have our virtual environment we created during the `MNE-tutorial`.

1. Press `Select Kernel`, then `Python Environments...` and then choose any Python kernel. 
2. Run the code chunk below
3. Change the kernel used to run the code in this notebook. Press where it says `Python X.XX.XX` in the top right corner, then `Select Another Kernel`, then `Jupyter kernel...` and then select `env`. If `env` does not show up, press the little refresh symbol! 

In [1]:
!bash ../env_to_ipynb_kernel.sh

Installed kernelspec env in /home/ucloud/.local/share/jupyter/kernels/env


# Analysis of EEG data collected!
The workflow of preprocessing is as follows!
1. Load the data
2. Exclude bad channels
3. Common average reference
4. Filtering
5. Artefact rejection
6. Epoching
7. Downsampling

This is the same workflow as in the `preprocessing_tutorial.ipynb`. This notebook serves as a "skeleton" where you fill out the code you need. All code bits needed can be found in the notebook from yesterday! 

<div class="alert alert-block alert-info"><b>Tip:</b>
The MNE package has some really nice documentation! If you have any questions on how to use a function or if you want to see other ways you can plot your data, have a look at it! 

https://mne.tools/stable/index.html

In [1]:
import mne
from pathlib import Path
import matplotlib
import numpy as np
import pandas as pd

# importing a function laura and andreas made to fix the triggers depending on the logfile
import sys
sys.path.append("..")

from helper_functions import update_events_group1_group2, update_event_ids

%matplotlib inline

## 1. Load the data
As we are no longer using sample data from MNE, the process of loading the data will be a bit different. Therefore, code to help you is provided here! **Remember to change the group_number variable!**

In [2]:
# reading the file & loading in the data
data_folder = Path("/work/EEG_lab/raw")
group_number = "1"

# path to the data, made using the group_name variable 
EEG_path = data_folder / "EEG" / f"group{group_number}.vhdr"

raw = mne.io.read_raw_brainvision(EEG_path)
raw.load_data()

# set standard montage (let MNE know the layout of the electrodes on the cap)
montage = mne.channels.make_standard_montage('standard_1020')
raw.set_montage(montage, verbose=False)

Extracting parameters from /work/EEG_lab/raw/EEG/group1.vhdr...
Setting channel info structure...
Reading 0 ... 531959  =      0.000 ...   531.959 secs...


Unnamed: 0,General,General.1
,Filename(s),group1.eeg
,MNE object type,RawBrainVision
,Measurement date,2024-10-08 at 14:03:19 UTC
,Participant,Unknown
,Experimenter,Unknown
,Acquisition,Acquisition
,Duration,00:08:52 (HH:MM:SS)
,Sampling frequency,1000.00 Hz
,Time points,531960
,Channels,Channels


In [3]:
# reading in the csv file with experiment information
behavioural_path = data_folder / "behavioural" / f"subject-{group_number}.csv"
logfile = pd.read_csv(behavioural_path)
logfile.head() # prints the first lines of the csv file

# clean logfile
logfile["target_gender"] = logfile["target_gender"].apply(lambda x: x.strip())
logfile["target_gender"] = logfile["target_gender"].apply(lambda x: x.strip("??"))

## 2. Exclude bad channels

In [5]:
# sometimes filtering before plotting the channels can make it more nice to look at as we are getting rid of some noise!
# BUT IF YOU FILTER HERE NO NEED TO DO IT FURTHER ALONG :D
# raw.filter(0.1, 40)

In [None]:
# STEP 1: Plot the raw data to help you identify bad channels
raw.plot(
    n_channels=32, 
    start=100, 
    scalings={"eeg": 250e-7}, # try modifying this value to make the plot more pleasant to look at 
    duration=10);

In [7]:
# STEP 2: Mark bad channels as bad if there are any!

In [8]:
# STEP 3: Remove the bad channels

## 3. Common average reference

In [None]:
# STEP 1: Choose the common average reference
raw.set_eeg_reference('average', projection=True)

# STEP 2: applying the reference to the data
raw.apply_proj()

## 4. Filtering

In [10]:
# STEP 1: high-pass filter the data at 0.1 Hz and low-pass at 40 Hz

# STEP 2: plot the filtered data for inspection

## 5. Artefact rejection

In [11]:
# Defining the threshold, we will apply it later when we are creating epochs
reject = dict(eeg=150e-6)# 150 µV

## 6. Epoching

In [5]:
event_id = {
   # word / prime / prime_gender / age
   'word/prime/female/adult': 11,
   'word/prime/female/child': 12,
   'word/prime/female/neutral': 13,
   'word/prime/male/adult': 21,
   'word/prime/male/child': 22,
   'word/prime/male/neutral': 23,
   'word/prime/neutral/adult': 31,
   'word/prime/neutral/child': 32,
   'word/prime/neutral/neutral': 33,
   'word/prime/filler': 40,

   # word / target / target_gender / congruency or filler or neutral (prime gender)
   'word/target/female/congruent': 111,
   'word/target/female/incongruent': 121,
   'word/target/female/neutral': 131,
   'word/target/female/filler': 141,
   'word/target/male/incongruent': 112,
   'word/target/male/congruent': 122,
   'word/target/male/neutral': 132,
   'word/target/male/filler': 142,
   'word/target/neutral/female': 113,
   'word/target/neutral/male': 123,
   'word/target/neutral/neutral': 133,
   'word/target/neutral/filler': 143,
   'word/target/control/female': 114,
   'word/target/control/male': 124,
   'word/target/control/neutral': 134,
   'word/target/control/filler': 144,

   # response / correct / target_gender / congruency or filler or neutral (prime gender) / button that was pressed
   'response/incorrect/female/congruent/m': 161,
   'response/incorrect/female/congruent/z': 166,
   'response/incorrect/female/incongruent/m': 171,
   'response/incorrect/female/incongruent/z': 176,
   'response/incorrect/female/neutral/m': 181,
   'response/incorrect/female/neutral/z': 186,
   'response/incorrect/female/filler/m': 191,
   'response/incorrect/female/filler/z': 196,
   'response/incorrect/male/incongruent/m': 162,
   'response/incorrect/male/incongruent/z': 167,
   'response/incorrect/male/congruent/m': 172,
   'response/incorrect/male/congruent/z': 177,
   'response/incorrect/male/neutral/m': 182,
   'response/incorrect/male/neutral/z': 187,
   'response/incorrect/male/filler/m': 192,
   'response/incorrect/male/filler/z': 197,
   'response/incorrect/neutral/female/m': 163,
   'response/incorrect/neutral/female/z': 168,
   'response/incorrect/neutral/male/m': 173,
   'response/incorrect/neutral/male/z': 178,
   'response/incorrect/neutral/neutral/m': 183,
   'response/incorrect/neutral/neutral/z': 188,
   'response/incorrect/neutral/filler/m': 193,
   'response/incorrect/neutral/filler/z': 198,
   'response/incorrect/control/female/m': 164,
   'response/incorrect/control/female/z': 169,
   'response/incorrect/control/male/m': 174,
   'response/incorrect/control/male/z': 179,
   'response/incorrect/control/neutral/m': 184,
   'response/incorrect/control/neutral/z': 189,
   'response/incorrect/control/filler/m': 194,
   'response/incorrect/control/filler/z': 199,
   'response/correct/female/congruent/m': 211,
   'response/correct/female/congruent/z': 216,
   'response/correct/female/incongruent/m': 221,
   'response/correct/female/incongruent/z': 226,
   'response/correct/female/neutral/m': 231,
   'response/correct/female/neutral/z': 236,
   'response/correct/female/filler/m': 241,
   'response/correct/female/filler/z': 246,
   'response/correct/male/incongruent/m': 212,
   'response/correct/male/incongruent/z': 217,
   'response/correct/male/congruent/m': 222,
   'response/correct/male/congruent/z': 227,
   'response/correct/male/neutral/m': 232,
   'response/correct/male/neutral/z': 237,
   'response/correct/male/filler/m': 242,
   'response/correct/male/filler/z': 247,
   'response/correct/neutral/female/m': 213,
   'response/correct/neutral/female/z': 218,
   'response/correct/neutral/male/m': 223,
   'response/correct/neutral/male/z': 228,
   'response/correct/neutral/neutral/m': 233,
   'response/correct/neutral/neutral/z': 238,
   'response/correct/neutral/filler/m': 243,
   'response/correct/neutral/filler/z': 248,
   'response/correct/control/female/m': 214,
   'response/correct/control/female/z': 219,
   'response/correct/control/male/m': 224,
   'response/correct/control/male/z': 229,
   'response/correct/control/neutral/m': 234,
   'response/correct/control/neutral/z': 239,
   'response/correct/control/filler/m': 244,
   'response/correct/control/filler/z': 249
}

In [9]:
print(logfile["prime_gender"].unique())
print(logfile["target_gender"].unique())

['filler' 'female' 'neutral' 'male']
['female' 'control' 'neutral' 'male']


In [7]:
# STEP 2: Locate stimulus events in the recording and save it in a variable called events
# we are doing it a bit differently since we dont have a stimulus channel like in the sample data
# rather we have some annotations in the file
# therefore I have provided the code for you :)
events, _ = mne.events_from_annotations(raw)

Used Annotations descriptions: [np.str_('New Segment/'), np.str_('Stimulus/S  1'), np.str_('Stimulus/S 10'), np.str_('Stimulus/S 20'), np.str_('Stimulus/S 30')]


**THIS CODE IS ONLY TO BE RUN BY GROUP 1 AND GROUP 2**

All other groups can just delete the code in the chunk below!

In [8]:
# STEP 2B: ONLY GROUP 1 and GROUP 2
# get rid of practise trials from the logfile
logfile = logfile[logfile["practice"]=="no"]

events = update_events_group1_group2(events, event_id, logfile)


KeyError: 'word/target/filler/female'

In [15]:
# STEP 3 (EVERYONE): establish a time window for epochs (tmin and tmax)

In [16]:
# STEP 4: Update event_id dictionary 
# when creating the epochs the function will throw an error if it doesn't find 
# at least one trigger matching all the events in the dictionary
# however, if a participant never answered incorrectly to incongruent stimuli
# we will not have that given trigger
new_event_id = update_event_ids(events, event_id)


In [None]:
# STEP 5: Create the epochs
epochs = mne.Epochs(
    raw, 
    events,
    event_id = new_event_id, 
    tmin = tmin, 
    tmax = tmax, 
    picks=["eeg"],
    baseline=(None, 0), 
    reject=reject, 
    preload=True
    )

**At this point call Laura over to have a look!**

## 7. Downsampling

In [None]:
# STEP 1: Downsample to 250 Hz

# Analysis
Make some initial plots of the different conditions. You can get inspiration from the preprocessing tutorial notebook from yesterday!
