## TUTORIAL W38

In this tutorial, we will go through how to analyse visual responses.
Steps will include:  
- preprocessing
     - artefact rejection
     - filtering
- epoching
     - rejecting based on peak-to-peak amplitude
- evoked responses
      - difference waves
- estimating noise covariance
      - whitening the data
- do an inverse model (MNE)
- extract labels from the cerebral cortex
    - plot source times courses from different labels
    - do difference plots
- do a multivariate analysis in source space
    - (doing it in sensor space is left as an exercise to the reader)

In [None]:
## IMPORTS AND DEFAULT PLOTTING PARAMETERS

import mne ## MNE-Python for analysing data
## below magic provides interactive plots in notebook
%matplotlib widget
from os import chdir
from os.path import join
import matplotlib.pyplot as plt ## for basic plotting
import matplotlib as mpl ## for setting default parameters


## SAMPLE DATA SET (https://mne.tools/stable/documentation/datasets.html#sample)
*These data were acquired with the Neuromag Vectorview system at MGH/HMS/MIT Athinoula A. Martinos Center Biomedical Imaging. EEG data from a 60-channel electrode cap was acquired simultaneously with the MEG. The original MRI data set was acquired with a Siemens 1.5 T Sonata scanner using an MPRAGE sequence.*

*In this experiment, checkerboard patterns were presented to the subject into the left and right visual field, interspersed by tones to the left or right ear. The interval between the stimuli was 750 ms. Occasionally a smiley face was presented at the center of the visual field. The subject was asked to press a key with the right index finger as soon as possible after the appearance of the face.*

Change the path to your relevant path below

In [None]:
#%% LOAD SAMPLE DATA SET

sample_path = '/work/MEG_data/MNE-sample-data' ## UCloud
sample_meg_path = join(sample_path, 'MEG', 'sample')
chdir(sample_meg_path)
subjects_dir = '../../subjects/'

## MARKING BAD CHANNELS

First try to identify the two bad channels, one electrode and one planar gradiometer.  
(They have been greyed out; also notice that you can mark channels as bad by clicking them and "unbadding" them by clicking it again)
Do notice that the two marked channels look considerably different than the others  
The bad channels can also be found by `raw.info['bads']`
The x-axis has time and the y-axes, magnetic field (T), magnetic gradient (T/m) or voltage (V)
This is your $$

In [None]:
#%% READ RAW

raw = mne.io.read_raw_fif('sample_audvis_raw.fif', preload=True)
fig = raw.plot()

## IDENTIFYING BAD CHANNELS FROM THE POWER SPECTRAL DENSITY (PSD)

Now we will see the two bad channels in the power spectral density.
Describe to yourself how they differ from the rest. Often bad channels are easier to identify here than in the raw traces
The X-axis contains the frequency (Hz) and the y-axis the power of each for each frequency in dB. Here you can also click and toggle bad channels.

In [None]:
raw.compute_psd().plot();

## FILTERING

With filtering, we can reduce the contributions of frequencies that contain signal that is not of interest to our analysis.
For each of the copies, compute a psd and ascertain for yourself what they do and not do

In [None]:
copy_lowpass = raw.copy() ## create a copy so we do not overwrite the original
copy_lowpass.filter(h_freq=40,   l_freq=None) ## lowpass filter of 40 Hz

copy_highpass = raw.copy()
copy_highpass.filter(h_freq=None, l_freq=1) ## highpass filter of 1 Hz

copy_bandstop = raw.copy()
copy_bandstop.filter(h_freq=1,    l_freq=40) ## bandstop filler of 1-40 Hz

copy_bandpass = raw.copy()
copy_bandpass.filter(h_freq=40,   l_freq=1); ## bandpass fillter of 1-40 Hz

## CHOOSE ONE OF THE FILTERS - AND THEN FIND THE EVENTS

Choose a filter and apply it to your `raw` before you go further
For the events, we will use the two checkerboard stimuli

In [None]:
#%% FIND EVENTS

events = mne.find_events(raw)

## LA: 1: Response to left-ear auditory stimulus (a tone)
## RA: 2: Response to right-ear auditory stimulus
## LV: 3: Response to left visual field stimulus (checkerboard)
## RV: 4: Response to right visual field stimulus
## smiley: 5: Response to the smiley face
## button: 32: Response triggered by the button press
# https://mne.tools/stable/overview/datasets_index.html#sample

## CREATING EPOCHS

Create epochs using `mne.Epochs`. Specifically, create two epochs_objects: `epochs` and `epochs_eog_cleaned`.
- In both:
    - include 200 ms before each visual event and 550 ms after each event.
    - make sure to include both checkerboard stimuli epochs
    - do the demeaning by creating a baseline interval from 200 ms to 0 ms.
    - apply the `set_eeg_reference(projection=True)` to both
- In `epochs_eog_cleaned`:
    - include a `reject` dict that removes epochs where the peak-to-peak amplitude of eye-related activity (EOG) exceeds 250 µV
Check out https://mne.tools/stable/generated/mne.Epochs.html


In [None]:
epochs = mne.Epochs()
epochs_eog_cleaned = mne.Epochs()

In [None]:
## CREATING AVERAGES AND DIFFERENCE WAVES

evokeds = list()
evokeds_eog_cleaned = list()
for event in epochs.event_id:
    evokeds.append(epochs[event].average())
    evokeds_eog_cleaned.append(epochs_eog_cleaned[event].average())

evoked_diff = evokeds[0].copy() # create a copy
evoked_diff.data -= evokeds[1].data # modify the data in place
evoked_diff.comment = 'LV - RV'



## PLOT THE EVOKEDS AND THEIR DIFFERENCE WAVES
Find out where the differences are the greatest - what does this indicate
Use `mne.viz.plot_evoked`, `mne.viz.plot_evoked_topo`, `mne.viz.plot_evoked_topomap`, `mne.viz.plot_evoked_joint`, `mne.viz.plot_evoked_image` to get a feel for the data.

Also create a difference wave `evokeds_eog_cleaned_diff`
And make a difference of the differences between `evokeds_diff` and `evokeds_ego_cleaned_diff` - on which sensors are the differences most pronounced?

## FORWARD MODEL

$\boldsymbol L (\boldsymbol r)$ is our forward model that for each source location $\boldsymbol r$, expresses how that source is linked to the sensors. The SI-unit is $\frac {T} {Am}$.  
The SI-unit for the magnetic field at each time point, $t$, $\boldsymbol b (t)$ is $T$.  
The SI-unit for the current density $\boldsymbol s(\boldsymbol r, t)$  at each location $\boldsymbol r$ and time point $t$ is $Am$.  
The forward model states for each source at whatever $\boldsymbol r$ how its activation in $Am$ is linked to the magnetic field at each sensor, e.g. $b_1(t)$.  
$\boldsymbol n(t)$ is the Gaussian noise at each time point

$\boldsymbol b(t) = \left[
\begin{array}{c} 
b_1(t) \\
b_2(t) \\
\vdots \\
b_M(t)
\end{array}
\right]$  
$\boldsymbol{b}(t) = \boldsymbol{L}(\boldsymbol{r}) \boldsymbol s(\boldsymbol r, t) + \boldsymbol n(t)$

The nice people from MNE-Python have already made a forward model for us

In [None]:
## READ FORWARD MODEL
fwd = mne.read_forward_solution('sample_audvis-meg-eeg-oct-6-fwd.fif')

##  THE MINIMUM NORM ESTIMATE

$$
\huge \hat{\boldsymbol \nu}_{vox}(t) = \boldsymbol L_V^T(\boldsymbol G + \epsilon \boldsymbol I)^{-1} \boldsymbol b(t)    
$$
with
$$
\huge \boldsymbol G = \int_\Omega \boldsymbol L (\boldsymbol r) \boldsymbol L^T (\boldsymbol r) d^3r
$$
and with
$$ 
\huge
\boldsymbol{\hat{\nu}}_{vox}(t) = \left[
\begin{array}{c} 
\boldsymbol{\hat s} (\boldsymbol r_1, t) \\
\boldsymbol{\hat s} (\boldsymbol r_2, t) \\
\vdots \\
\boldsymbol{\hat s} (\boldsymbol r_N, t)
\end{array}
\right]  
$$

## CREATE SOURCE TIME COURSES FOR EACH CONDITION

Create `stc` (`stc = mne.minimum_norm.apply_inverse`) from `evokeds` , separate ones for each event ((LV and RV))
- Do you think you would see more frontal activity in `stc` compared to `stc_eog_cleaned`? And why is that?
- For the noise covariance plots, which of three kinds of sensors show the highest correlations between sensors?

In [None]:
##  WHITEN the data, i.e. normalizing magnetometers, gradiometers
## and electrode readings to make them comparable
noise_cov = mne.compute_covariance(epochs, tmin=None, tmax=0)
noise_cov.plot(raw.info)

inverse_operator = mne.minimum_norm.make_inverse_operator(epochs.info, fwd,
                                                          noise_cov)

# estimating the source pattern for each time point Vvox(t)
# use the MNE method and not the default dSPM
stc = mne.minimum_norm.apply_inverse(method='MNE')

## EXTRACT RELEVANT LABELS FROM THE CEREBRAL CORTEX

go to your path `MNE-sample-data/subjects/sample/label`, this is segmented brain data, and read some relevant labels using `mne.read_label` from there:
That could be:
 - `lh.V1.label`
 - `lh.V2.label`
 - `rh.V1.label`
 - `rh.V2.label`

You can read in even more labels by running `labels = mne.read_labels_from_annot('sample', subjects_dir=subjects_dir)`

Plot some (visual) labels that should show a response, using `plt.plot` from `matplotlib`. Here's some code that could get you started:

```
lh_V1_label = mne.read_label(<path_to_lh_V1_label>)
stc_lh_V1_label = stc.in_label(lh_V1_label)
plt.figure()
plt.plot(stc_lh_V1_label.times, stc_lh_V1_label.data.T)
plt.xlabel('Time (s)')
plt.ylabel('Current density (Am)')
plt.title(lh_V1_label.name)
plt.show()

```

Also plot some labels that should not show a response (maybe some language related labels)
Discuss with each other what the lines each represent.

##  MACHINE LEARNING LOGISTIC REGRESSION 

Finally, we are going to do some machine learning on this data, trying to predict whether the checkerboard was shown in the left visual field or the right visual field. For this we need to do source reconstruction of the epochs instead.


Below is some code to get you started

```

## GETTING SOURCE TIME COURSES FROM EPOCHS

stcs = mne.minimum_norm.apply_inverse_epochs(epochs, lambda=1, method='MNE')

## READING LABELS FROM ANNOTATION
labels = mne.read_labels_from_annot('sample', subjects_dir=subjects_dir)

## EXTRACT LABELS FROM MULTIPLE STCS
def extract_label(label, stcs):
    label_stcs = list()
    for stc in stcs:
        label_stcs.append(stc.in_label(label))
        
    return label_stcs


#%% CREATE X AND Y FOR LOGISTIC REGRESSIOM

import numpy as np

y = epochs.events[:, 2] # the codes for the visual events

n_events = len(y) ## how many repetitions
n_samples = label_stcs[0].data.shape[1] # how many time points
n_label_vertices = label_stcs[0].data.shape[0]  # how many source positions

X = np.zeros(shape=(n_events, n_label_vertices, n_samples))

## assign data to X
for event_index in range(n_events):
    X[event_index, :, :] = label_stcs[event_index].data
    
    
#%% DO LOGISTIC REGRESSION PER TIME SAMPLE

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.preprocessing import StandardScaler

scores_list_samples = [None] * n_samples

sc = StandardScaler()

for sample_index in range(n_samples):
    print(sample_index)
    this_X = X[:, :, sample_index]
    this_X_std = sc.fit_transform(this_X) ## standardise the data
    logr = LogisticRegression(C=1e-3)
    scores_list_samples[sample_index] = np.mean(cross_val_score(logr,
                                                                this_X_std,
                                                                y,
                                                        cv=StratifiedKFold()))


## PLOTTING THE CLASSIFICATION ACCURACY

plt.figure()
plt.plot(stc.times, scores_list_samples)
plt.show()
plt.title('Classification of LV vs RV')
plt.xlabel('Time (s)')
plt.ylabel('Proportion correctly classif')



    
```