# A tutorial for building Inverted Encoding Models on EEG data: An SSVEP study

Here I will apply two novel methodologies to fit an IEM to EEG data. The [data](https://osf.io/qx94k/?view_only=e256bebd6e7c4f888a696fb35919913d) come from a study [link](NA) where participants were presented two superimposed dot arrays each of which flickered at a distinct frequency (7.5 and 8.57 Hz). The dataframe is the amplitudes from a fast Fourier transform (FFT) procedure belonging to several participants.

The key manipulation was that the attended array's colors was sampled from a narrow or broad range of colors. Let's first examine the dataset.

In [None]:
import pandas as pd
import numpy as np

# Following function extracts the peak amplitudes for a given set of frequencies
def find_local_max(df,n=1, stim_freqs=120/np.array((14,16)), dv = 'fft_amplitude'):

    if n != 1: return(Exception("n has to be 1 for now."))

    idx = np.array([np.argmin(np.abs(df['frequency']-fqN)) + n*[-1,0, 1] for fqN in stim_freqs])
    peak_idx = np.array([idxN[np.argmax(df.iloc[idxN][dv])] for idxN in idx])
    if len(np.unique(peak_idx)) == 1:

        if np.argmax(df.iloc[peak_idx + [-1,1]][dv]) == 0: peak_idx[0] += 1
        else: peak_idx[1] -= 1

    return(df.iloc[peak_idx].assign(search_frequency = stim_freqs))

# Our dependent variable is amplitudes
dv = "fft_amplitude"

# These are the stimulation frequencies
stim_freqs = 120/np.array((14,16))

# The centers of the colors that were presented. The details will be given later.
stimulated_color_centers = np.linspace(-180,180,7)[:-1]*np.pi/180

# Directly extract peak amplitudes per stimulation frequencies
df = (
    pd.read_csv('./data/trial_fft_and_snrs_per_channel.csv')
    # Pick the first participant's data
    .set_index('participant').loc[0].reset_index()
    .groupby(['participant','trial','channels'])
    .apply(find_local_max)
      )
df[["conditions","target_frequency","color_center"]] = df['conditions'].str.split('/',expand=True)

# Transform the dataframe in a format where features are columns
df = pd.pivot_table(df.reset_index(drop=True), 
               values=[dv],
               columns=["channels"],
               index=["participant","search_frequency","trial","conditions","target_frequency","color_center"]
               ).reset_index()

# Housekeeping
df["target_frequency"] = df["target_frequency"].replace(
                   {"TargetFreq8":stim_freqs[0],"TargetFreq7":stim_freqs[1]}
               )
df["color_center"] = df["color_center"].replace({"Color{}".format(n):deg for n, deg in zip(range(6),stimulated_color_centers)})
df["color_center"] = df["color_center"].where(df['search_frequency'] == df['target_frequency'], 
                                              other=((((df["color_center"] + np.pi) + np.pi) % (2 * np.pi)) - np.pi))
df["color_center"] = np.around(df["color_center"],decimals=5)
df['isTargetSNR'] = df['search_frequency'] == df['target_frequency']
df.head()

Current dataframe contains fft amplitudes per stimulation frequency (7.5 or 8.57 Hz) for every trial. In roughly the half of the trials one stimulation frequency represent the attended and the other the unattended. In the remaining half, reverse is true. Also, the color ranges of the target array were indicated under conditions as 'Broad' or 'Narrow'. Color centers indicate one of 6 equidistant color centers around a isoluminant colorwheel in radians.
Each stimulation frequency induces idiosyncratic steady-state evoked potentials (SSVEPs). This idiosyncracity is due to the differences in how neural populations respond to specific frequency. Next, we will minimize the differences due to stimulation frequencies, using Procrustes transformation (REF).

## Procrustes Transformation

Procrustes transformation is a set of linear transformation on a hyperspace such that the structure of a matrix A is made similar to matrix B on a hyperspace where each feature indicates a coordinate. Each row/observation in matrix A should match the conditions of interest in matrix B since we want corresponding rows to retain its latent structure so that the underlying experimental manipulations would not be affected by this transformation. Therefore, we need to split our dataframe per peak frequency and subsample and order it in a way that each corresponding row represent the same 'conditions', 'target_frequency', and ' color_center'.