# Realign two EEGLab.Epoch object's epochs.

@Author [@FranckPrts](hstate_dictps://github.com/FranckPrts).

Here we astate_dictempt to provide a python-based solution to the following question: **How to can I realign concurrent epochs from two epoched object?**

Our main goal here is to be able to target all epochs in a given EEG that doesn't have a *sister* (concurently recorded) epoch in the other EEG because it was rejected by preprocessing. At the end, we should have two epoched EEG with the same amount of epochs and where each pair of epoch has their *sister* epoch at the same index.

Fist, the EEG data of each participant was segmented in 1sec epochs before moving on to preprocessing independelty each two EEG data. For each stage of this iterative process (in our case 2-3 iterations) the ID of the rejected epochs were noted in a separate file.

Our issue arise from the fact that once each step was performed, saving the data would lead to losing track of what was the epochs original IDs. 
As exemplified below: 

#IMAGE

In that process we see that an epoch that originally had the ID #6 can end up with the new ID #3. 

To retrieve the original id of the epoch, we will have to work bakward from the last iteration of preprocessing to the first iteration. At each step we will store what was the previous ID of the epochs so we can find their original IDs. 

## Imports

In [338]:
# Package 
import mne
import numpy as np
import pandas as pd
import tempfile

# Custom functions
import utils

# %matplotlib inline

We import two eeg stream that were preprocessed in MATLAB

In [339]:
def EpochsEEGLAB_to_mneEpochsFIF (path):
    """
    Loads a SET file into a mne.io.eeglab.eeglab.EpochsEEGLAB object
    and converts it into a mne.Epochs instance.

    Arguments
    ----------
    path: str
        participant #1 fNIRS data path (directory)

    Returns
    --------
    mneEpochs:
        instance of mne.Epochs.
    """
    # read the file and get a mne.io.eeglab.eeglab.EpochsEEGLAB instance
    tmp = mne.io.read_epochs_eeglab(path)

    with tempfile.TemporaryDirectory() as tmpdir:
        # save it in FIF
        tmp.save(tmpdir+"tmp.fif", overwrite=True, verbose=None)
        
    # re-read it so it is now a mne.EpochsFIF
    return mne.read_epochs(tmpdir+"tmp.fif")


In [340]:
files_to_process = np.loadtxt("files_to_process.csv",
                 delimiter=",", dtype=str)

dyad = [x for x in files_to_process]
# Careful, the file_to_process is in the order (dyad_nb, eeg_filepath_child, eeg_filepath_adutl)
dy = dyad[0]
data_path = '../FINS-data/'

In [341]:
eeg1 = EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1])) 
eeg2 = EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]))

Extracting parameters from /Users/zoubou/Documents/Work/NYU/Brito-Lab/FINS-Codes/../FINS-data/220_child_FP/FINS_220_Child_FreePlay_xchan_rej3.set...
Not setting metadata
159 matching events found
No baseline correction applied
0 projection items activated
Ready.
Reading /var/folders/vv/stc9rswn5c95vxdzpx7z6qqr0000gn/T/tmpwntk963btmp.fif ...
    Found the data of interest:
        t =       0.00 ...     998.00 ms
        0 CTF compensation matrices available
0 bad epochs dropped
Not setting metadata
159 matching events found
No baseline correction applied
0 projection items activated
Extracting parameters from /Users/zoubou/Documents/Work/NYU/Brito-Lab/FINS-Codes/../FINS-data/220_adult_FP/FINS_220_Adult_FreePlay_xchan_ica_rej3.set...
Not setting metadata
206 matching events found
No baseline correction applied
0 projection items activated
Ready.
Reading /var/folders/vv/stc9rswn5c95vxdzpx7z6qqr0000gn/T/tmp31cajzmwtmp.fif ...
    Found the data of interest:
        t =       0.00 ...     

  tmp = mne.io.read_epochs_eeglab(path)
  tmp.save(tmpdir+"tmp.fif", overwrite=True, verbose=None)
  return mne.read_epochs(tmpdir+"tmp.fif")
  tmp = mne.io.read_epochs_eeglab(path)
  tmp.save(tmpdir+"tmp.fif", overwrite=True, verbose=None)
  return mne.read_epochs(tmpdir+"tmp.fif")


0 bad epochs dropped
Not setting metadata
206 matching events found
No baseline correction applied
0 projection items activated


Let's see how many epochs we have per EEG file:

In [342]:
print('EEG-1 has {} epochs.'.format(eeg1.get_data().shape[0]))
print('EEG-2 has {} epochs.'.format(eeg2.get_data().shape[0]))

EEG-1 has 159 epochs.
EEG-2 has 206 epochs.


Well, there should be the same amount of epochs in each file. Moreover, when looking at the index of each epochs (see the x-axis of the plots bellow) we can see that they are all continuous, thus, not indicating which epochs were rejected:

In [343]:
# eeg1.plot()

In [344]:
# eeg2.plot()

In [345]:
# eeg1.to_data_frame()

### What's the plan now?

When loading an file in the EEGLAB format,  You have the following epoch indexes in your preprocessed file: 

`1, 2, 3, 4, 5, 6, 7`

And you know that the following epochs were rejected:

`3, 7, 8`

but then get 

`1, 2, 3, 4`

We'll now reconstruct the original epoch index as follows? (Within brackets):

`1(1), 2(2), NaN, 4(3), 5(4), 6(5), NaN, NaN, 9(6), 10(7)`


> **Careful, we have multiple round of rejection, so that method will have to be iterated over each round.**

### Extracting the epoch data

We're now going to extract the epoch data from the mne.EpochFIF to apply the operation described above.

In [346]:
df1 = eeg1.to_data_frame()
df2 = eeg2.to_data_frame()

In [347]:
df2.epoch.unique()


array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,
       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,
       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,
       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,
       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 18

## Make an example

In [348]:
df = pd.DataFrame({'Letters': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'], 'Indexes': [0, 1, 2, 3, 4, 5, 6, 7]})

# pd.set_option('display.max_rows', len(state_dict))

df

Unnamed: 0,Letters,Indexes
0,A,0
1,B,1
2,C,2
3,D,3
4,E,4
5,F,5
6,G,6
7,H,7


In [349]:
# Now we remove two rows in a first round:
# Create a list of elements to remove
rmed_1 = [1, 3]

# Create a boolean mask indicating which rows to keep
mask = df['Indexes'].isin(rmed_1)

# Remove the rows that match the elements in the list
df.drop(index=df[mask].index, inplace=True)
df

Unnamed: 0,Letters,Indexes
0,A,0
2,C,2
4,E,4
5,F,5
6,G,6
7,H,7


In [350]:
# Now we reset the index the same way saving this 'eeg' file would when being read for the next iteration's round 
df.Indexes = [i for i in range(len(df))]
df

Unnamed: 0,Letters,Indexes
0,A,0
2,C,1
4,E,2
5,F,3
6,G,4
7,H,5


In [351]:
# Now were remove three rows and directly reset the indexes
# Create a list of elements to remove
rmed_2 = [2, 5, 0]

# Create a boolean mask indicating which rows to keep
mask = df['Indexes'].isin(rmed_2)

# Remove the rows that match the elements in the list
df.drop(index=df[mask].index, inplace=True)

df.Indexes = [i for i in range(len(df))]

df

Unnamed: 0,Letters,Indexes
2,C,0
5,F,1
6,G,2


Alrigth, now we have two list containning the indexes that were removed **`at the time of their round of rejection`**. 

Keep in mind that the index #4 could be deleted in multiple round as #4 could be reassigned when the file is re-read.

In [352]:
last_state = df['Indexes'].tolist()
print('Index that were rejected at the\n\t1st round: {}\n\t2st round: {}'.format(rmed_1, rmed_2))
print('The indexes as they are after the last rejection round {}'.format(last_state))

Index that were rejected at the
	1st round: [1, 3]
	2st round: [2, 5, 0]
The indexes as they are after the last rejection round [0, 1, 2]


In [353]:
def create_initial_state_dict (last_state=list):
    state_dict = {}
    for i in last_state:
        state_dict[i] = i
    return state_dict

In [354]:
def take_one_step_back (state_dict=dict, rm_idx=list, first_round_bool=bool) :
    
    # If this is the first round (i.e., the last rejection round)
    if first_round_bool: 
        # Order the keys of the state_dict so we can iterate over them     
        existing_states = []
        for key in state_dict.keys():
            existing_states.append(key)
        existing_states.sort()

        # Add placeholders in the state_dict for the new state
        # added by introducing the removed states
        for new_key in range(len(rm_idx)):
            state_dict[existing_states[-1]+new_key+1] = existing_states[-1]+new_key+1

        # Order the keys of the state_dict so we can iterate over them 
        existing_states = []
        for key in state_dict.keys():
            existing_states.append(key)
        existing_states.sort()
    else:


    # Sort the states that were removed so we make sure we start by the 
    # lowest idx to reiterate shifting idx correctly
    rm_idx.sort()

    # For each index that was removed,
    for rmed in rm_idx:
        # Check all existing idx, and if the index
        # already exist, update it such that ...
        for existing in existing_states:
            # ... only the indexes that would be shifted by introducing 
            # a NaN sees their 'new' index substracted 1.
            # We substract 1 because in the persepctive of the initial df,
            # the index of a given state lost a rank because of removing a
            # state that was anterior to it.
            if existing > rmed:
                state_dict[existing] -= 1
        state_dict[rmed] = 'NaN'

    return state_dict

In [355]:
# Define the last state
last_state=[0, 1, 2]

# Define the 
rmed_list=[[2, 5, 0]]

In [361]:
# First initialise the state dict containing the true idx as keys 
# and their corresponding 
state_dict = create_initial_state_dict(last_state=last_state)
print('Initial state:')
for i in state_dict.keys():
    print('\t',i, state_dict[i])

first_round_bool = True
for rm_idx in rmed_list:
    updated_state_dict = take_one_step_back(state_dict=state_dict, rm_idx=rm_idx, is_first=first_round_bool)
    print('Updated state:')
    for i in updated_state_dict.keys():
        print('\t',i, updated_state_dict[i])
    first_round_bool = False

Initial state:
	 0 0
	 1 1
	 2 2
Updated state:
	 0 NaN
	 1 0
	 2 NaN
	 3 1
	 4 2
	 5 NaN


## Useful references

- To get comfortable with the MNE documentation, you should know that MNE is based on python [Object Oriented Programming (00P)](hstate_dictps://realpython.com/python3-object-oriented-programming/). These objects are defined from a python `Class`.
    - You can get familiarized with the OOP structure and its componenent, e.g. `methods` (a function associated to the the object) and `astate_dictribute` (a variable associated to the object), wit [this tutorial](hstate_dictps://www.datacamp.com/tutorial/python-oop-tutorial).
    - In MNE, we find [`Raw` objects](hstate_dictps://mne.tools/stable/generated/mne.io.Raw.html) (continuous data) or [`Epoch` objects](hstate_dictps://mne.tools/stable/generated/mne.Epochs.html) (a collection of epochs). 

You can find an introduction to the **Epochs data structure** [here](hstate_dictps://mne.tools/stable/auto_tutorials/epochs/10_epochs_overview.html) in MNE. 